Part 3 of 3. Part 1 covered pointer chains. Part 2 covered pattern scanning and save editing. This is the one where everything shits the bed and elegant solutions are abandoned in favor of the good enough.

Finally, we arrive at Part3 of our D2R trainer series. After Parts 1 and 2, the trainer could read and write stats, freeze values, and edit save files. But the player still died. The freeze thread was writing max HP back to the stat array at 100ms intervals (10 times a second), and that sounds fast until you realize what the game is doing under the hood. The damage function reads HP, subtracts damage, checks for death, and writes the result back in a single frame (~16ms at 60fps). Our thread was racing the game and still losing. The player took a hit, HP dropped, the death check fired, and then our thread wrote max HP back, but unfortunately too late.

So, if we couldn’t win the data race, maybe we could stop the damage from happening in the first place. That meant finding the actual damage function in the binary and patching it.

Finding the Damage Function#

D2R.exe on disk is obfuscated. The file you’d load into Ghidra is packed, so all you’d see is garbled nonsense where the code should be. The real instructions only exist unpacked in the live process memory after the game’s own loader decompresses them at launch. That means you can’t do any static analysis. All reversing had to happen on the running game, which makes things significantly more tedious.

The PE Under Wine#

If you read Part 2, you already know that Wine doesn’t map PE binaries in a straightforward way. Under Wine/Proton, D2R’s PE binary maps as a single large rw-s memfd:wine-mapping region with no filename in /proc/pid/maps. The PE optional header reports ImageBase: 0x6ffffa410000, but that’s not where it actually loads; the actual base is in the low address ranges, varying per run due to ASLR.

PE base varies per run: 0x4610000, 0x3e40000, etc.
.text section: ~22MB of executable code

You’ll note that there is 22mb of executable code here. It’s impractical to scroll through that looking for “the damage function.” This means that we need a strategy to narrow it down.

Caveat: 0x140000000 appears in maps as a ---p placeholder. It’s not the real image.

Scanning for References#

Here’s where our work from Part 2 pays off. D2R uses stat codes: 6 for HP, 7 for MaxHP, 8 for Mana, 9 for MaxMana (the same codes we used for the pattern scanner). Any function that deals damage has to reference these codes to know which stat to modify. This means that we can search the 22MB .text section for instruction patterns that load these constants, using instructions like lea edx, [r9 + 6]. For the unaware, lea means “load effective address”; technically it’s meant for computing memory addresses, but compilers love abusing it for quick arithmetic. In this case, if r9 is 0, lea edx, [r9 + 6] just puts 6 into edx. It’s faster and shorter than a mov in some contexts, so the compiler reaches for it when loading small values like stat codes.

The search turned up ~12 candidates. Most of them were clearly unrelated once you looked at the surrounding instructions (UI rendering, stat display, save file encoding). Cross-referencing those with nearby sub instructions (because damage = subtraction) narrowed it to one.

The Damage Handler#

After performing the search we found a cleanly structured damage handler at RVA ~0x396111 (relative virtual address, the offset from the PE base). The handler handles HP, Mana, and Stamina damage all in one function with three nearly identical blocks. Here’s the HP block as an example:

mov r15d, [rdi+0x134]     ; load damage amount
test r15d, r15d            ; damage > 0?
jle skip                   ; no damage, skip

; ... GetStat calls read current HP into eax ...

sub eax, r15d              ; current_hp - damage
xor r9d, r9d               ; zero r9d
cmp eax, 0x100             ; alive check (0x100 = 1 displayed HP)
mov r8d, r13d              ; prepare death value (0)
cmovge r8d, eax            ; if alive, use calculated HP; else 0
lea edx, [r9 + 6]         ; stat code 6 = HP
rcx = rbx                 ; unit pointer
call SetStat              ; write new HP

If you’ve never read x86 assembly before, the important thing to understand is that this is a fairly straightforward sequence. Basically, load the damage amount, check to see if it’s positive, subtract it from current HP, then check if the player is dead, and write the result back. The cmovge (conditional move if greater or equal) is the death check: if the result of the subtraction is less than 0x100 (which is 1 HP in display terms, because of the <<8 encoding from Part 2), the game writes 0 instead. Zero HP = dead.

Some important details about this function:

  • Damage struct offsets: [rdi+0x134] = HP, [rdi+0x124] = Mana, [rdi+0x128] = Stamina
  • HP values are shifted: raw >> 8 = display. Minimum alive = 0x100 (1 displayed HP)
  • rbx = unit pointer. [rbx] gives the unit type: 0=Player, 1=Monster
  • One function to rule them all: player AND monsters go through the same path
  • Three identical blocks: HP at 0x396111, Mana at 0x396177, Stamina at 0x3961DD

The 24-byte signature we used to find this function at runtime (since ASLR moves the base every launch):

41 2B C7 45 33 C9 3D 00 01 00 00 45 8B C5 48 8B CB 44 0F 4D C0 41 8D 51

That’s the sub eax, r15d through lea edx, [r9+6] sequence, unique enough to have exactly one match in the entire .text section. Same technique as Part 2’s stat array scanner, just applied to code instead of data.

One gotcha here was that I had to keep the signature within a single block (HP only) because the Stamina block uses cmovns instead of cmovge. While they do almost the same thing (signed non-negative vs. signed greater-or-equal) they encode to different bytes, so a signature spanning all three blocks wouldn’t match. If you don’t account for that, your scanner would return zero matches and you’d be left wondering why your “good” signature wasn’t hitting anything.


Code Patching: Three Attempts, Three Crashes#

With the damage function found and understood, the obvious next step would be to patch it. If sub eax, r15d is where damage happens, it would seem obvious to make the game just… not do it. Replace the instruction with something that does nothing, and the player can never lose HP. It sounds simple enough to do, but the reality proved far different.

Attempt 1: NOP the Subtraction#

This attempt was the most straightforward patch possible. Replace sub eax, r15d with NOPs (no-operation instructions, the CPU equivalent of “do nothing, move along”):

41 2B C7  →  90 90 90   (sub eax, r15d → nop nop nop)

We have to replace with the same number of bytes (3 NOPs for a 3-byte instruction) because x86 instructions are variable-length. If you change the byte count at any point, every subsequent instruction in the function gets misaligned. The CPU then starts trying to decode the middle of one instruction as the start of the next, which is… doubleplus ungood. You’d be executing completely random operations.

The patch worked immediately and the player was invulnerable. But there was also a problem I should have anticipated: since the same function handles both player and monster damage, monsters also stopped taking full damage. They weren’t invulnerable (other damage paths still existed), but something was clearly wrong with combat. Reflecting the wrongness of the whole situation, about 1-2 minutes into a fight, the game crashed to desktop.

Attempt 2: Code Cave with Player-Only Check#

Clearly, the NOP approach was too blunt, so I needed some kind of conditional patch. Only skip the subtraction if the target is the player. The technique for this is a “code cave”: find unused space in the binary, redirect execution there, do a check, and redirect back. If that sounds familiar to anyone who was around in the early oughts malware scene, it should. Back then we used to use the same trick for binary obfuscation: basically, pad the PE with an extra section, hijack the original entry point to jump into the cave, XOR-decode the real code, then jump back to OEP. Same concept, but with a different goal this time.

Compilers typically pad between functions with 0xCC bytes (the x86 INT3 debug interrupt) for alignment purposes. CPUs fetch instructions in aligned chunks, so pushing each function’s entry point onto a nice boundary means fewer fetch cycles. They use 0xCC instead of zeros or NOPs as a safety net: if execution were to ever accidentally fall through the end of a function into the padding, INT3 triggers a debug breakpoint instead of silently running garbage. Those bytes never execute during normal gameplay, so they’re safe to overwrite with our own code. We found a run of CC padding near the damage function and built a 29-byte trampoline there (a small chunk of code that the original function bounces through before jumping back, like a trampoline):

test rbx, rbx           ; null check (missiles have no unit pointer)
jz do_sub               ; if null, apply damage normally
cmp dword [rbx], 0      ; unit type: 0 = Player, 1 = Monster
jne do_sub              ; not player? apply damage
jmp return_addr         ; player? skip subtraction, god mode

do_sub:
sub eax, r15d           ; original instruction we replaced
jmp return_addr         ; continue normal execution

The assembly above first checks to see if we even have a unit pointer (missiles and some environmental damage don’t), and if not, apply damage normally. Next, if we do have a pointer, check if it’s a player (type 0) or a monster (type 1). Monsters get the original subtraction. Players skip it entirely.

This attempt was more satisfying. Player appeared to be invulnerable. Monsters died normally. Combat felt right. And then… another crash. Same timeframe. Same behavior. About 1-2 minutes of combat and the process just dies.

This was confusing. The crash from Attempt 1 could have been explained by the monster damage issue, maybe the game got into a bad state when monsters couldn’t die properly. But now monsters were dying fine and we were still crashing. Clearly, something else was going on.

Attempt 3: Zero the Damage Load#

Maybe the crash was related to the code cave redirect itself: the jump instruction, the different execution path, something about the control flow change might’ve been triggering a detection mechanism. So for the third attempt, I tried the most minimal possible change: don’t redirect execution at all. Instead, zero out the damage value before it’s ever used.

44 8B BF 34 01 00 00  →  45 31 FF 0F 1F 40 00
(mov r15d,[rdi+134h])     (xor r15d,r15d + 4-byte nop)

xor r15d, r15d is the standard way to zero a register in x86 (XOR anything with itself = 0). The remaining bytes are a 4-byte NOP to pad out to the same instruction length. With r15d=0, the game’s own test r15d, r15d; jle branch naturally skips the entire damage block, and we’re not even changing the control flow, just making the data say “zero damage.” The game’s own logic handles the rest.

This is the cleanest possible patch. Just seven bytes modified, no jumps, no caves, no new code. The original function still runs in exactly the same order, it just sees zero damage every time and exits early through its own branch.

Same result. Seems to work perfectly, then crashes after 1-2 minutes. Maddening.

The Verdict#

Where does this put us? Three completely different patching techniques (NOPs, a code cave with conditional logic, and a minimal register zero) all crashing identically. At this point the evidence was pretty overwhelming:

  • Game ran stable for hours with the trainer connected and NO patches applied
  • All three approaches crashed identically (within 1-2 minutes of combat)
  • No CRC32 instructions visible in .text section, no obvious integrity check mechanism
  • Local crash (not a Battle.net disconnect), this was the game process dying, not Warden kicking us

This leaves us to conclude that D2R has integrity checking that detects .text modifications. It doesn’t matter how elegant your patch is or whether you change 3 bytes or 7, the checker sees the modification and pulls the plug.


Pulse Patching#

At this point we knew the integrity checker was catching our modifications, but I didn’t know how often it scanned. If it runs on a timer (say, every 30 seconds), there’s a window between scans where modified bytes exist undetected. The idea behind pulse patching is to exploit that window: apply the patch, enjoy a few seconds of invulnerability, then revert to the original clean bytes before the next scan has a chance to run. If we get the timing right, the checker never sees the modified code.

Apply patches → wait 10s → Revert patches → wait 100ms → repeat

A dedicated thread cycled patches on and off using process_vm_writev. During the 10-second patched window, the player is invulnerable. During the 100ms clean window, the original bytes are back in place and the integrity checker (hopefully) sees unmodified code and moves on. The 100:1 ratio meant the player would only be vulnerable for roughly 1% of the time, which in practice is a fraction of a game tick.

Unfortunately, the game still crashed. I tried adjusting the timing (shorter patched windows, longer clean windows) and nothing helped. The integrity scan is either too frequent, too fast, or triggered by something other than a periodic timer (maybe it checksums on function entry, or uses a hardware watchpoint, or something else entirely). I never did figure out exactly how it detects the changes. Just that it does, reliably, regardless of timing. Maybe this is an exercise best left to the reader :).


DLL Injection#

At this point, I considered the external code patching via process_vm_writev a dead end. No matter what I changed or how carefully it was changed, the integrity checker caught it. The next idea was to get the code inside the process. If we’re running in D2R’s address space, we can use VirtualProtect (Windows API) to change page permissions and patch from within. The reasoning for this being that integrity checkers typically need to allow-list self-modifications because the game’s own code does them during unpacking and runtime code generation. If our modifications come from inside the process, maybe they’d be treated as legitimate.

CreateRemoteThread#

This is the classic Windows DLL injection technique. You allocate memory in the target process, write your DLL path into it, then create a new thread in the target that calls LoadLibraryA with your path as the argument. When the thread runs, Windows loads your DLL and calls its DllMain entry point, giving you a foothold inside the process.

To add to the oddities, since I was doing all of this on Linux, building a Windows DLL meant cross-compiling with MinGW (x86_64-w64-mingw32-g++ targets 64-bit Windows from a Linux host):

# Build the DLL that contains our trainer logic
x86_64-w64-mingw32-g++ -shared -o trainer.dll src/trainer.cpp -lpsapi -static
# Build the injector that loads the DLL into D2R's process
x86_64-w64-mingw32-g++ -o injector.exe src/injector.cpp -lpsapi -static

The good part was that the DLL loaded and ran inside D2R, but the bad was that I immediately hit the PE base address problem from Part 2 again. GetModuleHandle(nullptr) returns 0x6ffffa410000 (the PE ImageBase from the header, i.e. what the binary thinks its address is), not the actual mapped base (where Wine actually put it, somewhere in the low address ranges). The .text at that ImageBase address is encrypted/packed with the real unpacked code living in Wine’s memfd regions at a different address.

Unfortunately, before I could even attempt patching from inside, Warden detected the injected DLL as “unsupported software” and killed the process. Apparently Warden enumerates loaded modules and flags unknown ones, too; good times!

DLL Sideloading (version.dll Proxy)#

Anyway, if Warden detects injected DLLs, what about DLLs that the game loads on its own? Windows applications import from system DLLs like version.dll, and the OS searches the application directory first (before system32). If we place a fake version.dll in D2R’s folder, the game loads it instead of the system one, no injection needed, no CreateRemoteThread, just a file sitting in the right place waiting to be picked up at runtime.

I then built a proxy that forwards all 16 version.dll exports to the real system DLL while running our trainer code on attach:

LIBRARY version
EXPORTS
    GetFileVersionInfoA = Proxy_GetFileVersionInfoA
    GetFileVersionInfoW = Proxy_GetFileVersionInfoW
    // ... 14 more ...

The way a proxy DLL works is: our fake version.dll first loads the real one from system32 using LoadLibrary. Then for each of the 16 functions that version.dll exports, it looks up the real function’s address with GetProcAddress and stores that pointer. When the game calls any version.dll function, our proxy just forwards the call to the real one. The game has no idea it’s talking to a middleman.

The interesting part happens at load time. When Windows loads a DLL, it calls the DLL’s DLL_PROCESS_ATTACH entry point, which is basically saying “hey, you’ve been loaded, do your setup.” Our proxy used that moment to spawn a trainer thread. That thread used VirtualQuery to walk through the process’s memory map looking for executable regions, since the PE ImageBase issue from Part 2 meant we couldn’t trust GetModuleHandle to give us the real base address.

Warden detected this too. Modern Warden apparently checks for known proxy DLL names in the game directory, whether that’s a hash check, a filename check, or comparison against a known-good module list, I’m not sure. But the game died within seconds of launch.

Table flip


The Winning Approach: Data-Only God Mode#

So let’s recap the walls we’ve been bouncing off of:

  • External code patching → detected by integrity checker
  • Pulse timing → still detected
  • DLL injection → detected by Warden
  • DLL sideloading → also detected by Warden

After all of that (reverse engineering the damage function down to individual bytes, cross-compiling Windows DLLs on Linux, building code caves, implementing pulse timing) the answer turned out to be embarrassingly easy.

Remember the freeze thread from Part 2? It failed at 100ms because it couldn’t win the race against the damage function. But process_vm_readv and process_vm_writev are fast, they’re single kernel syscalls with near-zero overhead. What if we just… made it run faster?

The integrity checker is watching the code. Warden is watching the modules. But is anything watching the data itself? What if we stopped trying to change how the game runs and just fixed the results after the fact?

The HP Guardian#

Instead of patching code, a dedicated thread reads HP from ALL discovered player stat arrays and writes MaxHP back whenever HP drops, at 1ms intervals:

std::thread god_thread([&]() {
    while (running) {
        if (!god_mode || pid <= 0) {
            // When god mode is off, sleep longer to avoid wasting CPU
            std::this_thread::sleep_for(std::chrono::milliseconds(50));
            continue;
        }

        // The stat array moves in memory (zone transitions, level-ups).
        // Re-scan every 5 seconds to track its current location.
        // We find ALL stat arrays in memory, not just one. D2R sometimes
        // has multiple copies (shadow arrays, cached copies, etc.)
        if (needs_rescan) {
            auto all = find_all_stat_arrays(pid);
            // Filter to player arrays by checking Str/Ene/Dex/Vit values
            // match what we expect for the player character
        }

        // The actual guard: for each known stat array location,
        // read current HP, compare to MaxHP, fix if it dropped.
        for (auto& ga : god_addrs) {
            int32_t hp = 0, maxhp = 0;
            read_memory(pid, ga.hp_addr, &hp, 4);
            read_memory(pid, ga.maxhp_addr, &maxhp, 4);

            // Only write if maxhp is valid AND hp actually dropped.
            // Writing unconditionally would waste syscalls on the 99.9%
            // of iterations where nothing changed.
            if (maxhp > 0 && hp < maxhp) {
                write_memory(pid, ga.hp_addr, &maxhp, 4);
                god_fixes++;  // tracking for the status display
            }
            // Same logic for mana
        }

        // 1ms = 1000 checks per second. Fast enough to catch damage
        // between the SetStat call and the death check.
        std::this_thread::sleep_for(std::chrono::milliseconds(1));
    }
});

Why It Works#

The reason all our previous approaches failed is that they modified the game’s code, and the game was watching for exactly that. This approach never touches the .text section. It only reads and writes the data itself, that is, the stat arrays that live in heap memory. From the game’s perspective, nothing suspicious is happening. The integrity checker keeps scanning .text and finding it unchanged. Warden has nothing to flag because there’s no injected DLL, no foreign threads, and no modified instructions.

The speed itself is the other half of the trick. At 1ms polling (1000 checks per second), we’re finally fast enough to catch the window between when the damage function writes reduced HP and when the death check evaluates it. Each iteration does 4 reads and maybe 2 writes, all single kernel syscalls that complete in microseconds. The 5-second rescan handles stat array relocation (zone transitions, level-ups) so we’re never writing to stale addresses.

There is a catch though. The pattern scanner from Part 2 finds every stat array in the process, not just the player’s. Monsters have the same stat structure with HP, Mana, and the same encoding. If we blindly wrote MaxHP to every match, we’d be healing every enemy in the game too. To filter down to just the player, we check the base stats at known offsets in each array (Strength, Energy, Dexterity, Vitality). Monsters don’t have base stats in the same ranges as a player character, so if the values fall within reasonable bounds for a D2 class, it’s probably us.

Results#

⚔ GOD MODE. 4 arrays, 117 fixes / 1,400,821 checks, 82 rescans
HP: 380381b4 = 9599/8191
HP: 5bad0684 = 8191/8191
HP: 62f02234 = 8191/8191
HP: 885a8184 = 8191/2048

So, after all of this time, testing, and pain, I was finally able to stand in a crowd of enemies for 45+ seconds without dying. No crashes, no disconnects, no evil Warden detection. The game remained completely stable. Looking at the stats, 117 fixes out of 1.4 million checks means the thread caught and reversed every single damage event. The other 1,399,704 checks found HP already at max and did nothing, just a few microseconds of wasted syscalls per iteration, essentially free.

After building code caves, cross-compiling Windows DLLs, implementing pulse timing, and reverse engineering the damage function down to individual bytes, the solution was to just write to memory really fast. Sometimes the solution is a bit of a facepalm moment in retrospect. But the understanding we built in knowing what didn’t work is what made the simple approach actually work: we knew the exact memory layout, the encoding scheme, the timing constraints, and which detection mechanisms to avoid.


Lessons Learned#

Anti-tamper is far more pervasive than I’d expected#

D2R’s integrity checker catches .text modifications within minutes. NOP patches, code caves, load zeroing, pulse timing, nothing evades it from userspace. The checker likely hashes code sections periodically and triggers a controlled crash on mismatch. I never found the checker itself (it could be in a region I wasn’t scanning, or obfuscated beyond recognition), but the evidence is conclusive.

Warden understands injection techniques#

Both CreateRemoteThread injection and version.dll sideloading were detected immediately. Modern Warden enumerates loaded modules and checks for known proxy DLL names. Getting code inside the process without detection would probably require a kernel driver or manual mapping (allocate memory, copy the DLL image manually, never call LoadLibrary). Neither seemed worth the effort for an offline trainer.

Simple beats clever#

The 1ms polling thread uses essentially 0% CPU (it sleeps 1ms, does two tiny syscalls, sleeps again) and is completely invisible to anti-cheat because it never modifies code. All the sophisticated approaches failed because they triggered detection mechanisms. The brute-force data-only approach worked because it operates in a space the game doesn’t actively monitor.

Wine’s layers of pain#

  • PE binaries map as memfd:wine-mapping with no filename
  • GetModuleHandle returns the PE ImageBase, not the actual load address
  • Hardware watchpoints via ptrace don’t work (SIGSEGV instead of SIGTRAP)
  • Zombie processes can leave VRAM allocated (need kill -17 to parent)
  • Settings.json Window Mode resets on prefix upgrades (use mode 2 for borderless)

Conclusion#

I think one of the broader lessons here is that sometimes you spend inordinate amounts of time building complicated/involved solutions (reverse engineering, cross-compiling, timing exploits) only to find that it was the simple approach that was the correct one all along. The damage function still runs. The anti-tamper still does it’s thing. But the data gets fixed before anyone notices, and the game has no idea it happened.

I honestly came into this project thinking that it wouldn’t be that hard (silly me!). I knew how to use scanmem: attach to a process, search for a value, change something in-game, filter, repeat until you find the address, poke a new value in, etc. That part is easy. It’s also tedious and 100% non-repeatable. Every time the game restarts, every address changes and you’re doing the whole dance again.

As stated in the beginning of the series, what I wanted to understand was how the people who build trainers for a living (CheatHappens, WeMod, the folks who ship polished tools that just work every patch) actually do it. The answer involves a lot more than I expected: PE format internals, pattern scanning, bitfield-encoded save files, x86 reverse engineering, anti-tamper systems, DLL injection techniques, and ultimately the humbling realization that the solution doesn’t have to be complicated in order to be the right one.

I have a genuine respect now for the people who do this across dozens or hundreds of games, each with their own protection schemes and data formats. It’s certainly legitimate engineering, just pointed at a target that doesn’t get as much credit as it should.

The full source for this project can be found on GitHub: axiom0x0/d2r-trainer