SlimePE-rev — Windows PE32+ x86_64 binary → ASM + C reverse transpiler
Recover NASM and C sources from a vendor-lost Windows .exe, bit-exact.
Convert source-lost Windows x86_64 PE32+ binaries from Microsoft legacy estates,
industrial control systems and vendor-lost .exe / .dll into both NASM intel source
and C source, bit-exact. The emitted NASM is re-assembled with nasm -fwin64 + mingw-link; the emitted C is
compiled with x86_64-w64-mingw32-gcc -nostdlib -e _start; both produce real native
.exe binaries that, when executed under WSL2 binfmt, reproduce the original binary's stdout byte-for-byte.
- Phase A (shared with SlimeELF-rev): x86_64 instruction decoder (~37 opcodes, integer hot-loop subset) + NASM intel emitter + straight-line C emitter
- Phase B entry: CFG recovery (Cooper-Harvey-Kennedy iterative dominator + Aho/Sethi natural loop body) + structured C (
do { ... } while (R[1] != 0);+if/elsediamond) - Phase B (b): function-boundary recovery (prologue/epilogue + call/ret + self-recursion)
- Phase B (d): inter-procedural Slot IR — unified schema with SlimeELF-rev, deterministic JSON round-trip
- Phase D — PE32+ specific layer: PECOFF v8.3 parser + DLL import resolution (IAT slot → function name) + Win64 ABI (rcx / rdx / r8 / r9 + shadow space
[rsp+0x20]) + WinAPI recipe table (9 functions across kernel32 + msvcrt) - S9 bench: PE 8 axes 168/168 + Phase E lift v2 21/21 = 189/189 PASS (2026-05-19), 21 samples (msvcrt + kernel32 unified, now including
printf/GetModuleHandleA/CreateFileA/CloseHandle) — libc-linked PE binaries, MinGW local-thunk routed IAT calls all green across ASM and C round-trips - Phase E lift v2: stack-slot promotion per function + Win64 arg spill recovery (
[rbp+0x10],[rbp+0x18]) — 21/21 lifted-C .exe match original stdout byte-for-byte
A reverse transpiler for “vendor-lost Windows .exe”, built on deterministic translation + 8-axis round-trip auto-regression + audit chain.
Paired with sister product SlimeELF-rev for Linux ELF (same Slot IR, shared decoder), and forward sister SlimeASM for HLASM + MASM forward.
Phase E v3 (loop / natural-cond recovery) and Phase F PSDP (auto OpenMP parallelisation) are operational on the ELF side first; PE32+ extension is on the roadmap. The instruction decoder is shared, so the lift work carries over once MinGW's gcc -O0 CFG patterns are absorbed.
Key measurements (2026-05-19)
= msvcrt + kernel32 unified, printf / CreateFileA / lstrlenA included
= true locals + Win64 arg spill recovered
WriteFile / GetStdHandle / ExitProcess / lstrlenA / GetModuleHandleA / CreateFileA / CloseHandle / puts / printf
shared with SlimeELF-rev, call graph as first-class edge
SIB / movzx / movsx / cdqe / movsxd / shl-shr-sar / and-or — shared with SlimeELF-rev
kernel32 + msvcrt imported together (sample 19 lstrlenA + puts)
Market context — where source-lost Windows binaries live
| Vendor-lost .exe | Vendor went out of business or refuses to maintain — only the .exe / .dll remain. Source recovery is required for SBOM, CVE patching, and audit. |
|---|---|
| Microsoft legacy estates | Long-running Windows-based business apps with no surviving source repository. C-source recovery enables a cleanroom rewrite plan. |
| Industrial control / OT | Closed Windows-based PLC HMIs, instrumentation daemons, factory MES adapters frozen for 10-30 years. |
| Compliance documentation | FDA / PMDA / IEC 62443 OT-security obligate “complete software description” even when only the binary is on hand. Binary-only components must be lifted to C as auditor-reproducible documentation. |
| Competitive landscape | Ghidra / IDA Pro / Hex-Rays / RetDec already exist. SlimePE-rev differentiates on three axes: (1) determinism + 8-axis round-trip auto-regression proves “lossless”; (2) single unified Slot IR shared with SlimeELF-rev enables cross-OS audit pipelines; (3) decompile output compiles with mingw-gcc and runs as a real .exe whose stdout matches the original — Hex-Rays produces static C but does not guarantee build + run round-trip. |
S9 bench — all 8 axes: PE 168/168 PASS (bit-level)
The same 8-axis S9 bench harness used for ELF, applied to PE32+ at bit-level. Decoder (~37 opcodes) and Slot IR schema are shared with SlimeELF-rev; only the container layer (PE parser, IAT resolution, Win64 ABI) is PE-specific.
| Axis 1a PE dialect-detect | DOS magic “MZ” + PE signature “PE\0\0” + COFF Machine=0x8664 (IMAGE_FILE_MACHINE_AMD64) + Optional Header Magic=0x20B (PE32+) + Subsystem=3 (CONSOLE) validated at bit-level. 21/21 PASS. |
|---|---|
| Axis 1b opcode-recover | Every instruction within .text (VirtualSize-limited live region) decoded — db 0xNN fallback count = 0. 21/21 PASS. |
| Axis 2 mutation-detect | 1-bit flip in .text, 5 trials × 21 samples = 105 trials, 105/105 detected. |
| Axis 3 determinism | Same .exe disassembled twice → byte-equal across all 21 samples. 21/21 PASS. |
| Axis 4 ASM round-trip | emit NASM → nasm -fwin64 → mingw-link (-lmsvcrt -lkernel32) → real .exe → WSL2 binfmt execute → stdout matches original. IAT references are emitted as extern __imp_FUNC so the linker regenerates the PE import table. 21/21 PASS. |
| Axis 5 C round-trip | emit C → mingw-gcc -nostdlib -e _start → real .exe → stdout matches original. The IAT-indirect call mov reg, qword [rip+iat]; call reg is collapsed by a peephole into a synthetic call_iat <FUNC> and expanded to a direct kernel32 / msvcrt call with Win64 ABI arguments pulled from rcx/rdx/r8/r9 / [rsp+0x20]. 21/21 PASS. |
| Axis 6 structured-C round-trip | CFG-recovered structured C (do-while + if/else + per-function) → mingw → run → match. 21/21 PASS, including recursive fact(4) = 24. |
| Axis 7 Slot IR round-trip | The PE32+ binary is lifted into the same Slot IR schema used for ELF. SlotFunctions, call graph edges and structural equality all round-trip. 21/21 PASS, with 06_call_simple (2fn/1call) / 07_two_funcs (3fn/2call) / 08_recursion (2fn/2call, self-loop) reconstructing call graphs identical to their Linux counterparts. |
PE32+ specific layer (Phase D)
- PECOFF v8.3 parserDOS header (64B, magic MZ + e_lfanew @ 0x3C) → PE signature (4B) → COFF header (20B, Machine=0x8664 required) → Optional Header (PE32+ 240B, Magic=0x20B required) → Section Table parsed at bit-level. VirtualSize bounds the live .text region.
- DLL import resolutionDataDirectory[1] Import Table walked:
_IMAGE_IMPORT_DESCRIPTORarray + INT/IAT thunk array (8B, ordinal flag = bit 63) + Hint/Name table. Each IAT slot VA resolves to DLL!function. e.g.0x140003070 → msvcrt.dll!puts. - IAT-call peepholeThe two-instruction pair
mov reg, qword [rip+iat]; call regthat gcc -O0 + MinGW emits is folded into a single syntheticcall_iat <FUNC>and lowered to a direct C call via Win64 ABI. - Win64 ABI recipeArgument slots: rcx (R[1]) / rdx (R[2]) / r8 (R[8]) / r9 (R[9]) / 5th at
mem_r(R[4]+0x20, 8). String arguments (LPCSTR) are reconstructed by reading the virtual VA byte-by-byte via mem_r into a local buffer before passing to the real kernel. - WinAPI recipe table (extensible)Currently 9 functions — kernel32: GetStdHandle / WriteFile / ExitProcess / lstrlenA / GetModuleHandleA / CreateFileA / CloseHandle, msvcrt: puts / printf (1-arg form). Adding a new API is a 3-step procedure (PROLOGUE dllimport declaration + recipe table entry + link command
-l<dll>); the decoder, CFG and Slot IR layers stay untouched. - trailing thunk trim (call targets preserved)The
jmp qword [rip+iat]thunk table at the .text tail is excluded from the SlotImage (no spurious functions). Local thunks reachable fromcall rel32are preserved (e.g. MinGWprintf → putsoptimisation routes via a local<puts>thunk; dropping it would leave the call target undefined).
Phase E lift v2 — true-local-variable + Win64 arg spill recovery (21/21)
Transforms Phase D's VM-form C output (R[] + STACK[] + mem_r/mem_w dispatcher) into structured-C emit and applies stack-slot promotion per function (rbp ± offset memory accesses are lifted into named C locals). The C scoping rules eliminate cross-function frame collisions automatically. Win64 ABI arg spills ([rbp+0x10], [rbp+0x18], etc.) are also recovered as locals, bringing the output one step closer to natural C.
All 21 PE samples have their lifted C output rebuilt with mingw-gcc and confirmed to match the original .exe's stdout byte-for-byte under WSL2 binfmt.
Sample inventory (21 PE32+ .exe binaries)
C sources functionally equivalent to the Linux ELF 17 sample subset plus four libc-linked Windows specifics (msvcrt!puts / lstrlenA + puts / multi-WinAPI / msvcrt!printf), built with x86_64-w64-mingw32-gcc -nostdlib -e _start into PE32+ .exe and bench-tested under the same 8 axes. The IAT-indirect call (mov rax, qword [rip+iat]; call rax) is folded by a peephole into call_iat <FUNC> and lowered to direct kernel32 / msvcrt calls via the Win64 ABI. stdout verified by direct execution under WSL2 binfmt.
| 01 hello (PE) | GetStdHandle + WriteFile + ExitProcess → Hello, PE!. Minimal IAT of 3 functions via Win64 ABI. |
|---|---|
| 02 arith (PE) | 17+25 = 42 written as 2 ASCII digits via WriteFile → SUM=42. cqo + idiv rsi for true signed division. |
| 03 loop (PE) | For-loop computing 1+2+3+4+5 = 15 → SUM=15. cmp DWORD PTR + jle loop body + WriteFile. |
| 04 branch (PE) | if (x >= 5) branch with x=7 → big. Both arms WriteFile a distinct string and converge on a common join. |
| 05 compute (PE) | 6 × 7 = 42 → PROD=42. 2-op imul rax, rbx + idiv. |
| 06 call_simple (PE) | _start → do_print(handle). Win64 rcx passing + nested WriteFile. |
| 07 two_funcs (PE) | add_two() + print_dec(handle, val). SlotImage carries 3 function nodes and 2 inter-procedural edges. |
| 08 recursion (PE) | factorial(4) = 24 via self-recursion → FACT=24. Call graph carries a fact → fact self-loop; callee-saved state preserved via Win64 shadow space at [rsp+0x20]. |
| 09 array_sum (PE) | arr[5] = {3,5,7,9,11} sum → SUM=35. SIB byte mov rax, [rax*8+0x140002000] for indexed access. |
| 10 strlen (PE) | Hand-written strlen on "Hello, World!\n" → LEN=14. movzx eax, BYTE PTR [rax] + test al, al null-terminator loop. |
| 11 signed_array (PE) | Signed-char array {-3, 5, -8, 12, 4} sum → SUM=10. movsx rax, al + signed arithmetic. |
| 12 int_index (PE) | int i array loop + 3-digit print → SUM=150. cdqe + 32-bit op variants + dual divmod (100 / 10). |
| 13 bitshift (PE) | 32 << 3 = 256, >> 1 = 128 → VAL=128. shl/sar r32 (32-bit op variants included). |
| 14 bitmask (PE) | 0xFF12 & 0xFF = 18 → RES=18. and r/m64, imm8 bit-masking. |
| 15 stride (PE) | Stride access arr[i*3] sum → STR=35. add + add expansion for i*3 + SIB load. |
| 16 matrix (PE) | 3×4 matrix nested-loop sum → MAT=78. 2D array + nested CFG; lea rdx, [rax*4+0x0] (SIB-form lea) for row offset. |
| 17 struct (PE) | Array-of-struct pts[3] = {{10,20},{30,40},{50,60}} field sum → PT=210. SIB + displacement field offsets. |
| 18 msvcrt_puts (PE) | First libc-linked sample. msvcrt!puts("Hello, msvcrt!") + kernel32!ExitProcess → Hello, msvcrt!. Demonstrates scalability to DLL imports beyond kernel32. |
| 19 lstrlenA (PE) | kernel32!lstrlenA("Hello, libc!") measures length 12, then msvcrt!puts prints as two digits → LEN=12. Two DLLs (kernel32 + msvcrt) imported from a single binary. |
| 20 winapi_multi (PE) | Multi-WinAPI integration — GetModuleHandleA + CreateFileA + CloseHandle + WriteFile. Exercises the WinAPI recipe table at scale, with file-handle lifecycle preserved through reverse. |
| 21 printf (PE) | msvcrt!printf("Hello, %s!\n", "printf") 1-arg form — verifies the local-thunk routed call (MinGW optimises printf → puts via a local thunk; trim must preserve the target). |
Function = Slot node, call graph as first-class IR (Phase B (d))
Each function becomes a SlotFunction node; call edges are first-class IR (a list of callee names per function). Self-recursion is naturally a self-loop edge. The full SlotImage encodes/decodes via deterministic JSON (Axis 7 round-trip), so call graphs and function structure can flow into external toolchains (audit DBs, SBOM, static analysis) without information loss. The Slot IR schema is unified with SlimeELF-rev, enabling cross-OS audit pipelines.
Audit fitness (finance / defense / medical-device / OT)
- Bit-exactSame PE input → same sha256 NASM/C output. CFG / function boundaries / instruction stream all fully deterministic.
- Native .exe round-tripEmitted NASM re-assembled via nasm-fwin64 + mingw-link; emitted C compiled via mingw-gcc -nostdlib; two real native .exe files executed (under WSL2 binfmt) and stdout compared with the original. Not simulation — real-machine verification.
- Mutation detection1-bit flip in .text always changes disasm. 21 × 5 = 105/105 detected — tampering is immediately visible.
- DeterminismSame .exe disassembled + emitted twice → byte-equal per sample. Stable across parallel and GPU execution.
- Slot IR auditFunction = Slot node + call graph persisted as deterministic JSON. Joins SBOM / audit DB pipelines as a structured artifact.
- Build-time LLMLLM only at decoder-rule construction time. Runtime is deterministic rule-based — aligned with finance / defense / OT audit requirements.
Supported instructions (shared with SlimeELF-rev, ~37 patterns)
| Data movement | mov reg/mem, imm/reg (B8+r / 89 /r / 8B /r / C7 /0 / 88 /r, 64-bit and 32-bit) / movzx / movsx / lea r64, m (8D /r) / nop / leave |
|---|---|
| Arithmetic | add / sub r/m, r / imul r, r/m (REX.W 0F AF) / idiv r/m (F7 /7) / cqo / cdqe / cdq / movsxd r64, r/m32 |
| Logic | and / or / xor r/m64, r64 / xor reg, reg idiom recognised as zero-init |
| Bit shifts | shl / shr / sar r/m, imm8 (C1 /N) / shl / shr / sar r/m, 1 (D1 /N) |
| Compare / test | cmp r/m64, r64 / cmp r/m, imm8 / test r/m, r |
| Branch | Jcc rel8/32 / jmp rel8/32 / loop rel8 (E2) |
| Call / stack | call rel32 (E8) / ret (C3) / push/pop r64 / push imm |
| System (PE) | IAT-indirect call (mov rax, qword [rip+iat]; call rax) folded by peephole into call_iat <FUNC>, then lowered to a direct WinAPI call via Win64 ABI (rcx / rdx / r8 / r9 + [rsp+0x20]). (ELF uses syscall (0F 05) heuristic; see SlimeELF-rev.) |
| Memory operands | [reg] / [reg+disp8/32] / [rip+disp32] / [base+index*scale+disp] (SIB byte, scale=1/2/4/8) — covers [rbp-disp] local-variable access, [rax*8+disp] array indexing, and Win64 shadow space [rsp+0x20]. |
Next-phase additions: printf-N / malloc / SSE2 / SSE4 (XMM + floating-point), 3-op imul r64, r/m64, imm32, movabs r64, imm64, plus Phase E v3 (loop / natural-cond recovery) and Phase F PSDP (auto OpenMP) carry-over from the ELF side.
License model
| Charged | WASM/WASI converter tool (developer side) |
|---|---|
| Not charged | The produced NASM / C sources (customer asset, perpetual deployment) |
| Method | Ed25519 144B signed license + 3-hop air-gap activation (finance / defense / OT audit ready) |
| Parallelization (PSDP) | Not included. See the independent PSDP SKU under SlimeNENC. |
Related materials
- Sister (Linux reverse)SlimeELF-rev — Linux ELF x86_64 reverse, same Slot IR + shared decoder.
- Sister (forward)SlimeASM — HLASM + Win x64 MASM forward transpiler.
- Reverse family overviewSlimeASM-rev landing — the umbrella reverse-family page.
- Slot IR shared familySlimeCOBOL / SlimePL/I / SlimeRPG / SlimeMUMPS share the Slot IR (Core64 + Ext32 fixed-bit).
- Patent applicationJP application 2026-046620 v15b, claims 11 / 14d.
Reverse PoC / Request Materials Back to SlimeNENC family SlimeELF-rev (Linux pair)
