SlimePE-rev — Windows PE32+ x86_64 binary → ASM + C reverse transpiler
Recover NASM and C sources from a vendor-lost Windows .exe, bit-exact.
Convert source-lost Windows x86_64 PE32+ binaries from Microsoft legacy estates,
industrial control systems and vendor-lost .exe / .dll into both NASM intel source
and C source, bit-exact. The emitted NASM is re-assembled with nasm -fwin64 + mingw-link; the emitted C is
compiled with x86_64-w64-mingw32-gcc -nostdlib -e _start; both produce real native
.exe binaries that, when executed under WSL2 binfmt, reproduce the original binary's stdout byte-for-byte.
- Phase A (shared with SlimeELF-rev): x86_64 instruction decoder (~37 opcodes, integer hot-loop subset) + NASM intel emitter + straight-line C emitter
- Phase B entry: CFG recovery (Cooper-Harvey-Kennedy iterative dominator + Aho/Sethi natural loop body) + structured C (
do { ... } while (R[1] != 0);+if/elsediamond) - Phase B (b): function-boundary recovery (prologue/epilogue + call/ret + self-recursion)
- Phase B (d): inter-procedural Slot IR — unified schema with SlimeELF-rev, deterministic JSON round-trip
- Phase D — PE32+ specific layer: PECOFF v8.3 parser + DLL import resolution (IAT slot → function name) + Win64 ABI (rcx / rdx / r8 / r9 + shadow space
[rsp+0x20]) + WinAPI recipe table (9 functions across kernel32 + msvcrt) - S9 bench: PE 8 axes 168/168 + Phase E lift v2 21/21 = 189/189 PASS (2026-05-19), 21 samples (msvcrt + kernel32 unified, now including
printf/GetModuleHandleA/CreateFileA/CloseHandle) — libc-linked PE binaries, MinGW local-thunk routed IAT calls all green across ASM and C round-trips - Phase E lift v2: stack-slot promotion per function + Win64 arg spill recovery (
[rbp+0x10],[rbp+0x18]) — 21/21 lifted-C .exe match original stdout byte-for-byte
A reverse transpiler for “vendor-lost Windows .exe”, built on deterministic translation + 8-axis round-trip auto-regression + audit chain.
Paired with sister product SlimeELF-rev for Linux ELF (same Slot IR, shared decoder), and forward sister SlimeASM for HLASM + MASM forward.
Phase E v3 (loop / natural-cond recovery) and Phase F PSDP (auto OpenMP parallelisation) are operational on the ELF side first; PE32+ extension is on the roadmap. The instruction decoder is shared, so the lift work carries over once MinGW's gcc -O0 CFG patterns are absorbed.
A SlimeNENC-family tool that moves legacy code into a modern language without changing its meaning. A hand rewrite quietly drifts and causes incidents; SlimeNENC doesn't interpret meaning — it copies only the "skeleton" (structure), so the computed results stay identical to the original. It proves the behaviour first, so migration anxiety disappears. It copies only what it can, honestly, and isolates what it can't.
Hand rewrites drift on subtle numeric/exception differences (boundary conditions), and verifying that (old-vs-new testing) costs enormous labour. SlimeNENC faithfully mirrors language-specific traps, proves "zero divergence" via differential testing, backed by an independent reference implementation. The deliverable is a machine-checkable "certificate of behavioural invariance," not human UAT. No overstating; what it can't do isn't hidden.
Source is projected onto a Slot IR (language-independent structural intermediate form) and transcribed structure-preserving into the target. The statically-determined core is made bit-exact; dynamic, state-dependent parts are honestly isolated (isolate, don't confabulate). Backed by differential fuzzing plus formal methods where applicable, with deterministic verification a third party can reproduce locally.
Projection (π) of source as unique structure, not semantics. Primitives are modelled rigorously in bit-vector theory and formally verified over all inputs; composition/loops are covered by Csmith-style differential fuzzing — a two-tier guarantee. Non-reproducible computation (float/parallel/AI) goes to tier-③: meaning-equivalence + convergence + residual. "Where there is form, prove it in bytes; where there is none, in meaning. Lie on neither face."
📋 "Ask your AI at this level" copies this page's explanation with an instruction matched to the level you picked. Paste it into your own AI (Claude · GPT · Gemini · Grok) to dig deeper at that resolution.
Key measurements (2026-05-19)
= msvcrt + kernel32 unified, printf / CreateFileA / lstrlenA included
= true locals + Win64 arg spill recovered
WriteFile / GetStdHandle / ExitProcess / lstrlenA / GetModuleHandleA / CreateFileA / CloseHandle / puts / printf
shared with SlimeELF-rev, call graph as first-class edge
SIB / movzx / movsx / cdqe / movsxd / shl-shr-sar / and-or — shared with SlimeELF-rev
kernel32 + msvcrt imported together (sample 19 lstrlenA + puts)
Market context — where source-lost Windows binaries live
| Vendor-lost .exe | Vendor went out of business or refuses to maintain — only the .exe / .dll remain. Source recovery is required for SBOM, CVE patching, and audit. |
|---|---|
| Microsoft legacy estates | Long-running Windows-based business apps with no surviving source repository. C-source recovery enables a cleanroom rewrite plan. |
| Industrial control / OT | Closed Windows-based PLC HMIs, instrumentation daemons, factory MES adapters frozen for 10-30 years. |
| Compliance documentation | FDA / PMDA / IEC 62443 OT-security obligate “complete software description” even when only the binary is on hand. Binary-only components must be lifted to C as auditor-reproducible documentation. |
| Competitive landscape | Ghidra / IDA Pro / Hex-Rays / RetDec already exist. SlimePE-rev differentiates on three axes: (1) determinism + 8-axis round-trip auto-regression proves “lossless”; (2) single unified Slot IR shared with SlimeELF-rev enables cross-OS audit pipelines; (3) decompile output compiles with mingw-gcc and runs as a real .exe whose stdout matches the original — Hex-Rays produces static C but does not guarantee build + run round-trip. |
S9 bench — all 8 axes: PE 168/168 PASS (bit-level)
The same 8-axis S9 bench harness used for ELF, applied to PE32+ at bit-level. Decoder (~37 opcodes) and Slot IR schema are shared with SlimeELF-rev; only the container layer (PE parser, IAT resolution, Win64 ABI) is PE-specific.
| Axis 1a PE dialect-detect | DOS magic “MZ” + PE signature “PE\0\0” + COFF Machine=0x8664 (IMAGE_FILE_MACHINE_AMD64) + Optional Header Magic=0x20B (PE32+) + Subsystem=3 (CONSOLE) validated at bit-level. 21/21 PASS. |
|---|---|
| Axis 1b opcode-recover | Every instruction within .text (VirtualSize-limited live region) decoded — db 0xNN fallback count = 0. 21/21 PASS. |
| Axis 2 mutation-detect | 1-bit flip in .text, 5 trials × 21 samples = 105 trials, 105/105 detected. |
| Axis 3 determinism | Same .exe disassembled twice → byte-equal across all 21 samples. 21/21 PASS. |
| Axis 4 ASM round-trip | emit NASM → nasm -fwin64 → mingw-link (-lmsvcrt -lkernel32) → real .exe → WSL2 binfmt execute → stdout matches original. IAT references are emitted as extern __imp_FUNC so the linker regenerates the PE import table. 21/21 PASS. |
| Axis 5 C round-trip | emit C → mingw-gcc -nostdlib -e _start → real .exe → stdout matches original. The IAT-indirect call mov reg, qword [rip+iat]; call reg is collapsed by a peephole into a synthetic call_iat <FUNC> and expanded to a direct kernel32 / msvcrt call with Win64 ABI arguments pulled from rcx/rdx/r8/r9 / [rsp+0x20]. 21/21 PASS. |
| Axis 6 structured-C round-trip | CFG-recovered structured C (do-while + if/else + per-function) → mingw → run → match. 21/21 PASS, including recursive fact(4) = 24. |
| Axis 7 Slot IR round-trip | The PE32+ binary is lifted into the same Slot IR schema used for ELF. SlotFunctions, call graph edges and structural equality all round-trip. 21/21 PASS, with 06_call_simple (2fn/1call) / 07_two_funcs (3fn/2call) / 08_recursion (2fn/2call, self-loop) reconstructing call graphs identical to their Linux counterparts. |
PE32+ specific layer (Phase D)
- PECOFF v8.3 parserDOS header (64B, magic MZ + e_lfanew @ 0x3C) → PE signature (4B) → COFF header (20B, Machine=0x8664 required) → Optional Header (PE32+ 240B, Magic=0x20B required) → Section Table parsed at bit-level. VirtualSize bounds the live .text region.
- DLL import resolutionDataDirectory[1] Import Table walked:
_IMAGE_IMPORT_DESCRIPTORarray + INT/IAT thunk array (8B, ordinal flag = bit 63) + Hint/Name table. Each IAT slot VA resolves to DLL!function. e.g.0x140003070 → msvcrt.dll!puts. - IAT-call peepholeThe two-instruction pair
mov reg, qword [rip+iat]; call regthat gcc -O0 + MinGW emits is folded into a single syntheticcall_iat <FUNC>and lowered to a direct C call via Win64 ABI. - Win64 ABI recipeArgument slots: rcx (R[1]) / rdx (R[2]) / r8 (R[8]) / r9 (R[9]) / 5th at
mem_r(R[4]+0x20, 8). String arguments (LPCSTR) are reconstructed by reading the virtual VA byte-by-byte via mem_r into a local buffer before passing to the real kernel. - WinAPI recipe table (extensible)Currently 9 functions — kernel32: GetStdHandle / WriteFile / ExitProcess / lstrlenA / GetModuleHandleA / CreateFileA / CloseHandle, msvcrt: puts / printf (1-arg form). Adding a new API is a 3-step procedure (PROLOGUE dllimport declaration + recipe table entry + link command
-l<dll>); the decoder, CFG and Slot IR layers stay untouched. - trailing thunk trim (call targets preserved)The
jmp qword [rip+iat]thunk table at the .text tail is excluded from the SlotImage (no spurious functions). Local thunks reachable fromcall rel32are preserved (e.g. MinGWprintf → putsoptimisation routes via a local<puts>thunk; dropping it would leave the call target undefined).
Phase E lift v2 — true-local-variable + Win64 arg spill recovery (21/21)
Transforms Phase D's VM-form C output (R[] + STACK[] + mem_r/mem_w dispatcher) into structured-C emit and applies stack-slot promotion per function (rbp ± offset memory accesses are lifted into named C locals). The C scoping rules eliminate cross-function frame collisions automatically. Win64 ABI arg spills ([rbp+0x10], [rbp+0x18], etc.) are also recovered as locals, bringing the output one step closer to natural C.
All 21 PE samples have their lifted C output rebuilt with mingw-gcc and confirmed to match the original .exe's stdout byte-for-byte under WSL2 binfmt.
Sample inventory (21 PE32+ .exe binaries)
C sources functionally equivalent to the Linux ELF 17 sample subset plus four libc-linked Windows specifics (msvcrt!puts / lstrlenA + puts / multi-WinAPI / msvcrt!printf), built with x86_64-w64-mingw32-gcc -nostdlib -e _start into PE32+ .exe and bench-tested under the same 8 axes. The IAT-indirect call (mov rax, qword [rip+iat]; call rax) is folded by a peephole into call_iat <FUNC> and lowered to direct kernel32 / msvcrt calls via the Win64 ABI. stdout verified by direct execution under WSL2 binfmt.
| 01 hello (PE) | GetStdHandle + WriteFile + ExitProcess → Hello, PE!. Minimal IAT of 3 functions via Win64 ABI. |
|---|---|
| 02 arith (PE) | 17+25 = 42 written as 2 ASCII digits via WriteFile → SUM=42. cqo + idiv rsi for true signed division. |
| 03 loop (PE) | For-loop computing 1+2+3+4+5 = 15 → SUM=15. cmp DWORD PTR + jle loop body + WriteFile. |
| 04 branch (PE) | if (x >= 5) branch with x=7 → big. Both arms WriteFile a distinct string and converge on a common join. |
| 05 compute (PE) | 6 × 7 = 42 → PROD=42. 2-op imul rax, rbx + idiv. |
| 06 call_simple (PE) | _start → do_print(handle). Win64 rcx passing + nested WriteFile. |
| 07 two_funcs (PE) | add_two() + print_dec(handle, val). SlotImage carries 3 function nodes and 2 inter-procedural edges. |
| 08 recursion (PE) | factorial(4) = 24 via self-recursion → FACT=24. Call graph carries a fact → fact self-loop; callee-saved state preserved via Win64 shadow space at [rsp+0x20]. |
| 09 array_sum (PE) | arr[5] = {3,5,7,9,11} sum → SUM=35. SIB byte mov rax, [rax*8+0x140002000] for indexed access. |
| 10 strlen (PE) | Hand-written strlen on "Hello, World!\n" → LEN=14. movzx eax, BYTE PTR [rax] + test al, al null-terminator loop. |
| 11 signed_array (PE) | Signed-char array {-3, 5, -8, 12, 4} sum → SUM=10. movsx rax, al + signed arithmetic. |
| 12 int_index (PE) | int i array loop + 3-digit print → SUM=150. cdqe + 32-bit op variants + dual divmod (100 / 10). |
| 13 bitshift (PE) | 32 << 3 = 256, >> 1 = 128 → VAL=128. shl/sar r32 (32-bit op variants included). |
| 14 bitmask (PE) | 0xFF12 & 0xFF = 18 → RES=18. and r/m64, imm8 bit-masking. |
| 15 stride (PE) | Stride access arr[i*3] sum → STR=35. add + add expansion for i*3 + SIB load. |
| 16 matrix (PE) | 3×4 matrix nested-loop sum → MAT=78. 2D array + nested CFG; lea rdx, [rax*4+0x0] (SIB-form lea) for row offset. |
| 17 struct (PE) | Array-of-struct pts[3] = {{10,20},{30,40},{50,60}} field sum → PT=210. SIB + displacement field offsets. |
| 18 msvcrt_puts (PE) | First libc-linked sample. msvcrt!puts("Hello, msvcrt!") + kernel32!ExitProcess → Hello, msvcrt!. Demonstrates scalability to DLL imports beyond kernel32. |
| 19 lstrlenA (PE) | kernel32!lstrlenA("Hello, libc!") measures length 12, then msvcrt!puts prints as two digits → LEN=12. Two DLLs (kernel32 + msvcrt) imported from a single binary. |
| 20 winapi_multi (PE) | Multi-WinAPI integration — GetModuleHandleA + CreateFileA + CloseHandle + WriteFile. Exercises the WinAPI recipe table at scale, with file-handle lifecycle preserved through reverse. |
| 21 printf (PE) | msvcrt!printf("Hello, %s!\n", "printf") 1-arg form — verifies the local-thunk routed call (MinGW optimises printf → puts via a local thunk; trim must preserve the target). |
Function = Slot node, call graph as first-class IR (Phase B (d))
Each function becomes a SlotFunction node; call edges are first-class IR (a list of callee names per function). Self-recursion is naturally a self-loop edge. The full SlotImage encodes/decodes via deterministic JSON (Axis 7 round-trip), so call graphs and function structure can flow into external toolchains (audit DBs, SBOM, static analysis) without information loss. The Slot IR schema is unified with SlimeELF-rev, enabling cross-OS audit pipelines.
Audit fitness (finance / defense / medical-device / OT)
- Bit-exactSame PE input → same sha256 NASM/C output. CFG / function boundaries / instruction stream all fully deterministic.
- Native .exe round-tripEmitted NASM re-assembled via nasm-fwin64 + mingw-link; emitted C compiled via mingw-gcc -nostdlib; two real native .exe files executed (under WSL2 binfmt) and stdout compared with the original. Not simulation — real-machine verification.
- Mutation detection1-bit flip in .text always changes disasm. 21 × 5 = 105/105 detected — tampering is immediately visible.
- DeterminismSame .exe disassembled + emitted twice → byte-equal per sample. Stable across parallel and GPU execution.
- Slot IR auditFunction = Slot node + call graph persisted as deterministic JSON. Joins SBOM / audit DB pipelines as a structured artifact.
- Build-time LLMLLM only at decoder-rule construction time. Runtime is deterministic rule-based — aligned with finance / defense / OT audit requirements.
Supported instructions (shared with SlimeELF-rev, ~37 patterns)
| Data movement | mov reg/mem, imm/reg (B8+r / 89 /r / 8B /r / C7 /0 / 88 /r, 64-bit and 32-bit) / movzx / movsx / lea r64, m (8D /r) / nop / leave |
|---|---|
| Arithmetic | add / sub r/m, r / imul r, r/m (REX.W 0F AF) / idiv r/m (F7 /7) / cqo / cdqe / cdq / movsxd r64, r/m32 |
| Logic | and / or / xor r/m64, r64 / xor reg, reg idiom recognised as zero-init |
| Bit shifts | shl / shr / sar r/m, imm8 (C1 /N) / shl / shr / sar r/m, 1 (D1 /N) |
| Compare / test | cmp r/m64, r64 / cmp r/m, imm8 / test r/m, r |
| Branch | Jcc rel8/32 / jmp rel8/32 / loop rel8 (E2) |
| Call / stack | call rel32 (E8) / ret (C3) / push/pop r64 / push imm |
| System (PE) | IAT-indirect call (mov rax, qword [rip+iat]; call rax) folded by peephole into call_iat <FUNC>, then lowered to a direct WinAPI call via Win64 ABI (rcx / rdx / r8 / r9 + [rsp+0x20]). (ELF uses syscall (0F 05) heuristic; see SlimeELF-rev.) |
| Memory operands | [reg] / [reg+disp8/32] / [rip+disp32] / [base+index*scale+disp] (SIB byte, scale=1/2/4/8) — covers [rbp-disp] local-variable access, [rax*8+disp] array indexing, and Win64 shadow space [rsp+0x20]. |
Next-phase additions: printf-N / malloc / SSE2 / SSE4 (XMM + floating-point), 3-op imul r64, r/m64, imm32, movabs r64, imm64, plus Phase E v3 (loop / natural-cond recovery) and Phase F PSDP (auto OpenMP) carry-over from the ELF side.
License model
| Charged | WASM/WASI converter tool (developer side) |
|---|---|
| Not charged | The produced NASM / C sources (customer asset, perpetual deployment) |
| Method | Ed25519 144B signed license + 3-hop air-gap activation (finance / defense / OT audit ready) |
| Parallelization (PSDP) | Not included. See the independent PSDP SKU under SlimeNENC. |
Related materials
- Sister (Linux reverse)SlimeELF-rev — Linux ELF x86_64 reverse, same Slot IR + shared decoder.
- Sister (forward)SlimeASM — HLASM + Win x64 MASM forward transpiler.
- Reverse family overviewSlimeASM-rev landing — the umbrella reverse-family page.
- Slot IR shared familySlimeCOBOL / SlimePL/I / SlimeRPG / SlimeMUMPS share the Slot IR (Core64 + Ext32 fixed-bit).
- Patent applicationJP application 2026-046620 v15b, claims 11 / 14d.
Reverse PoC / Request Materials Back to SlimeNENC family SlimeELF-rev (Linux pair)
