SlimeELF-rev — Linux ELF x86_64 binary → ASM + C reverse transpiler
Recover NASM and C sources from a tag-less native Linux ELF, bit-exact.
Convert source-lost / vendor-lost Linux x86_64 native binaries from banking RHEL fleets,
defense embedded Linux, and medical-device daemons back into both NASM intel source and C source,
bit-exact. The emitted NASM is re-assembled with nasm + ld; the emitted C is
compiled with gcc -O0 -nostdlib -static; both produce real native
ELFs that, when run, reproduce the original binary's stdout byte-for-byte — the
strictest round-trip axis we know how to write.
- Phase A: ELF64 minimal parser + x86_64 instruction decoder (integer hot-loop subset) + NASM intel emitter + straight-line C emitter
- Phase B entry: CFG recovery (Cooper-Harvey-Kennedy iterative dominator + Aho/Sethi natural loop body) + structured C (
do { ... } while (R[1] != 0);+if/elsediamond) - Phase B (b): function-boundary recovery (prologue/epilogue + call/ret + self-recursion). Each function emits as its own C function, call → C call, ret → return;
- Phase B (d): inter-procedural Slot IR — function = Slot node, call graph lifted as first-class IR edges, deterministic JSON round-trip; self-recursion expressed naturally
- S9 bench: ELF 8 axes 200/200 + Phase E lift v2 25/25 = 225/225 PASS (2026-05-19), 25 samples (NASM hand-written 8 + gcc -O0 -nostdlib -static 17) — recursive fact(4) = 24, 2D matrix nested loops, struct field offsets and SIB-byte indexing all green across ASM and C round-trips
- Phase E lift v2: stack-slot promotion per function recovering true local variables on ELF (25/25 lifted-C binaries match the original stdout byte-for-byte)
- Phase E v3 PoC (loop + natural-cond recovery): regex transform lifts
goto + BB-labelform into human-readablewhile (var <= 5) { ... }etc.; 7/7 ELF loop samples (03_loop / 09_array_sum / 10_strlen / 11_signed_array / 12_int_index / 15_stride / 16_matrix) build+run PASS, 6/7 with natural-cond recovery - Phase F PSDP PoC (auto OpenMP parallelisation): detects reduction loops and inserts
#pragma omp parallel for reduction(...)automatically — a 3-reduction sample (sum + product + xor) confirms gcc -fopenmp 4-thread output matches serial byte-for-byte, demonstrating the full binary → equivalent C → lifted C → parallel C 3-stage pipeline end-to-end
A reverse transpiler for “source-lost native binaries” in Linux x86_64 banking, defense and embedded systems, built on deterministic translation + 8-axis round-trip auto-regression + audit chain.
Paired with sister product SlimePE-rev for Windows PE32+ (same Slot IR, shared decoder), and forward sister SlimeASM for HLASM + MASM forward.
A SlimeNENC-family tool that moves legacy code into a modern language without changing its meaning. A hand rewrite quietly drifts and causes incidents; SlimeNENC doesn't interpret meaning — it copies only the "skeleton" (structure), so the computed results stay identical to the original. It proves the behaviour first, so migration anxiety disappears. It copies only what it can, honestly, and isolates what it can't.
Hand rewrites drift on subtle numeric/exception differences (boundary conditions), and verifying that (old-vs-new testing) costs enormous labour. SlimeNENC faithfully mirrors language-specific traps, proves "zero divergence" via differential testing, backed by an independent reference implementation. The deliverable is a machine-checkable "certificate of behavioural invariance," not human UAT. No overstating; what it can't do isn't hidden.
Source is projected onto a Slot IR (language-independent structural intermediate form) and transcribed structure-preserving into the target. The statically-determined core is made bit-exact; dynamic, state-dependent parts are honestly isolated (isolate, don't confabulate). Backed by differential fuzzing plus formal methods where applicable, with deterministic verification a third party can reproduce locally.
Projection (π) of source as unique structure, not semantics. Primitives are modelled rigorously in bit-vector theory and formally verified over all inputs; composition/loops are covered by Csmith-style differential fuzzing — a two-tier guarantee. Non-reproducible computation (float/parallel/AI) goes to tier-③: meaning-equivalence + convergence + residual. "Where there is form, prove it in bytes; where there is none, in meaning. Lie on neither face."
📋 "Ask your AI at this level" copies this page's explanation with an instruction matched to the level you picked. Paste it into your own AI (Claude · GPT · Gemini · Grok) to dig deeper at that resolution.
Key measurements (2026-05-19)
= NASM-hand 8 + gcc -O0 17
= true local vars + Win64-agnostic frame recovery
= 7 loop samples build+run, 6/7 with natural-cond
OpenMP 4-thread and serial outputs byte-match (Phase F PoC)
shared with SlimePE-rev, call graph as first-class edge, self-recursion native
SIB / movzx / movsx / cdqe / movsxd / shl-shr-sar / and-or — shared with SlimePE-rev
Market context — where source-lost Linux ELF binaries live
| Banks (Linux x86_64) | During modernization projects, native ELF / .so libraries with no source and no surviving maintenance vendor. Heirloom / Astadia focus on mainframe HLASM and do not target Linux native binaries. |
|---|---|
| Defense / aerospace | Closed binaries (embedded Linux ELF / instrumentation daemons) frozen for 10-30 years. The originating vendor cannot supply source, but a C-source recovery with audit chain is required. |
| Embedded / medical devices | FDA / PMDA / IEC 62304 obligate “complete software description”. Binary-only components must be lifted to C as auditor-reproducible documentation. |
| Legacy documentation | “Working but untouchable” daemons must be lifted to C so static analysis, SBOM and CVE auditing can apply. |
| Competitive landscape | Ghidra (NSA OSS) / IDA Pro / Hex-Rays / RetDec already exist. SlimeELF-rev differentiates on three axes: (1) determinism + 8-axis round-trip auto-regression proves “lossless” via the bench harness; (2) single unified Slot IR shared with SlimePE-rev enables cross-OS audit pipelines; (3) decompile output compiles directly with gcc/ld and runs with stdout matching the original. |
S9 bench — all 8 axes: ELF 200/200 PASS
The S9 bench harness validates ELF at bit-level. The x86_64 instruction decoder built in Phase B (~37 opcodes) is shared with the sister product SlimePE-rev; only the container layer (ELF parser, syscall heuristic) is ELF-specific.
| Axis 1a dialect-detect | Tokenizer recognises ELF magic + ELFCLASS64 + EM_X86_64 (e_machine = 0x3E). 25/25 PASS. |
|---|---|
| Axis 1b opcode-recover | All 1,397 .text instructions across 25 samples (NASM 177 + gcc 1,220) decoded — db 0xNN fallback count = 0. 25/25 PASS. |
| Axis 2 mutation-detect | 1-bit flip in .text, 5 trials × 25 samples = 125 trials, 125/125 detected. Disasm output must differ — invariant. |
| Axis 3 determinism | NASM emit twice, byte-equal across all 25 samples. 25/25 PASS. |
| Axis 4 ASM round-trip | emit NASM → nasm + ld → run → original stdout match. The strictest axis: two real native binaries (original + ours) executed and compared. 25/25 PASS, including recursive fact(4) = 24, 2D matrix nested loops, struct field offsets and SIB-byte array indexing. |
| Axis 5 C round-trip | emit C → gcc -O0 -nostdlib -static → run → original stdout match. Straight-line PC dispatch + byte-addressable STACK[] modelling call/ret/push/pop is bit-faithful. 25/25 PASS. |
| Axis 6 structured-C round-trip | CFG-recovered structured C (do-while + if/else + per-function + call → C call + ret → return;) → gcc → run → match. 25/25 PASS (nested loops included). |
| Axis 7 Slot IR round-trip | SlotImage → JSON → SlotImage → structural equality + JSON byte-equal double check. Function = Slot node and the call graph are preserved completely. 25/25 PASS. |
Phase E lift v2 — true-local-variable recovery (25/25)
Transforms Phase D's VM-form C output (R[] + STACK[] + mem_r/mem_w dispatcher) into structured-C emit and applies stack-slot promotion per function (rbp ± offset memory accesses are lifted into named C locals). The C scoping rules eliminate cross-function frame collisions automatically.
// Phase D VM form (before lifting) mem_w((R[5] + (uint64_t)((int64_t)(-8LL))), (uint64_t)(R[0]), 8); R[0] = (uint64_t)(mem_r((R[5] + (uint64_t)((int64_t)(-8LL))), 8)); // Phase E lift v2 (after lifting) int64_t var_m8 = 0; /* [rbp-8] */ var_m8 = (int64_t)(R[0]); R[0] = (uint64_t)var_m8;
All 25 ELF samples have their lifted C output rebuilt with gcc -nostdlib and confirmed to match the original binary's stdout byte-for-byte.
Phase E v3 PoC — loop + natural-cond recovery (7 ELF samples operational)
Lifts the structured-C emit's goto BB_TEST; BB_BODY: body; BB_TEST: cmp; if (cond) goto BB_BODY; form (PC dispatch + BB labels) first into while (1) { test; if (!cond) break; body; } shape (Phase E v3 minimum), then rewrites the cmp + ZF/SF/OF + Jcc bit-level encoding into natural expressions (while (var <= 5) {...}) following the Jcc condition semantics (je→==, jne→!=, jl→<, jge→>=, jle→<=, jg→>). 7/7 ELF loop samples build+run PASS, 6/7 also recover natural cond (10_strlen uses a different test/jne pattern but still passes in while(1)+break form).
This phase is currently ELF-first; PE32+ extension is on the roadmap (instruction encoding is shared; CFG patterns from MinGW's gcc output differ slightly).
| sample | v3 minimum | v3 full | build+run |
|---|---|---|---|
| 03_loop | ✓ | ✓ (var <= 5) | SUM=15 |
| 09_array_sum | ✓ | ✓ | SUM=35 |
| 10_strlen | ✓ | − (test/jne different pattern, kept in while(1)+break form) | LEN=14 |
| 11_signed_array | ✓ | ✓ | SUM=10 |
| 12_int_index | ✓ | ✓ | SUM=150 |
| 15_stride | ✓ | ✓ | STR=35 |
| 16_matrix | ✓ | ✓ | MAT=78 |
Phase F PSDP PoC — automatic OpenMP parallelisation of reduction loops
Takes natural-form C (the kind Phase E v3 produces) and detects reduction-style loops to insert #pragma omp parallel for reduction(...) automatically. Supports 5 reduction operators (+= / *= / |= / &= / ^=) and aggregates multiple reductions into a single pragma. A 3-reduction sample (sum + product + xor simultaneously) confirms gcc -fopenmp 4-thread output matches the serial version byte-for-byte.
// input (natural C)
for (int i = 0; i < 5; i = i + 1) {
sum += arr[i]; prod *= arr[i]; xorv ^= arr[i];
}
// automatic transform output (Phase F PSDP PoC)
#pragma omp parallel for reduction(+:sum) reduction(*:prod) reduction(^:xorv)
for (int i = 0; i < 5; i = i + 1) {
sum += arr[i]; prod *= arr[i]; xorv ^= arr[i];
}
// → gcc -fopenmp -O2 + 4-thread run → "SUM=015 PROD=120 XOR=001"
// → byte-matches the serial version (semantics preserved from the original ELF binary)
This is the world's only same-project demonstration of the 3-stage pipeline (binary → equivalent C → human-readable C → parallel C). Where Hex-Rays / Ghidra / IDA Pro stop at “produce human-readable C with no build/run/parallelism guarantee”, SlimeELF-rev verifies all three stages by stdout match under build+run+parallel execution — a decisive differentiator.
Currently ELF-first; PE32+ Phase F extension is roadmapped.
Sample inventory (25 ELF binaries)
8 hand-written NASM samples (research PoC) + 17 real-world C sources built with gcc -O0 -nostdlib -static. Coverage includes array indexing (SIB byte), signed/unsigned extension (movsx / movzx / movsxd / cdq / cdqe), bit operations (shl / shr / sar / and / or), nested loops, 2D arrays, struct field offsets and recursion.
NASM-source 8 (research PoC, hand-written integer hot-loop subset)
| 01 hello | Syscall write of "Hello, ELF!\n". Smallest ELF: 1 BB / 8 instructions. |
|---|---|
| 02 arith | 17 + 25 = 42 printed as 2 ASCII digits. Exercises idiv, add al, mov [rip+disp], etc. |
| 03 loop | loop sum_loop for 1+2+3+4+5 = 15. CFG has a back edge; structured C recovers as do { ... } while (R[1] != 0);. |
| 04 branch | cmp + jge diamond. Structured C recovers as if (cond) { ... } else { ... } meeting at a common join BB. |
| 05 compute | imul rax, rbx for 6 × 7 = 42. |
| 06 call_simple | _start → do_print. Prologue (push rbp; mov rbp, rsp) + epilogue (pop rbp; ret) recognised as function boundary and split into independent C functions. |
| 07 two_funcs | 3 functions (_start → add_two + print_dec). Two inter-procedural call graph edges. |
| 08 recursion | factorial(4) = 24 via self-recursion. Call graph carries a self-loop edge (fact → fact); push rbx / pop rbx caller-saved spill is preserved bit-faithfully via STK[]. |
gcc -O0 -nostdlib -static 17 (production target, real-world C sources)
| 01 hello (gcc) | Freestanding C printing "Hello, gcc -O0!\n". 35 instructions. |
|---|---|
| 02 arith (gcc) | 17 + 25 = 42. True signed-division path via cqo + idiv. 69 instructions. |
| 03 loop (gcc) | For-loop computing 1+2+3+4+5 = 15. 71 instructions. |
| 04 branch (gcc) | if (x >= 5) branch. Full function epilogue including leave. 46 instructions. |
| 05 compute (gcc) | 6 × 7 = 42 with 2-op imul; multiple syscall calls. 68 instructions. |
| 06 call_simple (gcc) | _start calls static void do_print(void) then sys_exit. Three-function boundary recovery. 41 instructions. |
| 07 two_funcs (gcc) | Helpers add_two() + print_dec(). Call graph has 5 function nodes; inter-procedural argument passing via rdi is bit-faithful. 84 instructions. |
| 08 recursion (gcc) | C-written fact(4) = 24 recursion. Self-loop edge detected in the call graph. 85 instructions. |
| 09 array_sum (gcc) | arr[5] = {3,5,7,9,11} sum → SUM=35. SIB byte indexed array access. 72 instructions. |
| 10 strlen (gcc) | Hand-written strlen on "Hello, World!\n" → LEN=14. movzx + test al, al driven null-terminator loop. 81 instructions. |
| 11 signed_array (gcc) | Signed-char array {-3, 5, -8, 12, 4} sum → SUM=10. movsx and signed arithmetic. 74 instructions. |
| 12 int_index (gcc) | int i array loop + 3-digit print → SUM=150. cdqe + 32-bit op variants. 89 instructions. |
| 13 bitshift (gcc) | 32 << 3 = 256, >> 1 = 128 → VAL=128. shl / shr / sar. 86 instructions. |
| 14 bitmask (gcc) | 0xFF12 & 0xFF = 18 → RES=18. and r/m64, imm8 bit-masking. 65 instructions. |
| 15 stride (gcc) | Stride access arr[i*3] sum → STR=35. SIB indexed load. 72 instructions. |
| 16 matrix (gcc) | 3×4 matrix sum via nested loop → MAT=78. 2D array + nested CFG; SIB-form lea computes the row offset. 85 instructions. |
| 17 struct (gcc) | Array-of-struct pts[3] = {{10,20},{30,40},{50,60}} field sum → PT=210. SIB + displacement field offsets. 107 instructions. |
Function = Slot node, call graph as first-class IR (Phase B (d))
Each function becomes a SlotFunction node; call edges are first-class IR (a list of callee names per function). Self-recursion is naturally a self-loop edge. The full SlotImage encodes/decodes via deterministic JSON (Axis 7 round-trip), so call graphs and function structure can flow into external toolchains (audit DBs, SBOM, static analysis) without information loss. The Slot IR schema is unified with SlimePE-rev, enabling cross-OS audit pipelines.
Audit fitness (finance / defense / medical-device)
- Bit-exactSame ELF input → same sha256 NASM/C output. CFG / function boundaries / instruction stream all fully deterministic.
- Native ELF round-tripEmitted NASM re-assembled via nasm + ld; emitted C compiled via gcc -nostdlib; two real native ELFs executed and stdout compared with the original. Not simulation — real-machine verification.
- Mutation detection1-bit flip in .text always changes disasm. 25 × 5 = 125/125 detected — tampering is immediately visible.
- DeterminismSame ELF disassembled + emitted twice → byte-equal per sample. Stable across parallel and GPU execution.
- Slot IR auditFunction = Slot node + call graph persisted as deterministic JSON. Joins SBOM / audit DB pipelines as a structured artifact.
- Build-time LLMLLM only at decoder-rule construction time. Runtime is deterministic rule-based — aligned with bank / defense audit requirements.
Supported instructions (shared with SlimePE-rev, ~37 patterns)
| Data movement | mov reg/mem, imm/reg (B8+r / 89 /r / 8B /r / C7 /0 / 88 /r, both 64-bit and 32-bit) / movzx r32/64, r/m8 (0F B6 /r, zero-extend) / movsx r32/64, r/m8 (0F BE /r, sign-extend) / lea r64, m (8D /r, SIB / [reg+disp] / [rip+disp] all forms) / nop (90) / leave (C9) |
|---|---|
| Arithmetic | add / sub r/m, r / imul r, r/m (REX.W 0F AF) / idiv r/m (F7 /7) / cqo / cdqe / cdq / movsxd r64, r/m32 |
| Logic | and / or / xor r/m64, r64 / xor reg, reg idiom recognised as zero-init |
| Bit shifts | shl / shr / sar r/m, imm8 (C1 /N) / shl / shr / sar r/m, 1 (D1 /N) |
| Compare / test | cmp r/m64, r64 / cmp r/m, imm8 (83 /7) / test r/m, r |
| Branch | Jcc rel8/32 (je / jne / jge / jg / jl / jle / jb / jae / jbe / ja / js / jns ...) / jmp rel8/32 / loop rel8 (E2) |
| Call / stack | call rel32 (E8) / ret (C3) / push/pop r64 (50-5F) / push imm |
| System (ELF) | syscall (0F 05) — sys_write (rax=1) / sys_exit (rax=60) recognised by heuristic. (PE32+ uses IAT-indirect calls; see SlimePE-rev.) |
| Memory operands | [reg] / [reg+disp8/32] / [rip+disp32] / [base+index*scale+disp] (SIB byte, scale=1/2/4/8, REX.X index extension) — full coverage of [rbp-disp] local-variable access and [rax*8+disp] array indexing. |
Next-phase additions: printf / malloc / PLT/GOT dynamic linking, SSE2 / SSE4 (XMM + floating-point), 3-op imul r64, r/m64, imm32, movabs r64, imm64.
License model
| Charged | WASM/WASI converter tool (developer side) |
|---|---|
| Not charged | The produced NASM / C sources (customer asset, perpetual deployment) |
| Method | Ed25519 144B signed license + 3-hop air-gap activation (finance / defense audit ready) |
| Parallelization (PSDP) | Not included. See the independent PSDP SKU under SlimeNENC. |
Related materials
- Sister (same OS, forward)SlimeASM — HLASM + Win x64 MASM forward transpiler.
- Sister (Windows reverse)SlimePE-rev — Windows PE32+ x86_64 reverse, same Slot IR + shared decoder.
- Reverse family overviewSlimeASM-rev landing — the umbrella reverse-family page.
- Slot IR shared familySlimeCOBOL / SlimePL/I / SlimeRPG / SlimeMUMPS share the Slot IR (Core64 + Ext32 fixed-bit).
- Patent applicationJP application 2026-046620 v15b, claims 11 / 14d.
Reverse PoC / Request Materials Back to SlimeNENC family SlimePE-rev (Windows pair)
