SlimeELF-rev — Linux ELF x86_64 binary → ASM + C reverse transpiler
Recover NASM and C sources from a tag-less native Linux ELF, bit-exact.
Convert source-lost / vendor-lost Linux x86_64 native binaries from banking RHEL fleets,
defense embedded Linux, and medical-device daemons back into both NASM intel source and C source,
bit-exact. The emitted NASM is re-assembled with nasm + ld; the emitted C is
compiled with gcc -O0 -nostdlib -static; both produce real native
ELFs that, when run, reproduce the original binary's stdout byte-for-byte — the
strictest round-trip axis we know how to write.
- Phase A: ELF64 minimal parser + x86_64 instruction decoder (integer hot-loop subset) + NASM intel emitter + straight-line C emitter
- Phase B entry: CFG recovery (Cooper-Harvey-Kennedy iterative dominator + Aho/Sethi natural loop body) + structured C (
do { ... } while (R[1] != 0);+if/elsediamond) - Phase B (b): function-boundary recovery (prologue/epilogue + call/ret + self-recursion). Each function emits as its own C function, call → C call, ret → return;
- Phase B (d): inter-procedural Slot IR — function = Slot node, call graph lifted as first-class IR edges, deterministic JSON round-trip; self-recursion expressed naturally
- S9 bench: ELF 8 axes 200/200 + Phase E lift v2 25/25 = 225/225 PASS (2026-05-19), 25 samples (NASM hand-written 8 + gcc -O0 -nostdlib -static 17) — recursive fact(4) = 24, 2D matrix nested loops, struct field offsets and SIB-byte indexing all green across ASM and C round-trips
- Phase E lift v2: stack-slot promotion per function recovering true local variables on ELF (25/25 lifted-C binaries match the original stdout byte-for-byte)
- Phase E v3 PoC (loop + natural-cond recovery): regex transform lifts
goto + BB-labelform into human-readablewhile (var <= 5) { ... }etc.; 7/7 ELF loop samples (03_loop / 09_array_sum / 10_strlen / 11_signed_array / 12_int_index / 15_stride / 16_matrix) build+run PASS, 6/7 with natural-cond recovery - Phase F PSDP PoC (auto OpenMP parallelisation): detects reduction loops and inserts
#pragma omp parallel for reduction(...)automatically — a 3-reduction sample (sum + product + xor) confirms gcc -fopenmp 4-thread output matches serial byte-for-byte, demonstrating the full binary → equivalent C → lifted C → parallel C 3-stage pipeline end-to-end
A reverse transpiler for “source-lost native binaries” in Linux x86_64 banking, defense and embedded systems, built on deterministic translation + 8-axis round-trip auto-regression + audit chain.
Paired with sister product SlimePE-rev for Windows PE32+ (same Slot IR, shared decoder), and forward sister SlimeASM for HLASM + MASM forward.
Key measurements (2026-05-19)
= NASM-hand 8 + gcc -O0 17
= true local vars + Win64-agnostic frame recovery
= 7 loop samples build+run, 6/7 with natural-cond
OpenMP 4-thread and serial outputs byte-match (Phase F PoC)
shared with SlimePE-rev, call graph as first-class edge, self-recursion native
SIB / movzx / movsx / cdqe / movsxd / shl-shr-sar / and-or — shared with SlimePE-rev
Market context — where source-lost Linux ELF binaries live
| Banks (Linux x86_64) | During modernization projects, native ELF / .so libraries with no source and no surviving maintenance vendor. Heirloom / Astadia focus on mainframe HLASM and do not target Linux native binaries. |
|---|---|
| Defense / aerospace | Closed binaries (embedded Linux ELF / instrumentation daemons) frozen for 10-30 years. The originating vendor cannot supply source, but a C-source recovery with audit chain is required. |
| Embedded / medical devices | FDA / PMDA / IEC 62304 obligate “complete software description”. Binary-only components must be lifted to C as auditor-reproducible documentation. |
| Legacy documentation | “Working but untouchable” daemons must be lifted to C so static analysis, SBOM and CVE auditing can apply. |
| Competitive landscape | Ghidra (NSA OSS) / IDA Pro / Hex-Rays / RetDec already exist. SlimeELF-rev differentiates on three axes: (1) determinism + 8-axis round-trip auto-regression proves “lossless” via the bench harness; (2) single unified Slot IR shared with SlimePE-rev enables cross-OS audit pipelines; (3) decompile output compiles directly with gcc/ld and runs with stdout matching the original. |
S9 bench — all 8 axes: ELF 200/200 PASS
The S9 bench harness validates ELF at bit-level. The x86_64 instruction decoder built in Phase B (~37 opcodes) is shared with the sister product SlimePE-rev; only the container layer (ELF parser, syscall heuristic) is ELF-specific.
| Axis 1a dialect-detect | Tokenizer recognises ELF magic + ELFCLASS64 + EM_X86_64 (e_machine = 0x3E). 25/25 PASS. |
|---|---|
| Axis 1b opcode-recover | All 1,397 .text instructions across 25 samples (NASM 177 + gcc 1,220) decoded — db 0xNN fallback count = 0. 25/25 PASS. |
| Axis 2 mutation-detect | 1-bit flip in .text, 5 trials × 25 samples = 125 trials, 125/125 detected. Disasm output must differ — invariant. |
| Axis 3 determinism | NASM emit twice, byte-equal across all 25 samples. 25/25 PASS. |
| Axis 4 ASM round-trip | emit NASM → nasm + ld → run → original stdout match. The strictest axis: two real native binaries (original + ours) executed and compared. 25/25 PASS, including recursive fact(4) = 24, 2D matrix nested loops, struct field offsets and SIB-byte array indexing. |
| Axis 5 C round-trip | emit C → gcc -O0 -nostdlib -static → run → original stdout match. Straight-line PC dispatch + byte-addressable STACK[] modelling call/ret/push/pop is bit-faithful. 25/25 PASS. |
| Axis 6 structured-C round-trip | CFG-recovered structured C (do-while + if/else + per-function + call → C call + ret → return;) → gcc → run → match. 25/25 PASS (nested loops included). |
| Axis 7 Slot IR round-trip | SlotImage → JSON → SlotImage → structural equality + JSON byte-equal double check. Function = Slot node and the call graph are preserved completely. 25/25 PASS. |
Phase E lift v2 — true-local-variable recovery (25/25)
Transforms Phase D's VM-form C output (R[] + STACK[] + mem_r/mem_w dispatcher) into structured-C emit and applies stack-slot promotion per function (rbp ± offset memory accesses are lifted into named C locals). The C scoping rules eliminate cross-function frame collisions automatically.
// Phase D VM form (before lifting) mem_w((R[5] + (uint64_t)((int64_t)(-8LL))), (uint64_t)(R[0]), 8); R[0] = (uint64_t)(mem_r((R[5] + (uint64_t)((int64_t)(-8LL))), 8)); // Phase E lift v2 (after lifting) int64_t var_m8 = 0; /* [rbp-8] */ var_m8 = (int64_t)(R[0]); R[0] = (uint64_t)var_m8;
All 25 ELF samples have their lifted C output rebuilt with gcc -nostdlib and confirmed to match the original binary's stdout byte-for-byte.
Phase E v3 PoC — loop + natural-cond recovery (7 ELF samples operational)
Lifts the structured-C emit's goto BB_TEST; BB_BODY: body; BB_TEST: cmp; if (cond) goto BB_BODY; form (PC dispatch + BB labels) first into while (1) { test; if (!cond) break; body; } shape (Phase E v3 minimum), then rewrites the cmp + ZF/SF/OF + Jcc bit-level encoding into natural expressions (while (var <= 5) {...}) following the Jcc condition semantics (je→==, jne→!=, jl→<, jge→>=, jle→<=, jg→>). 7/7 ELF loop samples build+run PASS, 6/7 also recover natural cond (10_strlen uses a different test/jne pattern but still passes in while(1)+break form).
This phase is currently ELF-first; PE32+ extension is on the roadmap (instruction encoding is shared; CFG patterns from MinGW's gcc output differ slightly).
| sample | v3 minimum | v3 full | build+run |
|---|---|---|---|
| 03_loop | ✓ | ✓ (var <= 5) | SUM=15 |
| 09_array_sum | ✓ | ✓ | SUM=35 |
| 10_strlen | ✓ | − (test/jne different pattern, kept in while(1)+break form) | LEN=14 |
| 11_signed_array | ✓ | ✓ | SUM=10 |
| 12_int_index | ✓ | ✓ | SUM=150 |
| 15_stride | ✓ | ✓ | STR=35 |
| 16_matrix | ✓ | ✓ | MAT=78 |
Phase F PSDP PoC — automatic OpenMP parallelisation of reduction loops
Takes natural-form C (the kind Phase E v3 produces) and detects reduction-style loops to insert #pragma omp parallel for reduction(...) automatically. Supports 5 reduction operators (+= / *= / |= / &= / ^=) and aggregates multiple reductions into a single pragma. A 3-reduction sample (sum + product + xor simultaneously) confirms gcc -fopenmp 4-thread output matches the serial version byte-for-byte.
// input (natural C)
for (int i = 0; i < 5; i = i + 1) {
sum += arr[i]; prod *= arr[i]; xorv ^= arr[i];
}
// automatic transform output (Phase F PSDP PoC)
#pragma omp parallel for reduction(+:sum) reduction(*:prod) reduction(^:xorv)
for (int i = 0; i < 5; i = i + 1) {
sum += arr[i]; prod *= arr[i]; xorv ^= arr[i];
}
// → gcc -fopenmp -O2 + 4-thread run → "SUM=015 PROD=120 XOR=001"
// → byte-matches the serial version (semantics preserved from the original ELF binary)
This is the world's only same-project demonstration of the 3-stage pipeline (binary → equivalent C → human-readable C → parallel C). Where Hex-Rays / Ghidra / IDA Pro stop at “produce human-readable C with no build/run/parallelism guarantee”, SlimeELF-rev verifies all three stages by stdout match under build+run+parallel execution — a decisive differentiator.
Currently ELF-first; PE32+ Phase F extension is roadmapped.
Sample inventory (25 ELF binaries)
8 hand-written NASM samples (research PoC) + 17 real-world C sources built with gcc -O0 -nostdlib -static. Coverage includes array indexing (SIB byte), signed/unsigned extension (movsx / movzx / movsxd / cdq / cdqe), bit operations (shl / shr / sar / and / or), nested loops, 2D arrays, struct field offsets and recursion.
NASM-source 8 (research PoC, hand-written integer hot-loop subset)
| 01 hello | Syscall write of "Hello, ELF!\n". Smallest ELF: 1 BB / 8 instructions. |
|---|---|
| 02 arith | 17 + 25 = 42 printed as 2 ASCII digits. Exercises idiv, add al, mov [rip+disp], etc. |
| 03 loop | loop sum_loop for 1+2+3+4+5 = 15. CFG has a back edge; structured C recovers as do { ... } while (R[1] != 0);. |
| 04 branch | cmp + jge diamond. Structured C recovers as if (cond) { ... } else { ... } meeting at a common join BB. |
| 05 compute | imul rax, rbx for 6 × 7 = 42. |
| 06 call_simple | _start → do_print. Prologue (push rbp; mov rbp, rsp) + epilogue (pop rbp; ret) recognised as function boundary and split into independent C functions. |
| 07 two_funcs | 3 functions (_start → add_two + print_dec). Two inter-procedural call graph edges. |
| 08 recursion | factorial(4) = 24 via self-recursion. Call graph carries a self-loop edge (fact → fact); push rbx / pop rbx caller-saved spill is preserved bit-faithfully via STK[]. |
gcc -O0 -nostdlib -static 17 (production target, real-world C sources)
| 01 hello (gcc) | Freestanding C printing "Hello, gcc -O0!\n". 35 instructions. |
|---|---|
| 02 arith (gcc) | 17 + 25 = 42. True signed-division path via cqo + idiv. 69 instructions. |
| 03 loop (gcc) | For-loop computing 1+2+3+4+5 = 15. 71 instructions. |
| 04 branch (gcc) | if (x >= 5) branch. Full function epilogue including leave. 46 instructions. |
| 05 compute (gcc) | 6 × 7 = 42 with 2-op imul; multiple syscall calls. 68 instructions. |
| 06 call_simple (gcc) | _start calls static void do_print(void) then sys_exit. Three-function boundary recovery. 41 instructions. |
| 07 two_funcs (gcc) | Helpers add_two() + print_dec(). Call graph has 5 function nodes; inter-procedural argument passing via rdi is bit-faithful. 84 instructions. |
| 08 recursion (gcc) | C-written fact(4) = 24 recursion. Self-loop edge detected in the call graph. 85 instructions. |
| 09 array_sum (gcc) | arr[5] = {3,5,7,9,11} sum → SUM=35. SIB byte indexed array access. 72 instructions. |
| 10 strlen (gcc) | Hand-written strlen on "Hello, World!\n" → LEN=14. movzx + test al, al driven null-terminator loop. 81 instructions. |
| 11 signed_array (gcc) | Signed-char array {-3, 5, -8, 12, 4} sum → SUM=10. movsx and signed arithmetic. 74 instructions. |
| 12 int_index (gcc) | int i array loop + 3-digit print → SUM=150. cdqe + 32-bit op variants. 89 instructions. |
| 13 bitshift (gcc) | 32 << 3 = 256, >> 1 = 128 → VAL=128. shl / shr / sar. 86 instructions. |
| 14 bitmask (gcc) | 0xFF12 & 0xFF = 18 → RES=18. and r/m64, imm8 bit-masking. 65 instructions. |
| 15 stride (gcc) | Stride access arr[i*3] sum → STR=35. SIB indexed load. 72 instructions. |
| 16 matrix (gcc) | 3×4 matrix sum via nested loop → MAT=78. 2D array + nested CFG; SIB-form lea computes the row offset. 85 instructions. |
| 17 struct (gcc) | Array-of-struct pts[3] = {{10,20},{30,40},{50,60}} field sum → PT=210. SIB + displacement field offsets. 107 instructions. |
Function = Slot node, call graph as first-class IR (Phase B (d))
Each function becomes a SlotFunction node; call edges are first-class IR (a list of callee names per function). Self-recursion is naturally a self-loop edge. The full SlotImage encodes/decodes via deterministic JSON (Axis 7 round-trip), so call graphs and function structure can flow into external toolchains (audit DBs, SBOM, static analysis) without information loss. The Slot IR schema is unified with SlimePE-rev, enabling cross-OS audit pipelines.
Audit fitness (finance / defense / medical-device)
- Bit-exactSame ELF input → same sha256 NASM/C output. CFG / function boundaries / instruction stream all fully deterministic.
- Native ELF round-tripEmitted NASM re-assembled via nasm + ld; emitted C compiled via gcc -nostdlib; two real native ELFs executed and stdout compared with the original. Not simulation — real-machine verification.
- Mutation detection1-bit flip in .text always changes disasm. 25 × 5 = 125/125 detected — tampering is immediately visible.
- DeterminismSame ELF disassembled + emitted twice → byte-equal per sample. Stable across parallel and GPU execution.
- Slot IR auditFunction = Slot node + call graph persisted as deterministic JSON. Joins SBOM / audit DB pipelines as a structured artifact.
- Build-time LLMLLM only at decoder-rule construction time. Runtime is deterministic rule-based — aligned with bank / defense audit requirements.
Supported instructions (shared with SlimePE-rev, ~37 patterns)
| Data movement | mov reg/mem, imm/reg (B8+r / 89 /r / 8B /r / C7 /0 / 88 /r, both 64-bit and 32-bit) / movzx r32/64, r/m8 (0F B6 /r, zero-extend) / movsx r32/64, r/m8 (0F BE /r, sign-extend) / lea r64, m (8D /r, SIB / [reg+disp] / [rip+disp] all forms) / nop (90) / leave (C9) |
|---|---|
| Arithmetic | add / sub r/m, r / imul r, r/m (REX.W 0F AF) / idiv r/m (F7 /7) / cqo / cdqe / cdq / movsxd r64, r/m32 |
| Logic | and / or / xor r/m64, r64 / xor reg, reg idiom recognised as zero-init |
| Bit shifts | shl / shr / sar r/m, imm8 (C1 /N) / shl / shr / sar r/m, 1 (D1 /N) |
| Compare / test | cmp r/m64, r64 / cmp r/m, imm8 (83 /7) / test r/m, r |
| Branch | Jcc rel8/32 (je / jne / jge / jg / jl / jle / jb / jae / jbe / ja / js / jns ...) / jmp rel8/32 / loop rel8 (E2) |
| Call / stack | call rel32 (E8) / ret (C3) / push/pop r64 (50-5F) / push imm |
| System (ELF) | syscall (0F 05) — sys_write (rax=1) / sys_exit (rax=60) recognised by heuristic. (PE32+ uses IAT-indirect calls; see SlimePE-rev.) |
| Memory operands | [reg] / [reg+disp8/32] / [rip+disp32] / [base+index*scale+disp] (SIB byte, scale=1/2/4/8, REX.X index extension) — full coverage of [rbp-disp] local-variable access and [rax*8+disp] array indexing. |
Next-phase additions: printf / malloc / PLT/GOT dynamic linking, SSE2 / SSE4 (XMM + floating-point), 3-op imul r64, r/m64, imm32, movabs r64, imm64.
License model
| Charged | WASM/WASI converter tool (developer side) |
|---|---|
| Not charged | The produced NASM / C sources (customer asset, perpetual deployment) |
| Method | Ed25519 144B signed license + 3-hop air-gap activation (finance / defense audit ready) |
| Parallelization (PSDP) | Not included. See the independent PSDP SKU under SlimeNENC. |
Related materials
- Sister (same OS, forward)SlimeASM — HLASM + Win x64 MASM forward transpiler.
- Sister (Windows reverse)SlimePE-rev — Windows PE32+ x86_64 reverse, same Slot IR + shared decoder.
- Reverse family overviewSlimeASM-rev landing — the umbrella reverse-family page.
- Slot IR shared familySlimeCOBOL / SlimePL/I / SlimeRPG / SlimeMUMPS share the Slot IR (Core64 + Ext32 fixed-bit).
- Patent applicationJP application 2026-046620 v15b, claims 11 / 14d.
Reverse PoC / Request Materials Back to SlimeNENC family SlimePE-rev (Windows pair)
