JP application 2026-046620 (claims 11 / 14d — legacy-language coverage including native binary reverse direction)

SlimeASM-rev — Linux ELF x86_64 binary → ASM + C reverse transpiler

Recover NASM and C sources from a tag-less native ELF, bit-exact.

Convert source-lost / vendor-lost Linux x86_64 native binaries from banking, defense and embedded systems back into both NASM intel source and C source, bit-exact. The emitted NASM is re-assembled with nasm + ld; the emitted C is compiled with gcc -O0 -nostdlib -static; both produce real native ELFs that, when run, reproduce the original binary's stdout byte-for-byte — the strictest round-trip axis we know how to write.

  • Phase A: ELF64 minimal parser + x86_64 instruction decoder (integer hot-loop subset, ~14 patterns) + NASM intel emitter + straight-line C emitter
  • Phase B entry: CFG recovery (Cooper-Harvey-Kennedy iterative dominator + Aho/Sethi natural loop body) + structured C (do { ... } while (R[1] != 0); + if/else diamond)
  • Phase B (b): function-boundary recovery (prologue/epilogue + call/ret + self-recursion). Each function emits as its own C function, call → C call, ret → return;
  • Phase B (d): inter-procedural Slot IR — function = Slot node, call graph lifted as first-class IR edges, deterministic JSON round-trip; self-recursion expressed naturally
  • S9 bench: all 8 axes 64/64 PASS (2026-05-11), 8 samples including recursive fact(4) = 24, both ASM and C round-trips green

A reverse transpiler for “source-lost native binaries” in Linux x86_64 banking and defense systems, built on deterministic translation + 8-axis round-trip auto-regression + audit chain.
SlimeNENC family's first reverse-direction product, paired with the existing forward family (COBOL / HLASM / MASM / MUMPS / PL/I / RPG / FORTRAN / Natural).

Reverse PoC / Request Materials →

Key measurements (2026-05-11)

64 / 64
8 samples × all 8 S9 axes
= ALL AXES PASS (Phase A + B entry + (b) + (d))
8 / 8
ASM round-trip
NASM emit → nasm + ld → run → original stdout match
8 / 8
C round-trip
C emit → gcc -nostdlib -static → run → original stdout match
8 / 8
Structured-C round-trip
CFG + do-while/if-else + per-function → gcc → run → match
2 native paths
native binary regeneration
ASM (via NASM) + C (via GCC), both verified on real ELFs
function = Slot node
inter-procedural Slot IR
call graph as first-class edge, self-recursion native

Market context — where source-lost native binaries live

Banks (Linux x86_64)During modernization projects, native ELF / .so libraries with no source and no surviving maintenance vendor. Heirloom / Astadia focus on mainframe HLASM and do not target Linux native binaries.
Defense / aerospaceClosed binaries (embedded Linux ELF / instrumentation daemons) frozen for 10-30 years. The originating vendor cannot supply source, but a C-source recovery with audit chain is required.
Embedded / medical devicesFDA / PMDA / IEC 62304 obligate "complete software description". Binary-only components must be lifted to C as auditor-reproducible documentation.
Legacy documentation"Working but untouchable" daemons must be lifted to C so static analysis, SBOM and CVE auditing can apply.
Competitive landscapeGhidra (NSA OSS) / IDA Pro / Hex-Rays / RetDec already exist. SlimeASM-rev's differentiator is determinism + 8-axis round-trip auto-regression — every conversion is provably "lossless" via a bench harness.

S9 bench — all 8 axes (8 samples PASS)

Axis 1a dialect-detectTokenizer recognises ELF magic + ELFCLASS64 + EM_X86_64 (e_machine = 0x3E). 8/8 PASS.
Axis 1b opcode-recoverAll 177 .text instructions across 8 samples decoded — db 0xNN fallback count = 0. 8/8 PASS.
Axis 2 mutation-detect1-bit flip in .text, 5 trials × 8 samples = 40 trials, 40/40 detected. Disasm output must differ — invariant.
Axis 3 determinismNASM emit twice, byte-equal across all 8 samples. 8/8 PASS.
Axis 4 ASM round-tripemit NASM → nasm + ld → run → original stdout match. The strictest axis: two real native binaries (original + ours) executed and compared. 8/8 PASS, including recursive fact(4) = 24.
Axis 5 C round-tripemit C → gcc -O0 -nostdlib -static → run → original stdout match. Straight-line PC dispatch + STK[]-modelled call/ret/push/pop is bit-faithful. 8/8 PASS.
Axis 6 structured-C round-tripCFG-recovered structured C (do-while + if/else + per-function + call → C call + ret → return;) → gcc → run → match. 8/8 PASS.
Axis 7 Slot IR round-tripSlotImage → JSON → SlotImage → structural equality + JSON byte-equal double check. Function = Slot node and the call graph are preserved completely. 8/8 PASS.

Sample inventory (8 binaries, NASM-source → ELF)

01 helloSyscall write of "Hello, ELF!\n". Smallest ELF: 1 BB / 8 instructions.
02 arith17 + 25 = 42 printed as 2 ASCII digits. Exercises idiv, add al, mov [rip+disp], etc.
03 looploop sum_loop for 1+2+3+4+5 = 15. CFG has a back edge; structured C recovers as do { ... } while (R[1] != 0);.
04 branchcmp + jge diamond. Structured C recovers as if (cond) { ... } else { ... } meeting at a common join BB.
05 computeimul rax, rbx for 6 × 7 = 42.
06 call_simple_start → do_print. Prologue (push rbp; mov rbp, rsp) + epilogue (pop rbp; ret) recognised as function boundary and split into independent C functions.
07 two_funcs3 functions (_start → add_two + print_dec). Two inter-procedural call graph edges.
08 recursionfactorial(4) = 24 via self-recursion. Call graph carries a self-loop edge (fact → fact); push rbx / pop rbx caller-saved spill is preserved bit-faithfully via STK[].

Translation example 1 — loop recovered as structured do-while (sample 03)

The original NASM uses loop sum_loop to compute 1+2+3+4+5. The CFG carries a back edge; SlimeASM-rev recognises a natural loop body of one BB whose terminator is a `loop` instruction, and recovers structured C as do { ... } while (R[1] != 0); — no goto:

; Original NASM (sample 03, loop region)
        xor     rax, rax
        mov     rcx, 5
sum_loop:
        add     rax, rcx
        loop    sum_loop            ; dec rcx; jnz sum_loop
// SlimeASM-rev recovered C (structured)
R[0] = R[0] ^ R[0];                  // xor rax, rax
R[1] = 0x5;                          // mov rcx, 5
do {
    R[0] = R[0] + R[1];              // add rax, rcx
    /* e2 fb  loop sum_loop (terminator absorbed) */
    R[1] = R[1] - 1;
} while (R[1] != 0);

Cooper-Harvey-Kennedy iterative dominator detects the back edge; Aho/Sethi natural-loop body analysis collapses the self-loop to a single BB; the structured emitter then renders the whole loop as do-while.

Translation example 2 — branch recovered as structured if/else (sample 04)

The original NASM uses cmp + jge; both arms reconverge at a common join BB. SlimeASM-rev recognises cond BB + 2 succs reconverging at a single join and recovers structured C as if (cond) { ... } else { ... }:

; Original NASM (sample 04, branch region)
        mov     rax, 7
        cmp     rax, 5
        jge     is_big
        ; small branch
        ... write "small\n" ...
        jmp     done
is_big:
        ... write "big\n" ...
done:
        ... exit ...
// SlimeASM-rev recovered C (structured, both arms join)
R[0] = 0x7;                          // mov rax, 7
{ int64_t _diff = (int64_t)R[0] - (int64_t)5;
  ZF = (_diff == 0); SF = (_diff < 0); }
OF = 0;
/* 7d 1a  jge is_big (terminator absorbed) */
if (SF == OF) {                      // jge condition
    /* big arm */
    ... write "big\n" ...
} else {
    /* small arm */
    ... write "small\n" ...
}
/* done: common join BB → exit */

The diamond is detected via intra-function dominator + post-dominator agreement; the unconditional jmp at the end of the taken arm is absorbed by the structured form, leaving plain C if/else with no goto.

Translation example 3 — per-function + self-recursion showcase (sample 08)

factorial(4) = 24 via self-recursion, 6 BBs in one function. The CFG is neither a clean loop nor a join-converging diamond (each branch arm terminates with its own ret), so the structured C still carries goto BB_xxxx; labels. We include it as a showcase of function-boundary recovery + call → C call + ret → return; + caller-saved spill (push rbx / pop rbx) preserved bit-faithfully via STK[] + self-recursion:

; Original NASM (sample 08, fact function)
fact:
        push    rbp
        mov     rbp, rsp
        cmp     rbx, 1
        jg      recurse
        mov     rax, 1               ; base: 1! = 1
        pop     rbp
        ret
recurse:
        push    rbx                  ; spill rbx across recursive call
        sub     rbx, 1
        call    fact                 ; rax := (rbx-1)!
        pop     rbx                  ; restore rbx
        imul    rax, rbx             ; rax := rbx * (rbx-1)!
        pop     rbp
        ret
// SlimeASM-rev recovered C (per-function, self-recursion, caller-saved spill)
static void func_401060(void) {           // fact()
  BB_401060:;
    STK[--RSP_IDX] = R[5];               // push rbp
    R[5] = R[4];                         // mov rbp, rsp
    /* cmp rbx, 1 */
    if (ZF == 0 && SF == OF) goto BB_401071;  // jg recurse
  BB_40106a:;
    R[0] = 1;                            // mov rax, 1
    R[5] = STK[RSP_IDX++];               // pop rbp
    return;                              // ret → C return
  BB_401071:;
    STK[--RSP_IDX] = R[3];               // push rbx
    R[3] -= 1;                           // sub rbx, 1
    func_401060();                       // call fact → C function call (self-recursion)
    R[3] = STK[RSP_IDX++];               // pop rbx
    R[0] = R[0] * R[3];                  // imul rax, rbx
    R[5] = STK[RSP_IDX++];               // pop rbp
    return;
}

Compiling this with gcc -O0 -nostdlib -static and running yields the same stdout (FACT=24) as the original ELF — Axis 6 structured-C round-trip PASS. The remaining goto labels (cond BB + each arm an independent ret) will be eliminated by the multi-BB loop / tail-duplication extensions in later Phase B work.

Function = Slot node, call graph as first-class IR (Phase B (d))

The same SlimeNENC-family Slot IR (Core64 + Ext32 fixed-bit, claim 11) applied in reverse: each function becomes a SlotFunction node, and call edges are first-class IR (a list of callee names per function). Self-recursion is naturally a self-loop edge:

samplefunctionscall edges
01-05 (linear)10
06 call_simple21 (_start → func_40100f)
07 two_funcs32 (_start → func_401014, _start → func_401027)
08 recursion22 (_start → func_401060, func_401060 → func_401060)

The full SlotImage encodes/decodes via deterministic JSON (Axis 7 round-trip), so call graphs and function structure can flow into external toolchains (audit DBs, SBOM, static analysis) without information loss.

Audit fitness (finance / defense / medical-device)

  • Bit-exactSame ELF input → same sha256 NASM/C output. CFG / function boundaries / instruction stream all fully deterministic.
  • Native ELF round-tripEmitted NASM re-assembled via nasm + ld; emitted C compiled via gcc -nostdlib; two real native ELFs executed and stdout compared with the original. Not simulation — real-machine verification.
  • Mutation detection1-bit flip in .text always changes disasm. 8 samples × 5 trials = 40/40 detected — tampering is immediately visible.
  • DeterminismSame ELF disassembled + emitted twice → byte-equal per sample. Stable across parallel and GPU execution.
  • Slot IR auditFunction = Slot node + call graph persisted as deterministic JSON. Joins SBOM / audit DB pipelines as a structured artifact.
  • Build-time LLMLLM only at decoder-rule construction time. Runtime is deterministic rule-based — aligned with bank / defense audit requirements.

Supported instructions (Phase A, integer hot-loop subset, ~14 patterns)

Data movementmov reg, imm/reg (B8+r / 89 /r / C7 /0 / 88 /r) / lea reg, [rip+disp32] (8D /r mod=00 r/m=101)
Arithmeticadd r/m64, r64 (01 /r) / add al, imm8 (04) / add r/m8, imm8 (80 /0) / sub r/m64, imm8 (83 /5) / imul r64, r/m64 (REX.W 0F AF) / idiv r/m64 (F7 /7)
Logicxor r/m64, r64 (31 /r; xor reg, reg idiom recognised as zero-init)
Comparecmp r/m64, r64 (39 /r) / cmp r/m64, imm8 (83 /7)
BranchJcc rel8 (70-7F: je/jne/jge/jg/jl/jle/...) / Jcc rel32 (0F 80-8F) / jmp rel8/32 (EB/E9) / loop rel8 (E2)
Call/stackcall rel32 (E8) / ret (C3) / push/pop r64 (50-5F) / push imm (6A/68)
Systemsyscall (0F 05) — sys_write (rax=1) / sys_exit (rax=60) recognised by heuristic

Phase B onward will extend coverage to multi-BB loops / loop nest trees, libc-linked binaries (printf / malloc, PLT/GOT dynamic linking) and SSE2 / SSE4 (XMM registers). A 30+ sample bench from `gcc -O0` C builds (rather than hand-written NASM) will become the production regression target.

License model

ChargedWASM/WASI converter tool (developer side)
Not chargedThe produced NASM / C sources (customer asset, perpetual deployment)
MethodEd25519 144B signed license + 3-hop air-gap activation (finance / defense audit ready)
Parallelization (PSDP)Not included. See the independent PSDP SKU under SlimeNENC.

Related materials

  • Technical overviewSlimeNENC Technical Overview (reverse-direction ASM/C chapter being prepared)
  • Patent applicationJP application 2026-046620 v15b, claims 11 / 14d, target legacy-language dialect handling for COBOL / MUMPS / PL/I / RPG / assembler / native binary reverse direction.
  • Sister product (forward pair)SlimeASM (HLASM + Win x64 MASM forward) — together they cover both directions of the native-code surface.
  • Sister products (Slot IR shared)SlimeCOBOL / SlimePL/I / SlimeRPG / SlimeMUMPS share the Slot IR (Core64 + Ext32 fixed-bit).
  • BenchmarksS9 bench harness (8 axes of correctness), s9_bench/bench.py auto-regresses 8 samples × 8 axes = 64/64.

Reverse PoC / Request Materials Back to SlimeNENC family SlimeASM (forward pair) SlimeCOBOL