Per-line SHA-256 (FFI overhead)
Compute a 32-byte content digest of Frankenstein (Project Gutenberg #84, ~7,700 lines) by hashing each line with the runner's bundled SHA-256 primitive and XOR-folding into a running accumulator. Measures interpreter throughput + per-call native FFI overhead in equal measure.
Runtime · median per inner-loop window
Full statistics
| Runner | N | Compile | Runtime | P95 | Stddev | RSS | vs piko | Status |
|---|---|---|---|---|---|---|---|---|
| Native Gocompiled | 10 | 181 ms | 2.08 ms | 2.09 ms | 4.21 µs | 69 MiB | 0.05× | OK |
| Piko interpbytecode VM | 10 | 1.71 ms | 39.3 ms | 39.8 ms | 261 µs | 117 MiB | 1.00× | OK |
| CPython 3.13bytecode VM | 10 | 276 µs | 57.3 ms | 58.2 ms | 1.07 ms | n/a | 1.46× | OK |
| PyPy 7.3tracing JIT | 10 | 233 µs | 31.4 ms | 32.7 ms | 666 µs | n/a | 0.80× | OK |
| tengobytecode VM | 10 | 200 µs | 87.7 ms | 108 ms | 10.3 ms | 428 MiB | 2.23× | OK |
| scriggobytecode VM | 10 | 288 µs | 82.6 ms | 106 ms | 7.31 ms | 407 MiB | 2.10× | OK |
| mvmbytecode VM | 10 | 251 µs | 124 ms | 141 ms | 5.19 ms | 62 MiB | 3.16× | OK |
| yaegiAST walker | 10 | 286 µs | 55.9 ms | 74.1 ms | 8.26 ms | 66 MiB | 1.42× | OK |
Workload & symmetry rules
Workload
Walk Frankenstein's text (~441 KiB, ~7,700 lines) byte-by-byte to find newline boundaries. For each line, call the runner's bundled SHA-256 primitive once. XOR-fold every 32-byte digest into a running accumulator. Emit the accumulator as 64 lowercase hex characters.
Symmetry rules
- One SHA-256 call per line (~7,700 native FFI dispatches per inner iteration). Whole-buffer hashing (
sha256.Sum256(corpus)/hashlib.sha256(corpus).hexdigest()) is banned because it would do the work in a single native call, hiding the per-call signal. - Line scan and XOR fold run in interpreted code on every runner. No
strings.Split, nostr.splitlines. - Hand-rolled hex emission (no
encoding/hex, nobinascii.hexlify) so the trailing format step also runs in the interpreter. - Tengo's stdlib does not bundle SHA-256, so the harness registers a
sha256_sumbuiltin that wraps Go'scrypto/sha256.Sum256. Same shape of "register a host function as a native callout" that every other runner uses, just via Tengo'sUserFunctionsurface.
Why this benchmark exists
This is the only benchmark that exercises the script-orchestrates-compiled-primitive pattern, which is how most production embedded interpreters are actually used (NumPy from Python, engine bindings from Lua, host functions from Tengo). Per-call FFI dispatch overhead is what discriminates runners: with ~7,700 calls per inner iteration, a 1µs-per-call difference adds up to ~8 ms of wall time, visible against a ~100 ms baseline. Every runner uses the same SHA-256 algorithm routed through the same compression primitive (OpenSSL on Python, Go's crypto/sha256 with ASM where available on Go family), so the per-block hash cost is comparable across the board.
Source code
piko / Go
piko_source.gonative Go
native_main.goCPython / PyPy
cpython.pytengo
script.tengo