Parallel word count (Monte Cristo)
Parallel word-frequency count over ~2.7 MB of Project Gutenberg text. Each runner uses its idiomatic CPU-parallel primitive.
Runtime · median per inner-loop window
Full statistics
| Runner | N | Compile | Runtime | P95 | Stddev | RSS | vs piko | Status |
|---|---|---|---|---|---|---|---|---|
| Native Gocompiled | 10 | 181 ms | 19.0 ms | 21.2 ms | 1.35 ms | 72 MiB | 0.18× | OK |
| Piko interpbytecode VM | 10 | 4.11 ms | 108 ms | 110 ms | 2.48 ms | 418 MiB | 1.00× | OK |
| CPython 3.13bytecode VM | 10 | 671 µs | 181 ms | 184 ms | 1.70 ms | n/a | 1.68× | OK |
| PyPy 7.3tracing JIT | 10 | 540 µs | 353 ms | 359 ms | 5.10 ms | n/a | 3.28× | OK |
| tengobytecode VM | 0 | n/a | n/a | n/a | n/a | n/a | n/a | unsupported |
| scriggobytecode VM | 0 | n/a | n/a | n/a | n/a | n/a | n/a | unsupported |
| mvmbytecode VM | 10 | 679 µs | 302 ms | 307 ms | 4.57 ms | 93 MiB | 2.80× | OK |
| yaegiAST walker | 0 | n/a | n/a | n/a | n/a | n/a | n/a | unsupported |
Workload & symmetry rules
Workload
- Load The Count of Monte Cristo (Project Gutenberg #1184) corpus from disk.
- Split into 16 word-aligned chunks.
- Spawn 16 workers (
goroutine+WaitGroupon Go/piko;multiprocessing.Poolon Python). - Each worker tokenises via byte-walk and builds a local
word → countmap. - Merge maps, compute top-50 by
(count desc, word asc)via hand-rolled insertion-sort.
Symmetry rules
- Each runner uses its idiomatic CPU-parallel primitive. Python's
multiprocessing.Pool(notthreading, because of the GIL). - No
str.split, noregexp, nosort.*/sorted(key=...): tokenisation and ranking are hand-rolled.
Why this benchmark exists
Directly measures the GIL-free advantage. Piko inherits Go's goroutine scheduler; CPython has to fork.
Source code
piko / Go
piko_source.gonative Go
native_main.goCPython / PyPy
cpython.py