Gzip storage transformer
Stream transformer that gzips byte streams on the way to the storage provider and decompresses them on the way back, with constant memory usage.
Overview
Gzip is the most universally supported compression format. Every language, tool, browser, and storage system reads it. The transformer compresses uploads through klauspost/compress/gzip, a faster drop-in replacement for the standard library, at a configurable level, and reverses on downloads. It streams with constant memory. The source reader feeds the compressor and the consumer pulls the compressed result without buffering the whole object.
The transformer implements StreamTransformerPort directly. One RegisterTransformer call plugs it into the storage provider chain, with no glue code. It composes with other transformers through an integer priority, so chaining compress-then-encrypt is declarative ordering, not a hand-written pipeline.
Levels follow the standard gzip semantics: gzip.NoCompression (0) through gzip.BestSpeed (1), gzip.DefaultCompression (-1, the default), and gzip.BestCompression (9). The Reverse path caps decompressed output to guard against decompression bombs on untrusted downloads. See Decompression cap.
Configuration
import (
"github.com/klauspost/compress/gzip"
"piko.sh/piko/wdk/storage/storage_transformer_gzip"
)
transformer, err := storage_transformer_gzip.NewGzipTransformer(storage_transformer_gzip.Config{
Name: "gzip", // optional, default "gzip"
Priority: 100, // optional, default 100
Level: gzip.DefaultCompression, // optional, default DefaultCompression
MaxDecompressedBytes: 256 * 1024 * 1024, // optional, default 256 MiB
})
if err != nil {
return err
}
NewGzipTransformer returns an error when Level is outside the valid range. storage_transformer_gzip.DefaultConfig() returns the same defaults to start from and override one field.
Config fields:
- Name. The transformer identifier in the registry. Defaults to
gzip. - Priority. The position in the transform chain. Lower values run first on writes. Defaults to 100.
- Level. The gzip compression level. A zero value maps to
gzip.DefaultCompression. - MaxDecompressedBytes. The cap on bytes produced by
Reverse. A zero value uses the 256 MiB default. A negative value disables the cap.
WithMaxDecompressedBytes(maxBytes int64) sets the same cap as a construction option:
transformer, err := storage_transformer_gzip.NewGzipTransformer(
storage_transformer_gzip.DefaultConfig(),
storage_transformer_gzip.WithMaxDecompressedBytes(64*1024*1024),
)
Bootstrap
Gzip registers against the storage service transformer registry, not through piko.With*. RegisterTransformer is the only supported wiring. There is no NewService option for transformers. Register after building the service:
if err := service.RegisterTransformer(ctx, transformer); err != nil {
return err
}
Use a transformer on writes and reads
PutObject takes a provider name and a *storage.PutParams and returns only an error. Set TransformConfig.EnabledTransformers to the transformer names that apply to the call. The TransformerOptions map keys on the transformer name and overrides the level for that one operation:
params := &storage.PutParams{
Key: "report.json",
Reader: r,
TransformConfig: &storage.TransformConfig{
EnabledTransformers: []string{"gzip"},
TransformerOptions: map[string]any{
"gzip": map[string]any{"level": 9},
},
},
}
if err := service.PutObject(ctx, providerName, params); err != nil {
return err
}
GetObject mirrors the write path. Pass the same EnabledTransformers so the chain reverses the gzip step on the way back:
reader, err := service.GetObject(ctx, providerName, storage.GetParams{
Key: "report.json",
TransformConfig: &storage.TransformConfig{
EnabledTransformers: []string{"gzip"},
},
})
if err != nil {
return err
}
defer func() { _ = reader.Close() }()
A custom ProviderPort exposes the same step through Put(ctx, *storage.PutParams), which also returns only an error.
Ordering with encryption
The transform chain sorts by Priority(). It runs ascending on writes and reverses for reads, so a lower priority compresses first and decompresses last. To chain compress-then-encrypt, gzip must run before the crypto transformer on writes.
The two transformers carry different priority numbers in two places. The gzip default is 100. The crypto package recommends 250 for encryption. The framework bootstrap, however, auto-registers the crypto transformer at priority 100, the same number as the gzip default. The chain sorts with slices.SortFunc, which does not preserve input order, so a tie between gzip at 100 and crypto at 100 leaves the relative order undefined. The default deployment does not guarantee compress-then-encrypt.
Set gzip to a priority below 100 so it always compresses before encryption and decompresses after decryption:
transformer, err := storage_transformer_gzip.NewGzipTransformer(storage_transformer_gzip.Config{
Priority: 50,
})
Decompression cap
Reverse wraps the gzip reader so reads beyond MaxDecompressedBytes surface ErrDecompressedTooLarge instead of inflating without bound. The default cap is 256 MiB. This protects callers that buffer a download against a small upload that expands to gigabytes. Use errors.Is(err, storage_transformer_gzip.ErrDecompressedTooLarge) to distinguish the cap from a normal end of stream. Lower the cap for stricter limits on untrusted objects, or set a negative value to disable it for fully trusted input.
Observability
The package emits OpenTelemetry metrics under storage.transformer.gzip.*: operation duration, an operations counter, an errors counter, bytes processed, and a compression ratio histogram. They surface throughput and effectiveness without extra instrumentation.
Tradeoffs
Gzip is slower and produces larger output than zstd at every comparable setting. The reasons to pick gzip over zstd are interoperability, where third parties read your objects directly, and ecosystem maturity, where every audit tool already speaks gzip. Reach for the Zstandard transformer when you control both ends of the pipe and want a better ratio at similar speed. Reach for the crypto transformer when at-rest encryption matters more than size.
See also
Other storage transformers:
- Zstd transformer, modern compression with better ratio at similar speed.
- Crypto transformer, at-rest encryption; chain after compression for compress-then-encrypt.
Storage providers:
- Amazon S3, GCS, Cloudflare R2, Disk, any provider works; the transformer is provider-agnostic.
Framework docs:
- How to handle file storage, storage pipeline overview including transformers.
- Storage API reference,
StreamTransformerPort,RegisterTransformer, transformer priorities.
External:
- klauspost/compress/gzip, the underlying compression library.
- RFC 1952: gzip file format, protocol specification.