Hungarian language pack

Hungarian bundle for Piko's linguistics service: Snowball stemmer, Hungarian phonetic encoder, and a stop-word list.

Overview

The Hungarian pack registers three adapters under the language code hungarian. The stemmer is kljensen/snowball configured for Hungarian, which matters for an agglutinative language where a single root carries a long suffix chain. The phonetic encoder applies Hungarian-specific rules for digraphs and trigraphs (for example cs, gy, sz, dzs) and produces a code capped at six characters by default. The stop-word provider holds 127 entries covering pronouns, case-suffix particles, postpositions, conjunctions, copula forms, demonstratives, question words, and quantifiers.

The package exports nothing. It blank-imports linguistics_phonetic_hungarian, linguistics_stemmer_hungarian, and linguistics_stopwords_hungarian for their init() side effects.

Each adapter implements a domain port (StemmerPort, PhoneticEncoderPort, StopWordsProviderPort) and registers a factory keyed by the hungarian language code. The pack reuses the same registry machinery and the same kljensen/snowball library as every other language pack, so Hungarian search behaves consistently with English, French, and the rest. All three adapters are pure Go with no CGO, system libraries, or build tags, so the pack runs in compiled builds and in interpreted dev mode (dev-i).

The linguistics_bigrams_hungarian adapter for gibberish detection is a distinct package. This language pack does not bundle it. Import it separately if you need bigram analysis.

Bootstrap

A blank import is enough. Each sub-package's init() registers its factory with the linguistics domain registry.

import (
    _ "piko.sh/piko/wdk/linguistics/linguistics_language_hungarian"
)

After import, cache_linguistics and any other consumer of the linguistics registry can request the hungarian analyser by name. To build an analyser directly, pass linguistics.WithLanguage("hungarian"), which looks up the registered stemmer, phonetic encoder, and stop-word provider in one step.

config := linguistics.DefaultConfigForLanguage("hungarian")
analyser := linguistics.NewAnalyser(config, linguistics.WithLanguage("hungarian"))

See also

Other language packs:

Consumers:

Framework docs: