This site is for the Hungarian Gigaword Corpus
containing more than one billion running words.
This is the renewed version of the Hungarian National Corpus
>> Latest version v2.0.5
is available since 8 June 2018. <<
It includes the whole material of the old
, but is has many new features and advantages.
- Larger. The size of the corpus has increased considerably. The material is a newer sample from today's Hungarian language.
- Better language analysis. The corpus contains good quality language analysis which provides information about compounds, phonological features, and derivation among others.
- New features. Results can be saved, filtered, various frequency lists can be created, and collocational investigations can also be conducted.
- Sophisticated search. With the help of the detailed search interface and the CQL query language.
- All results. In contrast with the former maximum of 500, now it is possible to view all hits.
- Faster. Response time of the search interface has been shortened.
Get acquainted with the new corpus and the new
that is available after a
registrations also apply for the new corpus.
There is a detailed
about usage (in Hungarian).
Data on sizes of subcopora
are also available (in Hungarian).
Annotation and user interface of HGC
may change from time to time,
in case of significant change the previous versions
If you have a comment, please, contact us at
Please refer to the following article and
let us know
about any work or publication
that has been created using the Hungarian Gigaword Corpus
Also, try our other corpus query tool the
Verb Argument Browser
to investigate verbs and arguments directly.