MNSZ2 - HGC

magyar változat
This site is for the Hungarian Gigaword Corpus containing more than one billion running words. This is the renewed version of the Hungarian National Corpus.
>> Latest version v2.0.5 is available since 8 June 2018. <<
Search interface. Registration. News.
It includes the whole material of the old HNC, but is has many new features and advantages.

Larger. The size of the corpus has increased considerably. The material is a newer sample from today's Hungarian language.
Better language analysis. The corpus contains good quality language analysis which provides information about compounds, phonological features, and derivation among others.
New features. Results can be saved, filtered, various frequency lists can be created, and collocational investigations can also be conducted.
Sophisticated search. With the help of the detailed search interface and the CQL query language.
All results. In contrast with the former maximum of 500, now it is possible to view all hits.
Faster. Response time of the search interface has been shortened.

Get acquainted with the new corpus and the new search interface that is available after a free registration. Previous HNC registrations also apply for the new corpus. There is a detailed help about usage (in Hungarian). Data on sizes of subcopora are also available (in Hungarian). Annotation and user interface of HGC may change from time to time, in case of significant change the previous versions remain available. If you have a comment, please, contact us at mnsz[at]nytud.hu.
Please refer to the following article and let us know about any work or publication that has been created using the Hungarian Gigaword Corpus:

Oravecz Csaba, Váradi Tamás, Sass Bálint: The Hungarian Gigaword Corpus. In: Proceedings of LREC 2014, 2014.

Also, try our other corpus query tool the Verb Argument Browser to investigate verbs and arguments directly.