DIALEKT, a corpus of Czech dialects, has been continuously curated and expanded by the Spoken Corpora section of the Institute of the Czech National Corpus. The following paper aims first to give a concise characteristic of the corpus, addressing its sociolinguistic parameters and possible subcorpora derivable thereof, its two-layer approach to the transcription of dialect recordings, and lemmatization & morphological tagging of the corpus. Subsequently, we move on to examples of how linguists can use the corpus and discuss two related projects which expand upon currently available possibilities: an archive of dialect-specific differential phones of the Czech language (completed) and an interactive web environment for spatial map-based visualization of data from all kinds of spoken corpora (in preparation). Thanks in part also to these additional tools, the DIALEKT corpus should serve both experts in the field as well as the general public.
Keywords
- spoken corpus
- dialect corpus
- dialectology
- corpus design
- transcription
Tell me what it feels like”: On the verbal interface of the phenomenal Communicating globalized science: a comparative analysis of domestic and Anglo-Saxon style of academic writing in linguistics Remarks on lexical adaptation of loanwords in the Slovak language (based on the blog nomination family)The Problem of Natiolect in Connection with the Language Interference and the Specifics of Terminology Translation Czech and English terminology of health and impaired health in historical, societal and conceptual framework What do modern languages with Scriptio Continua have in common?Recenzie a Správy Book Review: Le Concept Linguistique D’opérativité Etymológia a Nárečová Lexikografia (Na Materiáli Slovníka Slovenských Nárečí)