1. bookVolume 72 (2021): Issue 2 (December 2021)
    NLP, Corpus Linguistics and Interdisciplinarity
Journal Details
License
Format
Journal
eISSN
1338-4287
First Published
05 Mar 2010
Publication timeframe
2 times per year
Languages
English
access type Open Access

Capturing Numerals and Pronouns at the Morphological Layer in the Prague Dependency Treebanks of Czech

Published Online: 30 Dec 2021
Page range: 454 - 464
Journal Details
License
Format
Journal
eISSN
1338-4287
First Published
05 Mar 2010
Publication timeframe
2 times per year
Languages
English
Abstract

The paper presents a novel and unified morphological description of numerals and pronouns, as compiled for the newest edition of the Prague Dependency Treebank (Prague Dependency Treebank – Consolidated 1.0) and its integral part the morphological dictionary MorfFlex. On the basis of considerable experience with real data annotation and the use of the morphological dictionary, particular changes were proposed. For both of the parts of speech a new set of subtypes was proposed, based mainly on the morphological criterion and its combination with semantic properties and other relevant features, such as definiteness in numerals and possessivity, reflexivity, and clitichood in pronouns. Each subtype has a specific value at the 2nd position of the morphological tag, which serves also as an indicator of the applicability of other tag categories.

Keywords

[1] Hajič, J. et al. (2020). Prague Dependency Treebank – Consolidated 1.0 (PDT-C 1.0). LINDAT/CLARIAH-CZ, Prague. Accessible at: http://hdl.handle.net/11234/1-3185. Search in Google Scholar

[2] Hajič, J., Hlaváčová, J., Mikulová, M., Straka, M., and Štěpánková, B. (2020). MorfFlex CZ 2.0. LINDAT/CLARIAH-CZ, Prague. Accessible at: http://hdl.handle.net/11234/1-3186. Search in Google Scholar

[3] Petkevič, V., Hlaváčová, J., Osolsobě, K., Svášek, M., and Šimandl, J. (2019). Parts of Speech in NovaMorf, a New Morphological Annotation of Czech. Jazykovedný časopis 70(2), pages 358–369.10.2478/jazcas-2019-0065 Search in Google Scholar

[4] Komárek, M. et al. (1986). Mluvnice češtiny 2. Academia, Prague. Search in Google Scholar

[5] Štícha, F. et al. (2018). Velká akademická gramatika spisovné češtiny. Academia, Prague. Search in Google Scholar

[6] Hajič. J. (2004). Disambiguation of Rich Inflection (Computational Morphology of Czech). Karolinum, Prague. Search in Google Scholar

[7] Mikulová, M. et al. (2020). Manual for Morphological Annotation, Revision for the Prague Dependency Treebank – Consolidated 2020 release. Technical report, 2020/TR-2020–64, Charles University, Prague. Search in Google Scholar

[8] Slovenský národný korpus – prim-6.1-public-sane. Bratislava: Jazykovedný ústav Ľ. Štúra SAV 2013. Accessible at: https://korpus.juls.savba.sk/. Search in Google Scholar

[9] Morphological annotation of texts in the Slovak National Corpus – Numerals and Pronouns. Accesible at: https://korpus.sk/num.html; https://korpus.sk/pronom.html. Search in Google Scholar

[10] https://www.sketchengine.eu/polish-nkjp-part-of-speech-tagset/. Search in Google Scholar

[11] Prepiórkovski, A. (2009). A comparison of two morphosyntactic tagsets of Polish. In Representing Semantics in Digital Lexicography: Proceedings of MONDILEx Fourth Open Workshop, Warsaw, pages 138–144. Search in Google Scholar

[12] Petkevič, V. (2010). L’accord en tchèque: le centre et la périphérie. Écho des études romanes, 6(1–2), pages 143–160. Search in Google Scholar

[13] Cvrček, V. et al. (2015). Mluvnice současné češtiny. Karolinum, Prague. Search in Google Scholar

[14] Hlaváčová, J., Mikulová, M., Štěpánková, B., and Hajič, J. (2019). Modifications of the Czech morphological dictionary for consistent corpus annotation. Jazykovedný časopis 70(2), pages 380–389.10.2478/jazcas-2019-0067 Search in Google Scholar

[15] Sgall, P., Hajičová, E., and Panevová, J. (1986). The Meaning of the Sentence in Its Semantic and Pragmatic Aspects. Academia, Prague. Search in Google Scholar

Recommended articles from Trend MD

Plan your remote conference with Sciendo