1. bookVolume 3 (2017): Issue 1 (December 2017)
Journal Details
License
Format
Journal
First Published
19 Nov 2014
Publication timeframe
1 time per year
Languages
English
access type Open Access

Challenges of annotation and analysis in computer-assisted language comparison: A case study on Burmish languages

Published Online: 13 Sep 2017
Page range: 47 - 76
Journal Details
License
Format
Journal
First Published
19 Nov 2014
Publication timeframe
1 time per year
Languages
English

The use of computational methods in comparative linguistics is growing in popularity. The increasing deployment of such methods draws into focus those areas in which they remain inadequate as well as those areas where classical approaches to language comparison are untransparent and inconsistent. In this paper we illustrate specific challenges which both computational and classical approaches encounter when studying South-East Asian languages. With the help of data from the Burmish language family we point to the challenges resulting from missing annotation standards and insufficient methods for analysis and we illustrate how to tackle these problems within a computer-assisted framework in which computational approaches are used to pre-analyse the data while linguists attend to the detailed analyses.

Keywords

Atkinson, Q. and R. Gray. 2006. “How old is the Indo-European language family? Illumination or more moths to the flame?” In: Forster, P. and C. Renfrew (eds.), Phylogenetic methods and the prehistory of languages. Cambridge, Oxford and Oakville: McDonald Institute for Archaeological Research. 91-109.Search in Google Scholar

Bagga, A. and B. Baldwin. 1998. “Entity-based cross-document coreferencing using the vector space model”. In: Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics. Association of Computational Linguistics. 79-85.Search in Google Scholar

Blevins, J. 2004. Evolutionary phonology. The emergence of sound patterns. Cambridge: Cambridge University Press.Search in Google Scholar

Burling, R. 1967. Proto-Lolo-Burmese. Bloomington: Indiana University Press. Search in Google Scholar

Butler, A. and W. Saidel. 2000. “Defining sameness: Historical, biological, and generative homology”. BioEssays 22. 846-853.Search in Google Scholar

Campbell, L. 2013. Historical linguistics. Edinburgh: Edinburgh University Press.Search in Google Scholar

Clerk, F. 1911. A manual of the Lawngwaw or Maru language, containing: the grammatical principles of the language, glossaries of special terms, colloquial exercises, and Maru-English and English-Maru vocabularies. Rangoon: American Baptist mission Press.Search in Google Scholar

Corel, E., P. Lopez, R. Meheust and E. Bapteste. 2016. “Network-thinking: Graphs to analyze microbial complexity and evolution”. Trends in Microbiology 24(3). 224-237.Search in Google Scholar

Covington, M. 1996. “An algorithm to align words for historical comparison”. Computational Linguistics 22(4). 481-496.Search in Google Scholar

Dixon, R. and A. Kroeber. 1919. Linguistic families of California. Berkeley: University of California Press.Search in Google Scholar

Dunn, M. (ed.). 2012. Indo-European lexical cognacy database (IELex). http://ielex.mpi.nl/.Search in Google Scholar

Fox, A. 1995. Linguistic reconstruction. An introduction to theory and method. Oxford; Oxford University Press.Search in Google Scholar

François, A. 2008. “Semantic maps and the typology of colexification: Intertwining polysemous networks across languages”. In: Vanhove, M. (ed.), From polysemy to semantic change.Amsterdam: Benjamins. 163-215.Search in Google Scholar

Gabelentz, G. v. d. 1891. Die Sprachwissenschaft. Ihre Aufgaben, Methoden und bisherigen Ergebnisse. Leipzig: T. O. Weigel.Search in Google Scholar

Gabelentz, G. v. d. 1892. Handbuch zur Aufnahme fremder Sprachen [Handbook for the description of foreign languages]. Berlin: Ernst Siegfried Mittler & Sohn.Search in Google Scholar

Greenhill, S., R. Blust and R. Gray. 2008. “The Austronesian Basic Vocabulary Database: From bioinformatics to lexomics”. Evolutionary Bioinformatics 4. 271-283.Search in Google Scholar

Haas, M. 1969. The prehistory of languages. Mouton: The Hague and Paris.Search in Google Scholar

Hammarstrom, H., R. Forkel and M. Haspelmath. 2017. Glottolog. Leipzig: Max Planck Institute for Evolutionary Anthropology.Search in Google Scholar

Holm, H. 2007. “The new arboretum of Indo-European ‘trees’. Can new algorithms reveal the phylogeny and even prehistory of Indo-European?” Journal of Quantitative Linguistics 14(2-3). 167-214.Search in Google Scholar

Huáng Bufan 黃布凡 .1992. Zangmiǎn yǔzu yǔyan cihui [A Tibeto-Burman lexicon]. Zhōngyāng Minzu Daxue 中央民族大学 [Central Institute of Minorities]: Běijīng 北京.Search in Google Scholar

Jenny, M. and P. Sidwell (eds.). 2015. The handbook of Austroasiatic languages. Leiden and Boston: Brill.Search in Google Scholar

Kiparsky, P. 1988. “Phonological change”. In: Newmeyer, F. (ed.), The Cambridge Survey of Linguistics (vol. 1). Cambridge: Cambridge University Press. 363-415.Search in Google Scholar

Koerner, E. 1976. “Zu Ursprung und Geschichte der Besternung in der historischen Sprachwissenschaft. Eine historiographische Notiz”. Zeitschrift fur vergleichende Sprachforschung 89(2). 185-190.Search in Google Scholar

Kondrak, G. 2000. “A new algorithm for the alignment of phonetic sequences”.In: Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference. 288-295.Search in Google Scholar

Koonin, E. 2005. “Orthologs, paralogs, and evolutionary genomics”. Annual Review of Genetics 39. 309-338.Search in Google Scholar

Kroonen, G. 2013. Etymological dictionary of Proto-Germanic. Leiden and Boston: Brill.Search in Google Scholar

Kürschner, W. 2014. “Georg von der Gabelentz’ Handbuch zur Aufnahme fremder Sprachen (1892). Entstehung, Ziele, Arbeitsweise, Wirkung“. In: Ezawa, K., F. Hundsnurscher and A. Vogel (eds.), Beitrage zur Gabelentz-Forschung. Tubingen: Narr. 239-259.Search in Google Scholar

Labov, W. 1981. “Resolving the Neogrammarian Controversy”. Language 57(2). 267-308.Search in Google Scholar

List, J.-M. 2012. “LexStat. Automatic detection of cognates in multilingual wordlists”. In: Proceedings of the EACL 2012 Joint Workshop of Visualization of Linguistic Patterns and Uncovering Language History from Multilingual Resources. 117-125.Search in Google Scholar

List, J.-M., A. Terhalle and M. Urban. 2013. “Using network approaches to enhance the analysis of cross-linguistic polysemies”. In: Proceedings of the 10th International Conference on Computational Semantics - Short Papers. Association for Computational Linguistics. 347-353.Search in Google Scholar

List, J.-M., S. Nelson-Sathi, W. Martin and H. Geisler. 2014. “Using phylogenetic networks to model Chinese dialect history”. Language Dynamics and Change 4(2). 222-252.Search in Google Scholar

List, J.-M. 2014. Sequence comparison in historical linguistics. Dusseldorf: Dusseldorf University Press.Search in Google Scholar

List, J.-M. 2015. “Network perspectives on Chinese dialect history”. Bulletin of Chinese Linguistics 8. 42-67.Search in Google Scholar

List, J.-M., M. Cysouw and R. Forkel. 2016. “Concepticon. A resource for the linking of concept lists”. In: Proceedings of the Tenth International Conference on Language Resources and Evaluation. 2393-2400.Search in Google Scholar

List, J.-M. and R. Forkel. 2016. LingPy. A Python library for historical linguistics. Jena: Max Planck Institute for the Science of Human History.Search in Google Scholar

List, J.-M. 2016. “Beyond cognacy: Historical relations between words and their implication for phylogenetic reconstruction”. Journal of Language Evolution 1(2). 119-136.Search in Google Scholar

List, J.-M., P. Lopez and E. Bapteste. 2016. “Using sequence similarity networks to identify partial cognates in multilingual wordlists”. In: Proceedings of the Association of Computational Linguistics 2016. (Volume 2: Short Papers.) Association of Computational Linguistics. 599-605.Search in Google Scholar

List, J.-M., S. Greenhill and R. Gray. 2017. “The potential of automatic word comparison for historical linguistics”. PLOS ONE 12(1). 1-18.Search in Google Scholar

List, J.-M. 2017. “A web-based interactive tool for creating, inspecting, editing, and publishing etymological datasets”. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics. System Demonstrations. 9-12.Search in Google Scholar

Luce, G.H. 1985. Phases of Pre-Pagan Burma: Languages and history. Oxford: Oxford University Press.Search in Google Scholar

Makaev, E. 1977. Obščaja teorija sravnitel’nogo jazykoznanija [General theory of comparative linguistics]. Moscow: Nauka.Search in Google Scholar

Malkiel, Y. 1954. “Etymology and the structure of word families”. Word 10(2-3). 265-274.Search in Google Scholar

Mann, N. 1998. A phonological reconstruction of Proto Northern Burmic. (MA thesis, the University of Texas at Arlington.)Search in Google Scholar

Matisoff, J. 2015. The Sino-Tibetan Etymological Dictionary and Thesaurus project. Berkeley: University of California.Search in Google Scholar

McMahon, A. and R. McMahon. 2005. Language classification by numbers. Oxford: Oxford University Press.Search in Google Scholar

Meier-Brügger, M. 2002. Indogermanische Sprachwissenschaft. Berlin: de Gruyter. Search in Google Scholar

Meiser, G. 1998. Historische Laut- und Formenlehre der lateinischen Sprache. Darmstadt: Wissenschaftliche Buchgesellschaft.Search in Google Scholar

Morrison, D. 2015. “Molecular homology and multiple-sequence alignment: an analysis of concepts and practice”. Australian Systematic Botany 28. 46-62.Search in Google Scholar

Nishi, Y. 1999. Four papers on Burmese: Toward the history of Burmese (the Myanmar language). Tokyo: Institute for the study of languages and cultures of Asia and Africa, Tokyo University of Foreign Studies.Search in Google Scholar

Norquest, P. 2007. A phonological reconstruction of Proto-Hlai. (PhD dissertation, The University of Arizona.)Search in Google Scholar

Okell, J. 1971. “K Clusters in Proto-Burmese”. Paper presented at the Sino-Tibetan Conference, October 8-9, 1971. Bloomington, IN.Search in Google Scholar

Payne, D. 1991. “A classification of Maipuran (Arawakan) languages based on shared lexical retentions”. In: Derbyshire, D. and G. Pullum (eds.), Handbook of Amazonian languages (vol. 3). Berlin: Mouton de Gruyter. 355-499.Search in Google Scholar

Prokić, J., M. Wieling and J. Nerbonne. 2009. “Multiple sequence alignments in linguistics”. In: Proceedings of the EACL 2009 Workshop on Language Technology and Resources for Cultural Heritage, Social Sciences, Humanities, and Education. 18-25.Search in Google Scholar

Ratliff, M. 2010. Hmong-Mien language history. Canberra: Pacific Linguistics.Search in Google Scholar

Schwink, F. 1994. Linguistic typology, universality and the realism of reconstruction. Washington: Institute for the Study of Man.Search in Google Scholar

Smoot, M., K. Ono, J. Ruscheinski, P. Wang and T. Ideker. 2011. “Cytoscape 2.8. New features for data integration and network visualization”. Bioinformatics 27(3). 431-432.Search in Google Scholar

Steiner, L., P. Stadler and M. Cysouw. 2011. “A pipeline for computational historical linguistics”. Language Dynamics and Change 1(1). 89-127.Search in Google Scholar

Sturtevant, E. 1920. The pronunciation of Greek and Latin. Chicago: University of Chicago Press.Search in Google Scholar

Swadesh, M. 1963. “A punchcard system of cognate hunting”. International Journal of American Linguistics 29(3). 283-288.Search in Google Scholar

Urban, M. 2011. “Asymmetries in overt marking and directionality in semantic change”. Journal of Historical Linguistics 1(1). 3-47.Search in Google Scholar

Vaan, M. 2008. Etymological dictionary of Latin and the other Italic languages. Leiden: Brill.Search in Google Scholar

Wannemacher, M. 2011. A phonological overview of the Lacid language. Chiang Mai: Linguistics Institute, Payap University.Search in Google Scholar

Recommended articles from Trend MD

Plan your remote conference with Sciendo