Keynote

The following speakers have graciously agreed to give keynotes at PACLIC 2023.

Speaker: Charles Yang

Charles Yang

Talk title: Learning by Satisficing

In his pioneering work, Herbert Simon observed that under ecological constraints, human learning and decision making often do not strive for the optimal solution but merely one that is good enough. However, the absence of a precise theory of what counts as good enough—which Simon called satisficing—has hampered its utility in the psychological and computational studies of learning.

Language offers a perfect opportunity to study satisficing. Almost all linguistic rules have exceptions but are apparently good enough. For example, the English verb past-tense rule of adding “-ed” is far from perfect as it must counter some 150 irregular verbs. Yet generalizes to every new verb that comes into existence (e.g., “google-googled”). I will present evidence for a surprisingly simple, parameter-free, principle of satisficing that specifies the boundary condition of what counts as good enough. Such a principle leads to efficient and interpretable learning models that are nevertheless competitive against neural network models especially when the training data is limited as is the case of child language acquisition.

Charles Yang is an accomplished scholar and educator in the field of cognitive science and linguistics. With a background in computer science from the renowned MIT AI Lab, he went on to pursue his academic career at Yale University before joining the University of Pennsylvania. Currently, he holds the position of director of the Program in Cognitive Science while also teaching linguistics, computer science, and psychology. Throughout his career, Charles has delved into various areas of research, including language acquisition, variation, and change. His expertise extends to natural language processing (NLP) and the study of the human mind, encompassing numerical and conceptual development in children. Charles is a published author with notable works such as “The Price of Linguistic Productivity: How Children Learn to Break the Rules of Language” (MIT Press 2016), which earned him the esteemed Leonard Bloomfield Award from the Linguistic Society of America. Recognized for his contributions, Charles has received fellowships from esteemed institutions like the National Science Foundation (1995) and the John Simon Guggenheim Memorial Foundation (2018). His leadership is evident in his co-direction of the Integrated Language Science and Technology initiative at the University of Pennsylvania, building upon the institution’s historic legacy as a pioneer in interdisciplinary linguistic research.

Affiliation: University of Pennsylvania

Speaker: Lori Levin

Lori Levin

Talk title: Interlinear Gloss in Natural Language Processing and Language Documentation

Interlinear Glossed Text (IGT) is the main mechanism used by linguists to document morphosyntax for the purpose of language documentation and linguistic research. For endangered languages, larger corpora annotated with IGT would enable corpus linguistic studies and corpus based dictionaries and language learning programs. Nevertheless, a large proportion of field linguistic recordings go untranscribed and unannotated. The wav2gloss project at Carnegie Mellon University and Gettysburg College aims to speed up the process of transcription and annotation using end-to-end neural models with human linguists in the loop and human knowledge in the form of weighted finite state transducers. The talk will cover new Generalized Glossing Guidelines (G3) for documenting non-concatenative morphology and its application to tonal morphology in a large corpus of Yoloxochitl Mixtec. We will also cover our efforts to convert field data archives containing audio and IGT in many languages to machine-learning-ready format for use in multi-lingual speech recognition models and multi-lingual models of IGT.

Lori Levin is a Research Professor in the Language Technologies Institute at Carnegie Mellon University, with a Ph.D. in linguistics (MIT 1986). She has been Principal Investigator and Co-Principal Investigator of many funded projects where she provides the linguistic resources and corpus annotation for natural language processing systems. She has supervised work on dozens of languages and many types of corpus annotation including morphology, syntax, semantics, pragmatics, and discourse. For more than twenty years, she has specialized in NLP for low resource languages and endangered languages. She is also the co-founder (2007) and co-chair of the North American Computational Linguistics Open competition.