Computational Approaches to Language Acquisition
CUNY Graduate Center – Fall 2023


This course is an overview of research in language acquisition, focusing on the important connection between what children know and how they come to know it. We will devote special attention to the use of simple computational and mathematical models in explaining the mechanisms children possess to learn and use language.

The format of the course will be a mixture of instructor-led lectures and student-led paper discussions. Each week or two will consist of an investigation of a new topic.

Exact areas are subject to vary based on student interest but will include some subset of: the proper characterization of the primary linguistic data, the role of innate ability and environment (individual variation and universal tendencies), the nature of speech perception and the specialization to the native language, stages that child learners go through as they acquire their grammars, similarities and differences between child language acquisition and adult learning/adaptation, the structure and acquisition of words and conceptual knowledge, the acquisition of pragmatic and social interpretation of sentences in context, language learning impairments and the biological basis of language acquisition, the role of language learning in language change.

Goals and Objectives

  • Gain familiarity with the core questions, debates, and methods common in contemporary research in language acquisition as a formal/cognitive science
  • Foster an understanding of the ways in which acquisition interfaces with and may provide explanations within the core sub-fields of linguistics (phonology, syntax, etc.)
  • Be able to read and critique the primary literature on language acquisition, including the empirical methods and primary data common in the field.
  • Practice designing a study and writing original research and/or literature review
  • Materials

    While there is no official textbook for the course, we will read a lot of source material from the relevant literature. All readings assigned throughout the term (both required and optional) will be posted to the course schedule/website.

    You are welcome to bring your laptop to class in order to access material in real-time, however I would encourage you to consider sometimes refraining from using electronics during discussions as it frequently becomes a greater source of distraction than students may realize in my experience.


    60% of students' grades will be based on active participation and attendance throughout the term. Each student will additionally lead paper/topic discussion in the course at least once per term, which will contribute 10% of their grade. Finally, students will write an original research proposal or literature review (with the opportunity to implement and pursue novel research opportunities towards the end of or following the semester; potentially including analysis of primary data, behavioral experiments, or computational modeling). This final term paper will constitute the remaining 30% of the grade.

    Since this is the first time I am teaching this course, there is a certain degree of uncertainty on the pace with which we will move through content. I may potentially assign a couple small problem sets during the term. If this happens, then completing those will be folded into the "60%" portion of the grade for participation and attendance


    The instructor will attempt to provide all reasonable accommodations to students upon request. If you believe you are covered under the Americans With Disabilities Act, please direct accommodations requests to Vice President for Student Affairs Matthew G. Schoengood.


    Students are extended to attend all seminar meetings (in person). However, students who have reason to believe they may be contagious for COVID-19 or other infectious diseases should attend the course online after contacting the instructor. Other absences will not be excused, and the instructor reserves the right to tie grades to attendance records. The instructor is not responsible for reviewing materials missed to absence.


    In line with the Student Handbook policies on plagiarism, students are expected to complete their own work. The general ethos of the integrity policy is that actions which shortcut the learning process are forbidden while actions which promote learning are encouraged. Studying and discussing notes, papers, and ideas together provides a fruitful avenue for learning and is encouraged. Using a classmate’s solution to a problem set or having someone write a portion of your presentation, however, is prohibited because it circumvents the learning process. If you have any questions about what is or is not permissible, please contact your instructor.

    The instructor reserves the right to refer violations to the Academic Integrity Officer.

    Problem Sets / HWs

    HW1 (due Sept 29th)

    HW2 (due Oct 20th)

    HW3 (due Nov 22nd)

    Final Paper Details

    Weekly Schedule

    (Please note that this is subject to change.)

    I realize it would be a herculean task to thoroughly read all the materials listed below, some of which may be quite far removed from the core areas of study for most participants. I list them because I think they are important, for me anyway, for developing a broader perspective on language and learning -- as well as to provide a window on my thinking.

    Week 0 (8/30) – Introduction to the Problem of Language Acquisition

    Slides (L1)

    Lewontin, R.C. 1983 The organism as the subject and object of evolution Scientia. 118, 53-82.
    Labov, W., 1989. The child as linguistic historian. Language variation and change. 1(1), pp.85-97.

    Quine, W.V. 1957 The scope and language of science. The British Journal for the philosophy of Science. 8(29), pp.1-17.
    Chomsky, N. 1965. Aspects of the theory of syntax. MIT Press. Chapter 1.

    Week 1 (9/06) – Categories, Concepts, and Learning Words

    Slides (L2)

    (But, no readings this week)

    Week 2 (9/13) – Learning words (continued)

    Slides (L3)

    Gleitman, R. & Trueswell, J. 2020 Easy Words: Reference Resolution in a Malevolent Referent World. Topics in Cognitive Science. 12(1), 22-47
    Stevens, J. S., Gleitman, L., Trueswell, J., & Yang, C. 2017 The Pursuit of Word Meanings. Cognitive science. 41, 638-676

    Caplan, S. 2021 Word Learning as Category Formation (Ch. 2 of my dissertation: "Immediacy of Linguistic Computation") (Doctoral dissertation, University of Pennsylvania)
    Fisher, C., Hall, D. G., Rakowitz, S., & Gleitman, L. 1994 When it is better to receive than to give: Syntactic and conceptual constraints on vocabulary growth. Lingua. 92, 333-375.
    Rehder, B., & Hoffman, A. B. 2005 Eyetracking and selective attention in category learning. Cognitive psychology. 51(1), 1-41
    Medin, D. L., & Smith, E. E. 1984 Concepts and concept formation. Annual review of psychology. 35(1), 113-138

    Week 3 (9/20) – Generalization and Learning Phonological Categories

    Slides (L4)

    Cui, A. 2020 The Emergence of Phonological Categories (Doctoral dissertation, University of Pennsylvania) Just read Ch 1-3
    Johnson, E. K., & White, K.S. 2019 Six Questions in Infant Speech. Human Language: From Genes and Brains to Behavior pg. 99-113

    Feldman, N.H., Griffiths, T.L. and Morgan, J.L. 2013 A role for the developing lexicon in phonetic category acquisition. Psychological review 120(4), p.751.
    Kuhl, P. K., Williams, K. A., Lacerda, F., Stevens, K. N., & Lindblom, B. 1992 Linguistic experience alters phonetic perception in infants by 6 months of age. Science 255(5044), 606-608.

    Week 4 (9/27) – Phonological Categories and Word Segmentation

    Slides (L5)

    Lignos, C. 2013 Modeling Words in the Mind: Word segmentation (Doctoral dissertation, University of Pennsylvania)

    Saffran, J., Newport, E., & Aslin, R. 1996 Word Segmentation: The Role of Distributional Cues Journal of memory and language 35(4), 606-621.
    Lignos, C. 2013 Modeling Words in the Mind: Building computational models of cognitive processes (Doctoral dissertation, University of Pennsylvania)

    Week 5 (10/04) – Word Segmentation

    Slides (L6)

    Week 6 (10/11) – Corpora; Critical Periods; Negative Evidence

    Slides (L7)

    Lenneberg, E. H. 1967 The biological foundations of language Hospital Practice 2(12), 59-67.

    Week 7 (10/18) – Syntactic Parameters and the Variational Learner

    Slides (L8)

    Yang, C. 2001 Knowledge and Learning in Natural Language: Chapter 2 (Doctoral dissertation, MIT)

    Gibson, E., & Wexler, K. 1994 Triggers Linguistic Inquiry 25(3), 407-454.

    Week 8 (10/25) – Competing Grammars; Determiner War

    Slides (L9)

    Yang, C. 2013 Ontogeny and phylogeny of language Proceedings of the National Academy of Sciences 110(16), 6324-6327.
    Yang, C. 2000 Dig-dug, think-thunk. Review of Steven Pinker's Words and Rules: The ingredients of language. London Review of Books.

    Pine, J. M., Freudenthal, D., Krajewski, G., & Gobet, F. 2013 Do young children have adult-like syntactic categories? Zipf’s law and the case of the determiner. Cognition 127(3), 345-360.
    Kroch, A. S. 1989 Reflexes of grammar in patterns of language change. Language variation and change 1(3), 199-244.
    Han, C. H., Musolino, J., & Lidz, J. 2016 Endogenous sources of variation in language acquisition. Proceedings of the National Academy of Sciences 113(4), 942-947.

    Week 9 (11/01) – Rules and Exceptions; The Tolerance Principle

    Slides (L10)

    Week 10 (11/08) – Tolerance Principle; Case Studies; Gaps

    Slides (L11)

    Yang, C. 2016 The Price of Linguistic Productivity: How Children Learn to Break the Rules of Language Chapters 2-3 .

    Murray, W. S. & Forster, L. I. 2004 Serial Mechanisms in Lexical Access: The Rank Hypothesis Psychological Review 111(3), 721.
    Schuler, K., Yang, C., & Newport, E. 2016 Testing the Tolerance Principle: Children form productive rules when it is more computationally efficient.
    Yang, C. 2016 Price of Productivity Chapter 5

    Week 11 (11/15) – To-Dative and Double-Object; Learning to count; Clever Hans Effect

    Slides (L12)

    Yang, C. 2016 The Linguistic Origin of the Next Number Unpublished Manuscript

    Week 12 (11/22) – No class (🦃Thanksgiving)

    Week 13 (11/29) – Syntax in NNs; Functionalism; Vowel Harmony

    Slides (L13)

    Caplan, S., Kodner, J., & Yang, C. 2020 Miller's monkey updated: Communicative efficiency and the statistics of words in natural language Cognition 205, 104466.
    Caplan, S. & Kodner, J. 2018 The Acquisition of Vowel Harmony from Simple Local Statistics Proceedings of the Cognitive Science Society

    Kodner, J. & Gupta, N. 2020 Overestimation of Syntactic Representationin Neural Language Models Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 1757–1762

    Week 14 (12/06) – Paper Presentations

    Week 15 (12/13) – No class (reading days)

    Week 16 (12/20) – Term paper due

    Some useful resources

  • The CHILDES database
  • The SUBTLEX Corpus
  • English Lexicon Project
  • The CMU Pronouncing Dictionary
  • Python Natural Language Toolkit