Spencer Caplan

Upcoming Semester (Spring 2026)

(Spring 2026)

Ideals and reality often diverge. If we have made such strides in the last fifty years in many source disciplines of cognitive science individually (computer science, linguistics, mathematics, neuroscience, psychology), why has this not manifested as massive progress in fundamental understanding of the mind? Where is the glorious cognitive utopia promised by our large advances in empirical methods, the scale and availability of data, and the unreasonable effectiveness of engineering tools?

The root of this tension must, at least partially, reside in a distinction between prediction and explanation. It should be obvious to a reasoned thinker that an observation and its cause are two very different objects: this is echoed in the theoretical study of language in the divide between a speaker's i(nternal)-language —the system for generating and interpreting sentences — from their e(xternalized)-language; the utterances and texts that we can actually record. Here i-language is the intended object of scientific study despite e-language being the paradigmatic (and perhaps necessary) locus of measurement.

Attempts to collapse this distinction are endemic. This has, perhaps, been brought to a head by modern so-called AI systems whose application draws no such distinction between predicting external behavior and explaining its internal cause. This "collapsed" view, in fact, has a long history, as Celsus described a prominent school of Greek physicians nearly two thousand years ago: "[this group does] indeed accept evident causes as necessary; but they contend that inquiry about obscure causes and natural actions is superfluous, because nature is not to be comprehended." But if messy, stochastic observations are all we have, then why bother?

In this seminar we take an optimistic view: in order to treat the study of cognition as a mature science we must re-emphasize a commitment to the idea that the world (mind included) has a mechanistic basis that can be understood as such. This seminar will cover a mechanistic understanding of mental representations in language. We will explore how factors such as perception, memory, learning, categorization, and conceptual development interact with and constrain the properties of human language and inference, including its acquisition, processing, and variation.

Office Hours: Tuesday 3:30-4:30PM, GC 7400.02 and by appt.

Various GC Courses in my Rotation

(Spring 2025, Spring 2024)

Language is perhaps the best window we have into cognition. Humans' knowledge of language consists not only in grammatical representation, but in the processes which operate over such representation. Thus, how we are able to convert gradient, continuous, ephemeral perceptual signals into discrete, mental symbols (and vice versa) is of fundamental importance to work toward a fuller understanding the cognitive system of language. This discussion-based seminar will provide a wide-ranging but in-depth overview of topics in the real-time processing of language, speech, and related perceptual domains. Special attention will be devoted to the use of simple algorithmic models and related experiments in order to explore the specific mechanisms involves in the mental representation and use of language.

(Fall 2023)

This course is an overview of research in language acquisition, focusing on the important connection between what children know and how they come to know it. We will devote special attention to the use of simple computational and mathematical models in explaining the mechanisms children possess to learn and use language.

(Fall 2025)

This class provides an introduction to statistical and quantitative data analysis from various areas in linguistics research. Topics covered include probability, descriptive and inferential statistics, hypothesis testing, analyses of variance, regression models (linear, logistic, and mixed-effects), and approaches to corpus and experimental data. Students will learn to use the R statistical environment and a wide variety of methods for data wrangling, visualization, and statistical inference, and will gain experience with best practices for clearly and fairly reporting results. Emphasis will be placed on developing statistical reasoning, understanding the assumptions behind common tests, and critically evaluating methods in the (psycho)linguistics literature.

(Fall 2025, Fall 2024, Fall 2023)

This course is the first of a two-semester series introducing modern software development. The intended audience are students interested in speech and language processing technologies, though the materials will be beneficial to all language researchers.

Using the Python programming language, students will be able to write programs which count the frequencies of various linguistic phenomena in text. They will be able to process text stored in various structured data formats. They will come to understand how computers encode multilingual text. They will learn the basic principles of command-line design and master regular expressions.

(Spring 2025, Spring 2024)

This course is the second of a two-semester series introducing computational linguistics and software development. The intended audience are students interested in speech and language processing technologies, though the materials will be beneficial to all language researchers.

The Old Days

Computer Science

Cognitive Science

Swarthmore College

(Spring 2022, Fall 2020)

This course will introduce you to a broad range of topics in the area of natural language processing including language modeling, part of speech tagging, machine translation, syntactic parsing, vector semantics, text classification, as well as the application of computational tools to cognitive modeling and psycholinguistics.

(Spring 2021)

This course will introduce fundamental ideas in computer science while also teaching you how to write computer programs. We will study algorithms for solving problems and implement solutions in the Python programming language. Python is an interpreted language that is known for its ease of use. We also introduce object-oriented programming and data structures. This course is appropriate for all students who want to learn how to write computer programs and think like computer scientists. It is the usual first course for computer science majors and minors.

(Fall 2021)

This is the second semester in a broad introduction to computer science. Topics to be covered include object-oriented programming in C++, advanced data structures (such as priority queues, trees, hash tables, and graphs), advanced algorithms (and analysis of asymptotic complexity), as well as software design and verification. These topics are central to every sub-discipline in computer science, and also connect to central concepts across the sciences.

Upcoming Semester (Spring 2026)

Various GC Courses in my Rotation

The Old Days