Specifications

You are responsible for a term paper that counts something of linguistic interest using a non-trivial amount of Python features.

If you are not sure whether your project satisfies the above specifications, email a brief description to Spencer before proceeding.

A brief list of ideas:

  1. Count the words most associated with each of the 12 zodiac signs in a corpus of horoscopes
  2. Count the number of words ending in various derivational suffixes in a digital dictionary
  3. Count the number of words ending in syllabic sonorants in a pronunciation dictionary
  4. Count the frequencies of the different pronunciations of the word live using a tagger (n.b.: this works because one pronunciation is used when it's a noun, and another when it's a verb)

What to submit

Your submission should include:

  1. Any interesting samples of code (though I will only be minimally reviewing code quality in my grading)
  2. Data used (or instructions or code to obtain it, if it's more than 10 MB or so)
  3. A write-up of 3-4 pages describing:
    1. the data you used
    2. what you counted
    3. what the counts were (please make a nicely formatted table, don't just dump Python output here)
    4. why this might be a linguistically interesting thing to count
    5. how the project might be extended if you had more time or more experience programming

(While not strictly mandatory, I would strongly encourage you to typeset your paper in LaTeX. Proficiancy in LaTeX is an important skill you'll need to graduate school and beyond, and online tools such as Overleaf make this much easier to get started with compared to when I was a young graduate student!)

Rubric

The term paper will be graded on the degree to submission satisfies the above specification.

The term paper is officially due 12/20 (the final day of the semester), but I will grade submissions up to the point where I am required to submit grades to the registrar's office; this is usually a week or so after the end of the semester. If I have not received a term paper by then, you will receive an "I" (incomplete) grade until you submit the term paper.

Hints

  1. While it's technically possible to work with audio data for this project, it's a lot harder than working with something that's already discrete (e.g., text, etc.) data unless you've also studied acoustic phonetics and/or signal processing.
  2. It's okay (good, even) if this harmonizes with some other projects you're doing for credit (e.g., qualifying papers), so long as you make it clear in your write-up what part of the project is unique to the term paper.