![]() |
Robert Daland Department of Linguistics, Northwestern University 2016 Sheridan Rd., Evanston IL, 60201 |
I am involved in several research projects, each involving some subset of the the following: sound structure, language acquisition, word structure, statistical learning theory, computational modeling, language change, Russian. Projects:
Diphone-based Lexical Bootstrapping (my dissertation!) Word segmentation and word learning are mutually dependent processes -- to learn a new word you must perceive the boundaries between it and its neighbors, but the simplest way to do this is to recognize the neighbors -- which requires having learned the neighbors already... Segmentation is simple for adults, who know most of the words they encounter, but it is unclear how infants solve this bootstrapping problem. My dissertation is a cross-linguistic investigation of the proposal that diphones (sequence consisting of a consonant or vowel followed by another consonant or vowel) mediate word segmentation both in infants and adults. The project consists of a computational model and behavioral experiment, with Russian and English data and participants. The computational model uses diphones to guess the position of word boundaries in phonologically transcribed corpora. The "adult" model, which possesses an adult lexicon, achieves 90% accuracy in both English and Russian. I am developing an "infant" model, which uses the same diphone principles to guess word boundaries from raw speech, without any lexicon. The behavioral experiments test whether word segmentation and word learning are mutually facilitatory. The Persistence of Gaps (with Andrea Sims and Janet Pierrehumbert) Languages occasionally exhibit idiosyncratic failures of morphological productivity known as paradigmatic gaps. For example, Russian has about 100 verbs which do not have a first-person non-past, e.g. *pylesoshu 'I will vacuum'. Historical records indicate that these gaps have persisted in the language on the order of 150 years. Because these gaps are lexically specific, they must be learned. So, listeners must somehow figure out that the gap occurs less often than expected. We propose that learners track the relative frequency with which each root co-occurs with its inflections. We demonstrate that this proposal is adequate to explain the persistence of Russian gaps with a social network model. The initial generation is seeded with a lexicon drawn from the Russian National Corpus, and talks according to the grammar they induced from this data. Following generations learn from their elders, and pass on their language to their children. We show that the number of gaps in the language is relatively stable across 10 generations. In future work we will extend these results to Greek, focusing more on the role of social structure. Interaction of Phonetic Categories and Phonotactics (with Matt Goldrick and Jessica Maye) Phonetic categories and phonotactics interact in adults, and the timecourse of acquisition in infants is conspicuously similar. We propose that these two aspects of speech perception are mediated by a single representational mechanism, formally unifying these two kinds of knowledge. To test this proposal, we expose a connectionist sequence learning model to an artificial language containing both phonetic and phonotactic dependencies, and developed a novel testing method that demonstrated overlapping emergence of sensitivity to each kind of dependency. Language Dynamics, Categorical Grammars, and Dialect Formation (with Janet Pierrehumbert and Brady Clark) This research project consists of analyzing an idealized social network in which each agent possesses a grammar. Agents produce utterances according to their grammar, and update their grammar according to the utterances produced by their neighbors. Crucially, while utterances are categorical, grammars are stochastic, meaning that agents are in principle able to command multiple variants. First, we show that only mild conditions are required for all members of a community to converge to one or the other categorical variant. In other words, the categorical nature of a community-level grammar can emerge from repeated cycles of perception, production, and learning. Thus, there is no need to stipulate that it is a property of individuals that grammars must be categorical. Then, we show how the same mechanisms can give rise to dialect splitting. Dialect splitting is something of a mystery because it simultaneously involves reduction of intra-community variation, but amplification of inter-community variation. The analysis reveals how both effects result from an interaction between cognitive (grammatical) and social factor. Modeling Speech Errors with Harmonic Grammars (with Matt Goldrick) Speech errors are sometimes regarded as extra-grammatical. However, they are apparently sensitive to grammatical factors such as markedness. In particular, speakers make errors in which a marked target is replaced by an unmarked competitor (e.g. [g] --> [k]), and they also make the reverse error, in which an unmarked target is replaced by a marked competitor ([k] --> [g]), but the former type error is more common (Goldrick, unpublished dissertation). Thus, there is a markedness asymmetry in the relative frequency of speech error types. We propose to account for this fact using Harmonic Grammars. The basic idea is to take the existing formalism and add noise. So, we first extend the existing formalism to define a speech error. Then, we give a rigorous proof that explains the markedness asymmetry effect. Finally, we show in a series of simulations that the proposed theory gives a very tight fit to experimental data on speech errors. Perceptual Sensitivity to Subject-Verb Agreement in Toddlers (with Jessica Maye) The explosion of literature on the remarkable statistical learning abilities of infants shows that children have the cognitive machinery to take advantage of regularities in their input for a wide variety of language learning tasks. The subject-verb agreement relation is a wonderful source of data for this purpose, since it is highly frequent and reliable. However, in order for children to make use of this source of data, they must be able to perceive subject-verb agreement morphemes occur. when can infants do this? We are testing 19-month-olds on intransitive sentences with grammatical and ungrammatical agreement. Perceptual sensitivity to agreement morphemes should be evident in a different pattern of attention to ungrammatical sentences than to grammatical ones. |