The dual nature of the Tense Aspect system in English
Learners of English are sometimes overwhelmed when having to choose which tense to use in a sentence, but we have some good news: simple tenses are the most commonly used by English speakers and they are easy to learn! By running a simple, biologically plausible learning algorithm, we found that tense aspect (TA) combinations can actually be learned via exposure to language.
We trained our algorithm on individual sentences from the British National Corpus and asked it to retrieve the TA combination used in these sentences based on the immediate context of the sentence and the verb lemma. Our model was not only quite successful, but it also helped us unveil an interesting fact about the TA system in English: it appears to be a dual system with (i) the simple tenses making up the bulk of TA combinations used but also being easy to learn and reliant on the verb used and (ii) the complex tenses being much less frequently used and relying on more contextual information.
This says a lot about language and how we learn and use it. Taking a cognitive linguistic, experiential view on language, and considering languages and our knowledge of them a product of our interaction with the world, allowed us to explore what the use of different tense and aspect combinations reveals about the interaction between our experience of time, and the cognitive demands that talking about time puts on the language user. The finding that the simplex TA combinations, which are the most frequent TA combinations, are essentially lexical in nature and that the more complex TA combinations typically require contextual lead us to argue for a rethink of tense and aspect as grammatical categories. Instead of a separation of tense and aspect as such, it appears that the distinction lies in a simplex versus complex paradigm that emerges from the interaction of language use and language cognition, learning in particular.
You can find out more and read our paper here
In line with the Open Data policy of our project, data can be downloaded from the University of Birmingham edata repository UBIRA and code is on our GitHub page.