Tobias Scheer

The corpus was, is and will be a valuable tool that helps pursuing a goal. Its ontological status as a tool will not change, no matter how fabulous the computational power, storage capacity, access and transmission speed, and whatever the size of the corpus. The corpus is a data source among others (namely grammaticality judgements), which has specific advantages and limitations that the user needs to be aware of – like for any other tool.Drowned in the ambient utilitarianism and project-hysteria, many people believe, overtly or tacitly (or without being aware that they do), that research which involves the building of a corpus coupled with exploitation by a "powerful" computer programme, is more serious than a competitor which does not. Some even believe that the whole purpose of a research project may be the creation of a corpus, and that the corpus will produce science by itself, i.e. substitute itself to reasoning and the data-expectation dialectic. The same ideology promotes the idea that whatever scientific statement is made, it needs to be statistically relevant. This is where the corpus stops being a tool, i.e. where the system goes mad. And it did on a large scale in the past decade or so. Poor corpora are in the middle of this thunderstorm, and are abundantly abused by the ideology in place.

Publication details

DOI: 10.4000/corela.3006

Full citation:

Scheer, T. (2013). The corpus: A tool among others. Corela 13 (HS), pp. n/a.

