[University home]

School of Languages, Linguistics and Cultures

GerManC Plus: the complete project

Work on the complete GerManC project commenced in September 2008 with Professor Martin Durrell as Principal Investigator, Dr Paul Bennett as Co-Investigator and  Dr Silke Scheible and Dr Richard J. Whitt as Research Associates. In the first instance this will involve extending the corpus by including the remaining genres, i.e. drama, sermons, personal letters, journals, narrative prose (fiction and biographies), academic, medical and legal texts. The parameters established in the pilot project will be followed for these genres, in other words three 2000 word samples will be taken for each of the five regions within each of the three fifty-year sub-periods.

Building on the achievements of the pilot project, software programs will be developed to enable full analysis of the corpus material. In particular, in collaboration with colleagues at the Institute for German Language (IDS) and other institutions in Germany working on the Deutsch Diachron Digital Project, we shall be aiming to find ways in which all occurrences of particular words in the corpus can be found automatically despite the considerable variation in spelling at this time. It is also intended to adapt existing software which identifies the part of speech (noun, verb, adjective, etc.) for each word and classifies them according to grammatical category (case, gender, tense, etc.), as well as automatically specifying the basic structure of each sentence. Such programs will ideally have the potential for wider application to other languages whose grammar is similarly complex to that of German. The whole corpus will be set up with interfaces to ensure maximum ease of access.