LexMC's directors have been involved in English corpus lexicography since its beginnings in the early 1980s. They have played a leading role in the design and development of several major corpora, including the British National Corpus (BNC). They are now at the forefront of new initiatives to use the Web as a source of corpus data.

What we do.

Drawing on our extensive experience in the development and exploitation of corpora, we can offer a complete corpus-gathering service, including any or all of the following:

  • advice on design principles and corpus building
  • data collection and copyright clearance
  • document header design and other documentation
  • encoding and annotation: standardization, lemmatization, tokenization, POS-tagging, shallow parsing

Corpus initiatives we have been involved in: