Adam Kilgarriff – Publications
Journal Papers
2005
Language is never ever ever random. Corpus Linguistics and Linguistic Theory 1 (in press)
(with Michael Rundell and Elaine Uí Dhonnchadha) Efficient corpus development for lexicography: building the New Corpus for Ireland. Language Resources and Evaluation Journal (submitted)
2003
(with Gregory Grefenstette) Introduction to the Special Issue on Web as Corpus. Computational Linguistics 29 (3). (Also guest editors for the Special Issue)
2002
(with Philip Edmonds) Introduction to the Special Issue on Evaluating Word Sense Disambiguation Systems. Journal of Natural Language Engineering 8 (4). (Also guest editors for the Special Issue)
2001
“Comparing Corpora.” International Journal of Corpus Linguistics 6 (1): 1-37.
2000
(with Joseph Rosenzweig) “English Framework and Results.” Computers and the Humanities 34 (1-2), Special Issue on SENSEVAL.
(with Martha Palmer) Introduction to the Special Issue on SENSEVAL. Computers and the Humanities 34 (1-2). (Also guest editors for the Special Issue)
“Business Models for Dictionaries and NLP” International Journal of Lexicography 13 (2). Pp 107-118.
(with Wim Peters) “Discovering Semantic Regularity in Lexical Resources.” International Journal of Lexicography 13 (4): 287-312.
1998
“Gold Standard Datasets for Evaluating Word Sense Disambiguation Programs.” Computer Speech and Language, 12 (3) Special Issue on Evaluation of Speech and Language Technology, edited by Robert Gaizauskas. 453-472.
1997
“Putting frequencies in the dictionary.” International Journal of Lexicography 10 (2): 135-155.
“I don’t believe in word senses” Computers and the Humanities 31: 91-113.
To be reprinted in Readings in the Lexicon Pustejovsky, Wilks and Castaño, editors. MIT Press.
Reprinted in Polysemy: Flexible patterns of meaning in language and mind Nerlich, Todd, Herman and Clarke, editors. Walter de Gruyter. Pp 361-392.
“The hard parts of lexicography.” International Journal of Lexicography 11 (1): 51-54.
1993
“Dictionary Word Sense Distinctions: An enquiry into their nature.” Computers and the Humanities 26 (1-2): 365-387.
Book Chapters
2005
Use of Computers in Lexicography. Article in Encyclopedia of Language and Linguistics. Elsevier.
(with Hiram Calvo and Alexander Gelbukh) Automatic Thesaurus vs. WordNet: A Comparison of Backoff Techniques for Unsupervised PP Attachment. Proc. CICLING, 5th Int. Conf. on Intelligent Text Processing and Computational Linguistics, Mexico City. Springer Verlag. Conference best presentation prize.
2003
(with Rob Koeling) An evaluation of a lexicographer’s workbench incorporating word sense disambiguation” Proc. CICLING, 3rd Int Conf on Intelligent Text Processing and Computational Linguistics, Mexico City. Springer Verlag.
2002
(with David Tugwell) “Sketching words” Lexicography and Natural Language Processing: A Festschrift in Honour of B. T. S. Atkins. Marie-Hélène Corréard (Ed.) EURALEX: 125-137.
2001
“Generative lexicon meets corpus data: the case of non-standard word uses” The Language of Word Meaning. Pierrette Bouillon and Frederica Busa (Eds.) Cambridge University Press: 312-330.
1995
“Inheriting Polysemy” Computational Lexical Semantics. Patrick St. Dizier and Evelyne Viegas (Eds.) Cambridge University Press: 319-335.
(with Gerald Gazdar) “Polysemous Relations” Grammar and Meaning: Essays in Honour of Sir John Lyons. Frank Palmer (Ed.) Cambridge University Press: 1-25.
Conference Papers
2005
Putting the corpus into the dictionary In: Proc. MEANING Workshop. Trento, Italy, February.
(with Chu-Ren Huang, Pavel Rychly, Simon Smith, David Tugwell) Chinese word sketches. Proc. Asialex, Singapore, June.
Linking Dictionary and Corpus. Proc. Asialex, Singapore, June.
(with Michael Rundell and Elaine Uí Dhonnchadha) Corpus creation for lexicography. Proc. Asialex, Singapore, June.
(with Chu-Ren Huang, Jan Pomikálek, Michael Rundell, Pavel Rychly, Simon Smith, David Tugwell, Elaine Uí Dhonnchadha) Word sketches for Irish and Chinese. Proc. Corpus Linguistics 2005, Birmingham, UK. July.
2004
How dominant is the commonest sense of a word? In: Text, Speech, Dialogue. Lecture Notes in Artificial Intelligence Vol. 3206. Sojka, Kopecek and Pala, Eds. Springer Verlag: 103-112.
(with Rada Mihalcea and Timothy Chklovski) The Senseval-3 English Lexical Sample Task. In Proc. Senseval-3: Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text. Barcelona, July: 25-28.
(with Pavel Rychly, Pavel Smrz and David Tugwell The Sketch Engine Proc. Euralex. Lorient, France, July: 105-116.
2003
Thesauruses for Natural Language Processing. Keynote lecture. Proceedings of NLPKE, Beijing, October.
What computers can and cannot do for lexicography, or, Us precision, them recall. Keynote lecture. In Proceedings of ASIALEX, Tokyo, August.
(with Roger Evans, Rob Koeling, Michael Rundell, David Tugwell) WASPBench: a lexicographers’ workstation incorporating word sense disambiguation. Demo and research note. European ACL, Budapest, April.
(with Rob Koeling, David Tugwell, Roger Evans) An evaluation of a lexicographer’s workbench: Building lexicons for machine translation. Workshop on MT tools, European ACL, Budapest, April.
(with Tomaz Erjavec, Roger Evans, Nancy Ide) The CONCEDE lexical databases. Proc. COMPLEX, Budapest, April.
No-bureaucracy evaluation. Workshop on Evaluation Initiatives in NLP, European ACL, Budapest, April.
Linguistic Search Engine Proc. Corpus Linguistics 2003. Lancaster, March.
2002
(with Michael Rundell) Lexical profiling software and its lexicographic applications – a case study. Proc EURALEX, Copenhagen, August: 807-818.
(with Rob Koeling) Evaluating the WASPbench, a lexicography tool incorporating word sense disambiguation. Proc. ICON International Conference on Natural Language Processing. December, Mumbai. Vikas Publishing, New Delhi: 165-174.
2001
(with Gabriela Cavaglià) Corpora from the web. Proc 4th CLUK colloquium, Sheffield, January.
Web as Corpus. Proc Corpus Linguistics 2001, Lancaster, UK. March. Reprinted in Corpus Linguistics: Readings in a widening discipline, Sampson and McCarthy,editors. Continuum International.
(with David Tugwell) WORD SKETCH: Extraction and Display of Significant Collocations for Lexicography. Proc ACL workshop COLLOCATION: Computational Extraction, Analysis and Exploitation. Toulouse, July: 32-38.
(with David Tugwell) WASP-Bench: an MT Lexicographers’ Workstation Supporting State-of-the-art Lexical Disambiguation. Proc MT Summit VIII, Santiago de Compostela, Spain, September: 187-190.
English Lexical Sample Task Description. Proc ACL-SIGLEX SENSEVAL workshop, Toulouse, July: 17-20.
(with David Tugwell) WASPBENCH: a lexicographic tool supporting WSD. Proc ACL-SIGLEX SENSEVAL workshop, Toulouse, July: 151-154.
2000
(with Joseph Rosenzweig) English SENSEVAL: Framework and Results. Proc. Second Intnl Conf on Language Resources and Evaluation. Athens, Greece: 1239-1244.
(with Colin Yallop) What’s in a thesaurus? Proc. Second Intnl Conf on Language Resources and Evaluation. Athens, Greece: 1371-1379.
(with Tomaz Erjavec, Roger Evans, Nancy Ide) The Concede Model for Lexical Databases Proc. Second Intnl Conf on Language Resources and Evaluation. Athens, Greece: 355-362.
(with Nancy Ide, Laurent Romary) A formal model of dictionary structure and content. Proc. EURALEX, Stuttgart, Germany: 113-126.
(with David Tugwell) Harnessing the Lexicographer in the Quest for Accurate Word Sense Disambiguation. Proc. 3rd Int. Workshop on Test, Speech, Dialogue (TSD 2000), Brno, Czech Republic. Lecture Notes in Artificial Intelligence, Springer-Verlag: 9-14.
1999
(with Nadjet Bouayad-Agha) Duplication in Corpora. Proc 2nd CLUK Colloquium, Colchester, Essex, Jan.
95% Replicability for Manual Word Sense tagging. Proc. EACL, pp 277-288. Bergen, Norway, June: 277-288.
1998
SENSEVAL: An Exercise in Evaluating Word Sense Disambiguation Programs Proc. EURALEX, Liège, Belgium, August: 167-174.
Bridging the gap between lexicon and corpus: convergence of formalisms Proc. Workshop on Adapting Lexical and Corpus Resources. Granada, Spain, May.
(with Tony Rose) Measures for corpus similarity and homogeneity Proc. 3rd Conf. on Empirical Methods in Natural Language Processing (EMNLP-3), Granada, Spain, June: 46-52.
SENSEVAL: An Exercise in Evaluating Word Sense Disambiguation Programs Proc. First Intnl Conf on Language Resources and Evaluation, pp 581-588. Granada, Spain, May 1998.
The Generative Lexicon and the Nunbergian Lexicon Proc Utrecht Congress on Storage and Computation in Linguistics. Utrecht, The Netherlands, October 1998.
1997
Evaluating word sense disambiguation: progress report. Proc. SALT Workshop on Evaluation of Speech and Language Technology Sheffield, June: 114-120.
Foreground and Background Lexicons and Word Sense Disambiguation for Information Extraction. Proc. Workshop on Lexicon Driven Information Extraction. Frascati, Italy, July: 51-62.
Using Word Frequency Lists to Measure Corpus Homogeneity and Similarity between Corpora. Proc. 5th ACL SIGDAT Workshop on Very Large Corpora: 231-245. Beijing and Hong Kong, August: 231-245.
What is Word Sense Disambiguation Good For? Proc. Natural Language Processing in the Pacific Rim (NLPRS ’97): Phuket, Thailand, December: 209-214.
1996
Which words are particularly characteristic of a text? A survey of statistical approaches. Language Engineering for Document Analysis and Recognition. Proceedings, AISB Workshop, Falmer, Sussex.
(with Raphael Salkie) Corpus similarity and homogeneity via word frequency. Proc. EURALEX Gothenberg, Sweden: 121-130.
Why chi-square doesn’t work, and an improved LOB-Brown comparison. Proc. ALLC-ACH Conference. Bergen, Norway: 169-172.
Word senses are not bona fide objects: implications for cognitive science, formal semantics, NLP. Proc. 5th Conf. on the Cognitive Science of Natural Language Processsing, Dublin: 193-200.
1995
(with Roger Evans) MRDs, Dictionaries, and How To Do Lexical Engineering. Proc. Second Language Engineering Convention. London: 125-134.
1994
The Myth of Completeness and Some Problems with Consistency. Proc. EURALEX, Amsterdam: 101-106.
Corpus Use at Longman. Teaching and Language Corpora. Lancaster.
A Dictionary for Language Generation. Papers on Computational Lexicography. Budapest: Hungarian Academy of Sciences: 127-136.
1993
Inheriting Verb Alternations. Proc. 6th Conference of the European Chapter of the Assn. for Computational Linguistics. Utrecht, Holland: 213-221.
1992
Inheriting Polysemy. Proc. Second Seminar on Computational Lexical Semantics. Toulouse, France: 198-211.
1991
Corpus Word Usages and Dictionary Word Senses: What is the Match? Using Corpora: Proc. Seventh Ann. Conf. of the UW Centre for the New OED. Oxford: 23-29.
1990
An Analysis of Distinctions Between Dictionary Word Senses. Papers on Computational Lexicography: Balatonfured, Hungary: 111-126.
Book Reviews
Christiane Fellbaum, ed. WordNet: an electronic lexical database. Review in Language 76 (3) 2000: 706-708.
Randolph Quirk Grammatical and Lexical Variance in English. Review in Linguistics 36, 1998: 209-212.
Chengming Guo Machine Tractable Dictionaries: Design and Construction Review in Linguistics 35 (5), 1997.
Yorick Wilks, Brian Slator and Louise Guthrie Electric Words: Dictionaries, Computers and Meanings. Review in Linguistics 35 (5), 1997.
Henri Béjoint Tradition and Innovation in Modern English Dictionaries. Review in Linguistics 34 (6), 1996. Pp 1278-1281.
Donald Walker, Antonio Zampolli and Nicoletta Calzolari Automating the Lexicon: Research and Practice in a Multilingual Environment. Review in Machine Translation Review 3, 1996. Pp 35-38.
Douglas Biber Dimensions of Register Variation. Review in Journal of Natural Language Engineering 1 (4), 1995: 396-399.
Gregory Grefenstette Explorations in Automatic Thesaurus Discovery. Review in Journal of Natural Language Engineering. 1 (4), 1995: 396-399.
Christina Alm-Alvius The English Verb See: A Study in Multiple Meaning. Review in Linguistics 33 (3), 1995. Pp 611-613.
Garrison Cottrell A Connectionist Approach to Word Sense Disambiguation. Review in AISB Quarterly 72, 1990. Pp 41-42.
Journalism, various
Words in Asia: report on the Asialex Conference EURALEX Newsletter, Autumn.
ELSNEWS 8.2 (June 1999) Don’t be a Dictionary Dentist.
LE Journal (Summer 1998) Lexicographic Quality.
ELRA (European Languages Resources Association) Newsletter 2 (2) 1997: Evaluating Word Sense Disambiguation Programs.
English Language Gazette, August 2002. How to learn about fish.