Award-winning classic papers in ML and NLP

I was trying to find a consolidated list of papers in machine learning (ICML, NIPS, AAAI, SIGIR) and natural language processing (ACL, EMNLP, NAACL) published after 2000, which are held in some regard, perhaps by winning prizes such as Test-of-time paper at these major conferences. However, there seems to be no such list, or if it is, it’s hidden too deep and it may just be quicker to prepare a similar list of my own. I will add the papers in reverse chronological order of their publication year.

2009

A General, Abstract Model of Incremental Dialogue Processing. David Schlangen and Gabriel Skantze. EACL 2009. Honorable mention at NAACL 2018

2008

A unified architecture for natural language processing: deep neural networks with multitask learning. Ronan Collobert and Jason Weston. ICML 2008. Test-of-time award at ICML 2018
Cheap and Fast—But is it Good?: Evaluating Non-Expert Annotations for Natural Language Tasks. Snow, O’Connor, Jurafsky, and Ng. EMNLP 2008. Honorable mention at NAACL 2018
Modeling Local Coherence: An entity-based approach. Regina Barzilay and Mirella Lapata. Transactions of ACL (2008). Honorable mention at NAACL 2018

2007

Random features for large scale kernel machines. Ali Rahimi and Ben Recht. NIPS 2007. Test-of-time award at NIPS 2017
Combining Online and Offline Knowledge in UCT. Sylvain Gelly and David Silver. ICML 2007. Test-of-time award at ICML 2017
Pegasos: Primal estimated sub-gradient solver for SVM. Shalev-Shwartz et al. ICML 2007. Honorable mention at ICML 2017
A Bound on the Label Complexity of Agnostic Active Learning. Steve Hanneke. ICML 2007. Honorable mention at ICML 2017
An Investigation into the Validity of Some Metrics for Automatically Evaluating Natural Language Generation Systems. Ehud Reiter and Anja Belz. Transactions of ACL 2009. Honorable mention at NAACL 2018
Frustratingly Easy Domain Adaptation. Hal Daume III. ACL 2007. Honorable mention at NAACL 2018

2006

Dynamic topic models. David Blei and John Lafferty. ICML 2006. Test-of-time award at ICML 2016
Improving web search ranking by incorporating user behavior information. Agichtein et al. SIGIR 2006. Test-of-time award at SIGIR 2018

2005

Learning to Rank Using Gradient Descent. Burges et al. ICML 2005. Test-of-time award at ICML 2015
Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis. Wilson, Weibi, and Hoffman. EMNLP 2005. Honorable mention at NAACL 2018

2004

Multiple kernel learning, conic duality, and the SMO algorithm. Michael Jordan’s group. ICML 2004. 10 year paper award at ICML 2014
A Linear Programming Formulation for Global Inference in Natural Language Tasks. Dan Roth and Wen-tau Yih. CoNLL 2004. Honorable mention at NAACL 2018
Evaluating Content Selection in Summarization: The Pyramid Method. Ani Nenkova and Rebecca Passonneau. NAACL 2004. Honorable mention at NAACL 2018
TextRank: Bringing Order into Texts. Rada Mihalcea and Paul Tarau. EMNLP 2004. Honorable mention at NAACL 2018
Trainable sentence planning for complex information presentation in spoken dialog systems. Stent, Prasad, and Walker. ACL 2004. Honorable mention at NAACL 2018

2003

Semi-Supervised Learning Using Gaussian Fields and Harmonic Functions. Zhu, Ghahramani, and Lafferty. ICML 2003. Classic paper prize at ICML 2013
Online Convex Programming and Generalized Infinitesimal Gradient Ascent. Martin Zinkevich. ICML 2003. Classic paper prize at ICML 2013
Anaphora and Discourse Structure. Webber et al. Computational Linguistics (2003). Honorable mention at NAACL 2018
Minimum Error Rate Training In Statistical Machine Translation. Franz Och. ACL 2003. Honorable mention at NAACL 2018
Probabilistic Text Structuring: Experiments with Sentence Ordering. Mirella Lapata. ACL 2003. Honorable mention at NAACL 2018
Sentence Level Discourse Parsing using Syntactic and Lexical Information. Radu Soricut and Daniel Marcu. NAACL 2003. Honorable mention at NAACL 2018

2002

An Unsupervised Method for Word Sense Tagging using Parallel Corpora. Mona Diab and Philip Resnik. ACL 2002. Honorable mention at NAACL 2018
BLEU: a Method for Automatic Evaluation of Machine Translation. Papineni et al. ACL 2002. Test-of-time award at NAACL 2018
Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms. Michael Collins. EMNLP 2002. Test-of-time award at NAACL 2018
Thumbs up?: Sentiment Classification using Machine Learning Techniques. Pang, Lee, and Vaithyanathan. EMNLP 2002. Test-of-time award at NAACL 2018
Unsupervised Discovery of Morphemes. Mathia Creutz and Krista Laguz. SIGPHON 2002. Honorable mention at NAACL 2018

2001

2000

Algorithms for non-negative matrix factorization. Daniel Lee and H. Sebastian Seung. NIPS 2000. Classic paper award at NIPS 2013
Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers. Erin Allwein, Robert Schapire, and Yoram Singer. ICML 2000. Best 10 year paper award at ICML 2000
PROMPT: Algorithm and Tool for Automated Ontology Merging and Alignment. Natalya Roy and Mark Musen. AAAI 2000. Classic paper award at AAAI 2018

Some random observations:

NLP venues didn’t really have a classic paper section until this year’s NAACL, which is probably why so many papers were nominated.
2001 seems to have been a dismal year for NLP, with no good papers in the long run. By contrast, the community appears to have bounced back next year, with all 3 NAACL 2018 test-of-time awards given to papers from 2002.
I have no idea why BLEU won. It was supposed to be an “understudy,” which is pretty clear from its name. The fact that it is still being used as an evaluation metric speaks more of a general failure to construct better metrics than of its strength.
Since the papers are from before 2010, deep learning is conspicuous by its absence. In fact, Collobert and Weston’s ICML’08 paper on a unified architecture for language is the only such paper.
Ali Rahimi’s “ML is alchemy” talk at NIPS’17 got a lot of attention, probably much more than his paper on random features.

Other similar lists

Best paper award winners in Computer Science