I was trying to find a consolidated list of papers in machine learning (ICML, NIPS, AAAI, SIGIR) and natural language processing (ACL, EMNLP, NAACL) published after 2000, which are held in some regard, perhaps by winning prizes such as Test-of-time paper at these major conferences. However, there seems to be no such list, or if it is, it’s hidden too deep and it may just be quicker to prepare a similar list of my own. I will add the papers in reverse chronological order of their publication year.
2009
- A General, Abstract Model of Incremental Dialogue Processing. David Schlangen and Gabriel Skantze. EACL 2009. Honorable mention at NAACL 2018
2008
-
A unified architecture for natural language processing: deep neural networks with multitask learning. Ronan Collobert and Jason Weston. ICML 2008. Test-of-time award at ICML 2018
-
Cheap and Fast—But is it Good?: Evaluating Non-Expert Annotations for Natural Language Tasks. Snow, O’Connor, Jurafsky, and Ng. EMNLP 2008. Honorable mention at NAACL 2018
-
Modeling Local Coherence: An entity-based approach. Regina Barzilay and Mirella Lapata. Transactions of ACL (2008). Honorable mention at NAACL 2018
2007
-
Random features for large scale kernel machines. Ali Rahimi and Ben Recht. NIPS 2007. Test-of-time award at NIPS 2017
-
Combining Online and Offline Knowledge in UCT. Sylvain Gelly and David Silver. ICML 2007. Test-of-time award at ICML 2017
-
Pegasos: Primal estimated sub-gradient solver for SVM. Shalev-Shwartz et al. ICML 2007. Honorable mention at ICML 2017
-
A Bound on the Label Complexity of Agnostic Active Learning. Steve Hanneke. ICML 2007. Honorable mention at ICML 2017
-
An Investigation into the Validity of Some Metrics for Automatically Evaluating Natural Language Generation Systems. Ehud Reiter and Anja Belz. Transactions of ACL 2009. Honorable mention at NAACL 2018
-
Frustratingly Easy Domain Adaptation. Hal Daume III. ACL 2007. Honorable mention at NAACL 2018
2006
-
Dynamic topic models. David Blei and John Lafferty. ICML 2006. Test-of-time award at ICML 2016
-
Improving web search ranking by incorporating user behavior information. Agichtein et al. SIGIR 2006. Test-of-time award at SIGIR 2018
2005
-
Learning to Rank Using Gradient Descent. Burges et al. ICML 2005. Test-of-time award at ICML 2015
-
Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis. Wilson, Weibi, and Hoffman. EMNLP 2005. Honorable mention at NAACL 2018
2004
-
Multiple kernel learning, conic duality, and the SMO algorithm. Michael Jordan’s group. ICML 2004. 10 year paper award at ICML 2014
-
A Linear Programming Formulation for Global Inference in Natural Language Tasks. Dan Roth and Wen-tau Yih. CoNLL 2004. Honorable mention at NAACL 2018
-
Evaluating Content Selection in Summarization: The Pyramid Method. Ani Nenkova and Rebecca Passonneau. NAACL 2004. Honorable mention at NAACL 2018
-
TextRank: Bringing Order into Texts. Rada Mihalcea and Paul Tarau. EMNLP 2004. Honorable mention at NAACL 2018
-
Trainable sentence planning for complex information presentation in spoken dialog systems. Stent, Prasad, and Walker. ACL 2004. Honorable mention at NAACL 2018
2003
-
Semi-Supervised Learning Using Gaussian Fields and Harmonic Functions. Zhu, Ghahramani, and Lafferty. ICML 2003. Classic paper prize at ICML 2013
-
Online Convex Programming and Generalized Infinitesimal Gradient Ascent. Martin Zinkevich. ICML 2003. Classic paper prize at ICML 2013
-
Anaphora and Discourse Structure. Webber et al. Computational Linguistics (2003). Honorable mention at NAACL 2018
-
Minimum Error Rate Training In Statistical Machine Translation. Franz Och. ACL 2003. Honorable mention at NAACL 2018
-
Probabilistic Text Structuring: Experiments with Sentence Ordering. Mirella Lapata. ACL 2003. Honorable mention at NAACL 2018
-
Sentence Level Discourse Parsing using Syntactic and Lexical Information. Radu Soricut and Daniel Marcu. NAACL 2003. Honorable mention at NAACL 2018
2002
-
An Unsupervised Method for Word Sense Tagging using Parallel Corpora. Mona Diab and Philip Resnik. ACL 2002. Honorable mention at NAACL 2018
-
BLEU: a Method for Automatic Evaluation of Machine Translation. Papineni et al. ACL 2002. Test-of-time award at NAACL 2018
-
Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms. Michael Collins. EMNLP 2002. Test-of-time award at NAACL 2018
-
Thumbs up?: Sentiment Classification using Machine Learning Techniques. Pang, Lee, and Vaithyanathan. EMNLP 2002. Test-of-time award at NAACL 2018
-
Unsupervised Discovery of Morphemes. Mathia Creutz and Krista Laguz. SIGPHON 2002. Honorable mention at NAACL 2018
2001
2000
-
Algorithms for non-negative matrix factorization. Daniel Lee and H. Sebastian Seung. NIPS 2000. Classic paper award at NIPS 2013
-
Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers. Erin Allwein, Robert Schapire, and Yoram Singer. ICML 2000. Best 10 year paper award at ICML 2000
-
PROMPT: Algorithm and Tool for Automated Ontology Merging and Alignment. Natalya Roy and Mark Musen. AAAI 2000. Classic paper award at AAAI 2018
Some random observations:
- NLP venues didn’t really have a classic paper section until this year’s NAACL, which is probably why so many papers were nominated.
- 2001 seems to have been a dismal year for NLP, with no good papers in the long run. By contrast, the community appears to have bounced back next year, with all 3 NAACL 2018 test-of-time awards given to papers from 2002.
- I have no idea why BLEU won. It was supposed to be an “understudy,” which is pretty clear from its name. The fact that it is still being used as an evaluation metric speaks more of a general failure to construct better metrics than of its strength.
- Since the papers are from before 2010, deep learning is conspicuous by its absence. In fact, Collobert and Weston’s ICML’08 paper on a unified architecture for language is the only such paper.
- Ali Rahimi’s “ML is alchemy” talk at NIPS’17 got a lot of attention, probably much more than his paper on random features.