|
|
Research Papers
Salton Award Lecture
(July 29, 9:00 - 10:30)
Session 1: Retrieval Models
(July 29, 11:00 - 12:30)
-
Bayesian extension to the Language Model for ad hoc Information Retrieval.
Hugo Zaragoza (Microsoft Research), Djoerd Hiemstra (U. Twente), Michael Tipping (Microsoft Research), Stephen Robertson (Microsoft Research).
-
Beyond independent relevance: Methods and evaluation metrics for subtopic retrieval.
ChengXiang Zhai (University of Illinois at Urbana-Champaign), William W. Cohen (Carnegie Mellon University), John Lafferty (Carnegie Mellon University).
-
Empirical Development of an Exponential Probabilistic Model for Text Retrieval Using Textual Analysis to Build a Better Model.
Jaime Teevan (MIT AI Lab), David Karger (MIT LCS).
Session 2: Question Answering
(July 29, 14:00 - 15:30)
-
Question Classification using Support Vector Machines.
Dell Zhang (Department of Computer Science, School of Computing, National University of Singapore), Wee Sun Lee (Department of Computer Science, School of Computing, National University of Singapore).
-
Structured Use of External Knowledge for Event-based Open Domain Question Answering.
Hui Yang (School of Computing, National University of Singapore), Tat-Seng Chua (School of Computing, National University of Singapore), Shuguang Wang (School of Computing, National University of Singapore), Chun-Keat Koh (School of Computing, National University of Singapore).
-
Quantitative Evaluation of Passage Retrieval Algorithms for Question Answering.
Stefanie Tellex (MIT AI Lab), Boris Katz (MIT AI Lab), Jimmy Lin (MIT AI Lab), Aaron Fernandes (MIT AI Lab), Gregory Marton (MIT AI Lab).
Session 3: Web
(July 29, 16:00 - 17:30)
-
Building a Web Thesaurus from Web Link Structure.
Zheng Chen (Microsoft Research Asia), Shengping Liu (Peking University of China), Liu Wenyin (City Univ. of Hong Kong), Geguang Pu (Peking University of China), Wei-Ying Ma (Microsoft Research Asia).
-
User Access Pattern Enhanced Small Web Search.
Gui-Rong Xue (Computer Science and Engineering Department, Shanghai Jiao-Tong University), Hua-Jun Zeng (Microsoft Research Asia), Zheng Chen (Microsoft Research Asia), Wei-Ying Ma (Microsoft Research Asia), Hong-Jiang Zhang (Microsoft Research Asia), Chao-Jun Lu (Computer Science and Engineering Department, Shanghai Jiao-Tong University).
-
Query Type Classification for Web Document Retrieval.
In-Ho Kang (KAIST), GilChang Kim (KAIST).
Session 4: Human Interaction
(July 30, 9:00 - 10:30)
-
Stuff I've Seen: A System for Personal Information Retrieval and Re-Use.
Susan Dumais (Microsoft Research), Edward Cutrell (Microsoft Research), JJ Cadiz (Microsoft Research), Gavin Jancke (Microsoft Research), Raman Sarin (Microsoft Research), Daniel C. Robbins (Microsoft Research).
-
Search Strategies in Content-Based Image Retrieval.
Sharon McDonald (University of Sunderland), JohnTait (university of Sunderland).
-
Using Terminological Feedback for Web Search Refinement - A Log-based Study.
Peter Anick (AltaVista).
Session 5A: Text Categorization
(July 30, 11:00 - 12:30)
-
A Scalability Analysis of Classifiers in Text Categorization.
Yiming Yang (Carnegie Mellon University), Jian Zhang (Carnegie Mellon University), Bryan Kisiel (Carnigie Mellon University).
-
A repetition based measure for verification of text collections and for text categorization.
Dmitry V. Khmelev (University of Toronto and Moscow State University), William J. Teahan (University of Wales, Bangor).
-
Using Asymmetric Distributions to Improve Text Classifier Probability Estimates.
Paul N. Bennett (Carnegie Mellon University).
Session 5B: Multimedia Information Retrieval
(July 30, 11:00 - 12:30)
-
Automatic Image Annotation and Retrieval Using Cross-Media Relevance Models.
J. Jeon (University of Massachusetts, Amherst), V. Lavrenko (University of Massachusetts, Amherst), R. Manmatha (University of Massachusetts, Amherst).
-
Modeling Annotated Data.
David M. Blei (U.C. Berkeley), Michael I. Jordan (U.C. Berkeley).
-
Experimental Result Analysis for a Generative Probabilistic Image Retrieval Model.
Thijs Westerveld (CWI), Arjen P. de Vries (CWI).
Session 6A: Structured Documents
(July 30, 14:30 - 15:30)
-
Combining Document Representations for Known-Item Search.
Paul Ogilvie (Carnegie Mellon University), Jamie Callan (Carnegie Mellon University).
-
Searching XML document via XML fragments.
David Carmel (IBM Research Lab in Haifa), Yoelle Maarek (IBM Research Lab in Haifa), Matan Mandelbrod (IBM Research Lab in Haifa), Yosi Mass (IBM Research Lab in Haifa), Aya Soffer (IBM Research Lab in Haifa).
Session 6B: Text Representation
(July 30, 14:30 - 15:30)
-
Word Sense Disambiguation in Information Retrieval Revisited.
Christopher Stokoe (The University of Sunderland), John Tait (The University of Sunderland), Michael Oakes (The University of Sunderland).
-
Probabilistic Term Variant Generator for Biomedical Terms.
Yoshimasa Tsuruoka (CREST, Japan Science and Technology Corporation), Jun'ichi Tsujii (the University of Tokyo).
Session 7A: Text Categorization
(July 30, 16:00 - 17:30)
-
A Maximal Figure-of-Merit Learning Approach to Text Categorization.
Sheng Gao (Institute for Infocomm Research), Wen Wu (National University of Singapore), Chin-Hui Lee (Georgia Institute of Technology), Tat-Seng Chua (National University of Singapore).
-
Text Categorization by Boosting Automatically Extracted Concepts.
Lijuan Cai (Brown University), Thomas Hofmann (Brown University).
-
Robustness of Regularized Linear Classification Methods in Text Categorization.
Jian Zhang (School of Computer Science, Carnegie Mellon University), Yiming Yang (School of Computer Science, Carnegie Mellon University).
Session 7B: Human Interaction
(July 30, 16:00 - 17:30)
-
Building and Applying a Concept Hierarchy Representation of a User Profile.
Nikolaos Nanas (The Open University), Victoria Uren (The Open University), Anne De Roeck (The Open University), John Domingue (The Open University).
-
Query Length in Interactive Information Retrieval.
Nicholas Belkin (Rutgers University), Colleen Cool (Queens College, CUNY), Diane Kelly (Rutgers University), Giyeong Kim (Rutgers University), Ja-Young Kim (Rutgers University), Hyuk-Jin Lee (Rutgers University), Gheorghe Muresan (Rutgers University), Muh-Chyun Tang (Rutgers University), Xiao-Jun Yuan (Rutgers University).
-
Re-examining the potential effectiveness of interactive query expansion.
Ian Ruthven (University of Strathclyde).
Session 8A: IR Theory
(July 31, 11:00 - 12:30)
-
The Number of Orthogonal Factors in Latent Semantic Analysis.
Georges Dupret (IBM).
-
A Frequency-based and a Poisson-based Definition of the Probability of being Informative.
Thomas Roelleke (Queen Mary University of London).
-
Table Extraction Using Conditional Random Fields.
David Pinto (University of Massachusetts Amherst), Andrew McCallum (University of Massachusetts Amherst), Xing Wei (University of Massachusetts Amherst), W. Bruce Croft (Univeristy of Massachusetts Amherst).
Session 8B: Filtering and Retrieval Models
(July 31, 11:00 - 12:30)
-
Building a Filtering Test Collection for TREC 2002.
Ian Soboroff (NIST), Stephen Robertson (Microsoft Research Cambridge).
-
An Empirical Study on Retrieval Models for Different Document Genres: Patents and Newspaper Articles.
Makoto Iwayama (Hitachi Ltd.), Atsushi Fujii (University of Tsukuba), Noriko Kando (NII), Yuzo Marukawa (Tokyo Institute of Technology).
-
Collaborative Filtering via Gaussian Probabilistic Latent Semantic Analysis.
Thomas Hofmann (Brown University).
Session 9A: Clustering
(July 31, 14:00 - 15:30)
-
Document Clustering Based On Non-negative Matrix Factorization.
Wei Xu (NEC Labs America), Xin Liu (NEC Labs America), Yihong Gong (NEC Labs America).
-
ReCoM: Reinforcement Clustering of Multi-Type Interrelated Data Objects.
Jidong Wang (Microsoft Research, Asia), Huajun Zeng (Microsoft Research, Asia), Zheng Chen (Microsoft Research, Asia), Hongjun Lu (Department of Computer Science, Hong Kong University of Science and Technology), Li Tao (Microsoft Research, Asia), Wei-Ying Ma (Microsoft Research, Asia).
-
A Comparative Study on Content-based Music Genre Classification.
Tao Li (University of Rochester), Qi Li (University of Delaware), Mitsunori Ogihara (University of Rochester).
Session 9B: Distributed Information Retrieval
(July 31, 14:00 - 15:30)
-
Methods of estimating retrieval quality for a probabilistic model of resource selection.
Henrik Nottelmann (University of Duisburg-Essen), Norbert Fuhr (University of Duisburg-Essen).
-
Relevant Document Distribution Estimation Method for Resource Selection.
Luo Si (Carnegie Mellon University), Jamie Callan (Carnegie Mellon University).
-
SETS: Search Enhanced by Topic Segmentation.
Mayank Bawa (Stanford University), Gurmeet S Manku (Stanford University), Prabhakar Raghavan (Verity Inc.).
Session 10A: Novelty and Topic Change
(July 31, 16:00 - 17:30)
-
Retrieval and Novelty Detection at the Sentence Level.
James Allan (University of Massachusetts at Amherst), Courtney Wade (University of Massachusetts at Amherst), Alvaro Bolivar (University of Massachusetts at Amherst).
-
Domain-independent Text Segmentation Using Anisotropic Diffusion and Dynamic Programming.
Xiang Ji (Pennsylvania State University), Hongyuan Zha (Pennsylvania State University).
-
A System for New Event Detection.
Thorsten Brants (Palo Alto Research Center), Francine Chen (Palo Alto Research Center), Ayman Farahat (Palo Alto Research Center).
Session 10B: Cross-Lingual Information Retrieval
(July 31, 16:00 - 17:30)
-
Probabilistic Structured Query Methods.
Kareem Darwish (ECE), Douglas W. Oard (UMIACS).
-
Fuzzy Translation of Cross-lingual Spelling Variants.
Ari Pirkola (University of Tampere), Jarmo Toivonen (Tampere University of Technology), Heikki Keskustalo (University of Tampere), Kari Visala (University of Tampere), Kalervo Jdrvelin (University of Tampere).
-
Automatic Transliteration for Japanese-to-English Text Retrieval.
Yan Qu (Clairvoyance Corporation), Gregory Grefenstette (Clairvoyance Corporation), David A. Evans (Clairvoyance Corporation).
|