RI: Medium: Collaborative Research: Closing the User-Model Loop for Understanding Topics in Large Document Collections

RI: Medium: Collaborative Research: Closing the User-Model Loop for Understanding Topics in Large Document Collections

Research into effective environments for information exploration has traditionally been represented by two distinct research areas: models and interfaces. Machine learning researchers build ever more complicated models, and human-computer interaction researchers build shiny interfaces assuming a static model. The lack of communication between these groups comes at a price: for many years the objective function of the machine learning researchers making topic models—likelihood on held-out data—negatively correlated with human judgments of topic model quality. Similarly, existing interfaces do not allow the user to correct or guide topic models. This proposal corrects this dichotomy by bringing together machine learning researchers with human-computer interaction researchers to build topic model interfaces that allow users to interactively refine the entirety of the topic modeling process: the number and granularity of topics, vocabulary selection—the words considered (or ignored) by the model—and finally interactive constraints on what words appear together in topics. Moreover, documents do not appear in isolation; effective analysis must also include document metadata, which allow users to explore and interact with the data through a metadata map interface. While these options have been usable by and accessible to machine learning experts for years, they have not been incorporated into new, more broadly available interfaces.

Duration: 
August 2014 - July 2017
Funder: 
National Science Foundation
Total Award Amount: 
$650,000

Principal Investigator: