Fostering New Collaborations in Open Online Community Data Research: Prototyping an Open Collaboration Data Factory

Fostering New Collaborations in Open Online Community Data Research: Prototyping an Open Collaboration Data Factory

Open online communities (OOC) have emerged as significant drivers of innovation and social well-being. Data plays a pivotal role in understanding and managing OOCs, but the diversity of tools, idiosyncratic data formats, and divergent domain perspectives limits our ability to work across data sets and extract value from them. Principles and prototypes for sharing data and detailed methodological approaches have not yet been developed for the study of OOCs. The proposed project closes these gaps by building a community centered on developing a novel and potentially transformative alternative model for interdisciplinary research on cross-domain OOC datasets: an Open Collaboration Data Factory (OCDF). Through a series of meetings, workshops and hackathons, we will guide the community in 1) empirically identifying and validating the dimensions, criteria, and principles underlying shareable OOC datasets, 2) developing specifications for future tools, and 3) piloting the infrastructure for routinely creating this type of data resource. Open online communities (OOC) have emerged as significant drivers of innovation, economic activity, and social well-being. OOCs play important roles in a wide variety of areas, including but not limited to software development, general knowledge management, education, health, and scientific discovery. Scholars and practitioners from different disciplines (e.g. computer science, sociology mathematics, economics, physics, anthropology, organization science, communications) engage in OOC research to improve our ability to understand and help citizens successfully manage and grow these communities. Data now plays a pivotal role in these attempts as tools for data collection, handling and analysis have improved, but the diversity of tools, idiosyncratic data formats, and divergent domain perspectives has severely limited our ability to work across data sets, curtailing the value that can be extracted from these data resources. Because interest in OOCs crosses both disciplinary and problem domain boundaries, techniques used to develop data standards within a domain are unlikely to prove successful.

Duration: 
August 2014 - July 2018
Funder: 
National Science Foundation
Total Award Amount: 
$128,121

Principal Investigator:

Additional Investigators