Andrea Thomer, assistant professor at the University of Arizona School of Information, discussed natural history data curation.
There are an estimated 1-3 billion natural history specimens in the world, and many of them are uncataloged—seashells and insects stuffed in drawers, animal remains floating in jars. Digitizing these specimens is a laborious task. The specimens, field notes, and other data stored in natural history collections can be crucial for studies of past and ongoing climate change—but only if they can be transformed into computationally-ready datasets.
On March 1, Andrea Thomer, assistant professor at the University of Arizona School of Information, discussed natural history data curation. In this talk, she described the CHANGES (Collections, Heterogenous data And Next Generation Ecological Synthesis) project, in which they are developing approaches to curate rich but under-utilized longitudinal datasets that are often stored in the archives of natural history collections and surveys. Working with over 100 years of archival records from the Michigan Institute for Fisheries Research, they used the Zooniverse community science platform to ask friendly strangers from the internet to help transcribe over 100,000 data cards. Extensive data curation is needed both before and after records are entered in Zooniverse. While they have developed some workflows that will likely be generalizable to similar projects, considerable curation ‘by hand’ is still needed. They find that digitization reveals the human idiosyncrasies that inevitably shape any artifact created by many people over many years.
Watch a recording of the talk here: