Label Reconstruction from Fluid Preserved Lots

Chae Young Lee
David Skelly, Nelson Rios
Start Date: 
September, 2019

(This project is under Professor Skelly’s First-Year Seminar EVST 040) Collectively, the number of biological specimens in U.S. museums and herbaria exceeds 500 million. The number worldwide is estimated to be nearly 3 billion with more than 90 percent of associated data still locked away in cabinets, ledgers, labels and other physical archives. Over the past three decades, digitization efforts have focused on bringing this “dark data” into the light of the digital world. To meet the growing need for digitization at scale, new workflows and technologies that enable high throughput data capture from natural history specimens will need to be pioneered. This project will evaluate the potential of whole jar scanning through multi-view imaging and subsequent label reconstruction of fluid preserved specimen-lots. This  work will generate multi-view images from a selection of lots and apply various techniques to extract label data from the images.