DATABASE

A Brief History

One of the most valuable assets of UCFC is its specimen database. As of June 2013, the entire pinned collection has been databased at the specimen level with 507,280 specimen records and this figure is growing rapidly. Fullerton initiated the retroactive databasing of the specimens in 1997, at which time, the size of the collection was about 148,000 specimens. He made an executive decision to format the unique identifier label as the collection codon (UCFC) followed by 7-digit number (e.g. UCFC-0148001), and this format continues today. UCFC’s simple identifier labels are human-readable and serve the purpose of digitizing the collection well, and, thus, there is no plan to retroactively place barcode labels on the collection’s holdings.

In 1997, Fullerton worked with several undergraduate students with computer programming skills to create a relational database using FileMaker Pro version 3. Because there was not a standard for digitizing natural history collections at that time, he decided to include as much information as possible into the database. Also, because most specimens were collected using traps, which had identical collecting information per trap sample, he created two separate but associated databases, one capturing descriptive data from specimens (basic taxonomic information, sex, and unique specimen identifier) and another capturing lot data from collecting information (locality, geo-coordinate, date, habitat, trapping method, collector). Fullerton and the student volunteers began databasing the collection retroactively, but this was eventually done in a proactive manner as they caught up and new materials were added into the collection. In 2005, Fullerton hired a part-time computer programmer to upgrade the software to FileMaker Pro version 7 and make the database available online with some search capacities via Instant Web Publishing (see Legacy Database). After 16 years of curation and databasing, the entire pinned collection has been digitized and 90% of the collection has been identified to at least the genus level. However, the FileMaker database is limited only to simple search functions, and most importantly, it does not comply with the Biodiversity Information Standards and the data are not provided to the Global Biodiversity Information Facility (GBIF).

In recognition of this major shortcoming, in 2011, we initiated the process of converting the UCFC data into a format that is interoperable with the network of other natural history collections in the world by developing a collaborative relationship with Norman Johnson at the Charles A. Triplehorn Insect Collection at Ohio State University (OSUC). OSUC is a registered data provider for GBIF through a DiGIR interface delivering the data in Darwin Core format. Johnson is a worldwide leader in collection digitization and bioinformatics and has built a very powerful, yet extremely flexible, database system using the Oracle RDBMS beginning in 1996. There is also an ongoing collaboration with OSUC and iDigBio, and thus our data can be seamlessly integrated into the large network of natural history collections. As a regional collection, UCFC does not have the necessary bioinformatics expertise to make our data comply with the ever-changing environment of collection digitization efforts.

Currently, we are in the process of converting our FileMaker data into a format that can be easily imported into the OSUC database system. Although the backbone of the new database resides at OSUC, the front-end for displaying the UCFC holdings has been natively implemented into our own website and is currently available to the public (see Search UCFC Holdings). Data for the first two thousand specimens have already been converted and we expect the complete data conversion to occur by the end of 2013. The digitization for the new materials will be done in a manner that complies with the OSUC database using an in-house web application to manage information within the database.