Research Software & Building Useful Data from Absence
On February 7th, one of the Seminar’s very own participants headed our lunchtime discussion; Dr. Matthew Lincoln, a research software engineer at Carnegie Mellon University Libraries, talked with us about museum informatics, archive management, and computational approaches to humanities projects. Although his transition to software engineer is relatively recent, his experience with data modelling and analysis is definitely not—before his move to Carnegie Mellon, Dr. Lincoln earned a Ph.D. in art history from University of Maryland, where he used computational methods to study 16th-18th century Dutch printmakers. This, along with his work on data engineering at the Getty Research Institute’s Getty Provenance Index Databases, makes him uniquely attuned to multiple aspects of building data sets and archiving. As Dr. Lincoln himself articulated during his talk, using large data sets as a Ph.D. candidate—what he worded as the “available technology”—alerted him to particular data absences within library and museum holdings; in other words, researchers can only carry out the large-scale digital projects that data actually exist for. If you’ve ever searched for an eBook only to find that a digital version of this text does not (yet) exist, you know this feeling; it is, on a smaller scale, the same feeling a researcher might have if they, for example, wanted to compare one particular library system’s entire collection to another—but there is no usable data with which to do such a project. The project idea is there, the necessary data is not. This is where and why Dr. Lincoln’s job becomes so essential; his work has helped Read More