Matthew Lincoln

Research Software & Building Useful Data from Absence

On: February 7, 2020

Tagged: Curation, Data, data visualization, Information Ecosystems, Museums

On February 7th, one of the Seminar’s very own participants headed our lunchtime discussion; Dr. Matthew Lincoln, a research software engineer at Carnegie Mellon University Libraries, talked with us about museum informatics, archive management, and computational approaches to humanities projects. Although his transition to software engineer is relatively recent, his experience with data modelling and analysis is definitely not—before his move to Carnegie Mellon, Dr. Lincoln earned a Ph.D. in art history from University of Maryland, where he used computational methods to study 16th-18th century Dutch printmakers. This, along with his work on data engineering at the Getty Research Institute’s Getty Provenance Index Databases, makes him uniquely attuned to multiple aspects of building data sets and archiving. As Dr. Lincoln himself articulated during his talk, using large data sets as a Ph.D. candidate—what he worded as the “available technology”—alerted him to particular data absences within library and museum holdings; in other words, researchers can only carry out the large-scale digital projects that data actually exist for. If you’ve ever searched for an eBook only to find that a digital version of this text does not (yet) exist, you know this feeling; it is, on a smaller scale, the same feeling a researcher might have if they, for example, wanted to compare one particular library system’s entire collection to another—but there is no usable data with which to do such a project. The project idea is there, the necessary data is not. This is where and why Dr. Lincoln’s job becomes so essential; his work has helped Read More

What you can see in museums is just the tip of the iceberg

By: Erin O'Rourke

On: February 6, 2020

In: Matthew Lincoln

Tagged: Curation, Data, Information Ecosystems, Linked Open Data, Matt Lincoln, Museums

While all of the Sawyer Seminar speakers so far have been scholars or users of information ecosystems, Matt Lincoln is potentially unique in coding them. His Ph.D. in Art History, time as a data research specialist at the Getty Research Institute, and most recently, work as a research software engineer at Carnegie Mellon University have given him substantial knowledge about museums’ information systems, as well as the broader context of the seminar. For Lincoln, “data” consists of collections of art and associated facts and metadata. In his public talk, entitled “Ways of Forgetting: The Librarian, The Historian, and the Machine,” Dr. Lincoln focused on a case study from his time at the Getty, in which he was working on a project restructuring the way art provenance data were organized in databases. Lincoln argued that depending on who the creator or end-user of the information would be (whether librarian, historian or computer), the way the data are structured can vary. A historian would likely prefer open-ended text fields in which to establish a rich context with details specific to the piece, whereas a librarian would opt to record the same details about every piece, and a computer would prefer the data to be stored in some highly structured format, with lists of predefined terms that can populate each field. On top of balancing these disparate goals, Lincoln cited a particularly poignant Jira ticket, which asked: “Are we doing transcription of existing documents or trying to represent reality?” This question might well be answered with “both” since the Read More

Tags