• HC Visitor
Skip to content
Information Ecosystems
Information Ecosystems

Information, Power, and Consequences

Primary Navigation Menu
Menu
  • InfoEco Podcast
  • InfoEco Blog
  • InfoEco Cookbook
    • About
    • Curricular Pathways
    • Cookbook Modules

Research Software & Building Useful Data from Absence

By: Jane Rohrer
On: February 7, 2020
In: Matthew Lincoln
Tagged: Curation, Data, data visualization, Information Ecosystems, Museums

On February 7th, one of the Seminar’s very own participants headed our lunchtime discussion; Dr. Matthew Lincoln, a research software engineer at Carnegie Mellon University Libraries, talked with us about museum informatics, archive management, and computational approaches to humanities projects. Although his transition to software engineer is relatively recent, his experience with data modelling and analysis is definitely not—before his move to Carnegie Mellon, Dr. Lincoln earned a Ph.D. in art history from University of Maryland, where he used computational methods to study 16th-18th century Dutch printmakers. This, along with his work on data engineering at the Getty Research Institute’s Getty Provenance Index Databases, makes him uniquely attuned to multiple aspects of building data sets and archiving. As Dr. Lincoln himself articulated during his talk, using large data sets as a Ph.D. candidate—what he worded as the “available technology”—alerted him to particular data absences within library and museum holdings; in other words, researchers can only carry out the large-scale digital projects that data actually exist for.

If you’ve ever searched for an eBook only to find that a digital version of this text does not (yet) exist, you know this feeling; it is, on a smaller scale, the same feeling a researcher might have if they, for example, wanted to compare one particular library system’s entire collection to another—but there is no usable data with which to do such a project. The project idea is there, the necessary data is not. This is where and why Dr. Lincoln’s job becomes so essential; his work has helped individuals browse museum archives, an exploratory tool which becomes incredibly useful if you, like so many people, don’t actually know, yet, exactly what you’re looking to do or discover within an archive. But if you do happen to know, research engineers like Dr. Lincoln will invest their time and data prowess to carry out a project with a deliverable; such projects he  has worked on include CMU’s Encyclopedia of the History of Science or koningsbergr—an R package to find a path across all of the bridges Pittsburgh.

Dr. Lincoln pointed out that our current version of the internet is neither inevitable nor permanent. Before we arrived here, plenty of people had plenty of ideas about how the Internet might look, feel, and function. And many of those hopes included plans for massive, centralized and institutionalized databases and archives—even more centralized and powerful than, say, the National Archives or Facebook’s massive log of data on its users’ clicks, conversations, likes, locations, and restaurant reviews. While Dr. Lincoln was careful to articulate that these massive collections, of course, have particular and important dangers (Facebook is a particularly useful case study of this), there is certainly something exciting about the possibility of creating and maintaining archives made truly accessible to the everyday user.

So let’s say someone took a visit to Philadelphia’s Barnes Foundation museum on their day off. They almost certainly won’t have time to take in every piece of art in the collection, and then they might not be making regular trips back to see later exhibits on display—meaning that this individual is really only experiencing a slim selection of the Foundation’s full holding. So it is an exciting thing, then, that you can browse and search through the entire collection online; and you don’t have to know exactly what you want, either—the website allows users to search by similarities between pieces in colors, lines, light, and space. And if you’re even less picky, you can just click “shuffle” on the whole thing. This is not a replacement experience for actually moving your body through a museum. Rather, this website promises that there is far more out there to find, discover, and analyze than a single afternoon’s browse could possible contain.

Archives can help us know what we don’t already about our world and the ones before ours, just as informatics helps us understand the big-picture of the data we use and create, often without even realizing, every day. The structures we use to access information or objects—whether it’s a library book, a sculpture, a biographical note about your favorite poet, or that citation that would perfectly compliment your research paper’s argument—do not materialize on their own. Research software engineers like Dr. Lincoln track down information, make it usable, and see that it continues to be usable for those after us. It is unlikely, and important to remember, that whoever first conceived of CMU’s Encyclopedia of the History of Science is the last or only person to find it extraordinarily useful. And, like Dr. Lincoln himself notes, it is impossible to predict how existing models and structures might spark future knowledge, let alone models we don’t even yet have.

So while the project of building cross-disciplinary consensus around data and their use is still very much in-progress—and thus, a truly accessible (or perhaps Open) data world can seem like pipe dream—Matthew Lincoln’s Sawyer Seminar talks was a highly useful reminder that there are already excellent resources out there to help us build and maintain the data we wish existed.

 

 

 

2020-02-07
Previous Post: What you can see in museums is just the tip of the iceberg
Next Post: The replication crisis gets to the heart of what counts as knowledge

Invited Speakers

  • Annette Vee
  • Bill Rankin
  • Chris Gilliard
  • Christopher Phillips
  • Colin Allen
  • Edouard Machery
  • Jo Guldi
  • Lara Putnam
  • Lyneise Williams
  • Mario Khreiche
  • Matthew Edney
  • Matthew Jones
  • Matthew Lincoln
  • Melissa Finucane
  • Richard Marciano
  • Sabina Leonelli
  • Safiya Noble
  • Sandra González-Bailón
  • Ted Underwood
  • Uncategorized

Recent Posts

  • EdTech Automation and Learning Management
  • The Changing Face of Literacy in the 21st Century: Dr. Annette Vee Visits the Podcast
  • Dr. Lara Putnam Visits the Podcast: Web-Based Research, Political Organizing, and Getting to Know Our Neighbors
  • Chris Gilliard Visits the Podcast: Digital Redlining, Tech Policy, and What it Really Means to Have Privacy Online
  • Numbers Have History

Recent Comments

    Archives

    • June 2021
    • April 2021
    • March 2021
    • February 2021
    • January 2021
    • December 2020
    • October 2020
    • September 2020
    • May 2020
    • March 2020
    • February 2020
    • January 2020
    • December 2019
    • November 2019
    • October 2019
    • September 2019

    Categories

    • Annette Vee
    • Bill Rankin
    • Chris Gilliard
    • Christopher Phillips
    • Colin Allen
    • Edouard Machery
    • Jo Guldi
    • Lara Putnam
    • Lyneise Williams
    • Mario Khreiche
    • Matthew Edney
    • Matthew Jones
    • Matthew Lincoln
    • Melissa Finucane
    • Richard Marciano
    • Sabina Leonelli
    • Safiya Noble
    • Sandra González-Bailón
    • Ted Underwood
    • Uncategorized

    Meta

    • Log in
    • Entries feed
    • Comments feed
    • WordPress.org

    Tags

    Algorithms Amazon archives artificial intelligence augmented reality automation Big Data Bill Rankin black history month burnout cartography Curation Darwin Data data pipelines data visualization digital humanities digitization diversity Education election maps history history of science Information Information Ecosystems Information Science Libraries LMS maps mechanization medical bias medicine Museums newspaper Open Data Philosophy of Science privacy racism risk social science solutions journalism Ted Underwood Topic modeling Uber virtual reality

    Menu

    • InfoEco Podcast
    • InfoEco Blog
    • InfoEco Cookbook
      • About
      • Curricular Pathways
      • Cookbook Modules

    Search This Site

    Search

    The Information Ecosystems Team 2025

    This site is part of Knowledge Commons. Explore other sites on this network or register to build your own.
    Terms of ServicePrivacy PolicyGuidelines for Participation