• HC Visitor
Skip to content
Information Ecosystems
Information Ecosystems

Information, Power, and Consequences

Primary Navigation Menu
Menu
  • InfoEco Podcast
  • InfoEco Blog
  • InfoEco Cookbook
    • About
    • Curricular Pathways
    • Cookbook Modules

Darwin

Data Pipelines, Data Fluidity: Colin Allen on the “Useful Fiction” of Curated Data

2020-02-28
By: Jane Rohrer
On: February 28, 2020
In: Colin Allen
Tagged: Big Data, Darwin, data pipelines, Topic modeling

Colin Allen, distinguished professor in the Department of History and Philosophy of Science at the University of Pittsburgh, is both an invited speaker and an ongoing participant in our Seminar; on February 28th, Dr. Allen talked with his fellow participants about his work in what he (and others) call “data pipelines.” Broadly speaking, using data pipelines means that data are collected and recorded in one of many particular ways—but eventually used for purposes other than why they were originally collected. And this means, Dr. Allen pointed out, that data are highly fluid, flexible, and even self-perpetuating. An especially potent example of this in Allen’s own work is his current role as Associate Editor of the Stanford Encyclopedia of Philosophy. While this project has one discreet start date back in 1995, it has been anything but static since then; as of March 2018, the site has approximately 1,600 entries each of which is routinely reviewed and updated. Each new post adds to what is now a highly dynamic reference work containing data culled from all over the web—a pipeline, indeed. Dr. Allen thoughtfully pointed out that as our relationship to data changes over our collective futures, it is important to remember that data does not enter into our world on its own but, rather, it is collected and curated. Allen co-authored an article, “Exploration and Exploitation of Victorian Science in Darwin’s Reading Notebooks,” with Jaimie Murdock and Simon DeDeo in 2017. Charles Darwin left careful records of the books he read from 1837 to 1860, making this Read More

Self-perpetuating data and “guided serendipity”: Colin Allen’s reflection on Charles Darwin, topic modeling, and Margaret Floy Washburn

2020-02-27
By: Briana Wipf
On: February 27, 2020
In: Colin Allen
Tagged: Darwin, Topic modeling, Washburn

In his computational work, Colin Allen, distinguished professor in the Department of History and Philosophy of Science at the University of Pittsburgh, embraces the fact that the textual data he uses in his computational work often depends not on his choices, but on someone else’s. Data does not emerge, fully formed, for him and his colleagues to study. He discussed this characteristic of data when he addressed the Information Ecosystems Mellon Sawyer Seminar at the University of Pittsburgh on Friday, Feb. 28. Data, as Joanna Drucker has memorably argued, isn’t data as much as it’s capta. If we remember the Latin meaning of data is “things given” while capta is “things taken,” Drucker’s argument makes sense. The stuff we generate in our experiments or gather in the world doesn’t exist naturally. Rather, it’s taken or made (in which case I suppose we’d call it facta). In Drucker’s formation, we are reminded that data isn’t neutral but often exists according to the individual choice of this or that researcher, or this or that curator. Allen points out that the textual corpus — that is, his data — he uses for one project, Darwin’s reading list, for example, yields its own data when he runs a topic model of the corpus. The topics produced by the model is data he can then interpret in his own work. In this way, Allen explained to me when I interviewed him for an upcoming episode of the Information Ecosystems podcast, data has a habit of begetting more data. “I think it’s important to realize that Read More

Invited Speakers

  • Annette Vee
  • Bill Rankin
  • Chris Gilliard
  • Christopher Phillips
  • Colin Allen
  • Edouard Machery
  • Jo Guldi
  • Lara Putnam
  • Lyneise Williams
  • Mario Khreiche
  • Matthew Edney
  • Matthew Jones
  • Matthew Lincoln
  • Melissa Finucane
  • Richard Marciano
  • Sabina Leonelli
  • Safiya Noble
  • Sandra González-Bailón
  • Ted Underwood
  • Uncategorized

Recent Posts

  • EdTech Automation and Learning Management
  • The Changing Face of Literacy in the 21st Century: Dr. Annette Vee Visits the Podcast
  • Dr. Lara Putnam Visits the Podcast: Web-Based Research, Political Organizing, and Getting to Know Our Neighbors
  • Chris Gilliard Visits the Podcast: Digital Redlining, Tech Policy, and What it Really Means to Have Privacy Online
  • Numbers Have History

Recent Comments

    Archives

    • June 2021
    • April 2021
    • March 2021
    • February 2021
    • January 2021
    • December 2020
    • October 2020
    • September 2020
    • May 2020
    • March 2020
    • February 2020
    • January 2020
    • December 2019
    • November 2019
    • October 2019
    • September 2019

    Categories

    • Annette Vee
    • Bill Rankin
    • Chris Gilliard
    • Christopher Phillips
    • Colin Allen
    • Edouard Machery
    • Jo Guldi
    • Lara Putnam
    • Lyneise Williams
    • Mario Khreiche
    • Matthew Edney
    • Matthew Jones
    • Matthew Lincoln
    • Melissa Finucane
    • Richard Marciano
    • Sabina Leonelli
    • Safiya Noble
    • Sandra González-Bailón
    • Ted Underwood
    • Uncategorized

    Meta

    • Log in
    • Entries feed
    • Comments feed
    • WordPress.org

    Tags

    Algorithms Amazon archives artificial intelligence augmented reality automation Big Data Bill Rankin black history month burnout cartography Curation Darwin Data data pipelines data visualization digital humanities digitization diversity Education election maps history history of science Information Information Ecosystems Information Science Libraries LMS maps mechanization medical bias medicine Museums newspaper Open Data Philosophy of Science privacy racism risk social science solutions journalism Ted Underwood Topic modeling Uber virtual reality

    Menu

    • InfoEco Podcast
    • InfoEco Blog
    • InfoEco Cookbook
      • About
      • Curricular Pathways
      • Cookbook Modules

    Search This Site

    Search

    The Information Ecosystems Team 2023

    This site is part of Humanities Commons. Explore other sites on this network or register to build your own.
    Terms of ServicePrivacy PolicyGuidelines for Participation