From Archive Data to DH Practice: Starting a new Digital Humanities Project

At the University of Miami’s Cuban Heritage Collection (CHC), archivist Amanda Moreno has been collaborating with the Digital Humanities Support team of the Richter Libraries to create “Global Cuba Collections,” a digital map of Cuban archives located in the U.S. and around the world. Below, I discuss some of the initial steps in developing “Global Cuba Collections” (GCC) as a Digital Humanities project. This discussion may be relevant to anyone working to create a new DH project based on a data from an archive or library of interest to them.

The GCC project aims to provide a comprehensive list of Cuban archives and information about their holdings, as a starting point for research scholars. The GCC would also include the location and contact information for any given Cuban archive, aiding scholars in making travel and research plans. As a DH Research Fellow at UM’s Richter Libraries (courtesy of the UGrow Program), I worked with digital scholarship librarian Paige Morgan to support Moreno’s project. The points below are my reflections drawn from this experience.

Creating/Categorizing Data

Putting together the relevant data is one of the first steps in any DH project. Creating a spreadsheet with a few main categories of information helps to congeal one’s project idea. After consulting with Morgan, who is UM’s Digital Scholarship Librarian and Scholarly Publishing Officer, I asked Moreno to share with us the information that she wanted to feature in the GCC project. Moreno shared with us a Google Spreadsheet with a sample list of Cuban archives, including names of institutions that held Cuban archives, names of relevant collections within them, the historical period documented, and a brief description of the archive.

Data curation is not only a key step in turning one’s research into a DH project, but also an important skill in itself. Creating a data spreadsheet is a great practice for a project’s sustainability, since it generates a research object that can be saved outside the project’s main software platform or even offline. For instance, users conducting their project on the Omeka platform (online) may be able to preserve their data in a spreadsheet (a .csv file), which can be opened in MS Excel and other applications.

Finding a model DH project

Finding a DH project that is worthy of emulation (or one that contains some of the key features you are trying to incorporate into your project) can provide concrete direction to your work. It also facilitates collaboration by giving your colleagues an estimate of your project’s goals.

Moreno shared with us a model DH project on ArcGIS StoryMaps, a web-platform for cartographic and audiovisual information, titled “A Selection of Frank Lloyd Wright’s Buildings”. This model project quickly gave me a sense of what she wanted to achieve with the GCC data. Seeing that the project uses the Map Series (Tabbed) feature within the StoryMap app helped me understand what visual and conceptual features Moreno wanted in the GCC project. For instance, each institution would be represented by a separate tab in the Map Series, and within each tab, various collections would be represented by thumbnail images. Clicking on the thumbnail would reveal its location on the map panel and further information regarding accessing it.

In my discussion with Moreno, I showed her a model project on the Neatline platform called “Perspectives on the Haram,” which collates travel writing on the Masjid Al-Haram over several years. This model project features nested tabs that could be used to represent archival institutions and collections within them. Further, clicking on the collection tab would reveal not only its cartographic location, but also a horizontal timeline indicating the historical span covered by it.

Minimum Viable Project or Proof of Concept

As you embark on your DH project, it is immensely helpful to conceive a minimal form that it can take. You can think of it as a proof-of-concept version of the project that can be expanded on further. The Minimum Viable Project (MVP) will not only give new DH practitioners a sense of the challenges involved in the project but also enables them to get feedback from collaborators in their field and in the library. Often, the MVP can be shared with potential collaborators or grant-making organizations to help expand it into a full-scale project.

Based on the spreadsheet information initially provided to me by Moreno, I put together a minimal sample on ArcGIS StoryMaps that functioned as a proof of concept for the GCC. This presentation on the ArcGIS StoryMaps platform can be shared with other people at the CHC, as we conceive ways to expand the GCC project.

Platform shopping

With the increasing number of software platforms available, committing to a single one for your entire research project can be a big decision. If more than one software platform provides the features you need for your DH project, you may want to dwell on issues of access, sustainability, and ease of use. It is increasingly common for new DH projects to use proprietary or open source software platforms, rather than software created by an individual specialist or an academic institution. DH projects on proprietary software (such as ArcGIS StoryMaps) and open-source software (such as Neatline) tend to be more sustainable in the long run, since researchers and software specialists may migrate from one institution to another.

The Eugenics Archive is a good example of a software platform created by an individual and catered to the specific needs of the project. ArcGIS StoryMaps, which belongs to a private company named ESRI, ranks highly in terms of ease of use and range of features. Since the University of Miami currently licenses and supports ArcGIS StoryMaps and an array of ESRI products, this platform was easily accessible to Moreno and myself. If your academic institution does not license the full suite of ArcGIS products, they can be a bit pricey. But a version of the ArcGIS StoryMaps platform is available to the public.

Neatline is an open-source platform maintained by the University of Virginia. Since you can obtain it for a very reasonable price, your institutional access to it may not be an issue. Neatline offers an array of features, but it is not the most user-friendly software, currently being available as an add-on feature within the Omeka platform. These factors may come into play as we decide how to take the GCC project forward.