Environmental Justice Research Repository

Interconnected histories of racism, urban ecology, and environmental activism in Eugene, Oregon.

Contents: Overview | Data Curation and Metadata for Digital Collections | Making Metadata Interoperable with CollectionBuilder | Metadata Remediation and Refinement Activity |

Data Curation and Metadata Interoperability Learning Sequence

🖋️ Lesson Plan for Instructors

This learning sequence’s instructor lesson plan is found on a different page. Please use this resource to help facilitate teaching the learning sequence. It also contains instructions for next steps after the sequence is finished.

🖋️ Note for Learning Sequence Advisory Board Reviewer You do not need to complete the “Let’s reflect” activity. The only activity you need to complete is the “Metadata Remediation and Refinement Activity”..

The Google Folder referenced in the lesson plan and learning sequence contains all the contents you need to complete the activity. The metadata sheet does not contain all the metadata using in the digital collection.

Once you have completed the learning sequence, you should then act as if you are the instructor and upload objects and metadata into your own CollectionBuilder GitHub repository. This will show the NEH Learn-Static team if you were able to complete the metadata remediation and refinement activity.

If you are unsure how to setup a CollectionBuilder repository or upload objects and metadata into it then visit the CollectionBuilder documentation page for instructions.

Overview

This learning sequence will prepare you for working with the Beyond Toxic Dataset before using the dataset for your own research purposes. Before anyone is able to use the dataset, it must first be made publicly available online using CollectionBuilder, the platform used to host, search, and browse the dataset.

This learning sequence gives you an introduction to data curation and standardizing metadata to be interoperable. It also asks you to complete a Beyond Toxic metadata remediation and refinement activity. By correcting and adding metadata to the Beyond Toxic dataset, you are directly contributing to making a digital collection for Beyond Toxic, a nonprofit environmental justice organization based in Eugene, Oregon, to use for their advocacy purposes.

What will you learn after completing this learning sequence

Data Curation and Metadata for Digital Collections

What is Data Curation?

Data Curation, sometimes called Digital Curation, is the necessary steps required to actov manage historical and scientific records. Data includes files representing photographs, tabulated data, sound recordings, video recordings, textual documents, 3D models, etc. and the accompanying metadata (data about data representing context and meaning for humans and computers).

Data Curation involves people using computers to do the following activities:

These activities can take shape through the following actions, but at not limited to this list:

Within the library, museums, and archives profession, data curation tends to have a team of information science professionals charged with overseeing all care activities. Here’s an example from the Smithsonian’s Archives. They have a digital curation team responsible for this type of work. Professionals take one specific roles connected to the data acquisition, accessioning, processing, archiving, and sharing.

What is Metadata?

At its core, metadata adds meaning and context to how data is described, technically represented, and strucured for humans and computers to manage.

There are many different catalogies of metadata types.

Digital Collections Metadata in Context: The Digital Public Library of America (DPLA)

DPLA is an online database where libraries, museums, and archives across the United States aggregate their digital collections to be found in one place instead of siloed in their own database systems.

DPLA allows anyone in the world to be able to search millions of digital and digitized primary sources made publicly available by different types of digital collection programs at research libraries and archives (University of Oregon and Oregon State University); government managed archives (The Library of Congress; museums (The Smithsonian and the Getty Museum of Art); and public libraries (Boston Public Library and New York Public Library).

How does metadata work with DPLA?

Libraries, museums, and archives contributing to DPLA follow a metadata standard, which gives DPLA the ability to aggregate institutional and organizationally curated digital collections.

A metadata standard supports the consistant ways digital objects need to be cataloged. For this lesson, we’ll focus primarily on how DPLA wants descriptive metadata standardized instead of highlighting the other types of metadata previously mentioned.

Lets look at two different digital objects in DPLA

These photographs are of two different libraries and contributed to DPLA by two different universities, the University of Oregon and Oregon State University.

Each of these objects are described using similar labels:

Let us reflect!

Take 2 minutes to write a quick lecture reflection response. You can use a notetaking app on your computer or use a paper and writing instrument.

Answer these questions:

Making Metadata Interoperable with CollectionBuilder

What is interoperability?

Interoperability allows people to setup and express metadata so computers can aggregate, read, and intergrate data within different computer systems like online databases.

How can metadata become interoperable?

Data dictionaries, also known as metadata application profiles (MAP, support metadata interoperability and standarization so data curators, computers systems, and data users are able to make sense of and use metadata.

An example of metadata interoperability in practice

Data curators at the University of Oregon and Oregon State University following a metadata standard so digital collections can not only be found and used in Oregon Digital, UO and OSU’s jointly managed digital repository for unique cultural heritage materials, but also in DPLA. In order for digital collections to live in two places, the metadata has to conform to metadata descriptions and upload requirements set by both the Oregon Digital repository system and DPLA’s aggregation requirements.

The library photos coming from Oregon Digital follow DPLA’s MAP. This MAP sets what is required for anything that goes into the DPLA repository system. It ultimately acts as data dictionary that tells data curators how to prepare metadata so it is interoperable.

Look at these photographs and look for similarities in descriptive metadata labels, identify how dates are displayed, and click on the Subject field to find other primary sources. Something to note: When clicking on a hyperclicked subject term, the system is able to find related objects because data curators use controlled vocabularies. This allows people to find categorized topics related to the “aboutness” of an object.

How does interoperability relate to the Beyond Toxic Dataset?

You as a data curator will need to follow the class data dictonary that will be supplied in the forthcoming activity. This data dictionary has been created to following metadata description standards. It complies with how CollectionBuilder (the comptuer system) wants data entry to be standardized in order for data to be integreted upon system ingest.

By following the data dictionary and adding objects (newspapers clippings represented through .pdf or .jpg) and metadata to CollectionBuilder, the people (data users) who come to the Beyond Toxic Data Repository will be able to search, browse, read, find, and download any data added to CollectionBuilder.

What CollectionBuilder system requirements need to be followed by data curators?

CollectionBuilder wants data curators to add values, a term from the computer science discipline used to describe numbers, letters, and the unique unique combinations of them, in a very specific way for certain fields.

Here is a specific example for how to do data entry for the Subject field.

subject:

Example value to add to a spreadsheet cell: Dogs; Cats; Zebras

For the data dictionary used in class, please review the required fields for CollectionBuilder-GH to see how the application wants values expressed in a required way.

Okay, now lets apply this knowledg through an instructor “Show and Tell”

Watch your instructor demonstrate how to access Beyond Toxic Dataset (objects and metadata spreadsheet) and work through how to remediate and refine metadata records to conform to CollectionBuilder’s requirements. After they are finished then you will need to complete the activity below.

Metadata Remediation and Refinement Activity

Activity Goal: The Beyond Toxic Dataset needs remediation and refinement so objects and their descriptions can be successfully added to CollectionBuilder. Your goal is to follow the class data dictionary and remediate and refine 3 metadata records (the rows in the Google Sheet) associated with a unique object found in the dataset.

After the class has completed the activity, objects and metadata records will be added into the Beyond Toxic Data Repository (CollectionBuilder) by the instructor.

Materials for the Activity

âť— Important Note to Learners âť— Everyone is going to be working in the same space. Before starting the activity, you will need to find the objects and metadata that are assigned to you.

First, located the metadata record that has been assigned to you.

  1. Open the Google Drive Folder
  2. Open the spreadsheet
  3. Find the column header titled: “Cataloger”. This metadata field identifies what metadata records are assigned to you. By the way, this is a version of administrative metadata not seen by your data users.

Second, locate the object’s file that is assigned to you.

  1. Open Object Directory.
  2. Find the object you have to catalog
  3. Read and review the object before you start refining and remediating metadata

Begin Your Metadata Refinement and Remedition Proccess

  1. Open Google Drive Folder and Open Object Directory
  2. Locate the objects and metadata records that you will be working with
  3. Open the class data dictionary, which is located in the Google Drive Folder.
  4. Read the data dictionary to understand what the metadata standard is asking you to do with your object’s metadata record
  5. Locate the folllowing metadata fields identified in the data dictionary
    • objectid
    • filename
    • subjects
    • format
    • longitude and latitude
    • rightsstatement
  6. For each metadata record and the field identified in the previous step, compare the values against how the data dictionary wants you to conform to it’s standard.

    • Look for discrepencies to correct
    • Some spreadsheet cells will have values requiring remediation and refinement
    • Some cells will not have any data. If there is not a value in a cell then add one according to the data dictionary.
  7. Start correcting your metadata records.

  8. Once you are done let the Instructor know. When everyone has completed the activity they will demo how objects and the metadata sheets are uploaded to CollectionBuilder.

If you have questions be sure to raise your head and ask for help from the instructor.