Event box

OpenRefine - In-Person

OpenRefine is described as “a power tool for working with messy data” David Huynh - but what does this mean? It is probably easiest to describe the kinds of data OpenRefine is good at working with and the sorts of problems it can help you solve.

OpenRefine is most useful where you have data in a simple tabular format such as a spreadsheet, a comma separated values file (csv) or a tab delimited file (tsv) but with internal inconsistencies either in data formats, or where data appears, or in terminology used. OpenRefine can be used to standardize and clean data across your file. It can help you:

  • Get an overview of a data set
  • Resolve inconsistencies in a data set, for example standardizing date formatting
  • Help you split data up into more granular parts, for example splitting up cells with multiple authors into separate cells
  • Match local data up to other data sets, for example in matching local subjects against the Library of Congress Subject Headings
  • Enhance a data set with data from other sources

Some common scenarios might be:

  • Where you want to know how many times a particular value (name, publisher, subject) appears in a column in your data
  • Where you want to know how values are distributed across your whole data set
  • Where you have a list of dates which are formatted in different ways, and want to change all the dates in the list to a single common date format.
Date:
Wednesday, July 10, 2019
Time:
1:00pm - 4:00pm
Time Zone:
Auckland (change)
Location:
Poutama (Central Library Level 3)
Categories:
  Software Carpentry  
Registration has closed.

contact me for help (profile box)

Anton Angelo