Overview
This website is the public portal for an ongoing, collaborative corpus-based research project to study rock music, conducted by Trevor de Clercq [TdC] (trevor.declercq@gmail.com), David Temperley [DT] (dtemperley@esm.rochester.edu), Ethan Lustig, and Ivan Tan.
At this website, you can access:
An overview of our corpus (below), including a list of the songs
Harmonic analyses for the songs in our corpus and an explanation of our harmonic notational system
Melodic transcriptions of the same songs and an explanation of our melodic notational system
Timing data for these songs, as referenced to the original audio files
Lyrics and syllabic stress information aligned with the melodies for a subset of the corpus
Custom-written computer programs, used to extract data and aggregate statistics from our corpus
Specific documentation and archived files related to our 2011 paper in the journal Popular Music
The Corpus
Our corpus is based on Rolling Stone magazine's list of the "500 Greatest Songs of All Time." This list, which we refer to as the "RS 500," was original published in an article of the same name, which appeared in the December 9th issue (no. 963) from 2004, pages 65 - 165. This list used to be available online directly through the Rolling Stone magazine web site (as late as 2009), although it appears to have now been taken down.
An archive of the web version of the complete Rolling Stone list may still be found at web.archive.org/web/20080622145429/www.rollingstone.com/news/coverstory/500songs. In case this archive disappears, we created a tab-delimited text file list of the Rolling Stone 500 songs for download. The columns in this list are 1) the song rank, 2) the song title, 3) the artist name, and 4) the year.
Our current corpus is only a subset of the original 500-song list. We continue to expand the corpus, and its history is detailed below.
Our initial corpus was a selection of 100 songs, as reported in our 2011 paper. We wanted to create a subset with chronological balance, since the RS 500 is somewhat skewed towards earlier decades. So we took the top 20 songs on the RS 500 list from each decade, the 1950s through the 1990s, to create a 100-song list that we call the RS 5x20. (One of these songs, Public Enemy's "Bring the Noise", was considered not to contain any triadic harmony, so we did not analyze it; in effect, then, the corpus contained only 99 songs.) For reference, you can download a tab-delimited text file of the list of RS 5x20 songs, ordered by filename. The columns are: 1) the filename convention we used, 2) the song rank in the original RS 500 list, 3) the song title, 4) the artist, and 5) the year.
Since the publication of our 2011 paper, we have expanded our corpus by adding the next highest-ranked 101 songs from the original RS 500 list that were not in the 99-song RS 5x20 set, creating a total set of 200 songs. We refer to our current corpus as the RS 200. For reference, you can download a tab-delimited text file of the list of RS 200 songs, ordered by filename. The columns are: 1) the filename convention we used, 2) the song rank in the original RS 500 list, 3) the song title, 4) the artist, 5) the year, and 6) an asterix if the song was in the original RS 5x20 set.
Over the course of expanding the corpus, we have also offered an increasing number of annotations. We track these with version numbers. Note that small changes to the annotations may occur from version-to-version as, for example, one author may revise his harmonic analysis of a particular song.
- Version 1.0: Harmonic analyses (each song by both authors) of the RS 5x20.
- Version 1.1: Harmonic analyses (each song by both authors) of the RS 200.
- Version 2.0: Harmonic analyses (each song by both authors), melodic transcriptions (each song by one author), and timing data for the RS 200.
- Version 2.1: Harmonic analyses (each song by both authors), melodic transcriptions (each song by one author), and timing data for the RS 200.
(Version 2.1 fixes a problem with octave notations that existed in Version 2.0.)
More complete explanations of the harmonic analyses, melodic transcriptions, and timing data can be found on the relevant pages.
N.B.: All work (including all data, transcriptions, scripts, and programs) posted on our rock corpus web site (http://theory.esm.rochester.edu/rock_corpus/) can be used under the terms of the CC BY 4.0 License. This means that it can be freely used and adapted in academic or commercial projects, without our permission, as long as appropriate credit is given.