This is going to be a short post. Because the only thing I did for this package was to blatantly copy the work of Colin Morris on song lyrics self-similarity matrices (a.k.a “SongSim”), into R. My friend sent me this video:

And I was blown away. After I was blown away, it took me exactly a few minutes to replicate Colin’s SongSim matrices in R, and make a package for it, for further research.

What is a SongSim Matrix? What is a SongSim Matrix?

First, you probably should watch the video.

Then, you should probably play with Colin’s cool React demo, and read everything’s there.

Then, you should probably read Colin’s excellent blog posts: here and here.

At this point you should probably just go ahead, install the songsim package and make SongSim matrices to your favorite songs. But if you’re still here, you’re probably expecting me to explain what SongSim matrices are.

Take Madonna’s Material Girl. It has 339 words, and these 339 words correspond to our 339 rows x 339 columns of the matrix. Now, each cell \((i, j)\) of the matrix will be filled (in my R representation it will have a value of 1), if word \(i\) is the same as word \(j\). That simple.

library(songsim)

songsim("~/material_girl.txt")

Colin suggests to color each distinct word which appears more than once in a different color:

songsim("~/material_girl.txt", colorfulMode = TRUE, mainTitle = "Material Girl - Madonna")

The SongSim matrix aids in seenig patterns of repetitions in the song’s lyrics. Can you see the straightforward pattern here?

Madonna starts with a verse:

Some boys kiss me
Some boys hug me
I think they’re ok

We see this start in the matrix as the start of the somewhat black diagonal. Then Madonna “peeks” into the chorus:

’Cause we are living in a material world
And I am a material girl

We see this as the first somwhat green square area around the diagonal. Then again, a verse:

Some boys romance
Some boys slow dance
That’s all right with me

We see this again as a “dead” area around the diagonal. Then again the chorus, in full power:

’Cause we’re living in a material world
And I am a material girl
You know that we are living in a material world
And I am a material girl
Living in a material world
And I am a material girl
You know that we are living in a material world
And I am a material girl
Living in a material world (material)
Living in a material world
Living in a material world (material)
Living in a material world

We see this as a big greenish square in the middle of the matrix.

This kind of pop songs structure - verse-chorus-verse-chorus-bridge-chorus - is very common, Colin demonstrates it on Ke$ha’s Tik Tok.

Where it starts to get interesting and visually pleasing is where this pattern breaks:

songsim("~/all_you_need_is_love.txt", colorfulMode = TRUE, mainTitle = "All You Need Is Love - The Beatles")

songsim("~/times_they_are_a_changin.txt", colorfulMode = TRUE, mainTitle = "The Times They Are A Changin' - Bob Dylan")

songsim("~/chandelier.txt", colorfulMode = TRUE, mainTitle = "Chandelier - Sia")

songsim("~/bohemian_rhapsody.txt", colorfulMode = TRUE, mainTitle = "Bohemian Rhapsody - Queen")

And I’ll let Colin’s posts do the rest of the talking.

What’s in the Package? What’s in the Package?

As seen above the songsim package holds essentially a single function, songsim, which needs only a path to a txt file containing a song’s lyrics. You can look up other interesting parameters in my examples above and in the help file.

I have also attempted to plot the SimSong matrix interactively by specifying interactiveMode = TRUE, so we can see the actual words by hovering on a matrix cell. I used the wonderful heatmaply package by Tal Galilee, and this should work only if you have this package (and its dependants) installed. Caution: I do not know how this will turn up in Mobile browsers.

songsim("~/bohemian_rhapsody.txt", interactiveMode = TRUE, singleColor = "blue",
        mainTitle = "Bohemian Rhapsody - Queen")