Wednesday, February 4, 2015

Data visualizations based on Spotify listens

I've been tracking the music I listen to on Spotify since early November. There is an IFTTT recipe that automatically updates a Google Spreadsheet every time I listen to a song with the song name, artist name, album title, and timestamp.

While other people might be interested in the type of music and how much of it I listen to, I am likely the main audience for this visualization. No matter if the audience is just me, or family members, friends, classmates, or others, some might generate assumptions about what types of music I listen to on a regular bases. People who know I have a daughter would understand the kids songs in the treemap, those who don't know about my daughter might find the kids songs strange. The audience also probably has some assumptions about how they listen to music throughout the day, and may compare my listening habits to theirs. Someone who needs silence while working will not understand how I listen to so much music. People may also judge the types of music I listen to, and think better or worse of me.

My questions are: Which artists do I listen to the most, and do I like a lot of those artists' songs or are they one-hit wonders to me? What day of the week do I listen to the most music? What time of day do I listen to the most music?

There are several other questions I have, but don't have the proper data currently. Those questions are: Do I listen to certain genres of music on specific days? I'm guessing that music on Mondays is different than music on Fridays. How do the number of meetings I have during the day affect how many songs, and what types of songs, I listen to? Again, guessing that more meetings might mean fewer songs, but that the genres would change (since many meetings tend to put me in a bad mood, especially when they are poorly run). To make these analyses work, I would need to manually input genres for each song, and would need to go through my calendar and manually update the number of meetings I have each day.

I'm using R for the data preparation and analysis, and Illustrator to fine tune the images.
For the treemap I used the R package 'portfolio'. Flowingdata.com helped with the steps I needed to take to make this work. For the bar graph, I used barplot. For the line graph, I used ggplot.

I'm also including my initial sketches of how I wanted the visualization to look. This helped a lot for planning my attack, especially with regard to the data cleanup.

Visualizations are below, all code is on GitHub

Sketch of what I wanted:

Treemap of artists I listen to most:
Songs listened to per hour:


Songs listened to by day: