Wilco Discography Analysis

Prepared by Karl Duckett - April 2021

If you don't know who Wilco are, have a listen on Spotify https://open.spotify.com/artist/2QoU3awHVdcHS8LrZEKvSM

Imports & Setup

In [101]:
# Not all of these are used in this report, but it's my standard copy and paste for each Jupyter Notebook developed.
import pandas as pd
import numpy as np
import seaborn as sns
import os
import re
import matplotlib.pyplot as plt
from IPython.display import display, HTML
import plotly.offline as py
import plotly.graph_objs as go
import plotly.express as px

from numbers import Number
from tabulate import tabulate
from scipy import stats
import datetime

from PIL import Image
from wordcloud import WordCloud, STOPWORDS, ImageColorGenerator

pd.options.plotting.backend = "plotly"
pd.options.display.max_columns = 100
pd.options.display.max_colwidth = None
py.init_notebook_mode(connected=True)
In [2]:
# Import the data - this data was extract via Wikipedia + Lyric Genius website
df = pd.read_csv('Wilco.csv')

Clean Up

Create new columns change minutes to seconds and extracting the word count.

In [3]:
def time_convert(x):
    m,s = map(int,x.split(':'))
    return (m)*60+s
In [4]:
df['Seconds'] = df['Duration'].apply(time_convert)
In [5]:
df['WordCount'] =  df['Lyrics'].str.split().str.len()
In [83]:
df.head(2)
Out[83]:
Title Album Duration Tempo Key Chords Lyrics Seconds WordCount
0 I Must Be High A.M 2:59 NaN NaN NaN You always wanted more time,\nTo do what you always wanted to do\nNow you got it\n\nAnd I, I must be high,\nTo say goodbye\nBye bye bye\n\nYou never said you needed this\nAnd you're pissed that you missed the very last kiss,\nFrom my lips\n\nAnd I, I must be high,\nTo say goodbye\nBye bye bye\n\nAnd you never looked in my eyes,\nLong enough to find any piece of mind\nBut now you got it\n\nAnd I, I must be high,\nTo let you say goodbye\nBye bye bye 179 94
1 Casino Queen A.M 2:49 NaN NaN NaN Well the money's pouring down and the people all look down,\nAnd it's floating out of town\nI hit the second deck and I spend my paycheck,\nAnd my wife that I just met, she's looking like a wreck\n\nCasino Queen, my lord you're mean\nI've been gambling like a fiend on your tables so green\n\nI always bet on black, blackjack,\nI'll pay you back\nThe room fills with smoke and I'm already broke,\nAnd the dealer keeps on joking as he takes my last token\n\nCasino Queen, my lord you're mean\nI've been gambling like a fiend on your tables so green\n\nCasino Queen, my lord you're mean\nI've been gambling like a fiend on your tables so green 169 121

Create Elements

Word Clouds

In [29]:
text = " ".join(review for review in df['Lyrics'])
stopwords = set(STOPWORDS)
stopwords.update(["know", "go", "want", "will", "see"])
In [82]:
import random
def grey_color_func(word, font_size, position, orientation, random_state=None,
                    **kwargs):
    return "hsl(0, 0%%, %d%%)" % random.randint(60, 100)

# % time will return how long it took to execute this line (only line!)
%time wordcloud = WordCloud(stopwords=stopwords, background_color="white", width=1600,height=900).generate(text)
plt.figure(figsize = (32,18))
plt.imshow(wordcloud.recolor(random_state=3),
           interpolation="bilinear")
plt.imshow(wordcloud, interpolation='bilinear')
plt.axis("off")
plt.show()
Wall time: 2.53 s