Project by Anne Ensign
Analysis between various statistics of Christopher Walken's film career and popular music that utilizes the cowbell.
JUMP TO:
The Saturday Night Live sketch "More Cowbell" aired on April 8, 2000.1 It quickly because a pop culture sensation. Will Ferrell, an SNL cast member and writer, was responsible for penning the famous sketch.2 On The Tonight Show Starring Jimmy Fallon in 2019, Ferrell recounted a conversation (with tongue-in-cheek) that he had backstage with Walken after a play:
“You know, you’ve ruined my life. People, during the curtain call, bring cowbells and ring them. The other day I went for an Italian food lunch, and the waiter asked if I wanted more cowbell with my pasta bolognese.”3
I am curious to see if there is any correlation with Walken's film career (ratings, number of films per year, etc.) and popular songs that incorporate cowbell into the percussion instruments used in recording.
Sample size was comprised of:
- 111 Christopher Walken films
- 4284 songs with cowbell
- 1369 popular songs to find mutual instances with cowbell songs
- Final list of 130 songs mutual instances
Walken Data
I accessed www.RottenTomatoes.com for Walken's films. Since the list was fairly short, it was easiest to copy/paste the data into a spreadsheet and create a CSV.
Cowbell Data
I made three web scrapers to collect song data:
cowbell.py
This scraper ran through 4247 pages of UltimateCowbell.com4 to extract song information and wrote a CSV of the dataframe.script.js
This was a manual scraper. RollingStone.com5 had a list of "The 500 Greatest Songs of All Time," which I wanted to cross-check against the UltimateCowbell list. Rolling Stone's site is a dynamically loaded website, so the python scraper couldn't read the contents of the HTML. I only needed to run the scraper on 10 pages, so I made a JavaScript code to run in the console and copy/pasted the output into a spreadsheet.billboard.py
This scraper ran through Wikipedia's lists of the Billboard's Hot 100 Singles6 between 1970 and 2020 and wrote a CSV. This list includes any single that reached a #1 position on the charts during this time, with pages separated by decades.
I also copy/pasted a couple smaller lists of "best cowbell songs" and placed in another CSV to add to my cowbell sample size.78
Overall, over 5,600 songs were taken into the data and analyzed. The final list came to 131 cowbell songs that are popular/mainstream, with release dates ranging from 1964 to 2017, and genres such as Rock, Country, R&B, Pop and more. See Methodology section for more detailed information.
If you have a Google account, you can easily run this program in Colab- no downloads, packages or virtual environments needed.
- CLICK HERE FOR COLAB LINK
- Once open, click
File > Save a Copy in Drive
- Your copied version should automatically open.
- On the top tab bar, click
Runtime > Run all
- Ta-da! That's it.
This repo utilizes a number of tools, frameworks and libraries that are all included with Anaconda. Anaconda documentation and install Anaconda released an update on September 24, 2020. This repo runs on that latest release.
You can update by opening your terminal and enter:
conda update conda
conda install anaconda=VersionNumber
Click here for notes on updating.
If you do not wish to install Anaconda, be sure your machine has the following:
- Python 3.0 or higher
- Jupyter Notebook
- pandas
- NumPy
- Matplotlib
- PyPi to install the following:
To add these packages with pip, install:
pip install notebook
pip install pandas
pip install numpy
python -m pip install -U pip
python -m pip install -U matplotlib
pip install scipy
pip install seaborn
- Clone the repository.
- Save the folder.
- Open
jupyter notebook
from command line or start menu. - Navigate to the saved location of the repo.
- Open
walken.ipynb
. - Click
Cell
tab and thenRun All
.
If you cloned or downloaded this entire repo, you should have the CSVs that the web scrapers have made already available. These include:
cowbell1_4247.csv
billboard.csv
If you want to run these on your own:
-
Install Beautiful Soup
-
If desired, change folder location to download the CSVs. For example:
df.to_csv(f'billboard.csv', sep=',', encoding='utf-8-sig', index = False)
to
df.to_csv(f'./csv/billboard.csv', sep=',', encoding='utf-8-sig', index = False)
-
Run the file in your terminal.
python cowbell.py
python billboard.py
Requirements met for Code Louisville:
- Create a dictionary or list, populate it with several values, retrieve at least one value, and use it in your program.
PIE CHARTS # Wedges and labels decades = [ pre50, the50s, the60s, the70s, the80s, the90s, the00s, the10s] labels = [ 'pre50', 'the50s', 'the60s', 'the70s', 'the80s', 'the90s', 'the00s','the10s'] ... # Pie Chart ax1.pie(decades, explode=explode, labels=labels, autopct='%1.1f%%')
- Read data from an external file, such as text, JSON, CSV, etc and use that data in your application.
SEVERAL EXAMPLES, INCLUDING # Use pandas to read CSV walken = pd.read_csv('./csv/c_walken.csv')
- Create and call at least 3 functions or methods, at least one of which must return a value that is used somewhere else in your code.
METHOD- REPLACING M & K WITH 000s FOR DATAFRAME # Replace M and K with corresponding 0s in millions and thousands walken['box'] = walken['box']\ .replace(r'[KM]+$', '', regex=True)\ .astype(float) \ * (walken['box'].str\ .extract(r'[\d\.]+([KM]+)', expand=False)\ .fillna(1)\ .replace(['K','M'], [10**3, 10**6])\ .astype(int))
FUNCTION- REPLACE Y TICKS WITH 'M' ON GRAPH # Set Y1 Ticks def millions(x, pos): """The two args are the value and tick position.""" return '${:1.1f}M'.format(x*1e-6)
METHOD- REPLACE ALPHA WITH NUMERIC Replace month names with numerals in Date column newdates = { 'Date': { 'January': '1', 'February': '2', 'March': '3', 'April': '4', 'May': '5', 'June': '6', 'July': '7', 'August': '8', 'September': '9', 'October': '10', 'November': '11', 'December': '12'} } bill.replace(newdates, regex=True, inplace=True)
- Analyze text and display information about it (ex: how many words in a paragraph).
COMBINE .isin(), CONCAT, DROP DUPLICATED IN DF. USED TO GRAPH DATA # Combine songs that on UltimateCowbell and Billboard combo2 = songs[songs['Song']\ .isin(bill.Single)]\ .sort_values(by='Band', ascending=True) # Read CSV extrabell = pd.read_csv('./csv/extra_cowbell.csv') # Make final df of cowbell songs frames = [newlist, extrabell, newlist2] final_list = pd.concat(frames) # Drop duplicate songs final_list = final_list.drop_duplicates(subset=['song']) # Make a new df newlist2 = pd.DataFrame(data=combo2, columns=['Band', 'Song', 'Year']) newlist2.columns = ['artist', 'song', 'year']
- Visualize data in a graph, chart, or other visual representation of data.
SEVERAL
- Implement a “scraper” that can be fed a type of file or URL and pull information off of it.
billboard.py cowbell.py script.js
- Use pandas, matplotlib, and/or numpy to perform a data analysis project. Ingest 2 or more pieces of data, analyze that data in some manner, and display a new result to a graph, chart, or other display.
SEVERAL
Footnotes
-
More Cowbell - SNL, Saturday Night Live, YouTube ↩
-
Recording Session (More Cowbell), interview with Will Ferrell, Rolling Stone Magazine. ↩
-
Will Ferrell Ruined Christopher Walken's Life with SNL's More Cowbell Sketch, The Tonight Show Starring Jimmy Fallon, YouTube ↩