Spreadsheets are the tool of choice for data analysis and reporting among a majority of business professionals, who are well-acquainted with common data processes like sorting, filtering, and aggregation.
But spreadsheets have their limitations, which templates and complex formulas fail to solve. For quick, dynamic and repeatable data processes, Python is a helpful addition. The thought of learning a programming language may be intimidating to spreadsheet users, but they know more about data and coding than they think. This course serves to bridge the gap between spreadsheets and Python, by helping users pivot their existing knowledge about data into a new application.
What you'll learn — and how you can apply it By the end of this live, hands-on, online course, you’ll understand:
- Python’s powerful universe of open-source packages and tools
- The Python equivalent of common spreadsheet tasks such as PivotTables and lookups
- Where Python can augment and automate common data preparation and manipulation tasks often done in spreadsheets
- The role of variables, objects, and functions in Python
And you’ll be able to:
- Load, view and write spreadsheet files from Python
- Perform common data wrangling tasks such as sorting, filtering, and aggregation
- Navigate and execute code in Jupyter notebooks
- Identify, install and implement useful packages for your needs
This training is for you because...
- You're an analyst responsible for collecting, analyzing, and interpreting data for business insights.
- You want to learn how to use Python inside Jupyter Notebooks along with its most common packages.
- You would like to speed up, automate, and validate your reporting and analysis using open-source software.
- You’re a spreadsheet user interested in learning more about data science.
Prerequisites What prior knowledge or experience is necessary?
- You should be comfortable with basic tasks and functions in Excel, including sorting and filtering, and the use of IF statements. Ideally, you've also used conditional aggregates like SUMIF() and COUNTIF() as well as PivotTables and VLOOKUP().
- No prior programming knowledge needed. As an Excel user, you already know more programming than you realize!
Recommended preparation Any setup instructions or links to Safari-based content? Any supplemental materials, like PDF worksheets or links to code repositories?
- Read Chapters 1 “Preliminaries” and 2 “Python Language Basics, IPython, and Jupyter Notebooks” in Python for Data Analysis, 2nd edition **(book)
- This course will run Python on the cloud using O’Reilly’s JupyterHub, so no setup is required.
Recommended follow-up Links to Safari-based content for further learning
- Download an Anaconda distribution of Python on your computer (See Anaconda’s instructions).
- Read Automate the Boring Stuff with Python, 2nd edition (book)
- Read Python for Data Analysis, 2nd edition (book)
Up and running from spreadsheets to Python (50 minutes)
- Presentation: Welcome to Planet Python
- What is Python and when would you use it instead of a spreadsheet?
- Presentation: Hello from Jupyter
- Navigating and executing Python code from Jupyter notebooks
- Presentation: From spreadsheet cells and ranges to Python lists and dictionaries
- Creating, inspecting and manipulating lists and dictionaries
- Exercise: Practice assigning, indexing and subsetting variables in Python
- Q&A
Break (10 minutes) Working with tabular data (50 minutes)
- Presentation: From spreadsheet tables to Python DataFrames
- Creating, inspecting and manipulating DataFrames
- Importing spreadsheet data into Python
- Presentation: From lookups and PivotTables to Pandas manipulation
- Manipulating tabular data in Pandas: sorting, summarizing, merging, re-shaping, exporting to spreadsheets
- Exercise: Practice manipulating and analyzing tabular data with Pandas
- Q&A
Python for data analysis (50 minutes)
- Presentation: Data visualization with seaborn
- Visualizing univariate and bivariate distributions: bar charts, histograms, scatter plots, line charts
- Customizing plots and themes
- Presentation: From “That’s hard in spreadsheets” to “That’s easy in Python!”
- Conducting an end-to-end data analysis project: appending, transposing, summarizing and visualizing a set of csv files
- Exercise: Build an end-to-end Python data analysis project from spreadsheet data
- Q&A
Break (10 minutes)