I recommend doing these problems in a Jupyter notebook.
- Read in the comma-separated file "client_list.csv". Assign as variable
df1
. - Read in the delimted file "client_list.table". Assign as variable as
df2
. - Read in the fixed-width file "client_list.txt". Assign as variable
df3
- Read in the comma-separated file "client_list.csv", skip the first 3 rows, and ignore the header. Do not assign to variable (just return a view).
- Read in the comma-separated file "client_list.csv". Set the column headers in all caps. Assign as variable
df
. - Read in the comma-separated file "client_list_practice.csv" and only extract the columns
["FIRST_NAME","AGE","EYE_COLOR"]
. Do not assign to a variable.
- Slice rows 5 through 11 of
df
. Can you provide two ways of doing this? - Return only the columns ['LAST_NAME','AGE','HAIR_COLOR'] for
df
. Can you provide two ways of doing this? - Combine problems 7 and 8: return rows 5 though 11 and columns ['LAST_NAME','AGE','HAIR_COLOR'] for
df
. Can you provide two ways of doing this?
- Find the subset of
df
where the client's last name is "Smith". - Find the subset of
df
where the client's hair color is not black. - Find the subset of
df
where the client's hair color is red and reset the values to "ginger".
- Find the subset of
df
where the clients are females older than 30 years. - Repeat problem 13, but return only the hair color and eye color.
- Find the unique combination of hair and eye color for women older than 25 years.
- Perform a
merge
using "client_list.csv" and "customer_id_list.csv". Assign the resulting dataframe asclients
. - Perform a
merge
usingclients
and "purchase_log.csv" and limit the subset to only clients who made purchases. Assign the resulting dataframe asdetailed_sales
. - Use
groupby
to find the client who spent the most money on purchases. Determine how much he/she spent. HINT: save the intermediate dataframe from usinggroupby
asspenders
before applying slicing to determine the client who spent the most money on purchases. - (BONUS) Modify the answer to problem 18 slightly to determine exactly what items where purchased by the top spending client.
- Save
detailed_sales
as a csv file named "df_out.csv" with no indices. - Save
detailed_sales
to a pickle file named "df_out.p"