Motivation

With nearly 350,000 employees across 30,000 retail locations, Starbucks is one of the largest multinational chains of coffeehouses in the world. One of the key resources is their employees at Starbucks. Starbucks calls their employees partners because employees are all partners in shared success. Part of the Starbucks experience is walking into a store and being greeted by great employees that know your name and your favorite drink. In other words, the longer a partner works at Starbucks, the more relationships they build and experiences they contribute to which translates into better customer experiences, increased competitive advantage, and greater customer lifetime value.

Motivation

Our question to resolve throught this project is "how can Starbucks predict when high-value employees are at risk of leaving, so that steps can be taken to minimize turnover?"

Starbucks has a relatively high turnover rate of 65 percent for full-time partners. It costs as much as 33% of a worker's annual salary to replace. If we assume this statistic holds true for Starbucks, employee turnover could be costing them approcimately $2 billion per year and reduce this by just 0.1%, it could mean saving of $ 2 million per year.

Initial Dataset

Data Preprocessing

The dataset I received was a time-series format. Time series analysis suffers from a number of weaknesses, including problems with generalization from a single study, difficulty in obtaining appropriate measures, and problems with accurately identifying the correct model to represent the data. Therefore, I transformed this dataset to independent of the observations format data frame. Therefore, I created an independent observations data frame with Python--Pandas that is transformable from time-series data including over 100M + rows through ETL process and extracted the 6,100 talented employees’ data.

Data Exploration

Build with

Python Pandas
Matplotlib (data exploration and visualization)

Key Skills Learned

Machine Learning - Logistic Regression
Data Extract, Transform, Load
Matplotlib data exploration and visualization

Supervised Learning _ Rogistic Regression

Conclusion

Identified talentied partners who work more than 1.09 years with Starbucks and stay with the one position for more than 0.83 years.
Developed a supervised machine learning model with 98% of accuracy in predicting when the employees are about to leave or stay and derived a cost analysis outcome that can save $1,220 for each employee.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
README.md		README.md
Starbucks Presentation .pdf		Starbucks Presentation .pdf
Starbucks_FINAL.ipynb		Starbucks_FINAL.ipynb
Starbucks_employee_atrrition.py		Starbucks_employee_atrrition.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Motivation

Initial Dataset

Data Preprocessing

Data Exploration

Build with

Key Skills Learned

Supervised Learning _ Rogistic Regression

Conclusion

License

About

Releases

Packages

Languages

hej6853/Starbucks_People_Analytics

Folders and files

Latest commit

History

Repository files navigation

Motivation

Initial Dataset

Data Preprocessing

Data Exploration

Build with

Key Skills Learned

Supervised Learning _ Rogistic Regression

Conclusion

License

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages