(Almost) everything you need to know as an applied mathematician / statistician concerning coding and system administration.
- Joseph Salmon : [email protected],
- Benjamin Charlier : [email protected]
This course material was improved with the help of some students including:
- Amelie Vernay
- Tanguy Lefort
Students are expected to know basic notions of probabilities, optimization, linear algebra and statistics for this course. Some rudiments on coding is also expected (if, for, while, functions) but not mandatory.
This course focuses on discovering good coding practices (the language used being Python, but some element of bash and git will also be useful) for professional coding.
A special focus on data processing and visualization will be at the heart of the course.
We will mostly focus on basic programming concepts, as well as on discovering the Python scientific libraries, including numpy, scipy, pandas, matplotlib, seaborn
.
Beyond pandas
ninja skills, we will also introduce modern practices for coders : (unitary) tests, version control, documentation generation, etc.
-
BC : (09/09/2022) Introduction to linux essentials and command line tools: regexp, grep, find, rename
-
BC : (16/09/2022) IDE: VScode, Python virtual env: Anaconda, Python virtual environment, terminal, etc.
-
BC : (23/09/2022) Git: a first introduction,
github
, ssh key creation, various git commands, conflict, pull request; see also Bonus/, hands on git -
BC (quiz 1) + JS : (30/09/2022) Create a Python Module, classes (
__init__
,__call__
, etc...), operator overloading, files handling, -
JS : (03/10/2022 + 07/10/2022) unit tests
-
JS : (10/10/2022 + 14/10/2022) Pandas: first steps / missing data
-
JS : (17/10/2022 + 21/10/2022)
scipy, numpy
: Images/channel -
JS (quiz 2) : (28/10/2022) Sparse matrices, graphs and memory
-
BC : (18/11/2022) Documentation with Sphinx
-
JS + BC : (09/12/2022) The end: Project presentations
Short quiz of 20 min each (on Moodle). This will be a personal work.
- Quiz 1 BC (30/09/2022, 10%)
- Quiz 2 JS (28/10/2022, 10%)
Warning: the precise details of the projects might evolve before the allocation phase, and a precise grid will be given in the project section.
Warning: the project repository must show a balanced contribution between group members and intra-group grades variation could be made to reflect issues on the intra-group workload balance.
1 supplementary point on the final grade of the course can be obtained for contributions improving the course material (practicals, Readme, etc.). See the Bonus section for more details on how to proceed.
The resources for the course are available on the present github
repository. Additional elementary elements (in French) on Python are available in the course HLMA310 and the associated lectures notes IntroPython.pdf.
-
(General) : The Missing Semester of Your CS Education
-
(Data Science) : J. Van DerPlas, Python Data Science Handbook, With Application to Understanding Data, 2016https://jakevdp.github.io/PythonDataScienceHandbook/
-
(General) Skiena, The algorithm design manual, 1998
-
(General) Courant et al. , Informatique pour tous en classes préparatoires aux grandes écoles : Manuel d'algorithmique et programmation structurée avec Python, 2013, (french)
-
(General/data science) Guttag, Introduction to Computation and Programming, 2016
Associated videos: http://jakevdp.githubio/blog/2017/03/03/reproducible-data-analysis-in-jupyter/
-
(Code and style) Boswell et Foucher, The Art of Readable Code, 2011
-
(Scientific computing tools for Python) http://www.scipy-lectures.org/
-
(Visualization) http://openclimatedata.net/