Skip to content

Some fundamental machine learning and data-analysis techniques are explained through realistic examples.

License

Notifications You must be signed in to change notification settings

ignasineira/Machine_Learning

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

73 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Machine_Learning

This repo contains introduction to some of the most important machine learning and data-analysis techniques.

Filenames are preceded by DDMMYY. For descriptions and more check the Wiki Page.

PCA_Muller.py 190818: Principal component analysis example with breast cancer data-set.

270918: RidgeandLin.py, LassoandLin.py: Lasso and Ridge regression examples.

081018: bank.csv, data set of selling products of a portuguese company to random customers over phone call(s). Data-set description is available here.

161018: gender_purchase.csv, data-set of two columns describing customers buying a product depending on gender.

111118: winequality-red.csv, red wine data set, where the output is the quality column which ranges from 0 to 10.

121118: pipelineWine.py, A simple example of applying pipeline and gridsearchCV together using the red wine data.

24112018: lagmult.py, This program just demonstrate a simple constrained optimization problem using figures.

11122018: Consumer_Complaints_short.csv, 3 columns describing the complaints, product_label and category. Complete file can be obtained from Govt.data.

13122018: Text-classification_compain_suvo.py, Classify the consumer complaints data, which is already described above.

1912018: SVMdemo.py*, this program shows the effect of using RBF kernel to map from 2d space to 3d space. Animation requires ffmpeg in unix system.

05032019: IBM_Python_Web_Scrapping.ipynb, Deals with basic web scrapping, string handling, image manipulation.

06042019: datacleaning, Folder containing files and images related to data cleaning with pandas.

08062010: DBSCAN_Complete, Folder containing files and images related to application of DBSCAN algorithm to cluster Weather Stations in Canada.

13072019: SVM_Decision_Boundary, Pipeline + GridSearchCV were performed to find best-fit parameters for SVM and then decision function contours of SVM classifier for binary classification are plotted.

28122019: DecsTree, Folder contains notebook using a decision tree classifier on the Bank Marketing Data-Set.

07032020: Conjugate Prior, Folder contains a notebook where concept of conjugate prior is discussed including an introduction to PyMC3.

29052020: ExMax_Algo, Folder contains a notebook completely explaining the Expectation Maximization algorithm.

11092020: AdaptiveLoss.ipynb, File contains description and a simple implemetation of robust and adaptive loss function. Original Paper by J. Barron. More details on TDS.

31092020: pima_diabetes.ipynb, file contains description of data preparation and choosing best machine learning algorithm for binary classification task. Little more details on kaggle kernel.

15112020: terrorism_kaggle.ipynb, Notebook contains elaborate examples on how to think about problems and interpret large scale data using Global Terrorism Database. Apart from Pandas Groupby, Crosstab methods I have also used Folium, Basemap libraries for visualizing Leaflet map and 2D data on maps respectively. More on The Startup.

15022021: FocalLoss_Ex.ipynb, Notebook contains explanation on detail of how Focal Loss works. Please read the original Focal Loss paper. Example of implementing Focal Loss using Tensorflow is also shown. For more detail check the post on TDS.

19062021: Augly_Try.ipynb, Notebook contains examples of image augmentation using Facebook's Augly Library. For more detail check the notebook and TDS post.

About

Some fundamental machine learning and data-analysis techniques are explained through realistic examples.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 99.9%
  • Python 0.1%