-
Notifications
You must be signed in to change notification settings - Fork 1
A list of topics for a Google summer of code (GSOC) 2011
Mentor : O. Grisel
Goal : Devise an intuitive yet efficient API dedicated to the incremental fitting of some scikit-learn estimators (on an infinite stream of samples for instance).
See this thread on the mailing list for a discussion of such an API. Design decision will be taken by implementing / adapting three concrete models:
- text feature extraction
- online clustering with sequential k-means
- generalized linear model fitting with Stochastic Gradient Descent (both for regression and classification)
Mentor : Gael Varoquaux, Alex Gramfort
The objective is to bring to the scikit some recent yet very popular methods known as Dictionary Learning or Sparse Coding. It involves heavy numerical computing and has many applications from general signal/image processing to very applied topics such as biomedical imaging. The project will start from existing code snippets (see below) and will require to make some design decision to keep the API simple yet powerful as the rest of the scikit.
Some useful ressources with compatible License:
-
NMF + Hoyer method in milk
Mentor : Satra
Mentor : Fabian Pedregosa
Mentor : Satra
(there is already a preliminary implementation in my fork) i would combine this with boosting/bagging
Mentor : Mathieu Blondel?
There is an LSH implementation in pybrain (pybrain/supervised/knn/lsh)
Mentor : ?
Mentor : Vincent Michel?