Skip to content

Google Summer of Code 2015 Ideas

Daniel Mendler edited this page Mar 2, 2015 · 26 revisions

Contact

Feel free to reach us by joining #sciruby on chat.freenode.net or via our mailing list.

Instructions for students

See also: GSoC Student Application

We strongly recommend that you pick one of the ideas listed below. We value contributions in advance of GSoC, even if they're just little ones. Go pick out something in one of our trackers and work on it, talk to folks on the listserv, and get an idea for what features are needed.

You don't need to know a lot about Ruby to work on a project: depending on how much you already know, it'll be pretty easy to learn enough to be able to contribute. However, you may need some familiarity with scientific computation. If you don't have any, take a look at "Numerical Recipes in C", which you'll probably find in your university's library.

In any case, if you feel your skills aren't enough for some project, please ask us on our IRC channel (see contact section above) or our Google Group (see sciruby.com to sign up) and we can help you.

Our number-one priority right now as an organization is NMatrix. Our number-two priority is most likely visualization. If we write good visualization software, SciRuby will become much more accessible to people.

Read this before you commit your first patches

Most of the main SciRuby’s landing page on Github holds the stable version of SciRuby gems but developers and contributors should work on the very latest (bleeding edge) repositories in order to make sure that changes can be committed without conflict arising.

Try reading Finding The SciRuby Development Repositories on Github if you would like a brief introduction on finding the latest development gems to work on from Github.

How to submit a patch ("pull request")

Here's a great tutorial: http://www.thinkful.com/learn/github-pull-request-tutorial/

Have a look and feel free to ask if you have any questions.

Note about "recommended skills"

We used to say "required skills," but realized there may exist cultural as well as gender differences in how people interpret this phrase. We would like you to have at least one of the listed skills. More is better. Remember that GSoC is a learning experience, and we expect that you'll be lacking in some areas of knowledge.

One of the most important skills in science and engineering is knowing how to say, "I don't know." If you don't know something, look it up, try to understand it, and then feel free to ask for help on our listserv or in IRC.

Project ideas

NMatrix projects

NMatrix is SciRuby's numerical matrix core, implementing dense matrices as well as two types of sparse (linked-list-based and Yale/CSR). NMatrix is a fairly well-established project which has received Summer-of-Code-like grants from both Brighter Planet and the Ruby Association (in other words, from Matz, who created Ruby). Those who contribute to NMatrix will likely eventually become authors of a jointly-published peer-reviewed science article on the library. Additionally, NMatrix is a good place to gain practical C and C++ experience, while also working to improve Ruby.

NMatrix currently relies on ATLAS/CBLAS/CLAPACK and standard LAPACK for several of its linear algebra operations. In some cases, native versions of the functions are implemented, so that the libraries are not required. There are quite a number of areas for growth in terms of the capabilities of NMatrix here.

Adding Linear Mixed Model (LMM) support to SciRuby

  • Mentors: Pjotr Prins (@pjotrp), Carlos Agarie (@agarie)
  • We would like to add LMM support to SciRuby in a separate gem that uses nmatrix, BLAS and friends for the matrix manipulations. The implementation can be derived from Python's statsmodels and R's lme4 package.
  • Recommended skills: Strong in statistics, you should be willing to learn Ruby and read Python and R implementations

Abstraction of ATLAS/CBLAS/CLAPACK or OpenBLAS into a separate gem.

  • Mentors: John Woods (@mohawkjohn), Carlos Agarie (@agarie), Colin Fuller (@cjfuller)
  • Right now, NMatrix is able to do some math natively, in C, and some by linking to ATLAS/CBLAS/CLAPACK. This can cause problems because some systems are not ATLAS compatible, or have different flavors of LAPACK. NMatrix core should have as few dependencies as possible, and a separate gem (nmatrix-atlas) should be constructed which enables ATLAS extensions to work.
  • Another possibility is to have an external gem that interfaces NMatrix with OpenBLAS, an easier to install implementation of BLAS that also includes LAPACK.

Ability to interface with external libraries beyond ATLAS.

  • Mentors: John Woods (@mohawkjohn), Colin Fuller (@cjfuller)
  • In addition to the discussion in the previous idea, NMatrix should have the ability to leverage other libraries that might be installed, such as eigen3, or maybe even boost (nmatrix-eigen3, nmatrix-boost, and perhaps nmatrix-gsl). NMatrix should be able to switch seamlessly between them. One important design question to think about when applying: How does NMatrix choose which library to use if all three implement a given function? For example, if eigen3 and atlas both have matrix multiplication, which one should be used?
  • A related project is the writing of eigen3 and boost interfaces for NMatrix, though these are lower priority than adapting NMatrix to ATLAS. Another option is the Intel Math Kernel Library. Work in these areas would likely depend upon the F2RB project discussed further down the page, or perhaps FFI.

Ruby/GSL gem updates

  • Mentors: John Woods (@mohawkjohn), Pjotr Prins (@pjotrp)
  • SciRuby has its own fork of GSL which provides NMatrix compatibility (in lieu of NArray). Unfortunately, it's in need of some clean-up. While this is not in-and-of-itself an entire GSoC project, it could easily be combined with components of other projects.
  • Recommended skills: You should be comfortable with C and be willing to learn Ruby, or vice-versa.

Fast Ruby to Julia bindings through the LLVM

  • Mentors: Maurice Diamantini, Pjotr Prins (@pjotrp)

  • Julia is a new LLVM-based computer language aimed at statistics on large datasets. Julia offers Matlab style linear algebra and libraries for optimizations, graphs etc. It is gaining traction in the scientific community and can be used from R and Python using, so-called, bindings. Rubinius is a Ruby that also targets the LLVM, see Rubinius design. This means that, in principle, has the advantage that methods and data can be called natively between Ruby and Julia, possibly outperforming other bindings. At this point the binding strategy is not decided. One possibility is to generate a wrapper such as was done from Julia to Clang; or in our case from Rubinius to Julia. The output of this project would be a Ruby gem that can bring Julia functionality to Rubinius that avoids copying of large data structures, possibly providing a more Ruby-style interface. The gem does not need to cover all Julia libraries, but it should contain the fundamental bindings that other people can use as templates for future work.

  • Recommended skills: You should be comfortable with Ruby and be willing to learn Julia and the LLVM. Ideally you have some experience with language bindings.

Visualization projects

Ruby native visualization tools

  • Mentors: Pjotr Prins (@pjotrp), John Woods (@mohawkjohn)
  • Over the past few Google Summers of Code, a number of prototype visualization libraries have been developed by or contributed to by students — such as Nyaplot and Plotrb, both created by students, and Rubyvis. None of these are complete, but all of them have some nice features. Much work remains to be done in making these plotting tools useful for a wide array of visualization types.
  • Recommended skills: You should be comfortable with Ruby metaprogramming concepts, or should be prepared to learn them during the application process. You should also teach yourself about how Protovis and D3 work during the application process, and expect to understand how other pieces of plotting software function.

Gnuplot gem update

  • Mentors: John Woods (@mohawkjohn)
  • There's a timeless Ruby Gnuplot gem which was written several years ago, but it's fallen behind Gnuplot's innovation curve. It'd be great to provide a more robust Ruby Gnuplot, which among other things produces plots which can be updated in real-time by --- for example --- a pair of ZeroMQ publish–subscribe sockets. While @mohawkjohn has been working on just such a project, his version consists mainly of hacks which make the old Gnuplot gem work with live updates. @jtprince has also created an add-on which supports multiplotting. It'd be better to redesign Ruby Gnuplot from the ground up. This gem is an ideal Summer of Code project because it's only a few hundred lines of code, but does an effective job of leveraging an extremely robust plotting tool.
  • Recommended skills: You should be comfortable with Ruby and have some familiarity with meta-programming. Familiarity with Gnuplot is a plus, but you can learn it easily during the application period. If you work on this project, you may get to work with @mohawkjohn on some space applications; he's currently writing some basic visualization tools for spacecraft guidance, navigation, and control.

User Interface: IRuby notebook and integration with other scientific tools

  • Mentors: Daniel Mendler (@minad)
  • The IRuby system needs to be improved in stability, ease of installation and integration with the other scientific Ruby tools (e.g. plotting). IPython 3 is coming soon and this will be a major change for different language kernels like IRuby. The integration will improve which makes other languages first-class citizens in IPython. There will be breaking protocol changes.
  • The goal of this project is to get IRuby from the current state to something which is ready for production use! I consider this a very important project since IRuby acts (or can act in the future) as a central component of the SciRuby framework which allows you to access all the numerical and plotting functionality in a very beginner friendly way.
  • This project will also require a fair amount of communication with the other sciruby projects to help them to integrate better which IRuby and with each other.
  • Recommended skills: You should be comfortable with common Ruby programming concepts. It would be helpful if you are interested in other technologies and languages too, e.g. for digging into the IPython code or the 0mq-protocol.

Math API projects

Ruby need efficient tools in scientific domains aside from linear algebra: graph algorithms, mathematical programming, etc. For efficiency, these tools should be either new code written from scratch in C/C++ (with need several years of work) or bind to already existing stable libraries.

LEMON graph library API

  • Mentors: Pjotr Prins (@pjotrp), Maurice Diamantini
  • The LEMON C++ graph library (Library for Efficient Modeling and Optimization in Networks) is a good candidate as a Ruby binding because it has a clean C++ interface. It also provides a general MIP (Mixed Integer Programming) independent interface to various other free or commercial mathematical solvers (Glpk, Clp, CPLEX, Guroby, and so on). It is well maintained and its integration in the COIN-OR set of tools is a gauge of its quality. Such a binding would be a great advance for the operation research and combinatorial optimization Ruby community.
  • Candidates should be comfortable in C++ and familiar with Ruby.
Clone this wiki locally