Skip to content

GSoC 2024 ‐ Project Idea List

Frédéric Collonval edited this page Aug 13, 2024 · 1 revision

JupyterLab Google Summer of Code 2024 Project Idea List

Google Summer of Code logo

GSoC 2024 Mentors

Frederic Collonval (@fcollonval), Michał Krassowski (@krassowski), Eric Charles (@echarles)

Information for Students

This page lists the potential projects that are available for GSoC 2024 contributors. Interested applicants can always contact us by opening an issue on the code repository, on our chat Gitter, on the specific topic of our forum or send us an email for potential brainstorming before they submit their application.

An extensible environment for interactive and reproducible computing, based on the Jupyter Notebook.

JupyterLab is the multi-document user interface for Project Jupyter offering all the familiar building blocks of the single-document Jupyter Notebook (notebook, terminal, text editor, file browser, rich outputs, etc.) in a flexible and powerful user interface.

The projects presented here are our top picks for GSoC 2024. See our general projects page for more project ideas. We are open for discussion if you are interested in another project, but we recommend that you contact us early to discuss your ideas.

Project Ideas

  1. Improve user experience using the Jupyter toolkit
  2. Make the plugin system data-based
  3. Create a Swiss army knife builder CLI
  4. Advanced outputs from CDN

Idea 1 - Improve user experience using the Jupyter toolkit

To ease development and improve the user experience, a Jupyter toolkit has been created. It started to be part of JupyterLab core in the latest minor version.
The next steps would be to increase its usage in the core code as a small project. That could be increased to a medium project by adding new advanced components.

  • Complexity: Medium
  • Duration: 90 hours

Description

Leverage the component toolkit to increase UI homogeneity and reduce maintenance burden. A step-by-step plan would be:

  • Use toolkit search/input for all search/inputs: filebrowser, extension manager, debugger kernel source
  • Use toolkit button for all buttons: Dialog, extension manager, notification, running tabs
  • Use toolkit tree view for all tree view: table of content, debugger variables and running tabs
  • Use toolkit for the settings editor components

Reference: https://github.com/jupyterlab/jupyterlab/issues/15707

Required Skills

  • TypeScript
  • React
  • Git
  • [optionally] Web components
  • [optionally] GitHub

Expected outcome(s)

  • Multiple pull requests on the core JupyterLab repository for each toolkit components used to replace heterogeneous code.
  • Pull requests on the toolkit to add features need by the core repository and not covered by the toolkit.

Idea 2 - Make the plugin system data-based

JupyterLab application is built by combining a large number of plugins. The plugins system comes from a core library of the project called: lumino.

The current system forces the application to load most of the code plugin at start up. The idea is to explore describing the additional elements as a data model (likely a JSON). So that the application can present placeholder for plugin elements without the need to load the real code. That approach is used successfully in the popular Visual Studio Code editor.

  • Complexity: Hard
  • Duration: 350 Hours

Description

Before laying down a plan for Lumino plugin, here is an analysis of VS Code extension API that allows to conditionally load plugins. There are two key points:

  • Entry points must be defined in a no-code way
  • Trigger events must be defined in a no-code way

In VS Code, those elements are defined in the extension package.json. The entry points are for example (see contributions documentation):

  • Commands
  • Menus
  • Views

The trigger events are for example (see activation events documentation):

  • onCommand: When a command of the extension is triggered
  • onView: When a specific view is restored
  • onStartupFinished: After the application has started The side effect of defining the contributed element in a no-code format raises the need to have an evaluable string to determine if an element is activable or not. This is called when clause in VS Code.

Back to JupyterLab and Lumino, the equivalent of onStartupFinished could be implemented quickly. But it may result in unwanted side effects; e.g. if the plugin is providing a widget and that widget is used in a notebook output that is loaded at startup, the widget may or may not be rendered. To ensure robustness, other events should be available; for example the widget plugin should load before opening a notebook.

Plugins adding new panels in the main area or in the side panel, like jupyterlab-git, are probably the best candidates for optimization. This requires defining side panels and main area widgets in a no-code way. In JupyterLab, we introduced the definition of keyboard shortcuts and menus in setting schema files. So it seems a more natural place than the extension package.json.

To be able to add a panel, widget titles need to be defined in a no-code way. This requires in particular the definition of icons in a no-code way.

Some first actions could be:

  • We need to figure out how we could load the settings defining the entry points and the triggered events without loading the JavaScript code. A possibility would be to define entry points as we do for keyboard shortcuts or menus; i.e. in the settings file definitions. The entry points / no-code definitions in order of implementation would be:
    • Icon definition (needed for commands, widget title, file type,...)
    • Command definition
    • Panel definition (via the widget title to be displayed as placeholder in sidebar for example).
    • File type definition (needed for document factory)
    • Document factory

Reference: https://github.com/jupyterlab/lumino/issues/601

Required Skills

  • TypeScript
  • Git
  • [optionally] JSON schema
  • [optionally] GitHub

Expected outcome(s)

  • Create a proof of concept and a path to introduce this concept in Lumino plugins system.
  • This will materialize as code contribution to a dedicated development branch in the Lumino repository.

Idea 3 - Create a Swiss army knife builder CLI

Extract building tools from the core repository into a separated one.

  • Complexity: Easy
  • Duration: 90 hours

Description

The builder tooling is currently deeply nested with the core code of JupyterLab.

The goals of this project is:

  • to extract that tooling as a new separate package to ease maintenance (for core and extension developers)
  • update the existing package to use the new one
  • make it configurable to be reused for other applications.

Further information can be found in the following issue: https://github.com/jupyterlab/jupyterlab/issues/13456

And an quick attempt has been started in https://github.com/jupyterlab/hatch-jupyter-builder/pull/107

Required Skills

  • Python 3
  • Git
  • [optionally] TypeScript
  • [optionally] GitHub

Expected outcome(s)

  • A new repository with the extracted code in the JupyterLab GitHub organization
  • Pull requests to modify the core repository to use the new repository

Idea 4 - Advanced outputs from CDN

Jupyter Notebooks are documents mixing text, code snippets and their results. Those results can be highly complex with interactivity thanks to JavaScript libraries. The version used to produce the document is currently not saved as part of the document. Therefore when opening the same file a couple of years later may not render properly. This idea proposes a solution to fix that issue.

  • Complexity: Hard
  • Duration: 175 hours

Description

An experiment has been started to get the data output views for notebook cells from CDN: https://github.com/jupyterlab/richoutput-js

That approach would improve:

  • the ability to open very old notebook in a compatible way
  • ease the integration of Jupyter output renderer in other application.

The main challenge of the approach is the ability to support the existing interactive widgets that require to discuss with the execution process on the server through a websocket.

So the milestones would be:

  • Demonstrate the validity of the approach to deal with interactive widgets
  • Propose the required changes to the Jupyter project to make the approach official
  • Update the project template to configure automatically the generation of JavaScript assets compatible with the new output renderer.

Required Skills

  • TypeScript
  • Git
  • [optionally] GitHub

Expected outcome(s)

Resources for writing the proposal

Candidates must refer to GSoC guide regarding the application process and the proposal writing. You will also find specific advice there.

You should also look at the NumFocus documentation; in particular you will find a template for the proposal. Google also provides two examples of proposal: example 1 and example 2.