Skip to content
philippjfr edited this page Jun 8, 2015 · 9 revisions

A single plot can represent at most a few dimensions before it becomes visually overwhelming and cluttered. Since real world datasets often have far greater dimensionality, however, we face a tradeoff between representing the full dimensionality of our data, and keeping the visual representation intelligible and therefore effective. In practice we are limited to two or at most three spatial axes, in addition to the color, angle and size of the visual elements. To effectively explore higher dimensional spaces we can therefore have to find other solutions.

One way of dealing with this problem is to lay out multiple plots spatially, some plotting packages [Was14], [Wkh09] have shown how this can be done easily with various grid based layouts. Another solution, made possible by the interactivity of web-based technologies, is to introduce interactivity, allowing the user to reveal further dimensionality by interacting with the plots. Various solutions exist to bring interactivity to scientific visualization including IPython notebook widgets, Bokeh and the R language's shiny [shiny] web application framework. While these tools can provide extremely polished interactive graphics, getting them set up always requires additional effort and custom code, placing a barrier to their primary use case, the interactive exploration of data.

In HoloViews we solve this problem through a number of declarative and composable data structures, which can embed visual elements in any arbitrarily dimensioned space. These datastructures make it easy to lay your data out in time and space and simplify transformations between one another allowing the user to go through an iterative process to determine how the data can be best presented. These dimensioned spaces include:

  • HoloMaps - The core data structure in HoloViews allowing visual Elements to be embedded in an n-dimensional space, which can be rendered as an animation (i.e. video) or as interactive widget.
  • GridSpaces - A 1-2D data structure laying plots out in a grid, where each grid axis maps onto a data dimension.
  • NdLayouts/NdOverlays - Dimensioned equivalents to the Layout and Overlay data structures introduced in the collections section.

Since these datastructures are all composable even very highly dimensioned data can be explored trivially without having to worry about writing custom code to draw subplots or generate interactive widgets. This allows the user to focus on exploring their data and gaining insights rather than writing custom code to lay out their plots in one particular format.

In addition to the ease with which complex figures, widgets and animations can be generated with this approach and the declarative readable style it encourages, it also has another huge benefit. It is always clear where the data to generate a figure was drawn from and the data always remains accessible. In fact just like all the basic Element types, Spaces can be tabularized into a HoloViews table or pandas dataframes then be built back up into a different visual representation in just a few lines of code.

To get a real sense of how composing data and generating complex figures works within this framework we can explore some artificial data. Here we will simply vary the phase and frequency of a sine wave. We therefore declare the dimensions of our data as 'Phase' and 'Frequency'. This declares the space your data lives in. Once the dimensions are declared the HoloMap and the other space datastructures work very much like a regular Python dictionary and can be constructed in the same way, either by supplying a dictionary directly with a key tuple matching the number of dimensions or via __setitem__.

     dims        = ['Phase', 'Frequency']
     xs          = np.linspace(0, np.pi, 20)
     frequencies = np.linspace(1, 5, 5)
     phases      = np.linspace(0, 2*np.pi, 9)
     data        = {(phase, freq):
                    hv.Curve(np.sin((xs+phase)*freq))
                    for phase in phases
                    for freq in frequencies}
     curves      = hv.HoloMap(data, kdims=dims)

Now we have declared a dimensioned space for the data, that space can be transformed in any number of ways by grouping particular dimensions into the different data structures, e.g. we can assign the 'Frequency' dimension to be overlaid:

     curves.overlay('Frequency')

Rather than committing to one particular layout of the data we can quickly, and interactively experiment with different representations until we arrive at the most effective visual representation of the data.

[Was14] Michael Waskom et al.. seaborn: v0.5.0, Zenodo. 10.5281/zenodo.12710, November 2014.
[Wkh09] Hadley Wickham, ggplot2: elegant graphics for data analysis, Springer New York, 2009.
[shiny] RStudio, Inc, shiny: Easy web applications in R., http://shiny.rstudio.com, 2014.
Clone this wiki locally