Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sample data #1

Open
mccalluc opened this issue Oct 23, 2016 · 1 comment
Open

Sample data #1

mccalluc opened this issue Oct 23, 2016 · 1 comment

Comments

@mccalluc
Copy link
Member

@ngehlenborg: No hurry,

  • but can you point me at a dataset that would be good to used for a first demo? Or suggest how many rows and columns should be accommodated?
  • For the volcano plot and the pca, is it worthwhile to do it in JS? It would make it easier to plug in new sample data, and maybe it could be useful for smaller datasets in the long run, too.
@ngehlenborg
Copy link

Still looking for a good sample data set.

Typical data set sizes we need to be prepared for: ~25,000 genes (= rows) and say 100 conditions/samples (= columns).

In heatmaps, the genes will be filtered to something more managable (dozens to hundreds) and obivously we can't render 25000 rows even if we make them 1 pixel high. There will either be overplotting (initial solution) or we find smart ways to aggregate (later solution?).

The scatterplots (PCA and volcano) should support the same number of items as the worst case (~25,000), which is best handled using opacity (which will make rendering much slower). Note that PCA can be done on both the genes and the samples (which is much easier to handle in the visualization). I suggest you build on the canvas-based scatter plot that @sgratzl started to work on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants