Convert a table into a dict of Jagged arrays #378

nfoppiani · 2019-10-15T04:40:46Z

This issue is mainly a question.
I would like to know how I can convert a table
obtained with this code uproot.open(root_filename)[tree].lazyarrays()
into a dict of Jagged arrays
obtained normally with this code uproot.open(root_filename)[tree].arrays().
This is because I would like to consider a table, compute a function that produces a mask (a vector of True or False with length equal to the number of rows of the table), and then create a dict of Jagged arrays without the masked rows.

This is because I need to run functions that would produce new jagged arrays (or new columns, if we think about tables), which are not applicable on all rows but only on those for which the mask is True.

Additionally, I also realised that executing operations with the dict of arrays is faster than with the table, is it expected?

The text was updated successfully, but these errors were encountered:

jpivarski · 2019-10-15T10:22:03Z

You can do this:

{n: a[n] for n in a.columns}

for some jagged array a containing a Table. Each a[n] is a zero-cost projection through the structure (one of the advantages of a columnar format).

There are also MaskedArray techniques to produce filtered arrays that have the same length as the original, but maybe try this first.

nfoppiani · 2019-10-15T17:45:14Z

It worked, thanks!
But I would love to know how to use MaskedArrays to assign a new column to a table, which exists only for a subset of the rows.
e.g. the function that computes the values of the new column can be computed only on a subset of the rows.

jpivarski · 2019-10-15T21:51:06Z

If I remember right, ufuncs applied to a MaskedArray are only computed for the unmasked elements. They're less often used than jagged arrays and tables, but @nsmith- has had some success with them.

nsmith- · 2019-10-16T18:28:02Z

For example, this emits no RuntimeWarnings:

import awkward, numpy as np
a = np.random.normal(size=100)
ma = awkward.MaskedArray(a<0, a)
np.sqrt(ma)

However, it copies the (valid subset) array under-the-hood. An old issue is to use the (somewhat new) numpy ufunc where argument: scikit-hep/awkward-0.x#110

jpivarski · 2020-06-18T00:58:55Z

I think this is done and can be closed. Let me know if I'm wrong!

jpivarski closed this as completed Jun 18, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Convert a table into a dict of Jagged arrays #378

Convert a table into a dict of Jagged arrays #378

nfoppiani commented Oct 15, 2019 •

edited

Loading

jpivarski commented Oct 15, 2019

nfoppiani commented Oct 15, 2019

jpivarski commented Oct 15, 2019

nsmith- commented Oct 16, 2019

jpivarski commented Jun 18, 2020

Convert a table into a dict of Jagged arrays #378

Convert a table into a dict of Jagged arrays #378

Comments

nfoppiani commented Oct 15, 2019 • edited Loading

jpivarski commented Oct 15, 2019

nfoppiani commented Oct 15, 2019

jpivarski commented Oct 15, 2019

nsmith- commented Oct 16, 2019

jpivarski commented Jun 18, 2020

nfoppiani commented Oct 15, 2019 •

edited

Loading