Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read trees are not in sync #176

Open
kreczko opened this issue Jul 31, 2019 · 0 comments
Open

Read trees are not in sync #176

kreczko opened this issue Jul 31, 2019 · 0 comments

Comments

@kreczko
Copy link
Member

kreczko commented Jul 31, 2019

One big issue with reading multiple trees is that they all have to be read independently, e.g.

TreeChain(
    treeName,
    self.input_files,
    cache=True,
    events=self.nevents,
)

or

uproot.iterate(
    self.input_files,
    treeName,
    entrysteps=self._batch_size,
)

In order to safeguard ourselves against such behaviour the easiest way is to switch to file-based reading: one file at a time. For vectorized processing, this means switching to uproot.open + for each tree: lazyarrays(entrysteps=batch_size)

@kreczko kreczko added this to the version 0.6 milestone Jul 31, 2019
kreczko added a commit to kreczko/cms-l1t-analysis that referenced this issue Aug 2, 2019
kreczko added a commit to kreczko/cms-l1t-analysis that referenced this issue Aug 2, 2019
kreczko added a commit to kreczko/cms-l1t-analysis that referenced this issue Aug 6, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant