-
Notifications
You must be signed in to change notification settings - Fork 67
Extend uproot.newtree for use with pandas dataframes? #416
Comments
In principle, the That is, just as we can now tfile["some_hist"] = (np.array(...), np.array(...)) we'd be able to tfile["some_tree"] = pd.DataFrame(...) That's not a bad idea. I'm leaving this as an open issue in case someone wants to take it up. |
I can give it a try, though if someone beats me to it, all the better :) |
That would be great, if you have a chance. For reference, the histogram types are translated in a systemized way in uproot-methods: https://github.com/scikit-hep/uproot-methods/blob/9e98414d5c155fa902d13cf40d1c66dd0a1461d4/uproot_methods/convert.py#L14-L54 However, you probably can't just add this case to that because that mechanism converts each histogram type into an object with the fields uproot is looking for—nothing dynamic. This conversion is a little different because the TTree interface is not just " TTrees are special, so they can be handled with special code in uproot. (I put all the histogram-handlers in uproot-methods because data analysis types get beyond uproot's mission of being only I/O.) I don't think it would be a bad separation of concerns to put this special-case check directly in Just be sure that you detect the DataFrame without forcing Pandas to be loaded. The user might not even have Pandas, and they wouldn't want it to be "accidentally" imported just to find out whether the thing on the right-hand side might be a DataFrame. (It might not.) You can use something like this to check the object's type non-invasively: https://github.com/scikit-hep/uproot/blob/163bf0ab0a5b9d16e7aee61b8ab19e0b0412a83d/uproot/tree.py#L119 It might sound like a bad idea to check an object's type by string, but in Python, that's essentially what any type check is. (When you import a module, that's a particular name in a global namespace.) It would be a problem if Pandas moves the internal location of Thanks! |
Can I still commit here or should I push to uproot4 instead? |
I'll still see it here. I've been moving other issues to uproot4 because that's where the new development is happening, but due this one, file-writing hasn't started in uproot4 yet, and it will likely have a different interface (to try to learn from issues faced with this one), so comments on that are probably more relevant here than there. |
Great. Will try to have the PR here by the end of the week |
Is it possible to extend the uproot.newtree functionality take a pandas dataframe as input? It seems possible as the type to be written can be inferred from the dtype of the column. The only catch I see is having dtype = 'o', which could be ignored.
I guess one implementation could be
The text was updated successfully, but these errors were encountered: