-
Notifications
You must be signed in to change notification settings - Fork 323
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for Table
output in Data.read_many
#11546
Conversation
4c4b4e0
to
68df118
Compare
type Return_As_Table | ||
## Returns a table with a new column `Value` containing the objects loaded | ||
from each file. | ||
|
||
When the source for files to load was a table, all columns from the | ||
original table are also retained. In case of name clashes, the newly | ||
added columns will get a suffix. | ||
|
||
When the source was a simple Vector, the returned table will also contain | ||
a `File Name` column. | ||
Table_Of_Objects |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name Table_Of_Objects
was a working name. I'm not sure if we want to keep it or come up with something better?
Definitely Table_Of_Tables
in Excel sounded much better. But here it may not necessarily be tables, it may be various objects. So Table_Of_Objects
is quite precise, but it sounds weird.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should change approach altogether and try something like Added_Column
? (meaning that the loaded files are added as a column to the input)
Values_Column
Table_With_New_Column
any thoughts on these names?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think Table_With_New_Column
or Table_Of_Objects
seem the best to me, Values_Column
I think could end up being confusing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about more of a boolean? Merged/UnMerged. Merged True/False? I feel like it would be hard for someone to guess what table_of_objects is going to do without reading the help or trying it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if:
type Return_As
## Single table with values expanded
Merged_Table
## PRIVATE
Add this for backwards compatiblity then hide in widget?
Table_Of_Tables
##
Individual_Cells
## Objects returned as a raw vector allowing zipping to input
Vector
Not sure on name yet either - will think some more
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes sorry that wasn't clear. When I said boolean I didn't mean a boolean type. More text options that were Merged or Not Merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Vector
is handled byReturn_As_Base
- the type needs to be split up so that theVector
return works even ifTable
is not imported. It is opaque to the user - the dropdown displays all available types and autoscoping works as shown in tests.I'd not add
Table_Of_Tables
toData.read_many
. It is supported inExcel_Workbook.read_many
to retain backwards compatibility. But adding it to a brand new function seems unnecessary - why add an obsolete option?
I was in my head thinking of it as a single type used by both functions but I appreciate we are actually building this dynamically. Just for working out how it might look in the dropdown thinking as a single type is a bit easier!
If the ..Table_Of_Tables
was an option not shown in the dropdown but in one of the mix in types then no special handling would be needed for the two reads to share the same object was my thought.
I wonder if:
..As_Vector
..With_New_Column
..As_Merged_Table
Might be cleanest?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the
..Table_Of_Tables
was an option not shown in the dropdown but in one of the mix in types then no special handling would be needed for the two reads to share the same object was my thought.
But then more code would be affected by the legacy options, I'd rather not grow places that take legacy options when not needed. Unless there's any problem with the current approach, I'd keep the compatibility conversion as-is.
I wonder if:
..As_Vector
..With_New_Column
..As_Merged_Table
Might be cleanest?
That sounds quite clear to me 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I like those options
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The options outlined in the comment above have been implemented. Excel_Workbook.read_many
also allows old options for compatibility, but Data.read_many
(the new component) only works with new ones.
@@ -16,146 +7,3 @@ type Match_Columns | |||
|
|||
Note: column names are not compared. | |||
By_Position | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No meaningful code changes in this file, just moving the helpers to separate module - Match_Columns_Helpers
.
…API for now TODO: to be revisited in #11681
# TODO ideally we'd like to use `Meta` but it currently has no support for UnresolvedConstructor so we use a 'hack'. | ||
is_the_default = value.to_text == "..As_Merged_Table" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checking to_text
feels like a bit of a hack, but currently Meta
seems to lack support for UnresolvedConstructor
- Meta.meta
returns Primitive
and does not allow any inspecting.
It feels like it may prove useful in the future, @jdunkerley do we want to create a ticket for the engine team to add support for Unresolved_Constructor
in Meta
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would have done it via
type My_Private_Type
As_Merged_Table
and trying to convert to that.
I agree we could probably do with something on Meta.
cc: @JaroslavTulach for visibility.
# TODO ideally we'd like to use `Meta` but it currently has no support for UnresolvedConstructor so we use a 'hack'. | ||
is_the_default = value.to_text == "..As_Merged_Table" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would have done it via
type My_Private_Type
As_Merged_Table
and trying to convert to that.
I agree we could probably do with something on Meta.
cc: @JaroslavTulach for visibility.
Pull Request Description
Data.read_many
#11311Important Notes
Checklist
Please ensure that the following checklist has been satisfied before submitting the PR:
Scala,
Java,
TypeScript,
and
Rust
style guides. In case you are using a language not listed above, follow the Rust style guide.
or the Snowflake database integration, a run of the Extra Tests has been scheduled.