-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tighten up data/metadata mismatch section #8
Conversation
In particular mention, without recommending, alternative approaches to VOTable/parquet column mapping in complex cases, and make explicit reader behaviour that should defend against encountering such alternatives.
I don't know whether to add here some explicit text about how parquet types are mapped to VOTable types in straightforward cases, something like:
That in turn would raise the question of what you do with types that are somewhat but not quite like those ones:
In both of those cases I don't know what's best. For the signed/unsigned issue the right answer may be a DALI-defined Given that I expect a good majority of tables won't run into either issue, and that implementation experience may be required to get this right, is it best to:
I'm inclined to go with (1) or (2) but others might disagree. |
VOParquet may serve at least 2 purposes: From my understanding (I do not remember whether it is explicitly mentioned or not), the current version of the document covers case A. If we want to cover case B, I think that we should mention at least (2). Then, to go further, it raises the question of VOTable as a pivot format. I think that the VO lacks a description of tabular data and metadata which would be file format agnostic, i.e. decoupled |
I've written the document from the point of view of A. There is some feature creep into B (e.g. adding RESPONSEFORMAT=parquet in DALI) but I feel like trying to solve all the problems associated with that is too much at this stage. In TOPCAT if I see parquet columns that I can't make into |
I've added some text that explicitly lists those column types that can be mapped to VOTable in a straightforward way. Since this comes before the paragraph about what to do for types that can't be straightforwardly mapped, I think it covers (1) and a nod towards (2). |
In particular mention, without recommending, alternative approaches to VOTable/parquet column mapping in complex cases, and make explicit reader behaviour that should defend against encountering such alternatives.