-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
coordinate limits more flexible to allow for other coordinate systems #32
Conversation
circle, range, and polygon refer to point so this is specified in a single place
Well... I'm still unhappy that we're outlawing negative coordinates in
polygons overlapping the stitching line. Sure, once we have properly
moved to left-winding polygons, this is at least possible (it's not
now, for me, given the way pgsphere handles polygons now). But it's
certainly inconvenient to users.
So... I increasingly wonder whether all this is worth the effort at
all. circle(367, 20, 1) is clearly defined, and I'd be surprised if
there were services doing something odd with it.
circle(367, 95, 2), on the other hand, has no sensible interpretation
at all in any sort of spherical coordinates, and I don't think anyone
would be tempted to accept it. One may have doubts with circle(367,
91, 2), but frankly I don't think anyone will accept it.
So... does anyone remember exactly say why we are trying to define
the limits in the first place? Do we still find these reasons
convincing?
Conversely, I *totally* can see a point defining what should happen if a
service rejects a certain geometry. This is particularly non-trivial
in ADQL: are invalid geometries NULL? I could make a pitch in favour
of that (without being convinced).
If, against that, they must raise an error, can we come up with a way
that helps users to figure out what went wrong where? Imagine
someone has written something like
select circle(ra1-ra2, dec, 5) from some_table
[indulge me here, as I have no plausible idea why someone would write
this; but I hope you can believe people might want to have
expressions in geometry constructors]: if the query just aborts, can
we help a user to figure out it was the ra1<ra2 that killed the
query? Imagine some_table having 1e9 rows...
Anyway: I think I'd much rather work on robust, interoperable ways to
communicate rejection than on trying to clarify when a service must
or must not reject coordinates, in particular when all we can say is
vague anyway.
|
The current DALI-1.1 already states coordinate value limits:
That's either too restrictive (implying usable only for equatorial) or vague (only equatorial coords have limits). So we have to improve it somehow. The intent at the time was to limit the serialized values (that are sent between clients<->services) so that handling input correctly was simple. The burden then lies on any s/w doing output of values to conform (e.g. perform range reduction before output). Personally, I don't have a strong position either way. I know what I'd have to change if we completely dropped limits. It is always the case that when something does fail we must get a useful error message in front of user eyeballs so they are empowered to resolve the problem if possible. Two questions:
|
On Thu, Nov 09, 2023 at 10:35:59AM -0800, Patrick Dowler wrote:
1. Given the above sentence from REC-DALI-1.1, what would be the
impact of dropping rather than clarifying the limits? Is that
feasible in a minor version?
On the server side: What was ok remains ok, so we're not making
anything invalid that was valid before. Hence, I don't see how we'd
be breaking anything.
On the client side: The only situation I can see where we're breaking
previous behaviour is when a client relies on the server to reject
RAs outside of [0, 360). Disregarding the question why software
would do that, it would also fail right now with all DaCHS instances
(which don't validate coordinate intervals at the moment and can't
really do that because of cross-stitching-line polygons). So, I'd
argue we'd not be breaking anything there, either.
As far as I can see, we're in the clear on the "can we do it?".
2. ADQL-2.1 is deferring to DALI with respect to literal values in
geometry constructors (hence the attempt to clarify this). What is
the impact on ADQL spec and implementations if we drop the limits?
Well... I give you it sucks to have a dangling referral. But if what
we find is that we don't have much sensible to day in DALI, either
(and that, frankly, is my nagging suspicion; I, anyway, don't have),
then perhaps we should drop that referral, too? ADQL still is in PR,
after all.
|
I tried to run a cone-search query against Simbad-TAP with this circle, so:
Surprisingly, it seems to be exactly equivalent to:
Why? I don't know. So, indeed, I agree that it "has no sensible interpretation at all", according to me. Anyway, I bet this result is expected by nobody. |
It comes from the RFC of ADQL-2.1 and more especially about the fact that we decided to drop (deprecate for the moment) the coordinate system argument though the document was still enforcing coordinate value limits. These limits were for the equatorial system. At least two persons in the RFC complain about the fact that to do things properly, this part of the ADQL standard should be changed to also include other coordinate systems (the galactic one was taken as example). |
I am not sure dropping (and thus, ignoring) this coordinate limit issue is the right approach. I'd rather prefer some text in DALI stating that no ideal solution can be found than nothing at all. The question will keep coming at some point in the IVOA community. At least, writing somewhere (and indeed DALI seems to be the right place....or CoordsDM) a warning about "weird coordinate values" or a description of the issue would help users and implementers to know the status of this issue. And maybe sometime, one will find a good (or not too bad) solution to this. In the meantime, standards like ADQL can still continue to exist with just a reference to the document describing this issue (here, DALI), and they do not have to take care about it. Plus, I have to admit, and this is probably not a good reason/argument, that I do not like the idea to change again ADQL-2.1 PR so close to the TCG meeting and its voting stage. It would mean postponing again ADQL-2.1-REC just for this, and I'd prefer not to do that after so many years working on it without a solid reason. |
My answers would be: Q1: Yes, existing valid values remain valid. OK Q2: I think a vague ref in ADQL is ok - it means we only have to improve/fix/specify this in one place and implementers of ADQL have to deal with that complexity|simplicity. And that's why I created this PR: to impove the existing limits to be more open/flexible and allow for other coordinate system and their inherent limits. |
So, do we retain limits and improve the description? Or, do we remove limits and make it clear that all s/w has to be prepared to to do range-reduction as necessary to function correctly? aside, I think that 367,95 -> 187,85 is logical.... just think about is 367,85 and then moving 10 deg north (over the pole) to the other side of the sphere at 187,85. For longutude you can apply range reduction (+/- 360) but for latitude past the pole you flip longitude by 180 and latitude by lat-90... if you did longitude range reduction first you get the same thing: |
Explained like this, yes, it does more sense. Again, thank you for the explanation :-)
I don't know about keeping the limits with a
I like this option. The issue is not lost and it is made clear to implementers what level of freedom they have, and to users what they can expect (or not) if they are providing "weird" coordinates. In this case, it is more like a warning than a rule to follow. It is probably the least we can do without breaking anything (while waiting for an acceptable solution sometime in the future). |
I'm also in favour of removing mandatory limits at this time but making a statement in DALI that limits are service/implementation dependent. That way both clients and services understand that implementations may vary. |
It's true that ADQL may consider incorrect geometries as NULL, although it may not help the user figuring out what went wrong with his/her query and then to fix it. On the other hand, it may not be very helpful either to kill the query execution when a geometry becomes incorrect for a given row ; one has to return a very precise and useful error message. And that is indeed not trivial to do. Either solution is indeed not perfect. I just wanted to highlight something else that @jd-au and I just thought to: we have to be careful to make something that can work with "simple" access protocols (e.g. ConeSearch, SIA, ...) but also with more complex protocols like TAP. If we speak about geometry, currently it will concern only TAP (as far as I can think right now), so it is fine. But if we say that in case of incorrect input coordinates, an error should be raised, it should work for both "simple" and complex protocols. The thing becomes more complicated if a "simple" protocol supports input geometries. |
update to not impose limit... better? OK? aside: I have been looking/thinking about SIAv2 and SODA because they depend on this too |
On Thu, Nov 09, 2023 at 03:54:13PM -0800, Patrick Dowler wrote:
update to not impose limit... better? OK?
I can live with just about anything, although I can't say I'm
enthusiastic about specifications that contain "more likely". But
feel free to merge; I don't have ideas on improving these rules.
aside: I have been looking/thinking about SIAv2 and SODA because
they depend on this too
Absolutely -- everything parsing xtype="point|circle|polygon" in
VOTables does, too, which is why I'd so much like to provide a
guideline on what to do on errors. Imagine a VOTable client receiving
a geometry specification it doesn't like. It'd suck if it outright
refused to load the entire table (99.9% of which will be ok). But
it'd suck if it siliently put in... something inscrutinable as well.
The only halfway acceptable idea I have to say "services mark these
as invalid geometries, which are distinct, non-intersecting,
non-overlapping with all geometries (including themselves); where
appropriate, the representations of these are NaN NaN for points, NaN
NaN NaN for circles, [NaN NaN]*n_points for polygons, and an
empty string for MOCs."
And that thought again makes me dislike multipolygons, in particular
with the serialisation currently suggested.
Be that as it may: I think that's a different PR; let's merge this
for now.
|
That's better, even if it's not quite satisfactory, as Markus said. But I can't think of a better solution too. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I give you my approval, but I let you decide whether you want to wait for a better solution or not before merging this PR.
Replace the text "the previous version" with "DALI 1.1" so that this text does not need to be updated in future document versions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made a couple of minor changes. I think this is an improvement on the previous text, so approve.
circle, range, and polygon refer to point so this is specified in a single place
I'm not 100% happy with how explciit or vague this change is so this is still draft. I would like to include a second example. Once in an acceptable state, we can think about an erratum for DALI-1.1