-
Notifications
You must be signed in to change notification settings - Fork 193
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support of ISO19115-3:2016 Geospatial Standard #212
Comments
A good starting point would be implementing support for ISO19115-3:2016 in OWSLib. |
@kalxas thanks, as in it's not supported in https://github.com/geopython/OWSLib ? and the spatial extension for CKAN is using this as a dependancy? If I look to get this solved, would be great to make sure peoples input, knowledge and approaches can be included :) Any other insight into chasing this would be appreciated (feel free to brain dump here). |
Actually there is an iso parser in https://github.com/ckan/ckanext-spatial/blob/master/ckanext/spatial/model/harvested_metadata.py but OWSLib is also used in other places. |
It's a bit of a hack but as part of setting up cioos.ca we have forked ckanext-spatial and started modifying it to support harvesting iso19115-3. https://github.com/cioos-siooc/ckanext-spatial Most of the changes to date have been in https://github.com/cioos-siooc/ckanext-spatial/blob/cioos-siooc/ckanext/spatial/model/harvested_metadata.py. I would love a figure out a way to do this so that we could maintain support for iso19139 as well and contribute back to the community. |
Hi @fostermh, If so I could try to provide a pull req based on your fork very soon. |
It's not 100% coverage, unfortunately. It does handle most of the required and recommended fields but some of the optional fields are not currently part of our standard and so have not yet been addressed. At a quick look, there appear to be around 25 fields that do not currently have an iso19115-3 xpath. It would probably take a day or two to finish that up. If there is interest I could work on finishing up the last few fields next week. |
I'm checking changes, I see quite a lot of interesting stuff, maybe I can cherry pick some other than Any suggestions? (f.e.: I see an interesting GeoNetworkHarvester, currently I was planning to use CSW with a full profile but I'm not sure it will ship all the iso fields) Should I / would you like to create a raw pull req (only to share comments and improvements) looking forward for comments / suggestions. Note: I'd love to keep PR as small and focused as possible so it will be easy to merge and review. |
I think if the focus is ISO19115-3 support, then ckanext-spatial/ckanext/spatial/model/harvested_metadata.py is likely the only file of interest. Yes, we did quite a bit with harvesters. That was more related to how data flows between organizations in CIOOS so probably best to ignore it for the moment. No need to confuse the issue more than it already is. :-) A raw pull request with just the minimal changes needed sounds like a great idea. I will put a PR together and tag this issue when it's ready. |
Great, quick note, I'm testing a port of that file to my branch but looks like it requires (over Alpine-3.14):
due to osgeo import:
Adding so many dependencies can be tricky (but can be discussed with @amercader and maintainers) do you think that feature is necessary or not reproducible withoud gdal? Thanks |
It seems likely to me that spatial information will come in the form of GML, I added a couple of other options because I could but we are using GML internally I think. It seemed troublesome to store the GML in solr, and not very useful down the line, so I converted to GeoJSON. I used ogr for this. If we only support GML input then a custom function to parse it would be possible for sure. So.. yes I think it could be done without osgeo but seemed like a lot of work at the time. |
Ciao, I'm porting all the necessary changes to the schema (heavily based on your work) here: That is 'my_master' and is shipping also some minor improvement I'm providing as isolated pull request (mainly over csw), but necessary to go on supporting other types and ... in my opinion not bad to have. Focusing on the ckanext/spatial/model/harvested_metadata.py I've some question related to backward compatibility.... I'm seeing that most of them are compatible but some of them (please search for '#TODO' string) are not, f.e. change in multiplicity I'm suspecting that a new ISO family should be created instead of enriching the existing one. This implies a better iso 'guesser' which is doable but could be tricky and verbose!!! @amercader @fostermh what do you suggest? Thank you! |
Unfortunately, the link you provided doesn't work for me. As to your comments regarding backward compatibility, this was exactly my issue as well. The paths and multiplicity can likely be taken care of with relative ease. for example, if iso19115 expects multiple entries and iso19139 expected one we could adjust the value in an 'infer' function based on which standard was being harvested. The primary issue I was having is the namespaces that need to change between standards. If we use the same standard for all datasets in a harvester (possibly across one ckan) then we should be able to adjust based on a harvester config setting or ckan.ini setting. If we want to support a mix of standards then it gets tricky. It has been a while since I looked at it, but I believe the namespaces are set during the first harvest creation. So if the first dataset being harvested is ISO19115 standard, the namespaces are set up to support this standard and all harvest objects after will use the same namespace. Obviously, that will not work if you are harvesting a mix of ISO19139 and ISO19115 xml files, for example. |
Mmm this is tricky, I'm not sure I'm understanding. What I'm thinking is for sure to
I would really appreciate to have support on any of these from anyone (also approvals or kind of blockers...).
Right I've also had some issue to download them with csw, now should be possible to configure an OutputSchema from the json which configure the harvester. (see below)
This is interesting, but can be handled (not easi but doable) with different harvesters configured as specified here: #259 Thank you for your help! FYI: My short term goal is to implement a set of harvester to import the following formats:
|
ok I see TODO's and can lend a hand there. do you want me to do a PR to https://github.com/ccancellieri/ckanext-spatial or would you like to start a draft PR from https://github.com/ccancellieri/ckanext-spatial to https://github.com/ckan/ckanext-spatial and I can comment/push there? I think we talked about me starting a raw PR but it looks like you have most of it already and I would just need to remove/comment on a few bits. |
I added line by line comments #iso19115-3 as you did. I did that with blocks (separated by a space to the end. FYI: this branch is rebased to master and isolated to those 2 motels files but still requires #259 for multiple formats |
The plugin looks far more complex than expected, I think: We've to provide a new harvester implementation providing an ad-hoc ISOElement(s) hierarchy not touching base.py and harvested_metadata.py (which are dedicated to iso19139). For this reason, please hold-on few days I'll try to produce a better integration and I'll give you the go to in a couple of days to pull if interested in contributing. I'll try to give both options @fostermh stay tuned! :) |
Ok with the latest push (I apologize, I have had to rebase and squash to make it clear) we are ready to discuss: There are several ticket/issues to open to be able to reuse code but I think most of them are summarized by: https://github.com/ccancellieri/ckanext-spatial/blob/iso19115-3/ckanext/spatial/harvesters/iso19115/spatial_harvester.py Now why we have 2 harvester? The explanation is quite complex and mainly depends on several tickets that can be opened discussed (look at TODO). The the main harvester is the iso19115_csw_harvester (harvester.py - which must be selected as harvester type into the configuration type), it implements IHarvester delegating as much as possible to the parent the tasks (CSWHarvester). The iso19115 should be loaded as well (spatial_harvester.py - must be present into the plugins list) but it is mainly used to implement ISpatialHarvester to override some other method from base.py which is not so fault tolerant (most of this code may go if master will accept contributions. As you see now we have a separate model: Hope now we can remove all the 19139 parts from there, feel free to contribute on it, it should not be backward compatible anymore (but still may/should support all the 19115 versions). Which is used if any of the validators validate the incoming metadata otherwise other plugins will be used to test against the incoming metadata and the first match will be used. This means that the plugin is able to harvest both is19115-(1,2,3) and iso19139 and the recognition is automatic. @amercader I'm trying to develop following indications from ISpatialHarvester and IHarvester but looks like they are not so well integrated, should I open some tickets to fix them? I'll probably open a discussion since I feel there's still some missing but useful method to add to the ISpatialHarvester interface @torfsen over to you on the model.py! |
@ccancellieri: I'm not sure what you meant when mentioning me ("over to you on the model.py"). Unfortunately I'm currently not able to contribute much time to CKAN, but if you have a specific question for me I'll see what I can do. |
Thanks @ccancellieri, I have been sidetracked by a few issues but will get back to this soon. |
Is there anything that can be done to support this effort? |
What would be require to add support for this standard in this module?
We have agencies that are outputting CSW originating from this standard and the current CSW harvest in this module error when attempting to ingest this.
I note this is referenced over in this GH issue too GSA/data.gov#764
Keen to get a read on what's required so I can look at potentially tackling this soon.
Thanks all.
The text was updated successfully, but these errors were encountered: