-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
switch from rdfs:domain/range to schema:domainIncludes/rangeIncludes #14
Comments
What is the problem with the following?
If you cannot say anything about the domain, you can still add a |
@Inf3rno The current hierarchy is BodyOfWater<NaturalPlace and Planet<CelestialBody. So I would say that your proposals introduce useless abstract classes. "Useless" is not an overly strong term: schema.org has rejected the creation of "Agent", a super-class of Person and Organization, after a substantial discussion. I personally think such class is needed, but the community has disagreed. Why should we accept useless abstract classes? Where do we stop, do we also introduce mixins like Nameable, Measurable, etc etc? Better to use polymorphic characteristics like schema:domainIncludes instead of monomorphic like rdfs:domain. |
We really discussed a lot about how to structure the DBpedia Ontology. I think in the end it is super convenient to have it flat and simple. This makes it more maintainable and also has I am thinking about something called |
@VladimirAlexiev |
@Inf3rno Right! You can't/shouldn't use RDFS inference with DBpedia. Eg from |
@VladimirAlexiev just because the hierachy is modest, doesn't mean it you can't do RDFS reasoning. Actually, a lot of people do RDFS inferencing, i.e.
Also OWL inference such as sameAs, equivalentClass is there and more sophisticated stuff will come soon. Albeit I would like to exploit SHACL more. So why do you say this? |
@VladimirAlexiev also a question. Do you know the correct semantic interpretation of:
This is interpreted as inferring both types, right? Also we have type specific properties:
|
@kurzum yes, it should infer that the subject is both Planet and BodyOfWater, which is nonsense (except in the Waterworld movie). Similarly for
This is exactly why I've proposed this issue. In dbpedia, domain and range are purely advisory because the extractor does not enforce them. (That was the case 3y ago, and is still true afaik). |
Type-specific props are a bad idea because
Further, there are no type-specific Object props, so they are not relevant to the discussion |
There has been a post-processing clean up step that is configured to remove such triples. It is easily extensible but is currently configured to remove triples when the object is of type that is owl:disjointWith the expected range. The same for the rdfs:domain see |
the post-processing is still in place. see e.g. https://databus.dbpedia.org/dbpedia/mappings/mappingbased-objects/2019.09.01. range for datatype properties are in fact steering the parsers during extraction and trigger unit conversion for the "specific properties" in this dedicated dataset. In my opinion the domain / range of the properties in question is not defined well. it should be owl:thing or some really generic classes like MaterialThing for temparature. Moreover the mapping process should not accept the usage of properties like this or at least show warnings. Well defined property domains and ranges in combination with the post processing (domain / range check) aim exactly at filtering out false triples like the example ?s dbo:mother :England. When using the filtered files for rdfs reasoning only it should work in the most cases. What would be the advances of using schema ranges / domains instead of rdfs w.r.t. reasoning and error filtering? |
I didn't know about this post-processing. It's a good step, but not equivalent to enforcing rdfs:domain/range because it works based on explicit Disjoint declarations. Good example: although http://dbpedia.org/ontology/firstAscentYear is defined only for dbo:Mountain, it will be preserved on dbo:Volcano because the two are not declared Disjoint (in fact both are subclasses of dbo:NaturalPlace).
There are 100-200 Volcanos for which dbo:firstAscentYear is known: select * {
?x a dbo:Volcano; dbo:firstAscentYear ?y
} If you apply RDFS reasoning, all will be inferred dbo:Mountain. That may be ok for some of them, but the ontology creators didn't think it appropriate to declare Volcano subClassOf Mountain, so that inference is not right. At present there are only about 25 disjointness axioms. Most are about dbo:Person, and they are not rendered symmetric: select * {
?x owl:disjointWith ?y
} But even that may be too restrictive, eg these disjoints
don't account for the often occurring conflation of an organization and its (headquarters) building, which happens especially often for museums/libraries. Some other ontology queries I played with: Number of select (count(*) as ?c) {
?x a rdf:Property
filter(strstarts(str(?x),"http://dbpedia.org/ontology"))
}
2727 Breakdown into object vs data prop (there is a well-defined dichotomy): select (count(*) as ?c) (sum(?obj) as ?object) (sum(?dat) as ?data) {
?x a rdf:Property
filter(strstarts(str(?x),"http://dbpedia.org/ontology"))
bind(exists {?x a owl:ObjectProperty} as ?obj)
bind(exists {?x a owl:DatatypeProperty} as ?dat)
}
object 1105, data 1622 Props with defined range: select (count(*) as ?c) {
?x a rdf:Property; rdfs:range ?range
filter(strstarts(str(?x),"http://dbpedia.org/ontology"))
}
2450 Thus 277 props have no defined range. Not all of them are dataProps, eg http://dbpedia.org/ontology/subClassis is object prop: select * {
?x a rdf:Property
filter(strstarts(str(?x),"http://dbpedia.org/ontology"))
filter not exists {?x rdfs:range ?range}
} Breakdown by data/obj, then range: select (count(*) as ?c) (sum(?obj) as ?object) (sum(?dat) as ?data) {
?x a rdf:Property
filter(strstarts(str(?x),"http://dbpedia.org/ontology"))
bind(exists {?x a owl:ObjectProperty} as ?obj)
bind(exists {?x a owl:DatatypeProperty} as ?dat)
optional {?x rdfs:range ?range}
} group by ?range order by desc(?dat), ?range |
If you examine dbo:firstAscentPerson, you'll see plenty of nok:
|
This need is also borne out by domain/range validation: http://mappings.dbpedia.org/validation/index.html
E.g. filter to lang="en", predicate="temp":
This shows that maximumTemperature is only defined for Planet but is also used for Lake, Sea, etc.
Tracing the class hierarchy doesn't show a useful super-class of these, so we need two classes:
Planet, BodyOfWater
Therefore I would suggest to allow several domains & ranges in the ontology definition.
RDFS semantics would then infer that any resource with maximumTemperature is BOTH a Planet and a BodyOfWater.
If they were correct, we should delete all statements appearing in domain/range validation, but that's obviously wrong.
So switching to schema:domainIncludes and schema:rangeIncludes will reflect more accurately the meaning in the ontology wiki.
The text was updated successfully, but these errors were encountered: