In a Nutshell: Qanary Question Answering Components

The Qanary Framework is dedicated to creating Question Answering systems. Question Answering (QA) is a task requiring different fields leading to expensive/time-consuming engineering tasks that might block research as it is too expensive. Typical problems/use cases that might occur while developing a Question Answering system are:

an algorithm requires analyzing textual questions and annotating the found entities, relations, classes, etc.
- it is time-consuming as many services/algorithms/tools need to be compared
your QA process needs to be improved
- following traditional development approaches requires additional efforts for testing and debugging of code to uncover possible flaws
the quality of components dedicated to a particular task needs to be analyzed
- it is expensive to integrate all of the particular components due to a missing generalized interface

In this repository, the components of the Qanary framework are stored. All components are implemented in Java and provide a Docker container for lightweight maintenance.

Build and run a minimal set of components

To show the Qanary methodology and its functionality a tiny template-based Question Answering system was designed. It is capable of answering questions for the real name of a superhero like "What is the real name of Captain America?". For this purpose, just two components were used: a) Qanary DBpedia Spotlight component: The component is capable of finding superhero names and linking it to the DBpedia knowledge base (such a process is called Named Entity Recognition and Disambiguation). b) Qanary Query Builder for Superhero Names: The component is capable of creating SPARQL SELECT queries to be executed on DBpedia (such a component is typically called Query Builder) if the given question is following the template What is the real name of <superheroname>.

Hence, given a question following the described pattern the result will be a SPARQL query that might be executed, s.t., the real name of a superhero is retrieved from DBpedia.

Run a minimalistic Question Answering system

Install the Qanary core components
Clone the current repository:

git clone https://github.com/WDAqua/Qanary-question-answering-components.git

Switch to the folder Qanary-question-answering-components:

cd Qanary-question-answering-components

Build the minimal set of components using the Maven profile "tinytutorial" (here we skip creating the corresponding Docker images by adding the parameter -Ddockerfile.skip=true to the Maven command):

mvn clean package -Ddockerfile.skip=true -P tinytutorial

* The output should look like the following indicating that the component `qa.NED-DBpedia-Spotlight``and `qanary_component-QB-SimpleRealNameOfSuperHero` was created:

[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO] 
[INFO] qa.NED-DBpedia-Spotlight 2.1.0 ..................... SUCCESS [  3.717 s]
[INFO] qanary_component-QB-SimpleRealNameOfSuperHero 2.0.0  SUCCESS [  1.083 s]
[INFO] mvn.reactor 0.1.1-SNAPSHOT ......................... SUCCESS [  0.073 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------

Now, both components might be started using the JAR files:

java -jar qanary_component-NED-DBpedia-Spotlight/target/qa.NED-DBpedia-Spotlight-X.Y.Z.jar
java -jar qanary_component-QB-SimpleRealNameOfSuperHero/target/qanary_component-QB-SimpleRealNameOfSuperHero-X.Y.Z.jar

Build and start a Qanary pipeline
While having installed the Qanary components and Qanary pipeline using the standard configuration you can access a trivial Question Answering frontend via http://localhost:8080/startquestionansweringwithtextquestion
- Use the question "What is the real name of Captain America?".
- The question can be answered using the given two components.
- Thereafter, the triplestore will hold a SPARQL query that was created by the QueryBuilder component SimpleRealNameOfSuperHero (for DBpedia). It could be used to retrieve the actual answer from DBpedia. The UI shows the graph ID where the computed information was stored.
  - Retrieve the SPARQL query from your Qanary triplestore using:

PREFIX oa: <http://www.w3.org/ns/openannotation/core/>
PREFIX qa: <http://www.wdaqua.eu/qa#> 

SELECT *
FROM <ADD-YOUR-GRAPH-ID-HERE>
WHERE {
    ?s a qa:AnnotationOfAnswerSPARQL.
    ?s oa:hasBody ?sparqlQueryOnDBpedia .
    ?s oa:annotatedBy ?annotatingService .
}

Big Picture

Qanary provides the methodology for a knowledge-driven, vocabulary-based approach. Our long-term agenda is to create a knowledge-driven ecosystem for the field of Question Answering. It is part of the WDAqua project where Question Answering systems are researched and developed.
Qanary Framework is providing the core framework for creating Question Answering systems following the Qanary methodology. You might consider the Qanary Framework as a reference implementation of the Qanary framework as a microservice-based component architecture.
Qanary components is covering the QA components compatible with the Qanary framework.
Frankenstein is a supporting framework to establish a toolset for rapid orchestration and benchmarking of Qanary components. For example, it provides the tools to create from 29 components 380 QA systems.

Regarding questions, ideas, or any feedback related to Qanary please do not hesitate to contact the core developers. However, if you would like to see a QA system originally built using the Qanary framework, one of our core developers has built a complete end-to-end QA system that allows you to query several RDF data stores: http://wdaqua.eu/qa.

Please go to the GitHub Wiki page of the Qanary repository to get more insights on how to use this framework, how to add new components etc.

How to Cite

Introducing a Vocabulary for knowledge-driven Question Answering Processes

Kuldeep Singh, Andreas Both, Dennis Diefenbach, Saeedeh Shekarpour: Towards a Message-Driven Vocabulary for Promoting the Interoperability of Question Answering Systems. ICSC 2016: 386-389 DOI 10.1109/ICSC.2016.59

Introducing the Qanary Framework

Andreas Both, Dennis Diefenbach, Kuldeep Singh, Saeedeh Shekarpour, Didier Cherix, Christoph Lange: Qanary - A Methodology for Vocabulary-Driven Open Question Answering Systems. ESWC 2016: 625-641 DOI 10.1007/978-3-319-34129-3_38

Analytics of NER/NED Components

Dennis Diefenbach, Kuldeep Singh, Andreas Both, Didier Cherix, Christoph Lange, Sören Auer: The Qanary Ecosystem: Getting New Insights by Composing Question Answering Pipelines. ICWE 2017: 171-189 DOI 10.1007/978-3-319-60131-1_10

For further publications please see the following wiki page.

Qanary Components

The following components are contained in the

Question Answering Name Entity Recognition (NER) and Disambiguation Components (NED) Components

Entity Classifier 2 (NER)

It uses rule-based grammar to extract entities in a text.

Qanary Entity Classifier 2 for NER

Stanford NLP Tool (NER)

Stanford named entity recognizer is an open-source tool that uses Gibbs sampling for information extraction to spot entities in a text.

Qanary Stanford NLP Tool for NER

Babelfy

is a multilingual, graph-based approach that uses random walks and the densest subgraph algorithm to identify and disambiguate entities present in a text.

Qanary Babelfy for NED
Qanary Babelfy for NER

AGDISTIS (NED)

It is a graph-based disambiguation tool that couples the HITS algorithm with label expansion strategies and string similarity measures to disambiguate entities in a given text.

Qanary AGDISTIS for NED

DBpedia Spotlight

It is a web service that uses a vector-space representation of entities and using the cosine similarity, recognize and disambiguate the entities.

Qanary DBpedia Spotlight for NED
Qanary DBpedia Spotlight for NER

Tag Me

It matches terms in a given text with Wikipedia, \ie links text to recognize named entities. Furthermore, it uses the in-link graph and the page dataset to disambiguate recognized entities to its Wikipedia URIs.

Qanary Tag Me for NED
Qanary Tag Me for NER

Other NER and NED Tools

TextRazor (homepage) is a startup providing software that helps developers rapidly build text analytics into their applications.
- Qanary TextRazor for NER
Dandelion (homepage) is a startup specialized in Semantics & Big Data.
- Qanary Dandelion for NED
- Qanary Dandelion for NER
Ontotext (homepage) provides a complete set of Semantic Technologies enabling better content management, knowledge discovery, and semantic search.
- Qanary Ontotext for NED
- Qanary Ontotext for NER
Ambiverse (homepage) is a spin-off from the Max Planck Institute for Informatics, which develops technologies to automatically understand, analyze, and manage Big Text collections.
- Qanary Ambiverse for NED
- Qanary Ambiverse for NER
Meaningcloud (homepage) is a company based in New York City, that specializes in software for semantic analysis.
- Qanary Meaningcloud for NED
- Qanary Meaningcloud for NER

Question Answering Relation Linking (RL) Components

ReMatch

It maps natural language relations to knowledge graph properties by using dependency parsing characteristics with adjustment rules.It then carries out a match against knowledge base properties, enhanced with word lexicon Wordnet via a set of similarity measures. It is an open source tool.
Qanary ReMatch for RL

RelationLinker2 (RelationMatch)

It devise semantic-index based representation of PATTY~\cite{DBLP:conf/emnlp/NakasholeWS12} (a knowledge corpus of linguistic patterns and its associated properties in DBpedia) and a search mechanism over this index with the purpose of enhancing relation linking task.
Qanary RelationLinker2 for RL

OKBQA DiambiguationProperty (ReLMatch)

The disambiguation module (DM) of OKBQA framework provides disambiguation of entities, classes, and relations present in a natural language question.
Qanary DiambiguationProperty for RL

RelNliodRel (RNLIWOD)

Natural Language Interfaces for the Web of Data ((NLIWOD) community group (https://www.w3.org/community/nli/) provides reusable components for enhancing the performance of QA systems. We utilise one of its components to build similar relation linking.
Qanary RelNliodRel for RL

Spot Property (AnnotationofSpotProperty)

This component is the combination of RNLIWOD and OKBQA disambiguation module for relation linking task.
Qanary AnnotationofSpotProperty for RL

Question Answering Class Linking (CL) Components

ClsNliodCls (NLIWOD CLS)

NLIWOD Class Identifier is one among the several other tools provided by NLIWOD community for reuse. The code for class identifier is available on GitHub.
Qanary ClsNliodCls for CL

AnnotationofSpotClass (OKBQA Class linker)

This component is part of OKBQA disambiguation module.
Qanary AnnotationofSpotClass for CL

Question Answering Query Builder (QB) Components

QueryBuilder (NLIWOD Template-based QB)

Template-based query builders are widely used in QA community for SPARQL query construction. This component is similar to the existing template-based components.
Qanary QueryBuilder for QB

SINA (QB)

SINA is a keyword and natural language query search engine that is based on Hidden Markov Models for choosing the correct dataset to query. We decoupled original implementation to get query builder.
Qanary SINA for QB

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

In a Nutshell: Qanary Question Answering Components

Build and run a minimal set of components

Run a minimalistic Question Answering system

Big Picture

How to Cite

Introducing a Vocabulary for knowledge-driven Question Answering Processes

Introducing the Qanary Framework

Analytics of NER/NED Components

Qanary Components

Question Answering Name Entity Recognition (NER) and Disambiguation Components (NED) Components

Entity Classifier 2 (NER)

Stanford NLP Tool (NER)

Babelfy

AGDISTIS (NED)

DBpedia Spotlight

Tag Me

Other NER and NED Tools

Question Answering Relation Linking (RL) Components

ReMatch

RelationLinker2 (RelationMatch)

OKBQA DiambiguationProperty (ReLMatch)

RelNliodRel (RNLIWOD)

Spot Property (AnnotationofSpotProperty)

Question Answering Class Linking (CL) Components

ClsNliodCls (NLIWOD CLS)

AnnotationofSpotClass (OKBQA Class linker)

Question Answering Query Builder (QB) Components

QueryBuilder (NLIWOD Template-based QB)

SINA (QB)

Files

README.md

Latest commit

History

README.md

File metadata and controls

In a Nutshell: Qanary Question Answering Components

Build and run a minimal set of components

Run a minimalistic Question Answering system

Big Picture

How to Cite

Introducing a Vocabulary for knowledge-driven Question Answering Processes

Introducing the Qanary Framework

Analytics of NER/NED Components

Qanary Components

Question Answering Name Entity Recognition (NER) and Disambiguation Components (NED) Components

Entity Classifier 2 (NER)

Stanford NLP Tool (NER)

Babelfy

AGDISTIS (NED)

DBpedia Spotlight

Tag Me

Other NER and NED Tools

Question Answering Relation Linking (RL) Components

ReMatch

RelationLinker2 (RelationMatch)

OKBQA DiambiguationProperty (ReLMatch)

RelNliodRel (RNLIWOD)

Spot Property (AnnotationofSpotProperty)

Question Answering Class Linking (CL) Components

ClsNliodCls (NLIWOD CLS)

AnnotationofSpotClass (OKBQA Class linker)

Question Answering Query Builder (QB) Components

QueryBuilder (NLIWOD Template-based QB)

SINA (QB)