Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question-Answering Engine based on Natural Language Generation #33

Open
akolonin opened this issue Jul 12, 2020 · 0 comments
Open

Question-Answering Engine based on Natural Language Generation #33

akolonin opened this issue Jul 12, 2020 · 0 comments
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@akolonin
Copy link
Member

akolonin commented Jul 12, 2020

Overall task and design:
Based on #22, we need to provide an extended version of the Question Answering to replace or texted the current placeholder:
https://github.com/aigents/aigents-java/blob/master/src/main/java/net/webstructor/peer/Answerer.java
The code may go to org.aigents.nlp.qa or to respective package of the Aigents Platform Core.
There are few things to be done, written in the following pseudo-code to be refined during the implementation phase:

interface Indexer {
    void clear();//clears the current index
    void index(String text);//indexes text in the internal model where the model can be any
    Linker retrieve(String query);//retrieve the ranked list of relevant words based on the single query applied to the scope of all texts indexed by date, see https://github.com/aigents/aigents-java/blob/master/src/main/java/net/webstructor/data/Linker.java 
}

//Candidate implementation of the Indexer relying on the existing code
class GraphIndexer implements Indexer {
    Graph graph;//see https://github.com/aigents/aigents-java/blob/master/src/main/java/net/webstructor/data/Graph.java 
    int Mskip = 2;//width of skipping window to build word pairs
    // will be used to index any number of input texts in a graph object
    @Override
    index(String text){
        // tokenise text with Parser.parse https://github.com/aigents/aigents-java/blob/master/src/main/java/net/webstructor/data/Miner.java#L580
        // build word-word links based on per-sentence word pairs co-occurring in a distance of Mskip using link types "pred" and "succ" and store them in a graph with link weight set as W = Mskip / distance (so the closer words are given larger weight, the closest word weighted as Mskip and the most distant word weighted as 1) 
    }
    @Override
    Linker retrieve(String query){
        // tokenize query with Parser.parse https://github.com/aigents/aigents-java/blob/master/src/main/java/net/webstructor/data/Miner.java#L580
        // compute the ranks of nodes in the graph using algorithm GraphOrder.directed https://github.com/aigents/aigents-java/blob/master/html/ui/aigents-graph.js#L537 (need to add this function to Graph class) initialized with word nodes found in the query, with every word node weight to be 1 denominated with word frequency from https://github.com/aigents/aigents-java/blob/master/src/main/java/net/webstructor/data/LangPack.java#L85.  
        // retrieve the computed ranks of words from Graph and return in Linker implementation such as https://github.com/aigents/aigents-java/blob/master/src/main/java/net/webstructor/data/Counter.java or https://github.com/aigents/aigents-java/blob/master/src/main/java/net/webstructor/data/Summator.java having it returned  
    }
}

class AnswerGenerator extends Answerer { //to be re-used in https://github.com/aigents/aigents-java/blob/master/src/main/java/net/webstructor/peer/Answerer.java 
    Indexer indexer; //see above
    Generator generator; //see above 
    in max words;//configured hard cap limit on number of words to be used to build the reply
    String answer(String query){
        Linker words = indexer.retrieve(query);
        if (words == null || words.size() ==0)
            return "No.";
        Collection<String> top = getTopWordsFromLinker(linker);
        String response = generator.generate(top); //see #22 
        return response;
    }
}

Task outline:

  1. Complete Natural language production based on formal grammar #22
  2. Implement the above
  3. Find the baseline/train/test set for Question Answering from Kaggle or papers online
  4. Fine-tune the design, implementation, and parameters to provide results reasonable according to item 3 above
  5. Integrate with Aigents chat-script functionality
    5.1. Extend, replace or override the existing Aigents Answerer https://github.com/aigents/aigents-java/blob/master/src/main/java/net/webstructor/peer/Answerer.java using Intenter plugin replacement design https://github.com/aigents/aigents-java/blob/master/src/main/java/net/webstructor/agent/Demo.java#L82
    5.1.1. Solve the simplest summarization problem so given a single text as an input and few words as a seed, a brief summary out of the larger text body is created like with public static String summarize(java.util.Set words, String text) function in https://github.com/aigents/aigents-java/blob/master/src/main/java/net/webstructor/peer/Answerer.java#L163
    5.1.2. Solve the more complex answering problem where multiple texts are given and need to extract the relevant summary answering the question from the combination of the multiple text bodies, like with Collection searchSTMwords(Session session, final SearchContext sc) function in https://github.com/aigents/aigents-java/blob/master/src/main/java/net/webstructor/peer/Answerer.java#L82
    5.2. Extend unit test such as https://github.com/aigents/aigents-java/blob/master/php/agent/agent_chat.php
    5.3. Test in Telegram chat-bot
    5.4. Consider if some code should be moved to Aigents Core Platform from the org.aigents.nlp.qa
    5.5. TBD
  6. TBD

References:
https://blog.singularitynet.io/an-understandable-language-processing-3848f7560271

@akolonin akolonin added enhancement New feature or request help wanted Extra attention is needed labels Jul 12, 2020
@akolonin akolonin added progress In progress help wanted Extra attention is needed and removed help wanted Extra attention is needed progress In progress labels Jul 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants