Decoupling test generation and test execution #41

Jaxan · 2016-08-29T11:57:16Z

Many of the equivalence oracles in LearnLib are based on testing (this includes RandomWordEQOracle, W(p)MethodEQOracle, CompleteExplorationEQOracle, ...) and so each of these classes will contain code like this (in many flavours):

D hypOutput = output.computeOutput(queryWord);
sulOracle.processQueries(Collections.singleton(query));
if (!Objects.equals(hypOutput, query.getOutput()))
    return query;

It is a bit annoying to have to do this every time one comes up with a new test generation algorithm. My proposal is to introduce the concept of a test generator, then you implement the above code just once (i.e. one can make an equivalence oracle from any test generator).

Pros:

Implement test execution just once.
Allows to batch tests in order to execute them in parallel (now you'd have to implement this batching in each class, RandomWordEQOracle for example does this).
Allows to bound the test generation, withouth having to add a counter to each test generator.
Allows to interleave different test generation methods.

There are no cons ;-), besides having yet another abstraction.

Ideally we would implement this with coroutines, but Java does not have them. So I think iterators will do fine. Note that a collection is not sufficient, since many test generation methods are infinite. A supplier is also not sufficient because it cannot stop (some test generations methods may be finite).

In terms of interfaces, I think we would like to have two concepts. First there is the test generator, which may be just an iterator. Then there is a test generation method which has a function taking a hypothesis and returning a test generator. I have no opinion on whether we should use standard Java interfaces (like iterator) or define our own.

What are your thoughts?

Also should this be in LearnLib, or as something separate?

The text was updated successfully, but these errors were encountered:

praseodym · 2016-08-30T19:19:29Z

This sounds like something reactive streams are well suited for. Reactive streams can be infinite, can be cancelled, are asynchronous and can be easily parallelised.

Jaxan · 2016-09-01T13:50:06Z

I don't necessarily want the streams themselves to be asynchronous (producing a test is almost an instant operation, it's executing which takes time, so there is little gain in having produces being asynchronuous). So it might be overkill for something we can express using a simple iterator.

However, if the library for reactive streams already gives a lot of reusable building blocks (we think we will use), it might be worth looking into it. Reusable blocks which I imagine being used for testing are: bounds on test suites (for termination), interleaving of test suites (zip), extending test sequences (map), removing test cases (filter).

If your suggestion is to implement this ourselves, I would rather go for something simple like iterators. If your suggestion is to use a library, then I would like to know whether you have experience with such a lib.

Cheers,
Joshua

ricksmet · 2016-09-02T10:34:21Z

Another pro of Joshua's proposal is the possibility to use the PassiveLearningAlgorithm interface (and, by extension, the rich collection of passive learning algorithms proposed in literature) for learning reactive systems.

Generally speaking, the PassiveLearningAlgorithm consists of the self explanatory void addSample(DefaultQuery<I,D> sample) and M computeModel() method signatures. The proposed test generator interface could let a (collection of) test generation methods provide samples for the learning algorithm.

It would be ideal if, in turn, the test generation methods could use an up-to-date (partial) hypothesis from computeModel() to tailor their test selection procedure, but as of now I am unsure on how to achieve this.

It would be interesting to see how such a setup would compare to the conventional active learning approach in terms of performance and results. I am under the impression that active learning algorithms (at least some variations of L*, much less so TTT) add overhead (in terms of queries asked), that do not benefit the result of the learning process.

Jaxan · 2016-09-02T11:42:22Z

It would be ideal if, in turn, the test generation methods could use an up-to-date (partial) hypothesis from computeModel() to tailor their test selection procedure, but as of now I am unsure on how to achieve this.

So my plan was to have an interface TestMethod which returns an TestGenerator for every hypothesis. Meaning that you get a new test generator for each hypothesis. But maybe a single entity with some updateHypothesis function is easier to use.

What I envision for active learning is something where we can write the following (where all these components are provided in the library):

EquivalenceOracle<I, O> eqOracle = new TestingOracle(new Bound(1000, new Interleave(new WMethod(...), new LeeYannakakisMethod(...))));

where TestingOracle adapts a test method into an equivalence oracle (i.e. it is the test executor). And Bound just picks the first 1000 tests and then stops. Interleave will zip the two test methods.

fhowar · 2016-09-06T09:06:13Z

I like the idea and think we should implement this in LearnLib!

The one method that does not match this pattern is a random walk - but I think that is OK.

praseodym · 2016-09-06T19:43:03Z

I don't necessarily want the streams themselves to be asynchronous (producing a test is almost an instant operation, it's executing which takes time, so there is little gain in having produces being asynchronuous). So it might be overkill for something we can express using a simple iterator.

Asynchronous execution does have benefits for parallel execution, but I agree that the performance gain would not be that big.

However, if the library for reactive streams already gives a lot of reusable building blocks (we think we will use), it might be worth looking into it. Reusable blocks which I imagine being used for testing are: bounds on test suites (for termination), interleaving of test suites (zip), extending test sequences (map), removing test cases (filter).

That's exactly the purpose of these libraries. For example, see the list of ReactiveX operators or the reactor-core Flux JavaDoc (which implements Reactive Streams). Unfortunately terminology differs a little between the ReactiveX and Reactive Streams camps, but the ideas are mostly the same.

Your example:

EquivalenceOracle<I, O> eqOracle = new TestingOracle(new Bound(1000, new Interleave(new WMethod(...), new LeeYannakakisMethod(...))));

could be implemented with standard operators (take, zip) on two test publishers (observables); the resulting stream of tests is then subscribed on (observed) by the equivalence oracle.

If your suggestion is to implement this ourselves, I would rather go for something simple like iterators. If your suggestion is to use a library, then I would like to know whether you have experience with such a lib.

This definitely shouldn't be something you want to implement by yourself. Project Reactor has a very nice core library, although documentation for version 3 is still lacking. RxJava is a little more mature, but it does not implement Reactive Streams by default.

I have positive experiences with Reactor in other projects and I'd be happy to help here as well.

Jaxan · 2016-09-07T08:38:13Z

@fhowar You are right about the random walk oracle. The design I propose is "word-based". I think it will work well for all equivalence oracles which use a membership oracle (the random walk oracle does not do that). If the random walk oracle is the only exception, I think it's fine to just leave it like that. If you can think of more of these examples, that would be nice, we need more examples!

@praseodym I have to admit I am a bit overwhelmed by all the terminology (especially considering I am trying to solve an easy problem). But I can get used to that. One question though. It seems the test executor will be a subscriber in that model. That means I have to implement something like:

class TestExecutor {
  CE findCounterExample(...) {
    // start generation
    // and subscribe
    // and return ???
  }
  void onNext(...) {
    // execute test
    // if counter example, return it to learner
  }
  void onCompletion(...) {
    // return null to learner
  }
}

I do not really see yet, how to use the subscriber/publisher design here. Do I have to fork a thread in findCounterExample in order to let things flow? In comparison, the iterator style would look like:

class TestExecutor {
  CE findCounterExample(...) {
    Iterator it = ...
    for(Word w : it) {
      // execute test
      // if counter example, return it
    }
    return null;
  }
}

Also keep in mind that each hypothesis will have its own generation. But that I only want to construct the "chain" of operations once.

praseodym · 2016-09-07T20:47:14Z

@Jaxan I agree that it is definitely overwhelming at first. Actually it might not be a bad idea to start off implementing the iterator style and then adding reactive streams later (but of course before implementing things like Bound and Interleave).

An iterator/iterable can be wrapped into a Flux (Publisher) quite easily; executing tests could be done with something like:

counterExample = Flux.fromIterable(() -> testGenerator)
                     .take(10)
                     .filter(query -> testForCounterExample(query, hypothesis))
                     .next();

Or with parallel membership queries (of course, testForCounterExample should be thread-safe):

counterExample = Flux.fromIterable(() -> testGenerator)
                     .take(10)
                     .parallel()
                     .runOn(Schedulers.parallel())
                     .filter(query -> testForCounterExample(query, hypothesis))
                     .sequential()
                     .next();

Implementing your own Publisher is a little more tricky, because it should run asynchronously and handle backpressure to be fully reactive streams-aware.

Jaxan · 2016-10-16T13:52:24Z

Just to give an update: next week, @praseodym, @ricksmet, others and myself will have a meeting to discuss this (among other learning related things). I will give an update after the meeting.

mtf90 · 2017-07-14T19:51:56Z

I'm quite fond of this idea as well and I think both @Jaxan and @praseodym have pointed out valid requirements/concerns already.

I'm not a fan of adding a big dependency such as RXJava for such a "small" use case. Also this would mean some kind of "vendor lock-in" because everyone who wants to use this feature would have to exclusively use (and understand) the chosen library. Yet I think, some comfort (e.g. interleaving multiple sources) is quite nice to have. In my opinion a good middle ground is given by using Java 8's Streams.

I dabbled around a little bit with Streams in my learnlib fork. I added the basic interfaces you guys already proposed and tried to model some of the existing EQ oracles with the new approach, which already lead to cleaner code. (I know, I butchered the "batched" mode of the RandomWord oracle, but I just wanted a use case for unlimited streams). I also used the JOOL library to model some of the more advanced stuff, where multiple sources are cross-joined. (Btw, I think this library is quite nice, because it's relatively small and since Sequences inherit from Stream, they can be directly plugged in the proposed API)

But there are still some things I need to further think about. One thing is parallelism in Streams. In the best case, we could somehow cover the ParallelEQOracle with this approach as well. The other thing is: A big advantage of this approach is the lazy evaluation of generated test words. However, a lot of EQ oracles still compute all test words in a bulk operation and then simply break the iteration loop, once a counterexample is found. So to fully utilize this approach, there still needs to be some work done in lazily computing things like a characterizing set.

But for now I'm interested in your opinion about this proposal. Do you think this API is rich enough to allow all the crazy stuff, yet slim enough to not pull in a specific framework? Or do you have any other improvement suggestions?

Jaxan · 2017-07-31T10:21:01Z

Hi @mtf90, I gave your fork a quick look, and it seems very nice! Especially, the W-method is elegantly coded. Do you have any idea how a state-dependent methods will look like (e.g. the Wp-method)?

About the parallelism of different test methods, I see two use cases:

Combining two test methods. This is not necessarily parallelism, as simple interleaving would be enough (with some kind of coroutines, for example).
Running a simple tester while the smart tester is computing stuff. For example, we may want to test random tests while we compute a characteristic set. For most of the test methods I know, generating the needed data is actually not very expensive. (A char. set can be obtained in O(n log n), and local state identifiers in O(n^2) worst case.)

I am not sure if both use cases have to be solved within the same framework. I can imagine solving both in different ways. I would say it has a low priority.

I am not against using libraries, per se. As long as someone can still use LearnLib without having to write against those interfaces himself. JOOL seems to do that, since everything is just a java-Stream in the end. I'm all for it!

Thanks for the great work!

mtf90 · 2017-07-31T20:35:32Z

Regarding parallelism, I was initially wondering if I could somehow use the Stream#parallel mechanism to process queries in parallel. However, I figured we don't need this, when we have proper batching support, because we can use the existing ParallelOracle abstraction for this (e.g. use a parallel oracle with 4 instances and a batch size of 20, so you process 5 queries in parallel). This is actually pretty nice, because

it keeps the TestingOracle pretty simple, because we don't have to worry about threading, etc. and batching can be easily implemented
allows to use simple(r) libaries such as JOOL that only support sequential streams
all oracles based on this TestingOracle get proper batching and parallelism support for free

Your second point is a little bit more complex, though. With the Stream approach you basically have a lazy/ondemand work queue, where each operation queries its source for the next item and currently the EQs would only start to compute the next test word if asked for it. You want to realize some kind of "anytime test generator" that can always return a test word if asked, even if the actual method has not yet finished computing. While I think this could be implemented via Streams, this sounds much more native to asynchronous stream frameworks (such as JavaRX). But again, I'm not a fan of pulling big dependencies into APIs (just like you argued). However, a promising news is that asynchronous streams will be part of Java 9 which is planned for later this year. So maybe we could cover this case with "onboard" solutions as well. But for now (you called it a low priority, too) I think even the simpler approach will be an improvement already.

Lastly, your point on state dependent methods is quite good, because I haven't thought about that yet. In case of the WpMethod it is actually quite easy, because the state is local to the test generation, so you can simply incorporate it into the stream construction. See e.g. my commit mtf90/learnlib-ce-streams@7a123961d2b072bec384ad4569efcb978860ac5c (again, since JOOL only supports sequential streams, we don't have to worry about threading/synchronization). However, oracles such as the SampleSetEQOracle are quite problematic, because they require information about the answered queries. I thought about adding something like a callback, where the extending subclass can extract these information again, but maybe these are such corners cases (such as the RandomWalkOracle) where we should just leave it be for now. They still can be added later if the need arises.

praseodym · 2017-07-31T20:47:52Z

Just to add a short note here: the Reactive Streams API has been available for a while, and adding only the 3KB API dependency (i.e. not an actual implementation) could probably suffice. If someone were to use these features they could pull in Reactor or RxJava 2.x. No need to wait for Java 9 here.

It's been a while since I used LearnLib, so it might well be possible to add these features in a completely separate module (i.e. outside of LearnLib entirely). To me, that would be preferred above any specific interface implementation.

mtf90 · 2017-08-01T09:15:41Z

I looked into reactive streams a little bit further and start to see the appeal of it. It's nice that they have a separate API (I previously thought the situation was like with JavaRx1) and it's even nicer that they plan to be conformant to the JDK9 interfaces (such as Guava functions extending j.u.f.Function)

I tried to integrate reactive-streams in mtf90/learnlib-ce-streams@ba8cd064b5a7b91aed648943bcffc1b79674bce9 in a similar approach to the stream-based oracle, but I think there needs to be (a lot) more work done to fullfill the RS contract and integrate into existing EQOracle interfaces. For fully utilizing the asynchronous aspect we would also have to ramp up the ParallelOracle API, to (e.g.) delegate single items to free MembershipOracles as they come.

While this definitely sounds interesting from a HPC standpoint, I'm not sure about the cost/value ratio for now. Especially for this FR, which is originally not concerned with performance, but convenience / clean API design. If think, if you plan to use RS a Publisher -> Iterator -> Stream chain will integrate in the proposed draft just fine.

Jaxan mentioned this issue Aug 29, 2016

Adds two randomized equivalence oracles based on the W method #40

Merged

aschieweck added the type: new feature label Oct 16, 2016

mtf90 added a commit that referenced this issue Sep 6, 2017

eqtests: add abstract oracle based on test word generation (see #41)

d0aa27e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Decoupling test generation and test execution #41

Decoupling test generation and test execution #41

Jaxan commented Aug 29, 2016

praseodym commented Aug 30, 2016

Jaxan commented Sep 1, 2016

ricksmet commented Sep 2, 2016

Jaxan commented Sep 2, 2016

fhowar commented Sep 6, 2016

praseodym commented Sep 6, 2016 •

edited

Loading

Jaxan commented Sep 7, 2016

praseodym commented Sep 7, 2016 •

edited

Loading

Jaxan commented Oct 16, 2016

mtf90 commented Jul 14, 2017

Jaxan commented Jul 31, 2017

mtf90 commented Jul 31, 2017

praseodym commented Jul 31, 2017

mtf90 commented Aug 1, 2017

Decoupling test generation and test execution #41

Decoupling test generation and test execution #41

Comments

Jaxan commented Aug 29, 2016

praseodym commented Aug 30, 2016

Jaxan commented Sep 1, 2016

ricksmet commented Sep 2, 2016

Jaxan commented Sep 2, 2016

fhowar commented Sep 6, 2016

praseodym commented Sep 6, 2016 • edited Loading

Jaxan commented Sep 7, 2016

praseodym commented Sep 7, 2016 • edited Loading

Jaxan commented Oct 16, 2016

mtf90 commented Jul 14, 2017

Jaxan commented Jul 31, 2017

mtf90 commented Jul 31, 2017

praseodym commented Jul 31, 2017

mtf90 commented Aug 1, 2017

praseodym commented Sep 6, 2016 •

edited

Loading

praseodym commented Sep 7, 2016 •

edited

Loading