Re-designing the answer test for integration #1229

anst-i · 2024-07-23T14:12:38Z

Over the years, the Int answer test has grown considerably and now can deal with many different situations depending on given options. While the test became versatile, this development went against one of the design principles for answer tests: They should only do one thing, and they should do it well. The Int answer test currently does many things at once:

It may or may not check for a constant of integration (see NOCONST flag in the documentation)
It may or may not check if the student answer and the teacher answer do have equivalent derivatives, but have a "different form" (see the FORMAL flag in the documentation)
It guesses if the teacher used absolute values in the argument of logarithms, and if so, can again reject a student answer for not being formally equivalent.

This has several shortcomings:

The result of the answer test is hard to predict
Not all combinations of options currently work! (I can dig this out if it's needed, I think NOCONST is ignored if FORMAL is given)
The code for the answer test is difficult to maintain

The documentation comments this as:

The test cannot cope with some situations. Please contact the developers when you find some of these. This test is already rather overloaded, so please don't expect every request to be accommodated!

In an attempt to improve the situation, I've started a dev branch called ~~at-antidiff~~ iss1229 which tries to untangle the different mechanisms of the Int answer test into several new answer tests:

ATAntiDiff: Checks if the student answer and the teacher answer are antiderivatives of the same function
ATAddConst: Given a list of variables in the options, checks if the student answer exactly contains one additional variable not contained in the list, and if it is added as a +C to the expression.

This is work in progress, and some feedback would be nice. Open questions to me are:

How and when do you use or not use the FORMAL flag in the current Int answer test? Is it important that the new answer tests can check for formal equivalence? If yes, can this be done by other, already existing answer tests?
Do you use the fact that Int checks for absolute values in logarithms? Should there be a separate answer test to check for this?
In ATAddConst: Would you like to have an option to allow for "weird constants"? Technically, adding +C^3 to a basic antiderivative also gives you the set of antiderivatives, so mathematically it would be wrong to condone it. However, such a check might be difficult to program (one would have to check if the function in C is surjective on the reals), and it's not so clear if anyone would really use it.

The text was updated successfully, but these errors were encountered:

LukeLongworth · 2024-07-23T21:57:19Z

Hi Andreas,

Thanks for looking at this. I have a couple of comments/questions:

What would happen to the existing ATInt function in existing questions? Will it just run the two new functions one after the other?
I didn't know the FORMAL flag existed until I read this! On first read, I'm not sure why you would not want this behaviour if you were still able to check for a constant of integration. My experience with students is that at least one in every cohort will find a way to rearrange expressions in an unexpected, but correct (ignoring introduced domain restrictions) way, so having a flexible check like this seems like a no-brainer.
I don't think it's worth expanding the support for constants of integration much. As you say, there are all sorts of weird ways to write them. The only other form of an arbitrary constant that I feel could be worth checking for would be int(1/x,x) = ln(abs(kx)) because affine shifts to a log function always feel a bit odd to me. A general check sounds very difficult and likely isn't worth it.
The absolute value checking has caused me a few headaches over the years. For example, the following set-up has no correct solution at all due to the mismatched absolute value terms:

ATInt(ta,ta,x) will return false unless you remove the absolute value from the ln(x) term, which we don't want in most cases. In an ideal world I'd have the test check that the students' answer is an antiderivative over the entire domain of the given function, because a MORE general solution to int(1/x,x) is f(x):= if x<0 then ln(-x) + C1 else if x>0 then ln(x) + C2. This is obviously not a reasonable answer for a student to give a STACK question, but it illustrates that throwing an absolute value into the log is a quick fix at best.

anst-i · 2024-07-25T10:42:44Z

Thank you Luke!

ATInt would stay as it is, but we could mark it as deprecated in the documentation. Writing a script which rewrites all existing questions would be difficult without breaking anything.
I personally agree that I don't care how students write their antiderivatives, but apparently some people do. I wonder if a different answer test like EqualComAssRules would work for them.
I'm currently checking that the constant is additive by comparing the derivative of the expression wrt to the constant with 1. To be less strict, I could just check if this derivative does not contain any variable which is not the constant itself. This would allow for your ln(abs(kx)) case. Might be worthwhile to implement a nonstrict option?
I'm sorry, I can't recreate your example. Both on STACK sandboxes of 4.4.2 and 4.7.0 ATInt return true for me. What am I missing? Anyway, I have studied these constants of integration for non-connected domains quite a bit recently, and yes, if you want all antiderivatives, you need to add two different constants. But the question is, what do we want? STACK should be mathematically correct, but it should also be usable. To stay mathematically correct, we'd have to find all connected components of the natural domain of the expression and then check for a constant for each of them. I think we agree that this is certainly not what teachers would expect, and for some cases (like if the domain of the answer expression contains infinitely many holes) that's downright unusable. The question thus is about a compromise of correctness and user-friendlyness, and I think having the AddConst answer test as proposed here might be such a compromise.

LukeLongworth · 2024-07-25T21:39:44Z

Sounds good to me!
I think the tools already available in STACK are sufficient for teachers who want to closely analyse the form of student expressions. Using ATAntiDiff in conjunction with some more testing feels like it would be suitable in most cases to me
I like the idea of a nonstrict option. You're suggesting that, by default, we check is(diff(expr,C)=1), but when nonstrict is enabled we instead check something like emptyp(delete(C,listofvars(diff(expr,C))))?
It's entirely possible that I am doing something wrong here, as I only rarely work with the AT functions directly. This is what I tried:

My understanding is that the second element of this list is correct/incorrect, and that the first element is whether the code ran error-free. I am running STACK 4.4.6 here, and the behaviour is matched when I run it inside an actual question, whether I use abs in either, both or neither of the log terms. I can fix this by adding assume(x>0) which makes the whole thing abs-neutral - also not desired!

And yes, my remarks about multiple constants were really just a short rant about something I find interesting rather than a feature request. The main issue I would like to see fixed is the above, though I recognise this might be difficult if no-one can recreate it.

sangwinc · 2024-07-26T08:14:09Z

@LukeLongworth just to clarify one thing about the return format of answer test functions.
https://docs.stack-assessment.org/en/Authoring/Answer_Tests/

[Errors, Result, FeedBack, Note] = AnswerTest(StudentAnswer, TeacherAnswer, [Opt], [Raw])

The "Result" is a Boolean true/false which is what you normally want to see, so yes look for the 2nd argument in the sandbox.

anst-i · 2024-08-26T17:08:16Z

I like the idea of a nonstrict option. You're suggesting that, by default, we check is(diff(expr,C)=1), but when nonstrict is enabled we instead check something like emptyp(delete(C,listofvars(diff(expr,C))))?

Yep, pretty much. This is now implemented in the iss1229 branch.

It's entirely possible that I am doing something wrong here, as I only rarely work with the AT functions directly. This is what I tried:

My understanding is that the second element of this list is correct/incorrect, and that the first element is whether the code ran error-free. I am running STACK 4.4.6 here, and the behaviour is matched when I run it inside an actual question, whether I use abs in either, both or neither of the log terms. I can fix this by adding assume(x>0) which makes the whole thing abs-neutral - also not desired!

It turns out that you are not doing it wrong -- This seems to depend on simp being enabled or not! Can you open another issue for this so that we don't forget?

georgekinnear mentioned this issue Aug 7, 2024

Answer Test based on validator function #1243

Closed

anst-i added this to the 4.9.0 milestone Dec 16, 2024

anst-i self-assigned this Dec 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Re-designing the answer test for integration #1229

Re-designing the answer test for integration #1229

anst-i commented Jul 23, 2024 •

edited

Loading

LukeLongworth commented Jul 23, 2024

anst-i commented Jul 25, 2024

LukeLongworth commented Jul 25, 2024

sangwinc commented Jul 26, 2024 •

edited

Loading

anst-i commented Aug 26, 2024 •

edited

Loading

Re-designing the answer test for integration #1229

Re-designing the answer test for integration #1229

Comments

anst-i commented Jul 23, 2024 • edited Loading

LukeLongworth commented Jul 23, 2024

anst-i commented Jul 25, 2024

LukeLongworth commented Jul 25, 2024

sangwinc commented Jul 26, 2024 • edited Loading

anst-i commented Aug 26, 2024 • edited Loading

anst-i commented Jul 23, 2024 •

edited

Loading

sangwinc commented Jul 26, 2024 •

edited

Loading

anst-i commented Aug 26, 2024 •

edited

Loading