Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rethink constants #344

Open
mzuenni opened this issue Sep 18, 2024 · 26 comments
Open

rethink constants #344

mzuenni opened this issue Sep 18, 2024 · 26 comments

Comments

@mzuenni
Copy link
Contributor

mzuenni commented Sep 18, 2024

Right now we replace constants at the following places:

  1. problem statements
  2. input and output validators
  3. included code
  4. example submissions
  5. testdata.yaml

There are a few things that I want to mention.

  1. "Replacement in example submissions and included code" What is the use case here and why do we differentiate between example submissions and other submissions? (Note that I don't want to argue that we should do replacement for all submissions, rather I think that we should not touch any code at all).

  2. In most cases text replacement is just inconvenient. We already restrict validators to c++ and python in both cases there are nicer ways to automagically introduce constants. For example in c++ we could add constexpr <std::string_view,long long,double> <constant name> = <constant value> and something similar in python. In latex, we could also introduce the constants as macros.
    I know that this makes constants more complicated to implement on the tooling side but IMO it is much more convenient for the user. (For markdown and testdata.yaml i have no better idea...)

@evouga
Copy link
Collaborator

evouga commented Sep 18, 2024

The use case is stuff like

#define MAXN 100000
int dp[MAXN];

in judge solutions, which can now be

int dp[{{MAXN}}];

I agree that this is not as crucial as constant expansion in validators and problem statements, but it’s nice to be able to tweak problem bounds without needing to go through and fix RTEs in judge solutions.

Expanding constants in contestant submissions seems to me a bad idea; for one thing contestants don’t know what constants are available, and for another we should not be doing any messing around with submission source code that could accidentally break a solution that works locally for the student.

I’m not sure why text replacement is “just inconvenient”?

@RagnarGrootKoerkamp
Copy link
Collaborator

Text replacement badly messes up editor highlighting and autoformatting and such. Wouldn't it be nicer in your example if we did a #define MAX_N 1000 in the background? And similarly we could \newcommand{\maxn}{1000} for latex.

@evouga
Copy link
Collaborator

evouga commented Sep 18, 2024

Is there concrete evidence that the proposed balanced-braces “badly” messes up anything in practice?

I think it is good to make it obvious which variables are externally defined problem constants and which are statically scoped, and to make it more or less impossible to accidentally shadow the problem constants

@mzuenni
Copy link
Contributor Author

mzuenni commented Sep 18, 2024

regarding the change in judge submissions: I am not sure if this really makes things much easier since you still need to change all the test data anyway (note that we don't define generators). And distinguishing between user submissions and jury submissions makes stuff more complicated than necessary. Further, we don't don't allow a constant to write solutions like "do random stuff until you are out of time" for which automatic updates would be useful.

regarding messing stuff up: my problem is more that this messes up standard build processes. For c++ I can't compile stuff anymore but if it's done with macros or as constant I can simply tell my compiler what macros exist. Same for latex. The curly bracket thing breaks latexmk rebuild whereas with macros this can not happen.

@Tagl
Copy link
Contributor

Tagl commented Sep 18, 2024

Yeah I think defines and environment variables might be better approaches here.

@evouga
Copy link
Collaborator

evouga commented Sep 18, 2024

For LaTeX, don't you already need a script that does preprocessing before being able to compile the problem statement? You need to wrap the problem statement in a document environment, define all of the problem-package-specific macros, etc.

I agree that it's annoying that C++ code cannot be compiled locally using e.g. -D{{maxn}}=100000, and that it may be quite difficult to convince my judges to use constants in the judge solutions for this reason.

So:

  • we could try to come up with some language-specific way to include the constants (note that this would need to include all valid input validator languages, including stuff like CheckTestData, not just Python, TeX, and C++).
  • we could remove judge submissions from the list of preprocessed files, if this is the main sticking point. I am OK with this, though I don't really see the harm---judge teams who can't be bothered to deal with problem constants during local compilation don't have to use them.

But we had long discussions already about these issues at Lund, and considered alternatives like __maxN__ and XXMAXNXX and other schemes that could be implemented as macros, and the status quo was a compromise balancing factors such as simplicity, flexibility across multiple languages, readability, and not-breaking-syntax-highlighting-too-badly. I don't think the current proposal fits everybody's workflow perfectly, but I'm also hesitant to relitigate the feature from scratch, unless there is a major undiscovered problem with the proposed approach.

@mzuenni
Copy link
Contributor Author

mzuenni commented Sep 18, 2024

For LaTeX, don't you already need a script that does preprocessing before being able to compile the problem statement?

you can make latexmk do all that. it just cant preprocess the tex file with text replacement... (especially not with live pdf generation)

we could try to come up with some language-specific way to include the constants (note that this would need to include all valid input validator languages, including stuff like CheckTestData, not just Python, TeX, and C++).

we only allow ctd, viva, c++ and python as checkers

@evouga
Copy link
Collaborator

evouga commented Sep 18, 2024

we only allow ctd, viva, c++ and python as checkers

Yes, you are right. So we could specify different mechanisms for these languages (plus yaml and markdown) and remove support for constants in included code and submissions.

you can make latexmk do all that. it just cant preprocess the tex file with text replacement... (especially not with live pdf generation)

And it cannot be configured to sed the input files? (Though this is a moot point if you have a nice solution along the lines of the above).

@mzuenni
Copy link
Contributor Author

mzuenni commented Sep 18, 2024

you can make latexmk do all that. it just cant preprocess the tex file with text replacement... (especially not with live pdf generation)

And it cannot be configured to sed the input files? (Though this is a moot point if you have a nice solution along the lines of the above).

I see no good way to do that. but if we add a mechanism for c++ etc. then one for latex is also fine i.e. if you have the constant blub we define the latex macro \blub or \constant_blub or something like this.

@niemela
Copy link
Member

niemela commented Sep 18, 2024

  1. why do we differentiate between example submissions and other submissions?

"example submissions" are the only kind of submissions in the problem packages.

Expanding constants in contestant submissions seems to me a bad idea; [...]

Yes, definitely. I don't think anybody has suggested or want that.

regarding the change in judge submissions: I am not sure if this really makes things much easier since you still need to change all the test data anyway (note that we don't define generators).

I agree that constants are significantly less needed for (judge) submissions, simply because the agreement between submissions and statement/data/validators are checked. If there is a change of a limit that breaks a submission, then installation will fail.

The reason why we wanted something like constants to begin with is that we verify that submissions agree with data/validators, but we have to trust problem authors to make sure that the statement also agrees. In fact, the most common error with problem packages is exactly this.

And distinguishing between user submissions and jury submissions makes stuff more complicated than necessary.

I don't understand what this means? There are no "user submissions" in a problem package. Clearly (?) constants are not meant to apply outside of the package, i.e. they are absolutely not meant to be applied to submissions sent to some system that has installed the package.

Further, we don't don't allow a constant to write solutions like "do random stuff until you are out of time" for which automatic updates would be useful.

I can't quite parse this. Is "contsant" supposed to be "contestant"? If so, why would constants help with writing "do stuff until time runs out" style submissions?

we only allow ctd, viva, c++ and python as checkers

Yes, you are right. So we could specify different mechanisms for these languages (plus yaml and markdown) and remove support for constants in included code and submissions.

A minor point is that this could make it harder to add additional languages in the future. That said, we could always say that constants are not avaialble for some languages (and then we would be not worse off than if we didn't define constants).

@mzuenni
Copy link
Contributor Author

mzuenni commented Sep 19, 2024

  1. why do we differentiate between example submissions and other submissions?

"example submissions" are the only kind of submissions in the problem packages.

yes but in the end the problem gets uploaded to a a system that handles both the example submissions and participants submissions. Now it has to distinguish them (which will likely also make it impossible to upload the example submission alone).

Further, we don't don't allow a constant to write solutions like "do random stuff until you are out of time" for which automatic updates would be useful.

I can't quite parse this. Is "contsant" supposed to be "contestant"? If so, why would constants help with writing "do stuff until time runs out" style submissions?

Regarding use cases for constants in submissions: I think inserting something like the time-limit into a submission would be more useful than inserting constants but we already decided that that won't be possible.

Overall i am just in favor of not having constants in submissions and use the for their main purpose, to make checkers and statements consistent.

we only allow ctd, viva, c++ and python as checkers

Yes, you are right. So we could specify different mechanisms for these languages (plus yaml and markdown) and remove support for constants in included code and submissions.

A minor point is that this could make it harder to add additional languages in the future. That said, we could always say that constants are not avaialble for some languages (and then we would be not worse off than if we didn't define constants).

that is true, but I would prefer to change the constants like this:

  • If its c++ do X (for example macros)
  • If its python do Y (for example insert a variable with that value)
  • If its tex do Z (for example introduce a macro)
  • Else, replace {{constant}} with its value (we don't (yer?) have better ideas for ctd checkers, viva checkers, markdown or testdata.yaml).

@niemela
Copy link
Member

niemela commented Sep 19, 2024

"example submissions" are the only kind of submissions in the problem packages.

yes but in the end the problem gets uploaded to a a system that handles both the example submissions and participants submissions. Now it has to distinguish them (which will likely also make it impossible to upload the example submission alone).

I disagree with that claim. A system does not have to distinguish submissions in the problem package after installation, from any user submissions. To upload package submissions that are using constants you would of course have to apply the constant replacements first, but that is exactly why the target system does not have to even know that our constants exist.

Overall i am just in favor of not having constants in submissions and use the for their main purpose, to make checkers and statements consistent.

The consensus certainly seem to be that statements and validators are the more important use case. Is there anybody who sees a good use for constants in submissions (and included code)?

[...] I would prefer to change the constants like this:

  • If its c++ do X (for example macros)
  • If its python do Y (for example insert a variable with that value)
  • If its tex do Z (for example introduce a macro)
  • Else, replace {{constant}} with its value (we don't (yer?) have better ideas for ctd checkers, viva checkers, markdown or testdata.yaml).

Ok, that would work, but:

  • It's more complicated
  • It still has the issue you are complaining about. The progams wont work until after you have made the replacements. If you're willing to do the replacements first (as is the intent) with these rules, why wont it work with the simpler rules?

@mzuenni
Copy link
Contributor Author

mzuenni commented Sep 19, 2024

"example submissions" are the only kind of submissions in the problem packages.

yes but in the end the problem gets uploaded to a a system that handles both the example submissions and participants submissions. Now it has to distinguish them (which will likely also make it impossible to upload the example submission alone).

I disagree with that claim. A system does not have to distinguish submissions in the problem package after installation, from any user submissions. To upload package submissions that are using constants you would of course have to apply the constant replacements first, but that is exactly why the target system does not have to even know that our constants exist.

Well, I think the system you upload the package to has to do the replacement (so for example domjudge)?
Anyway, if I want to upload an example submission independently it won't work. As a problem setter, this makes it unlikely for me to use this feature for submissions. I get highlighting errors, I get syntax errors from my ide compiler, and I can't upload my submission and that is only for the small benefit of making changing one or two constants automatic?

Ok, that would work, but:

  • It's more complicated
  • It still has the issue you are complaining about. The progams wont work until after you have made the replacements. If you're willing to do the replacements first (as is the intent) with these rules, why wont it work with the simpler rules?

Yes, but I rarely run validators on my own (unlike submissions)? And with what I propose at least I don't get errors from my ide while writing those validators.
Further, for c++ my ide (and even plain g++ with -D) can do macro insertion. Same for latex (-usepretex). But none of my tools can directly do the replacement we currently specify. So I would need to write some script to add an extra build step that somehow does the replacement and in case of latexmk -pvc (continual rebuild) even that won't be sufficient...
Overall, this does not solve all problems especially if the language in question does not have a more convenient way of insertion but IMO in the cases where it is possible, we should use those features.

Also, yes what I propose makes it harder for us as tool authors but it makes it more convenient for the people who use our tools? In the end, our goal should be that constants are used as much as possible for the purpose of consistency i.e. ensuring that validators and therefore test cases match what is specified in the problem statement. Therefore, my aim is to make it as convenient as possible.

@niemela
Copy link
Member

niemela commented Sep 19, 2024

Well, I think the system you upload the package to has to do the replacement (so for example domjudge)?

Yes, true, but only when installing the problem (or, depending on how you look at it, before installing them). And that point user submissions do not exist, so I would disagree with the statement that "it has to distinguish them".

Also, yes what I propose makes it harder for us as tool authors but it makes it more convenient for the people who use our tools? In the end, our goal should be that constants are used as much as possible for the purpose of consistency i.e. ensuring that validators and therefore test cases match what is specified in the problem statement. Therefore, my aim is to make it as convenient as possible.

Good point.

@niemela
Copy link
Member

niemela commented Sep 19, 2024

I'm (again) hearing strong consensus for constants in statement and validators, some good arguments against constants in submissions, and (so far) no-one that explicitly wants constants in submissions (which is a lower bar than having a good argument for it), only that it could in theory maybe be useful (which is an even lower bar).

If you really think we should allow constants in submissions (and included code), please speak up.

@jsannemo
Copy link
Contributor

@simonlindholm is included code hacks for faster interactive validators still a relevant thing? If so, this might be nice.

Also, just because constant replacement exists for user submissions, nothing forces you to use them if you find them inconvenient.

@simonlindholm
Copy link
Member

@simonlindholm is included code hacks for faster interactive validators still a relevant thing?

It is, and included code hacks are relevant for other weird kinds of multipass stuff as well (e.g. you can imagine two submission instances running in parallel and communicating with each other, which wouldn't be covered by the newly added multipass support, or you might want time limits to be across the sum of runs).

I personally wouldn't want to use {{}}-style constants for them though, it makes it too painful to test the code outside of verifyproblem. Instead I'd prefer to embed constants in the test data that the included code reads from stdin. (In case of fake interactive problems you also want to make them real interactive for judge UI purposes, so the interactive output validator would be responsible for communicating constants across at the start of the run, reducing the problem to that of getting output validators to know the constants.)

For input/output validators I think injected C++ defines could be decent enough UX, but for a more language-portable solution we could also use cmdline arguments (with some custom syntax for embedding constants in "*_validators_args" yaml strings?) or environment variables.

@eldering
Copy link
Collaborator

eldering commented Sep 22, 2024

I don't have a strong opinion either way for allowing constants in submissions. And I agree with @niemela , this only apply to jury/example submissions, it's completely unrelated to team submissions, because any constant substitution should happen before these jury submissions are uploaded to the CCS (at least for DOMjudge), so I'd consider it part of the "exported" problem package, like calculated time limits and generated test data.

On checktestdata: you can define constants on the command line, so that's a way to pass them without replacement of {{ ... }} in the script if necessary.

@evouga
Copy link
Collaborator

evouga commented Sep 23, 2024

The part that I will definitely require my judges to use are constants in the validators, problem statement, and testdata.yaml.

I.e. in problem.yaml:

constants:
  minn: "1"
  maxn: "100"
  maxcoord:     "1e9"
  maxcoord_tex: "10^9"
  failstring: "No Solution"
  tol:     "1e-8"
  tol_tex: "10^{-8}"

In problem.en.tex:

The first line of input contains one integer $n ({{minn}} \leq n \leq {{maxn}})$, the number of points. 
The following $n$ lines will contain two integers each, $x$ and $y$, satisfying $0 \leq x,y \leq {{maxcoord_tex}}.$

Print the radius of the smallest disk containing all $n$ input points, if such a circle exists. 
Otherwise print \texttt{{{failstring}}}. 
Your answer will be judged correct if it within relative or absolute error ${{tol_tex}}$.

In validator.ctd:

INT({{minn}}, {{maxn}}, n) NEWLINE
REP(n)
INT(0, {{maxcoord}}) SPACE INT(0, {{maxcoord}}) NEWLINE
END
EOF

In testdata.yaml:

output_validator_args: "float_tolerance {{tol}}"

In a potential output validator:

std::string s;
std::getline(std::cin, s);
if(s == "{{failstring}}") {
  // ...
}
  1. As was mentioned at the CLI, it is annoying that I need to use separate _tex constants, but I am OK with this as there is not any great alternative. The above is already much better than the status quo, as scanning the constants in one place is less error-prone than across multiple files, especially when many people are making changes to bounds to tweak time limit behavior etc.

  2. I (still) think that constants in submissions are potentially useful, i.e.:

int main()
{   
  // some math
  if (ok)
    std::cout << std::setprecision(30) << answer << std::endl;
  else
    std::cout << "{{failstring}}" << std::endl;
}

Inconsistencies in i.e. capitalization of {{failstring}} arise not infrequently during problem development. Mzuenni and others above have convinced me I probably won't be able to force my judges to use them, since independent compilation of judge solutions is indeed useful. But I would use them in my solutions (after I am done debugging them locally), and I also don't see the harm in allowing them, unless there's a risk of accidental constant replacement in code that's not intended to have constants?

  1. I agree with @niemela that independent compilation of the .tex, validators, etc. is fundamentally at tension with the whole point of the global constants (which is to tightly couple these components in a way that reduces the incidence of broken problems). If people really want to independently compile the .tex etc. and it is possible with a few light scripts (especially if scripts to do the constant replacement are available in the tools and easy to use), I think that is OK.

  2. I am OK with some kind of language-specific idiomatic injection scheme, as long as it Just Works for all constants I want to use in practice. That includes all types of literals (including floating-point numbers and big integers) and strings (including with internal spaces as in the above.) (Pasting boilerplate into all of my validators and problem statements to parse these things from environment variables does not count as "Just Works.") I am still not entirely convinced that {{this}} syntax is all that bad, though. The above snippets were easy to write and it is clear at a glance where the global constants are.

@Tagl
Copy link
Contributor

Tagl commented Sep 23, 2024

I am still not entirely convinced that {{this}} syntax is all that bad,

What about conflicts with, for example, Python's f-strings?

@eldering
Copy link
Collaborator

eldering commented Sep 23, 2024

I am still not entirely convinced that {{this}} syntax is all that bad,

What about conflicts with, for example, Python's f-strings?

Not really though, right, because those only use single {this}, and also, you can always declare a variable and then use that

foo = {{this}}
bar = "{{this_string}}"
print(f'{this} and {this_string}')

@niemela
Copy link
Member

niemela commented Sep 23, 2024

  1. As was mentioned at the CLI, it is annoying that I need to use separate _tex constants, but I am OK with this as there is not any great alternative. The above is already much better than the status quo, as scanning the constants in one place is less error-prone than across multiple files, especially when many people are making changes to bounds to tweak time limit behavior etc.

Another solution would be write some formatting function in tex, so that instead if using {{tol_tex}} you would use \format_nicely{{{tol}}}. I would think that due to the simplicity of the _tex constants, they are more likely to be used, and the downside (i.e. that {{tol}} and {{tol_tex}} could actuall in theory become out of sync) is mitigated by the fact that they will be defiend right next to each other.

@simonlindholm
Copy link
Member

(Pasting boilerplate into all of my validators and problem statements to parse these things from environment variables does not count as "Just Works.")

FWIW for input validators I somewhat disagree -- CTD can use the custom syntax, and any non-CTD syntax validator will need to use some kind of testlib/template to correctly validate whitespace/no trailing zeroes/etc. because it's too error-prone to hand-roll. If you combine CTD and non-CTD you're probably putting the limits in the CTD file, but even if not it's really not hard to read data from env vars: in Python it's int(os.getenv("MYVAR")) and in C++ atoi(getenv("MYVAR")).

@RagnarGrootKoerkamp
Copy link
Collaborator

RagnarGrootKoerkamp commented Sep 28, 2024

  • I'm neutral on substitution in jury submissions. Don't mind supporting it but not the critical use case.
  • For latex, introducing \maxa and the like sounds good. Could have a prefix, but I don't think it's really needed.
  • For Python and C++ validators, initially I'm not a big fan of environment variables, but it may come out quite clean. One thing that could then be done is that what is currently v.read_int("a", 0, 1000) could then just become v.read_int("a"), which can read the mina and maxa environment variables automatically. So actually this would lead to less code :)
  • Question then is maybe is we should 'standardize' that such bounds are inclusive. It's still easy to mess up a \leq \maxa and a < \maxa. But I think exclusive bounds are quite rare, so maybe a non-issue.
  • BAPCtools specific, but we should also pass these to generators in the same way as we do for validators. It would be nice if command line args maxa=100 can in some way cleanly override equivalent environment variables without needing to include a library.

@evouga
Copy link
Collaborator

evouga commented Sep 29, 2024

any non-CTD syntax validator will need to use some kind of testlib/template to correctly validate whitespace/no trailing zeroes/etc.

This is true for output validators, but not always for input validators: one usage pattern is to have CTD handle the input syntax and for the C++/python validators to check more complicated semantics assuming correct syntax.

Anyway, I have a negative bias against environment variables, but maybe it is clean? What would my example above look like in this scheme? LaTeX would also use environment variables, or something else? It would work when the variables have TeX embedded in them?

@evouga
Copy link
Collaborator

evouga commented Sep 29, 2024

One thing that could then be done is that what is currently v.read_int("a", 0, 1000) could then just become v.read_int("a"), which can read the mina and maxa environment variables automatically. So actually this would lead to less code :)
Question then is maybe is we should 'standardize' that such bounds are inclusive. It's still easy to mess up a \leq \maxa and a < \maxa. But I think exclusive bounds are quite rare, so maybe a non-issue.

My sense is that this is too narrow/special-case. Some of the North American judges have very strong opinions about the style of the problem statement...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants