Clarify the behavior of the default output validator and the semantics of the default output validator flags #348

evouga · 2024-10-10T19:00:42Z

The behavior of the default output validator is not well specified.

What output encoding does it require? UTF-8 without BOM? How is the output tokenized? How is the "type" of the token (floating-point vs. string, I guess? Maybe integer too, see below?) determined?
Are integers compared as strings? (With leading zeros, hexadecimal encoding, etc. being judged as incorrect) or is there some special handling for "integer-type" tokens?
case_sensitive: notoriously underspecified. How exactly is case-insensitive comparison performed? I assume that this flag only applies to tokens of "string type" and not floating-point?
space_change_sensitive: The default behavior is mentioned in a parenthetical remark (the default is that any sequence of 1 or more whitespace characters are equivalent) but this is incomplete since it does not mention leading or trailing whitespace (which I understand the current default validator ignores by default). Moreover indicates that changes in the amount of whitespace should be rejected is imprecise: do changes in the type of whitespace (newlines to spaces e.g.) cause submissions to be rejected? What is a whitespace anyway? The 25 whitespace Unicode code points?

The text was updated successfully, but these errors were encountered:

Provide feedback