You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Generex currently replaces predefined character classes while wrapping them in square brackets: \d becomes [0-9]
However, if the \d is already in a character class expression then [\d] becomes [[0-9]], which is compilable correctly by java.util.regex.Pattern but not by dk.brics.automaton.Automaton used by Generex.
Simple regex replacement is apparently not enough, it looks like contextual replacement is needed (tracking if \d is inside [..] char by char, tracking already escaped \)
Input:[\d] (Java String literal "[\\d]")
Expected output:
transformed regex [0-9]
all matched strings:
0
1
2
3
4
5
6
7
8
9
Actual output:
transformed regex [[0-9]]
all matched strings:
0]
1]
2]
3]
4]
5]
6]
7]
8]
9]
[]
The text was updated successfully, but these errors were encountered:
It should be also taken into consideration that the backslash in the character class could have been escaped.
Input:\\d (Java string literal "\\\\d")
Expected output:
transformed regex: \\d (no change)
matched string: \d
Actual output:
transformed regex: \[0-9]
matched string: [0-9]
Proposed solution
Pattern.compile(
"(?<!\\\\)" + // (?<!\\) no preceding backslash allowed"(?<slashes>(\\\\\\\\)*)" + // (\\\\)* literal backslash allowed {0,} times"\\\\d"// \\d single backslash and the character class letter
);
Don't forget to include slashes when replacing, they had to be captured (in pairs) because the negative look-behind does not allow infinite *.
Generex currently replaces predefined character classes while wrapping them in square brackets:
\d
becomes[0-9]
However, if the
\d
is already in a character class expression then[\d]
becomes[[0-9]]
, which is compilable correctly byjava.util.regex.Pattern
but not bydk.brics.automaton.Automaton
used by Generex.Simple regex replacement is apparently not enough, it looks like contextual replacement is needed (tracking if
\d
is inside [..] char by char, tracking already escaped\
)Input:
[\d]
(Java String literal"[\\d]"
)Expected output:
[0-9]
Actual output:
[[0-9]]
The text was updated successfully, but these errors were encountered: