-
Notifications
You must be signed in to change notification settings - Fork 194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Runtime JIT support detection #157
base: master
Are you sure you want to change the base?
Conversation
9dee7e8
to
6f5c4da
Compare
In create_tempfile() we look for a suitable place to put the temporary file into and, among others, look at $TMPDIR. If the value of this environment variable exceeds the bounds of the local tmp_name[] buffer, we ignore it. However, we still change the value of 'tmp_name_len' which leads to follow-up errors. On debug builds this can lead to hitting the assertion as can be seen below: $ TMPDIR=$(perl -e 'print "A"x1024') ./bin/array_access Assertion failed at sljit_src/sljitProtExecAllocator.c:147 Aborted For non-debug builds, however, this can lead to a memory corruption, by abusing the fact that we change a trailing '/' to '\0' later on. With a sufficiently high enough value for 'tmp_name_len' this can corrupt stack frames up in the call chain. Fix this by setting 'tmp_name_len' only if value it is based on is found to be valid -- just like it was prior to commit 98323bd82218. Fixes: 98323bd82218 ("protexec: refactor create_tempfile() (PCRE2Project#37)") Signed-off-by: Mathias Krause <[email protected]>
SELinux or PaX/grsecurity based kernels may deny creating writable and executable mappings, leading to errors when trying to allocate JIT memory, even though JIT support is generally available. Provide a function to probe for the runtime availability of rwx maps to support users like libpcre2 which can use it to announce the lack of JIT and fall back to the interpreter instead. This function is only needed for Linux and only if we're using the default JIT memory allocator, as all others implement workarounds via double mappings. Signed-off-by: Mathias Krause <[email protected]>
SELinux or PaX/grsecurity based kernels may deny creating writable and executable mappings, leading to errors when trying to allocate JIT memory, even though JIT support is generally available. Instead of failing hard in this case, allow to use the interpreter fallback by probing and announcing the availability of JIT mode at runtime through the PCRE2_CONFIG_JIT hook. This only happens for configurations using only the default JIT memory allocator, i.e. not the SELinux aware one. However, we still mark the latter as experimental and distributions like Debian don't enable it. The current behaviour leads to nasty user visible errors on such systems, e.g. when running 'git grep': $ git grep peach fatal: Couldn't JIT the PCRE2 pattern 'peach', got '-48' With this change in place, it'll fall back to the interpreter and "just work", providing a much more pleasant user experience. Signed-off-by: Mathias Krause <[email protected]>
6f5c4da
to
c2dfa1e
Compare
How can you tell the difference between "jit is not compiled" and "jit needs extra permissions"? If it is not possible, this direction is a feature loss. |
Easy: #ifdef SUPPORT_JIT
printf("JIT support compiled in %s\n", !!PRIV(jit_supported)() ? "and runtime supported" : "but runtime disabled");
#else
printf("JIT support not compiled in\n");
#endif
What kind of feature loss? Users of PCRE2 just need to know if JIT will be working or not. Everything else doesn't matter. It's like as if one's complaining that glibc has an AVX2 implementation of memcpy() but that can't be used on an Atom box. |
This is pcre2 code, SUPPORT_JIT is unknown for a library user. Is there another way to do it? |
We could change the API and make
However, that's an ABI-breaking and backwards incompatible change, as users such as But what's the intended use case anyway? Why would a user of PCRE2 need to know why JIT isn't working? |
Some (future) OS may allow giving exec rights on demand, so the application sees it cannot create executable memory, but it can requests the OS to do so (e.g a modal form requests admin password), and then retry the compilation. Furthermore the application can give different suggestions how to enable jit based on the return value of the config. Offline (pic) code generation might also be possible at some point similar to saving compiled patterns. Honestly the whole who has which rights are out-of-scope from the perspective of jit. It would be good to have a separate library which handles / manages these things. Maybe there are some libraries for that. |
I agree, but this has nothing to do with this PR. That future OS may never materialize and if it does, it better handles the modal request dialog during the syscall in question. Everything else (application retrying syscalls (how often?)) makes no sense. Anything else that blocks this bug from getting fixed? |
This is an api breaking feature request, not a bug. The current code is working as intended, and this case can be handled as well. |
Oh, sorry, I must be missing something. I tried very hard to not break the API (nor ABI). Where does the change do that? (I guess you don't mean f852788 which fixes a memory corruption bug that might be considered "API.") And I have to disagree with you, again, the current code is not working as intended, as outlined multiple times already: c2dfa1e, zherczeg/sljit#136 (comment) (and various other comments of mine in that issue). |
I don't mind if you disagree with me. The patch collection changes the behaviour of PCRE2_CONFIG_JIT, so it is an api breaking change. It is like changing the fread to work like fwrite. The fix part can be landed. You could work on a library which doing things you like. You could create a function there which checks PCRE2_CONFIG_JIT and adds your extra checks for your platform. It can also update the system config automatically when needed, filter the messages you don't like, etc. |
The patch changes the behavior of
My understanding of this wording is that a call to I mean, after all it's a function call. If one would merely be interested if JIT support is compiled in (in contrast of being supported), a C preprocessor define would be sufficient. But it's a call, so looking at the current runtime constraints is very much in line with answering the question "Does libpcre2 support JIT compilation (for this very process)?". So IMO it's an API fix, not an API break. |
The pcre2_config() function is intended to tell a client what configuration options were used when PCRE2 was compiled - hence its name. So it returns 1 or 0 according as SUPPORT_JIT was defined or not at PCRE2 compile time. A client cannot use a C preprocessor test because SUPPORT_JIT is not included in pcre2.h. If pcre2_config() returns 0, you know that calls to JIT functions will never succeed. If it returns 1, they might - the only way to find out is to try. |
Even if the description is unclear, the code worked this way since beginning, and existing software rely on this. |
What's the use case of Isn't it much more sensible to make |
Why x86 32 bit has only 8 registers, when it is totally inefficient? It is designed that way and compatibility must be maintained. Your use case is valid, but it seems it works well as a separate project. You create a library, which checks user rights, and this library could provide an extended version of |
The pcre2_jit_config() function has always just reflected the state of SUPPORT_JIT at compile time. I have made a note to add more explanation to the documentation at some point. Changing this is not something I would want to do - PCRE2 is very widely used and I know from past experience that changes of this sort, however minor, can cause issues. // Note that there have been times in the past when pcre2_jit_compile() worked for some patterns but not for others - when new features were added to the interpreter before they were supported in JIT. Therefore, the only way to tell if JIT will work for this pattern is to try it. |
I'm not arguing against handling errors of pcre2_jit_*() functions, not at all! I'm just saying, it makes no sense to announce JIT support when we know for sure that all patterns will fail (because the JIT cannot allocate memory for its code generation). Or, to get back to @zherczeg's x86 analogy, what sense does it make trying to execute AES-NI instructions on a i486 and force the caller to handle invalid opcode exceptions -- just because some crypto library ships an AES-NI optimized implementation along with a generic one? |
I cannot say anything about the "makes sense" part. For some people, something makes sense, and for the others the opposite is true. As for pcre, API compatibility is one of the most important aspect. When we couldn't extend pcre anymore, we have created pcre2. So the next time we do breaking changes will be pcre3 (no plans for that at the moment). Maybe we should discuss this topic then. |
FWIW, It's a gross hack, but the best Another observation, while developing the workaround, it became clear that the PCRE2 API simply doesn't allow to tell the following two cases apart: JIT compiling a pattern failed because...
Both cases will simply fail with I still think this detection logic belongs to PCRE2/SLJIT, as that's where the JIT allocator lives and where all required details are known. Users of PCRE2 should have no need to guess or poke around any of this. It's an internal implementation detail of PCRE2/SLJIT after all. But I got it, you think differently, so we have to live with what we have. However, could you please tell how to distinguish error cases 1 and 2 above without making use of heuristics to maybe detect case 1? |
I think tools use a very simple pattern, e.g. an empty string pattern to check initial allocation. But it is possible to an return error code when compiling is failed because of an unsupported pattern. There are various reasons for that, but maybe 1-2 return values should be enough to cover all cases. I will think about it. |
Can you provide examples of such tools so I've something to compare with and also confirm that this is considered PCRE2 API, i.e. will be a sufficient test that can detect JIT runtime support for current, future and past versions of PCRE2? Regarding the error codes, I guess changing them is no option for PCRE2, as that would break the API and, from what I got, that's out of scope. |
Yes, we cannot change existing error codes, only add new ones. Existing applications will not understand the new code, they will simply accept something is wrong. I will think about the best option. |
Even adding new ones will break existing applications, as pointed out in #213 (review). |
Trying to resurrect the discussion... Even though my urgent needs where solved by implementing the First of all, the runtime probing hack is needed just because The probing code itself is, well, relying on a specific error code which isn't even uniquely identifying the case (and further may break in the future, e.g. via changes like #213, causing breakage in other applications checking the error code as well) and assuming JIT compiling the pattern But the biggest issue I have with the hack, it needs to be repeated for each and every project making use of PCRE2 (or SLJIT) and that's quite cumbersome. Having it implemented right at the core, within SLJIT, actually, feels much more sensible than having to redo the same thing for each user. I did a Debian Code search -- which is by no means a complete view but should provide a data point for what Debian and derived distribution are making use of -- and here are examples that still need patches and are currently broken if
The following examples do handle the error and fallback to the interpreter, however, do so in a suboptimal way:
If, instead, PCRE2 would be changed as proposed, to do a test at runtime for The error handling examples would also be better off by avoiding unnecessary error message or even log flood spam, as is the case for NGINX or R. That's why I'm asking you again to reconsider the (purely theoretical?) concerns about changing the semantics of From above examples it's also clear that the assumption of a lot of projects is that |
#433 is a good fix for this |
No, it's not. It's adding a new option for |
You can call it after |
Adding a call is completely missing the point of not having to modify the usage side. |
This PR adds runtime detection for the system's support of JIT memory allocations and allows users of pcre2 to fall back to the interpreter mode if rwx mappings are prohibited.
The most prominent user would be
git grep
currently failing hard on such systems. With this change in place, it simply uses the interpreter mode and "just works."The first two commits are also send as a PR to the
sljit
repository. So you can either wait until these are merged and integrate the changes through the upstream project or you take all three of them directly through this PR.Thanks!