From 292bf2e633fa8edd511f79fb4a69732fe4a6e99e Mon Sep 17 00:00:00 2001
From: matthewb-splunk <145702538+matthewb-splunk@users.noreply.github.com>
Date: Wed, 27 Sep 2023 00:56:58 -0400
Subject: [PATCH] The bug is in the generation of a call relative instruction
 (because this is PIC code). Call relative of 0 will run the next instruction.
 So call relative is relative to the next instruction, not the current
 instruction. (#208)

In x86_64 (ia32e) the call relative instruction can have a 32-bit or a 64-bit operand. Both versions are allowed (the 16-bit version is not). The 32-bit operand is preferred as the instruction is shorter. For this reason the JIT code in PCRE2 will attempt to use the 32-bit version when it can, and only use the 64-bit version. This code can also potentially use several other jump instructions, depending on the circumstances, including an 8 bit jump instruction.

The calculation to decide whether to use the 64-bit jump instruction looks like this: if ((sljit_sw)(label_addr - (jump->addr + 1)) > HALFWORD_MAX || (sljit_sw)(label_addr - (jump->addr + 1)) < HALFWORD_MIN)

jump->addr refers to the address that the jump address is stored in in the instruction. In the case of call 32-bit this is a 4-byte address.

The problem is that this calculation is ignoring the space the address itself takes (and I believe there is a case where the instruction can be 2 long, see the last else case of type == SLJIT_JUMP).

The longest this instruction can be (if we go with 32-bit or smaller jumps) is 6 bytes, and the shorest is 2 bytes. Since we don't know which we will pick yet, we should be conservative and ensure the 32-bit jump will work regardless of whether the instruction is 6 bytes or 2 bytes long. The instruction cannot in fact only be 1 byte long.

This calculation should thus look like this: if ((sljit_sw)(label_addr - (jump->addr + 2)) > HALFWORD_MAX || (sljit_sw)(label_addr - (jump->addr + 6)) < HALFWORD_MIN)

The first case is checking jumping to a larger address. In this case we want to check against the smallest possible starting address, for the largest possible total jump. The second case is checkin jumpin to a smaller address. In this case we want to check agains the largest possible address, for again the longest possible jump distance. Note that the change to the first case is not necessary. +1 is merely overly conservative by one byte.
---
 sljit_src/sljitNativeX86_common.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/sljit_src/sljitNativeX86_common.c b/sljit_src/sljitNativeX86_common.c
index 3e8cf1f6..c894fb45 100644
--- a/sljit_src/sljitNativeX86_common.c
+++ b/sljit_src/sljitNativeX86_common.c
@@ -607,7 +607,7 @@ static sljit_u8* generate_near_jump_code(struct sljit_jump *jump, sljit_u8 *code
 		label_addr = jump->u.target - (sljit_uw)executable_offset;
 
 #if (defined SLJIT_CONFIG_X86_64 && SLJIT_CONFIG_X86_64)
-	if ((sljit_sw)(label_addr - (jump->addr + 1)) > HALFWORD_MAX || (sljit_sw)(label_addr - (jump->addr + 1)) < HALFWORD_MIN)
+	if ((sljit_sw)(label_addr - (jump->addr + 2)) > HALFWORD_MAX || (sljit_sw)(label_addr - (jump->addr + 6)) < HALFWORD_MIN)
 		return generate_far_jump_code(jump, code_ptr);
 #endif