-
Notifications
You must be signed in to change notification settings - Fork 13
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Compatibility with Solr 9.7 and Lucene 9.11 (fixes #457)
This one was fun. Lucene 9.11 changed the signature for the `FieldHighlighter` constructor to add an 8th parameter. This meant that we had to call different constructors for different Solr/Lucene versions. Unfortunately, Java requires that the superclass constructor invocation must be the first top-level statement in a subclass constructor, i.e. dynamic selection is not possible in Java. It is however no problem at the bytecode level, and so we supply a handcrafted (via `jasm`) `FieldHighlighterAdapter.class` that does the dynamic superclass constructor invocation. It's shipped as a precompiled .class (along with the commented .jasm source code and build instructions) since I could for the life of me figure out how to integrate jasm into the maven build. Maven woes also required removing the JavaDoc plugin, since that was unable to pick up the .class from the resources directory. Since the plugin is not really used as a library anyway, I think this is an acceptable tradeoff.
- Loading branch information
Showing
8 changed files
with
156 additions
and
41 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
10 changes: 10 additions & 0 deletions
10
src/main/java/com/github/dbmdz/solrocr/util/LuceneVersionInfo.java
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
package com.github.dbmdz.solrocr.util; | ||
|
||
import org.apache.lucene.util.Version; | ||
|
||
public class LuceneVersionInfo { | ||
public static boolean versionIsBefore(int major, int minor) { | ||
return Version.LATEST.major < major | ||
|| (Version.LATEST.major == major && Version.LATEST.minor < minor); | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Binary file added
BIN
+1.17 KB
src/main/resources/com/github/dbmdz/solrocr/lucene/FieldHighlighterAdapter.class
Binary file not shown.
105 changes: 105 additions & 0 deletions
105
src/main/resources/com/github/dbmdz/solrocr/lucene/FieldHighlighterAdapter.jasm
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,105 @@ | ||
/** This is a small adapter class that allows inheriting from | ||
* FieldHighlighter from Lucene versions older than 9.11 and | ||
* newer. There was a breaking change in Lucene 9.11 that added | ||
* an 8th parameter to the constructor, breaking backwards | ||
* compatibility. | ||
* | ||
* This cannot be worked around at the Java source level, due to | ||
* strict requirements surrounding the `super()` call in a | ||
* subclass constructor. | ||
* | ||
* However, using JVM bytecode, we can easily work around that | ||
* and implement a class that can dynamically select the superclass | ||
* constructor based on the Lucene version. | ||
*/ | ||
public class com/github/dbmdz/solrocr/lucene/FieldHighlighterAdapter | ||
extends org/apache/lucene/search/uhighlight/FieldHighlighter { | ||
|
||
/** This constructor is a simple adapter that forwards the | ||
* parameters to the correct superclass constructor based on | ||
* the Lucene version. | ||
* | ||
* The bytecode corresponds to the following (illegal) Java code: | ||
* | ||
* ```java | ||
* public FieldHighlighterAdapter( | ||
* String fieldName, | ||
* FieldOffsetStrategy fieldOffsetStrategy, | ||
* PassageScorer passageScorer, | ||
* int maxPassages, | ||
* int maxNoHighlightPassages | ||
* ) { | ||
* if (LuceneVersionInfo.versionIsBefore(9, 11)) { | ||
* super(fieldName, fieldOffsetStrategy, null, passageScorer, maxPassages, maxNoHighlightPassages, null); | ||
* } else { | ||
* super(fieldName, fieldOffsetStrategy, null, passageScorer, maxPassages, maxNoHighlightPassages, null, null); | ||
* } | ||
* } | ||
* ``` | ||
* | ||
* @param fieldName The name of the field to highlight | ||
* @param fieldOffsetStrategy The strategy to use for field | ||
* offsets | ||
* @param passageScorer The scorer to use for passages | ||
* @param maxPassages The maximum number of passages to return | ||
* @param maxNoHighlightPassages The maximum number of passages | ||
* to return if no highlighting is possible | ||
*/ | ||
public <init>( | ||
java/lang/String, | ||
org/apache/lucene/search/uhighlight/FieldOffsetStrategy, | ||
org/apache/lucene/search/uhighlight/PassageScorer, | ||
int, | ||
int | ||
) void { | ||
|
||
// Load common constructor parameters from method params onto the stack | ||
aload 0 // `this`, i.e. object reference | ||
aload 1 // String fieldName | ||
aload 2 // FieldOffsetStrategy | ||
aconst_null // BreakIterator | ||
aload 3 // PassageScorer | ||
iload 4 // int maxPassages | ||
iload 5 // int maxNoHighlightPassages | ||
aconst_null // PassageFormatter | ||
|
||
// Check if Lucene version is lower than 9.11 | ||
bipush 9 // major version | ||
bipush 11 // minor version | ||
invokestatic com/github/dbmdz/solrocr/util/LuceneVersionInfo.versionIsBefore(int, int) boolean | ||
|
||
// go to new constructor if return value was false | ||
ifeq NEW_CONSTRUCTOR | ||
|
||
// Version check indicated a version <9.11, so we call the old | ||
// constructor signature with 7 parameters | ||
invokespecial org/apache/lucene/search/uhighlight/FieldHighlighter.<init>( | ||
java/lang/String, | ||
org/apache/lucene/search/uhighlight/FieldOffsetStrategy, | ||
java/text/BreakIterator, | ||
org/apache/lucene/search/uhighlight/PassageScorer, | ||
int, | ||
int, | ||
org/apache/lucene/search/uhighlight/PassageFormatter | ||
) void | ||
goto BEACH | ||
|
||
NEW_CONSTRUCTOR: | ||
// Versions >= 9.7 need a Comparator as the 8th parameter for the | ||
// new constructor signature | ||
aconst_null // Comparator | ||
invokespecial org/apache/lucene/search/uhighlight/FieldHighlighter.<init>( | ||
java/lang/String, | ||
org/apache/lucene/search/uhighlight/FieldOffsetStrategy, | ||
java/text/BreakIterator, | ||
org/apache/lucene/search/uhighlight/PassageScorer, | ||
int, | ||
int, | ||
org/apache/lucene/search/uhighlight/PassageFormatter, | ||
java/util/Comparator | ||
) void | ||
|
||
BEACH: | ||
return | ||
} | ||
} |
27 changes: 27 additions & 0 deletions
27
src/main/resources/com/github/dbmdz/solrocr/lucene/README.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
`FieldHighlighterAdapter.class` is a hand-crafted Java class that dynamically selects a | ||
`FieldHighlighter` constructor based on the Solr/Lucene version. | ||
|
||
This is neccessary because the `FieldHighlighter` class has changed its constructor signature | ||
between Solr 9.6 and 9.7, introduction an 8th parameter. | ||
|
||
Since dynamically selecting a superclass constructor to call isn't posssible at the Java source | ||
level, we have to drop down to the bytecode level to achieve this. | ||
|
||
The classs file is defined in the `FieldHighlighterAdapter.jasm` file, which is a [jasm](1) assembly | ||
file. | ||
|
||
To compile the class file, [download `jasm`](2) and run it on the `.jasm` file from the project root: | ||
|
||
```bash | ||
$ ./jasm-0.7.0/bin/jasm src/main/resources/com/github/dbmdz/solrocr/lucene/FieldHighlighterAdapter.jasm | ||
``` | ||
|
||
I tried for multiplpe hours to get this to build automatically as part of the Maven build, but had | ||
to give up, Maven just is too painful for this sort of thing. | ||
|
||
As to why we don't use the same bytecode patching technique as we did for the Solr 7 -> 8 API breakage, | ||
the reason is that this is a breakage within a single major version, if we created a separate JAR for | ||
each and everyone of those, we'd end up with a huge number of JARs over the years, which is not ideal. | ||
|
||
[1]: https://github.com/roscopeco/jasm | ||
[2]: https://github.com/roscopeco/jasm/releases |