Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement concurrent highlighting on multiple threads (#64) #429

Merged
merged 8 commits into from
May 24, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions docs/changes.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,12 @@
## Unreleased

**Changed:**
- Add support for multithreaded highlighting. Uses all available logical CPU cores by default and
can be tweaked with the `numHighlightingThreads` and `maxQueuedPerThread` attributes on the
`OcrHighlightComponent` in `solrconfig.xml`.
- Removed `PageCacheWarmer`, no longer needed due to multithreading support.


## 0.8.5 (2024-04-25)
[GitHub Release](https://github.com/dbmdz/solr-ocrhighlighting/releases/tag/0.8.5)

Expand Down
52 changes: 20 additions & 32 deletions docs/performance.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,13 +16,17 @@ Before you start tuning the plugin, it is important to spend some time on analyz

- Check Solr queries with `debug=timing`: How much of the response time is actually spent in the OCR highlighting
component?
- On newer Linux kernels, check the Pressure Stall Information (PSI) metrics with `htop` or by looking
at `/proc/pressure/{io,cpu}`. This can give you an indication if the system is I/O-bottlenecked or
CPU-bottlenecked.
- On the operating system level (if you're on a Linux system), use [BCC Tools](https://github.com/iovisor/bcc),
especially `{nfs/xfs/ext/...}slower` and `{nfs/xfs/ext/...}dist` to check if the performance issues are due to I/O
latency.

## Storage Layer
The plugin spends a lot of time on randomly reading small sections of the target files from disk. This means that
the performance characteristics of the underlying storage system have a huge effect on the performance of the plugin.
The plugin spends a lot of time reading small sections of the target files from disk. This means that
the performance characteristics of the underlying storage system have a huge effect on the performance
of the plugin.

Important factors include:

Expand All @@ -33,36 +37,20 @@ Generally speaking, local storage is better than remote storage (like NFS or CIF
flash-based storage is better than disk-based storage, due to the lower random read latency and the possibility to
do parallel reads. A RAID1/10 setup is preferred over a RAID0/JBOD setup, due to the increased potential for parallel reads.

## Plugin configuration
The plugin offers the possibility to perform a **concurrent read-ahead of highlighting target files**. This will perform
"dummy" reads on multiple parallel threads, with the intent to fill the operating system's page cache with the contents
of the highlighting targets, so that the actual highlighting process is performed on data read from the cache (which
resides in main memory). This is mainly useful for storage layers that benefit from parallel reads, since the highlighting
process is strongly sequential and performing the read-ahead concurrently can reduce latency.

To enable it, add the `enablePreload=true` attribute on the OCR highlighting component in your core's `solrconfig.xml`.
It is important to accompany this with benchmarking and monitoring, the available settings should be tuned to the
environment:

- `preloadReadSize`: Size in bytes of read-ahead block reads, should be aligned with file system block size
(or `rsize` for NFS file systems). Defaults to `32768`.
- `preloadConcurrency`: Number of threads to perform read-ahead. Optimal settings have to be determined via
experimentation. Defaults to `8`.

This approach relies on the OS-level page cache, so make sure you have enough spare RAM available on your machine to
actually benefit from this! Use BCC's `*slower` tools to verify that it's a `solr-ocrhighlight` thread that performs
most of the reads and not the actual query thread (`qtp....`). If you run the same query twice, you shouldn't see a lot
of reads from either the `qtp...` or `solr-ocrhlighight` threads on the second run.

Example configuration tuned for remote NFS storage mounted with `rsize=65536`:
```xml
<searchComponent
class="solrocr.OcrHighlightComponent"
name="ocrHighlight" enablePreload="true" preloadReadSize="65536"
preloadConcurrency="8"
/>
```

## Concurrency
The plugin can read multiple files in parallel and also process them concurrently. By default, it will
use as many threads as there are available logical CPU cores on the machine, but this can be tweaked
with the `numHighlightingThreads` and `maxQueuedPerThread` parameters on the `OcrHighlightComponent`
in your `solrconfig.xml`. Tune these parameters to match your hardware and storage layer.

- `numHighlightingThreads`: The number of threads that will be used to read and process the OCR files.
Defaults to the number of logical CPU cores. Set this higher if you're I/O-bottlenecked and can
support more parallel reads than you have logical CPU cores (very likely for modern NVMe drives).
- `maxQueuedPerThread`: By default, we queue only a limited number of documents per thread as to not
stall other requests. If this number is reached, all highlighting will be done single-threaded on
the request thread. You usually don't have to touch this setting, but if you have large result sets
with many concurrent requests, this can help to reduce the number of threads that are active at
the same time, at least as a stopgap.

## Runtime configuration
Another option to influence the performance of the plugin is to tune some runtime options for highlighting.
Expand Down
Loading
Loading