Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failure scanning large image #889

Closed
sammarshallou opened this issue Apr 5, 2023 · 8 comments
Closed

Failure scanning large image #889

sammarshallou opened this issue Apr 5, 2023 · 8 comments

Comments

@sammarshallou
Copy link

Describe the bug

When using a very large image (attached), 11840x7500, 9MB JPEG:

We get an error scanning. This is using the clamav/clamav:latest from https://hub.docker.com/layers/clamav/clamav/latest/images/sha256-eb1057073c6178d3e7f3e6e1edb43feec84db6809f12fbf900d9f7056380793a?context=explore as of yesterday.

The error happens with multiple images of this size (originally we had a user report, this test image is one I made).

It does not happen with normal-sized images, nor with other files that aren't images, even if they are larger (eg 13MB .dll file).

We are using a Moodle installation to do this and it fails across multiple different instances (test, live, etc). I tried downloading the latest 'portable win64' ClamAV standalone installation, but it is able to successfully scan this file.

How to reproduce the problem

On our Moodle system connected to a ClamAV container (version as above), upload the attached file; it fails with the 'error scanning file' popup message. (This is different from the 'your file has a virus' popup message, but that's the only information I have.)

Attachments

BIG

@micahsnyder
Copy link
Contributor

@sammarshallou I am unable to reproduce any failure scanning the attached image with clamav/clamav:latest.

I tried this first:


❯ docker run -it --mount type=bind,source=/home/micah/Downloads,target=/downloads clamav/clamav:latest /usr/bin/clamscan /downloads/230097030-f8f997ea-030f-45a1-b3b1-d2dab8bbbe5e.jpg
Loading:    14s, ETA:   0s [========================>]    8.66M/8.66M sigs
Compiling:   3s, ETA:   0s [========================>]       41/41 tasks

/downloads/230097030-f8f997ea-030f-45a1-b3b1-d2dab8bbbe5e.jpg: OK

----------- SCAN SUMMARY -----------
Known viruses: 8660651
Engine version: 1.0.1
Scanned directories: 0
Scanned files: 1
Infected files: 0
Data scanned: 9.56 MB
Data read: 8.98 MB (ratio 1.06:1)
Time: 21.932 sec (0 m 21 s)
Start Date: 2023:04:10 20:37:26
End Date:   2023:04:10 20:37:48

Then when that worked, I tried starting clamd and using clamdscan, instead. So I ran this first in one terminal:


❯ docker run -it --mount type=bind,source=/home/micah/Downloads,target=/downloads clamav/clamav:lates

Starting Freshclamd
Starting ClamAV
Socket for clamd not found yet, retrying (0/1800) ...ClamAV update process started at Mon Apr 10 20:39:43 2023
daily database available for update (local version: 26868, remote version: 26871)
Current database is 3 versions behind.
Downloading database patch # 26869...
Time:    0.5s, ETA:    0.0s [========================>]   14.50KiB/14.50KiB 6.5s [=>                 BDownloading database patch # 26870...
Time:    0.1s, ETA:    0.0s [========================>]    9.85KiB/9.85KiB
Downloading database patch # 26871...
Time:    0.1s, ETA:    0.0s [========================>]   13.45KiB/13.45KiB
Testing database: '/var/lib/clamav/tmp.a970cdad21/clamav-ef2abb81bc3a1750b9bc4a0dc538de94.tmp-daily.cld' ...
Socket for clamd not found yet, retrying (7/1800) ...Database test passed.
daily.cld updated (version: 26871, sigs: 2029300, f-level: 90, builder: raynman)
main.cvd database is up-to-date (version: 62, sigs: 6647427, f-level: 90, builder: sigmgr)
bytecode.cvd database is up-to-date (version: 334, sigs: 91, f-level: 90, builder: anvilleg)
WARNING: Clamd was NOT notified: Can't connect to clamd through /tmp/clamd.sock: No such file or directory
Socket for clamd not found yet, retrying (16/1800) ...Mon Apr 10 20:39:59 2023 -> Limits: Global time limit set to 120000 milliseconds.
Mon Apr 10 20:39:59 2023 -> Limits: Global size limit set to 419430400 bytes.
Mon Apr 10 20:39:59 2023 -> Limits: File size limit set to 104857600 bytes.
Mon Apr 10 20:39:59 2023 -> Limits: Recursion level limit set to 17.
Mon Apr 10 20:39:59 2023 -> Limits: Files limit set to 10000.
Mon Apr 10 20:39:59 2023 -> Limits: MaxEmbeddedPE limit set to 41943040 bytes.
Mon Apr 10 20:39:59 2023 -> Limits: MaxHTMLNormalize limit set to 41943040 bytes.
Mon Apr 10 20:39:59 2023 -> Limits: MaxHTMLNoTags limit set to 8388608 bytes.
Mon Apr 10 20:39:59 2023 -> Limits: MaxScriptNormalize limit set to 20971520 bytes.
Mon Apr 10 20:39:59 2023 -> Limits: MaxZipTypeRcg limit set to 1048576 bytes.
Mon Apr 10 20:39:59 2023 -> Limits: MaxPartitions limit set to 50.
Mon Apr 10 20:39:59 2023 -> Limits: MaxIconsPE limit set to 100.
Mon Apr 10 20:39:59 2023 -> Limits: MaxRecHWP3 limit set to 16.
Mon Apr 10 20:39:59 2023 -> Limits: PCREMatchLimit limit set to 100000.
Mon Apr 10 20:39:59 2023 -> Limits: PCRERecMatchLimit limit set to 2000.
Mon Apr 10 20:39:59 2023 -> Limits: PCREMaxFileSize limit set to 104857600.
Mon Apr 10 20:39:59 2023 -> Archive support enabled.
Mon Apr 10 20:39:59 2023 -> AlertExceedsMax heuristic detection disabled.
Mon Apr 10 20:39:59 2023 -> Heuristic alerts enabled.
Mon Apr 10 20:39:59 2023 -> Portable Executable support enabled.
Mon Apr 10 20:39:59 2023 -> ELF support enabled.
Mon Apr 10 20:39:59 2023 -> Mail files support enabled.
Mon Apr 10 20:39:59 2023 -> OLE2 support enabled.
Mon Apr 10 20:39:59 2023 -> PDF support enabled.
Mon Apr 10 20:39:59 2023 -> SWF support enabled.
Mon Apr 10 20:39:59 2023 -> HTML support enabled.
Mon Apr 10 20:39:59 2023 -> XMLDOCS support enabled.
Mon Apr 10 20:39:59 2023 -> HWP3 support enabled.
Mon Apr 10 20:39:59 2023 -> Self checking every 600 seconds.
Mon Apr 10 20:39:59 2023 -> Set stacksize to 1048576
socket found, clamd started.
Mon Apr 10 20:50:15 2023 -> SelfCheck: Database status OK.

And once started, I ran this:

❯ docker ps
CONTAINER ID   IMAGE                  COMMAND   CREATED          STATUS                    PORTS                NAMES
2c826b3bcb2c   clamav/clamav:latest   "/init"   32 minutes ago   Up 32 minutes (healthy)   3310/tcp, 7357/tcp   dazzling_greider

~
❯ docker exec -it dazzling_greider /usr/bin/clamdscan /downloads/230097030-f8f997ea-030f-45a1-b3b1-d2dab8bbbe5e.jpg
/downloads/230097030-f8f997ea-030f-45a1-b3b1-d2dab8bbbe5e.jpg: OK

----------- SCAN SUMMARY -----------
Infected files: 0
Time: 3.060 sec (0 m 3 s)
Start Date: 2023:04:10 21:12:57
End Date:   2023:04:10 21:13:00

Seems to be working okay to me.

Please share any output you have that shows an error.

@shrgabor
Copy link

shrgabor commented Apr 12, 2023

@micahsnyder

Is the scan time within expected range with this setup?
First: Time: 21.932 sec (0 m 21 s)
Second: Time: 3.060 sec (0 m 3 s)
I guess the difference is because of the starting time, but even though, 21 secs seems a lot, and 3 secs is not fast either.

Never mind, just found 877

@micahsnyder
Copy link
Contributor

Most of that 21.932 sec from the first scan was time spent loading the signatures. The 3.060 sec is the actual scan time of the image.
And you're correct about it being from issue 877. Calculating the fuzzy hash for large images does take some time.

In any case, you described an error in your initial post. Where is this error?

@sammarshallou
Copy link
Author

@micahsnyder I've got no error message so this may not be reproducible, maybe you need to close this, apologies.

To give a bit more detail, we are running a docker image inside Amazon ECS, and sending it the file data via a socket connection from a PHP application running separately. After the data is sent (fwrite of 'nINSTREAM\n', followed by the byte size and the actual data, followed by a zero), the PHP code reads one line from the socket using this command:

            $output = stream_get_line($socket, 4096);

For a working file, the value of $output which I'm logging is 'stream: OK\n'. For a non-working file, this function returns FALSE, which generally means the connection was closed.

There is a health check on connecting to the clamd port, which then fails, and ECS kills the task and starts a new one. So it looks like it crashed. There is nothing in the clamd log except the stuff that gets logged when the new instance starts up.

@micahsnyder
Copy link
Contributor

@sammarshallou That's strange! Clamd does have have read timeout where it will close the connection after a configurable time limit, using ReadTimeout in clamd.conf. The default is 30 seconds. I wouldn't expect a disconnect sooner, but perhaps it's really taking that long?

@sammarshallou
Copy link
Author

@micahsnyder I had logging from the PHP application of the time when the write of the file data finished, and the time when the response read finished (or failed in the case of this large image). While it's much longer than the tiny working image I was comparing against, It's still only 3 seconds so I don't think there can be a timeout:

[2023-04-17 14:44:37.387127 GMT] [php:notice] [pid 144:tid [client 10.52.1.104:39134] ClamAV done write...
[2023-04-17 14:44:40.084740 GMT] [php:notice] [pid 144:tid [client 10.52.1.104:39134] ClamAV output: [[]] FALSE

I checked in clamd.conf and our ReadTimeout is set to 600, CommandReadTimeout left at default 30. While looking in the file I did notice there are verbose and debug logging options, so I figured out how to rebuild our container, turned both those on and repeated the test. It's a bit hard to tell what happens because the same log includes multiple containers, but I think when it's failing, the log ends like this:

| Tue Apr 18 09:06:38 2023 -> Got chunksize: 0 |
| Tue Apr 18 09:06:38 2023 -> Chunks complete |
| Tue Apr 18 09:06:38 2023 -> Number of file descriptors polled: 1 fds |
| LibClamAV debug: Recognized JPEG file |
| Tue Apr 18 09:06:38 2023 -> THRMGR: queue (single) crossed low threshold -> signaling |
| Tue Apr 18 09:06:38 2023 -> THRMGR: queue (bulk) crossed low threshold -> signaling |
| LibClamAV debug: in cli_parsejpeg() |
| LibClamAV debug: segment[1] = 0xe0, Length 16 |
| LibClamAV debug: JFIF application marker |
| LibClamAV debug: segment[2] = 0xfe, Length 19 |
| LibClamAV debug: Comment (COM) segment marker |
| LibClamAV debug: segment[3] = 0xe2, Length 688 |
| LibClamAV debug: ICC Profile, chunk # 1 |
| LibClamAV debug: segment[4] = 0xdb, Length 67 |
| LibClamAV debug: Quantization Tables definitions (DQT) |
| LibClamAV debug: segment[5] = 0xdb, Length 67 |
| LibClamAV debug: Quantization Tables definitions (DQT) |
| LibClamAV debug: segment[6] = 0xc2, Length 17 |
| LibClamAV debug: Start of Frame (S0F) c2 |
| LibClamAV debug: segment[7] = 0xc4, Length 27 |
| LibClamAV debug: Huffman Tables definitions (DHT) |
| LibClamAV debug: segment[8] = 0xc4, Length 26 |
| LibClamAV debug: Huffman Tables definitions (DHT) |
| LibClamAV debug: segment[9] = 0xda, Length 12 |
| LibClamAV debug: Start of Scan (SOS) segment marker |

I.e. after that line there is no more about the jpg, it stops there.

This compares to a successful (small) jpg scan which continued after that line:

| Tue Apr 18 09:06:28 2023 -> Got chunksize: 0 |
| Tue Apr 18 09:06:28 2023 -> Chunks complete |
| Tue Apr 18 09:06:28 2023 -> Number of file descriptors polled: 1 fds |
| Tue Apr 18 09:06:28 2023 -> THRMGR: queue (single) crossed low threshold -> signaling |
| Tue Apr 18 09:06:28 2023 -> THRMGR: queue (bulk) crossed low threshold -> signaling |
| LibClamAV debug: Recognized JPEG file |
| LibClamAV debug: in cli_parsejpeg() |
| LibClamAV debug: segment[1] = 0xe0, Length 16 |
| LibClamAV debug: JFIF application marker |
| LibClamAV debug: segment[2] = 0xdb, Length 67 |
| LibClamAV debug: Quantization Tables definitions (DQT) |
| LibClamAV debug: segment[3] = 0xdb, Length 67 |
| LibClamAV debug: Quantization Tables definitions (DQT) |
| LibClamAV debug: segment[4] = 0xc2, Length 17 |
| LibClamAV debug: Start of Frame (S0F) c2 |
| LibClamAV debug: segment[5] = 0xc4, Length 28 |
| LibClamAV debug: Huffman Tables definitions (DHT) |
| LibClamAV debug: segment[6] = 0xc4, Length 27 |
| LibClamAV debug: Huffman Tables definitions (DHT) |
| LibClamAV debug: segment[7] = 0xda, Length 12 |
| LibClamAV debug: Start of Scan (SOS) segment marker |
| LibClamAV debug: Image hash: 9c6d4762299b64bc |
| LibClamAV debug: Descriptor[11]: Continuing after file scan resulted with: No viruses detected |
| LibClamAV debug: Checking image fuzzy hash '9c6d4762299b64bc' for signature match |
| LibClamAV debug: matcher_run: performing regex matching on full map: 0+91668(91668) >= 91668 |
| LibClamAV debug: Checking image fuzzy hash '9c6d4762299b64bc' for signature match |
| LibClamAV debug: matcher_run: performing regex matching on full map: 0+91668(91668) >= 91668 |
| LibClamAV debug: hashtab: Freeing hashset, elements: 0, capacity: 0 |
| LibClamAV debug: Descriptor[11]: Continuing after file scan resulted with: No viruses detected |
| LibClamAV debug: Descriptor[11]: Continuing after file scan resulted with: No viruses detected |
| LibClamAV debug: cli_magic_scan: returning 0 at line 4997 |
| LibClamAV debug: Descriptor[11]: Continuing after file scan resulted with: No viruses detected |
| Tue Apr 18 09:06:28 2023 -> Finished scanthread |
| Tue Apr 18 09:06:28 2023 -> Scanthread: connection shut down (FD 10) |

So it does appear like whatever is failing on our setup might be something that happens while computing the hash.

@sammarshallou
Copy link
Author

Following further investigation it looks like this happens for us when the machine runs out of memory, so it's not really a bug (although it might be nice if there was an 'out of memory' error message in the log). We've increased the memory capacity of the ClamAV container, and the problem is now resolved for us, so I'll close this issue, apologies.

@micahsnyder
Copy link
Contributor

@sammarshallou Glad to hear you figured out it was a memory issue. Sorry it was so frustrating to solve. It certainly would be nice if clamd could detect it was about to exceed system memory and warn before the OS kills it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants