RAM usage keeps increasing when deleting and reimporting data #1240

gpicciuca · 2024-01-19T13:42:00Z

Hello there,
I'm experimenting a bit with Virtuoso as we're trying to figure out if it's a good choice for our Project and while doing some tests I stumbled upon a weird memory usage case.

In this particular case, the Queries I run are:
DELETE FROM DB.DBA.RDF_QUAD
DB.DBA.TTLP_MT (file_to_string_output ('/path/to/ttl_file.ttl'), '', 'http://localhost:8890/XXX')

The dataset is about 5,7 MB and contains roughly 80k triplets.

Everytime I run the above queries, the RAM usage of Virtuoso increases by about 0,5 MB. Occasionally a few (1-2 MB) get reclaimed, but other than that it keeps increasing.
The test I ran lasted 40 minutes. The queries above were executed at 5sec intervals continuosly and each time the RAM kept increasing by 0,3 - 0,6 MB (mostly 0,5 MB).
Virtuoso started at 99,2 MB and after the first run (delete + import) it increased to 237,1 MB.
After 7:30mins it was sitting at 302 MB.
After 10:50mins it was at 318,2 MB and after 40mins it was above 400MB.

I rebuild Virtuoso with debug symbols so that I could inspect the issue with Valgrind and these are the findings:

==471315== LEAK SUMMARY:
==471315== definitely lost: 467,890 bytes in 26,239 blocks
==471315== indirectly lost: 955,209 bytes in 41,848 blocks
==471315== possibly lost: 206,207,327 bytes in 314,533 blocks
==471315== still reachable: 17,579,054 bytes in 131,026 blocks
==471315== suppressed: 0 bytes in 0 blocks`

However, I also run the Valgrind Massif heap profiler, but there it shows constant allocations with occasional spikes which return to normal right after:

I'm running the develop/7 branch locally on Ubuntu 20.04.

Configuration is unchanged, running default values.

Any idea what could be the cause? Are there any known issues that have yet to be fixed perhaps?

P.S.: I cannot share the dataset as it's confidential (company stuff).

The text was updated successfully, but these errors were encountered:

HughWilliams · 2024-01-19T15:27:47Z

So basically you are continuously deleting and loading the same RDF dataset ?

What does the output of the Linux top command and Virtuoso `status();" command run from "isql" report ?

Please also provide a copy of the virtuoso.ini file in use.

gpicciuca · 2024-01-19T15:42:47Z

Correct. I'm continuosly deleting and reloading the same dataset.

Top shows that Virtuoso started at 0,2 MEM % (I have 64GB RAM). After the first import it reaches 0,5%.
After multiple deletions and imports it's at 0,6% (05:35mins runtime, queries run every 5sec)

I was using Ubuntu's System Monitor to check the RAM usage.

status(); in iSQL says:

And the virtuoso.ini:
virtuoso.ini.txt

HughWilliams · 2024-01-19T17:41:40Z

Looking at the virtuoso.ini config file, I note the following settings:

ThreadCleanupInterval		= 0
ResourcesCleanupInterval	= 0

Setting these to 0 results in no cleanup of thread and other memory resources, which can be construed as a memory leak, as detailed in the configuration parameters documentation. You can set both to 1 to force the clean up of unused threads/resources, and thereby reduce memory consumption by the Virtuoso server.

gpicciuca · 2024-01-22T07:36:06Z

Looking at the virtuoso.ini config file, I note the following settings:
ThreadCleanupInterval		= 0
ResourcesCleanupInterval	= 0
Setting these to 0 results in no cleanup of thread and other memory resources, which can be construed as a memory leak, as detailed in the configuration parameters documentation. You can set both to 1 to force the clean up of unused threads/resources, and thereby reduce memory consumption by the Virtuoso server.

I gave it a try just now, setting both parameters to 1 in the ini file. Confirmed that the parameters are loaded via the web interface also:

but it's still only increasing the memory. Nothing is being free'd.
Resources are not freed either after I stop my stress-test tool and just let Virtuoso run on its own doing "nothing".

Here's a short recording:
https://github.com/openlink/virtuoso-opensource/assets/124195270/7814d00f-efdc-48ba-8f42-a1f9dac97801

The queries being executed are always the same as mentioned in the posts above as well as the dataset being loaded.

HughWilliams · 2024-01-22T13:10:18Z

I assume the Virtuoso instance was restarted when the INI file parameters where changed, such that they will take effect? (These settings do not take effect on a running instance without a restart, though the Conductor editor will immediately show the values have been changed in the INI file.)

Looking at your loop test case again, i.e.,

DELETE FROM DB.DBA.RDF_QUAD
DB.DBA.TTLP_MT (file_to_string_output ('/path/to/ttl_file.ttl'), '', 'http://localhost:8890/XXX')

This is a bad test case, as the RDF_QUAD table contains a number of system graphs that are used for managing the RDF Quad Store, that would have been deleted by the blanket DELETE FROM DB.DBA.RDF_QUAD query.

Even if you were to qualify it with the actual graph name being loaded, i.e., DELETE FROM DB.DBA.RDF_QUAD WHERE g = iri_to_id ('http://localhost:8890/XXX', there are other RDF-related tables that would be touched when loading the data with the TTLP_MT() function or SPARQL insert queries, and would not be cleaned. So, you should use the SPARQL CLEAR GRAPH <http://localhost:8890/XXX> query to remove the graph being loaded, which should clean all required the RDF-related tables.

You should probably also run the COMMIT WORK command after each iteration of the loop...

gpicciuca · 2024-01-22T14:19:31Z

I have restarted the server multiple times between the tests, and as noted in the screenshot above the parameters were active in the running instance.

I re-run the tests and changed:

the DELETE query with SPARQL CLEAR GRAPH <http://localhost:8890/XXX> as you suggested;

I was also already commiting the transaction at the end of each cycle (with SQLEndTran(SQL_HANDLE_DBC, hdbc_, SQL_COMMIT)) but now I also added (in addition) a query running COMMIT WORK.

The actions on each loop iteration are now:

SPARQL CLEAR GRAPH <graph name>
DB.DBA.TTLP_MT(file_open(...), '', 'graph_name')
SQLEndTran with SQL_COMMIT
Manual Query: COMMIT WORK

Result: Memory still keeps increasing. The parameters ThreadCleanupInterval and ResourcesCleanupInterval are still set to 1 in the configuration file.

Also tested:

changing file_to_string_output with file_open, and vice-versa
changing TTLP_MT() with TTLP()

Made no difference.

Edit: Downloaded and compiled the last release v7.2.11 to see if the problem would manifest on there too and indeed, I have the same problem with that version. Same as on the current develop/7 branch.

imitko · 2024-01-22T15:35:02Z

@gpicciuca

Please could you check the VmSize & VmRSS process stats, e.g.,

cat /proc/{*virtuoso-pid*}/status |grep Vm

What are these statistics after continuous operation, and how do they change?

gpicciuca · 2024-01-23T08:09:37Z

@imitko

I changed the cycle time to 1sec instead of 5sec just to speed things up a bit.

Here's a recording of how the VM values change over a period of 6-7 minutes:
https://github.com/openlink/virtuoso-opensource/assets/124195270/a3bd7f37-af64-4f7e-a73f-01b8ca13ffea

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RAM usage keeps increasing when deleting and reimporting data #1240

RAM usage keeps increasing when deleting and reimporting data #1240

gpicciuca commented Jan 19, 2024 •

edited

Loading

HughWilliams commented Jan 19, 2024

gpicciuca commented Jan 19, 2024

HughWilliams commented Jan 19, 2024 •

edited by TallTed

Loading

gpicciuca commented Jan 22, 2024 •

edited

Loading

HughWilliams commented Jan 22, 2024 •

edited by TallTed

Loading

gpicciuca commented Jan 22, 2024 •

edited by TallTed

Loading

imitko commented Jan 22, 2024 •

edited by TallTed

Loading

gpicciuca commented Jan 23, 2024

RAM usage keeps increasing when deleting and reimporting data #1240

RAM usage keeps increasing when deleting and reimporting data #1240

Comments

gpicciuca commented Jan 19, 2024 • edited Loading

HughWilliams commented Jan 19, 2024

gpicciuca commented Jan 19, 2024

HughWilliams commented Jan 19, 2024 • edited by TallTed Loading

gpicciuca commented Jan 22, 2024 • edited Loading

HughWilliams commented Jan 22, 2024 • edited by TallTed Loading

gpicciuca commented Jan 22, 2024 • edited by TallTed Loading

imitko commented Jan 22, 2024 • edited by TallTed Loading

gpicciuca commented Jan 23, 2024

gpicciuca commented Jan 19, 2024 •

edited

Loading

HughWilliams commented Jan 19, 2024 •

edited by TallTed

Loading

gpicciuca commented Jan 22, 2024 •

edited

Loading

HughWilliams commented Jan 22, 2024 •

edited by TallTed

Loading

gpicciuca commented Jan 22, 2024 •

edited by TallTed

Loading

imitko commented Jan 22, 2024 •

edited by TallTed

Loading