You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We have a global variable that counts malloc'd bytes and gets updated for every malloc call. If there are multiple threads that are doing malloc, there will be contention and will have measurable overhead.
The following is measured with Julia GCBenchmarks, using the multithreaded benchmarks (using 8 mutator threads). The two builds both return 0 in vm_live_bytes() for a fair comparison, and the build with no-malloc-counter does not have the malloc counter update. The results showed that there is measurable overhead for some benchmarks, e.g. 2% slowdown for mergesort_parallel.
One way to mitigate this issue is to reduce the frequency of global counter update. We could have a local counter for malloc'd bytes, and only update the global counter for every X bytes allocated (X could be 16K or something).
The text was updated successfully, but these errors were encountered:
This PR introduces fixed heap size for stock Julia. With the build time option `WITH_GC_FIXED_HEAP=1` and using `--fixed-heap-size=...`, it will bypass all the existing GC triggering heuristics, and only do GC when the heap size reaches the defined heap size, and will only do a full heap GC if the free memory after a GC is less than 20% of the heap size.
This PR also introduces a global counter for mallocd bytes. This will slow down the performance of malloc. For MMTK Julia, we also use such a counter (see mmtk/mmtk-julia#141). I plan to do another PR to fix this for both MMTK Julia and stock Julia.
We have a global variable that counts malloc'd bytes and gets updated for every malloc call. If there are multiple threads that are doing malloc, there will be contention and will have measurable overhead.
The following is measured with Julia GCBenchmarks, using the multithreaded benchmarks (using 8 mutator threads). The two builds both return 0 in
vm_live_bytes()
for a fair comparison, and the build withno-malloc-counter
does not have the malloc counter update. The results showed that there is measurable overhead for some benchmarks, e.g. 2% slowdown for mergesort_parallel.One way to mitigate this issue is to reduce the frequency of global counter update. We could have a local counter for malloc'd bytes, and only update the global counter for every X bytes allocated (X could be 16K or something).
The text was updated successfully, but these errors were encountered: