You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While looking at perf metrics of the mainnet chain, I noticed that v43-walletFactory has a very large XS heap (and thus also has a very large XS heap snapshot), and is still growing.
Here is a graph of the v43 XS heap snapshot size versus deliveryNum (which is not the same as time, but monotonically increases with time):
The piecewise-linear regions are separated by upgrade events. The last time we upgraded v43-walletFactory was in upgrade-17, which both made functional improvements to the walletFactory, and (supposedly) fixed the passStyleOf problem that caused heap growth for every non-retired Presence that traversed the marshalling code (mostly from #8400 and #8401 , which should be flattened by upgrade-18 and eventually drained by slowly deleting the old price-feed vats sometime after upgrade-19) . That upgrade event happened at about deliveryNum/snapPos 21M, and forms the start of the right-most linear segment in that graph.
We don't know why this vat is still growing: either the passStyleOf fix didn't take, or something else is going on. And I have even less ideas about why it might appear to be growing faster than before. Note that the X-axis is deliveries, not time, so I can only say with confidence that our bytes-per-delivery rate is higher. I'm not entirely sure that the bytes-per-time is higher: maybe we're doing fewer deliveries per second, but have more growth per delivery. To get a bytes-per-time slope from my data, we need to use the slogs to extract the blockTime of each sample. (I'll see if I can at least find the endpoints of each sawtooth).
#10493 is about restarting this vat, which would reset the state back down to the bottom of a sawtooth, which would buy us some time to figure out the real problem. And the upgrade-18 fixes that retire QuotePayment objects (and break the Zoe cycles) may help, whether or not the passStyleOf fix is working.
But we need to figure out the root cause. One step would be to take a v43 heap snapshot and use the Moddable tools to dump its contents. I'm going to guess that the CHUNKS section is fairly small (strings and Array backing stores), and the SLOTS section is very large (objects and their properties). With a heap that large, random sampling of the SLOTS section would mostly return the things that we have too much of: if we see more than three samples showing the same kind of object, that's the problem right there, and the second step will be to figure out where it's coming from, and why it's being retained.
The text was updated successfully, but these errors were encountered:
I made a new graph where date is the x-axis, which shows that the bytes-per-second is in fact higher now:
(this graph doesn't include all the same datapoints as above: I manually extracted a handful of samples, and only went as far back as april-2024 and the end of the upgrade-13 era, but I think it still accurately reflects the growth rate-vs-time for the last eight months)
While looking at perf metrics of the mainnet chain, I noticed that v43-walletFactory has a very large XS heap (and thus also has a very large XS heap snapshot), and is still growing.
Here is a graph of the v43 XS heap snapshot size versus deliveryNum (which is not the same as time, but monotonically increases with time):
The piecewise-linear regions are separated by upgrade events. The last time we upgraded v43-walletFactory was in upgrade-17, which both made functional improvements to the walletFactory, and (supposedly) fixed the
passStyleOf
problem that caused heap growth for every non-retired Presence that traversed the marshalling code (mostly from #8400 and #8401 , which should be flattened by upgrade-18 and eventually drained by slowly deleting the old price-feed vats sometime after upgrade-19) . That upgrade event happened at about deliveryNum/snapPos 21M, and forms the start of the right-most linear segment in that graph.We don't know why this vat is still growing: either the
passStyleOf
fix didn't take, or something else is going on. And I have even less ideas about why it might appear to be growing faster than before. Note that the X-axis is deliveries, not time, so I can only say with confidence that our bytes-per-delivery rate is higher. I'm not entirely sure that the bytes-per-time is higher: maybe we're doing fewer deliveries per second, but have more growth per delivery. To get a bytes-per-time slope from my data, we need to use the slogs to extract the blockTime of each sample. (I'll see if I can at least find the endpoints of each sawtooth).#10493 is about restarting this vat, which would reset the state back down to the bottom of a sawtooth, which would buy us some time to figure out the real problem. And the upgrade-18 fixes that retire QuotePayment objects (and break the Zoe cycles) may help, whether or not the
passStyleOf
fix is working.But we need to figure out the root cause. One step would be to take a v43 heap snapshot and use the Moddable tools to dump its contents. I'm going to guess that the
CHUNKS
section is fairly small (strings and Array backing stores), and theSLOTS
section is very large (objects and their properties). With a heap that large, random sampling of the SLOTS section would mostly return the things that we have too much of: if we see more than three samples showing the same kind of object, that's the problem right there, and the second step will be to figure out where it's coming from, and why it's being retained.The text was updated successfully, but these errors were encountered: