How to deal with Garbage Collection #1812
Replies: 20 comments
-
Yep, so around 4% of our instructions in the code are due to us using a garbage collector. The GC will once in a while for through all the garbage collected objects and decide if they should be freed or not, which takes long: for structures/enums with members, it has to go through all its mebers one by one. This can be alleviated firs by only deriving Then, environments have a clear hierarchy, so we don't really need a GC. Once the parent environment goes out of the scope, all inner environments no longer need to exist. If we use a GC, they will at some point be freed by the GC, but we don't need to wait for that, this can be reasoned at compile time by Rust by using In a hierarchical tree, the Rust guide for Rc gives us a nice usage:
So, we can remove our GC'd environments in favour of a RC'd environment. This would make it easier to implement the string interner (#279) too, since Rust now expects circular references in environments, which would make lifetimes be |
Beta Was this translation helpful? Give feedback.
-
Using the diagram above, its possible for outer() to return inner2 as a closure. |
Beta Was this translation helpful? Give feedback.
-
In that case, would the parent environment of the function change? or do we still need the information of the previous environment? |
Beta Was this translation helpful? Give feedback.
-
I implemented It seems this is the only object that contains an environment. One small question about this. Is this the inner environment or the outer environment? To implement How is this environment used? |
Beta Was this translation helpful? Give feedback.
-
From what I understand (please take a second look at the spec as I may be wrong) Function’s environment property is a reference to the outer environment. The inner environment doesn’t exist until the function is actually called. When the inner function is created, it’s parent is set to function[[Environment]]. This allows callbacks to happen at a later point whilst stil maintaining lexical scope, however it also means parent environments could be referenced long after theyve gone out of scope. (Diagram above: outer finishes but inner1 is a callback somewhere) Because functions are “run-to-completion” there’s no need to store the inner environment. However with generators I think that changes. |
Beta Was this translation helpful? Give feedback.
-
Looking into this, I noticed we have the That Do we really need it to be of In any case, implementing |
Beta Was this translation helpful? Give feedback.
-
You should hold a Trace instead of an Any |
Beta Was this translation helpful? Give feedback.
-
Would this require a special In any case, this brings the extra problem, since we can no longer use |
Beta Was this translation helpful? Give feedback.
-
You can have a trait that inherits from Any and Trace and make a trait object of it |
Beta Was this translation helpful? Give feedback.
-
It seems that this is not enough, or at least I'm not able to make it work. I get the errors listed in https://github.com/jasonwilliams/boa/pull/387/files#r422625290 |
Beta Was this translation helpful? Give feedback.
-
Right, you can't cast between trait objects directly. But in this case what you need is for that trait to inherit from Trace as well. |
Beta Was this translation helpful? Give feedback.
-
To fix those errors you just need to cast to |
Beta Was this translation helpful? Give feedback.
-
I don't know if it applies here, but did you take a look at this: https://github.com/kyren/gc-arena |
Beta Was this translation helpful? Give feedback.
-
Trait object issues are going to occur in any type that is of the form |
Beta Was this translation helpful? Give feedback.
-
Okay, but how about https://github.com/Others/shredder ? It looks like it supports cyclic structures as well |
Beta Was this translation helpful? Give feedback.
-
@sphinxc0re shredder looks very interesting and i would be happy for us to try it out |
Beta Was this translation helpful? Give feedback.
-
Shredder forces concurrency, which a JS runtime will likely not want since JS objects are single threaded |
Beta Was this translation helpful? Give feedback.
-
There is no information on "forcing" concurrency. The README states: |
Beta Was this translation helpful? Give feedback.
-
I think @Manishearth is referring to the fact that Shredder is built on top of Arc where as rust-gc is built on top of Rc. There's pros and cons to both. If you don't need multithreading you're still using Arc under the hood, which will be slower than RC because it generates atomic instructions for access. Its not easy to know how much time is lost until you can measure the before and after. Like wise if you stick with RC, it makes it difficult to move GC'd objects across threads (Which Boa may need to do at some point).
I don't know if this is strictly true, from what i understand JS Objects can move across realms, and each realm can be backed by a thread. |
Beta Was this translation helpful? Give feedback.
-
Not if you want to be spec-compliant 😄
This is incorrect. The javascript execution model is single-threaded; all realms which may share objects (not including The specification calls this an "agent", annoyingly I cannot find any text in the specification that explains this clearly, rather, the spec doesn't really work without this. I might file an issue to get this clarified. Regardless, the HTML specification explains this a bit clearer: objects that can share objects must be in the same agent, and thus, the same event loop. It's pretty clear why the specification is single-threaded from one observation: aside from You can have multiple agents running at once; this is what happens when you use a Worker. Workers typically have their own thread/event loop, and, crucially, they do not share garbage collected objects with a different thread. Instead, objects are sent to them via You can, of course, choose to allow an agent to use multiple threads for efficiency and put in some painstaking effort to ensure that all the single threaded happens-before relationships are still maintained for any data dependency. This would be an immense amount of work and multithreaded garbage collection is the least of your worries. JavaScript does not like being put on multiple threads. |
Beta Was this translation helpful? Give feedback.
-
Current state of play
We currently use GC to wrap objects which need to live as long as something is referencing it. These are all JSValues. To make things easier we wrap all JS Values in a GC and give that the type
Value
.Why not use RC?
Javascript has a lot of circular references, we could end up with memory leaks because objects will always have a reference count of 1 or more.
Constructors have a property to the prototype, and the prototype object has a property pointing back to the constructor.
More info:
https://doc.rust-lang.org/book/ch15-06-reference-cycles.html
Environments
Environments also need to be wrapped in a Gc because multiple function objects can reference the same environment, any function declaration references the outer env. So declaring more than 1 function in the same scope means they'll both have the same parent.
If a closure is returned the outer function environment needs to stay alive throughout the duration of the returned function.
@Razican has some ideas on how to improve Environment GC.
Beta Was this translation helpful? Give feedback.
All reactions