Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible Space Leak #340

Open
jhgarner opened this issue May 12, 2020 · 10 comments
Open

Possible Space Leak #340

jhgarner opened this issue May 12, 2020 · 10 comments
Labels
bug Something isn't working

Comments

@jhgarner
Copy link

I've been using Polysemy in a larger project and have run into some memory problems. I was able to create a super simple example at https://github.com/jhgarner/Polysemy-Testing where I compare running computations with State in MTL and Polysemy. Instead of looking at speed, I'm exclusively looking at space efficiency. While the MTL example uses ~80KB of memory, Polysemy uses over 100MB. Could this be some kind of space leak in Polysemy or is there something I'm doing wrong in the code? Is this a laziness problem?

In the repo linked above, all of the code is in the src/Lib.hs file. Currently, Polysemy is commented out and the MTL version is active. There are also .hp and .svg files in the root of that project with the memory usages for both configurations.

@KingoftheHomeless
Copy link
Collaborator

KingoftheHomeless commented May 13, 2020

The fact that the bottleneck is Sem's Functor/Applicative/Monad actions makes me very heavily suspect that the performance issue is that they fail to specialize, and the reason they fail to specialize is because you're not on GHC 8.10.

This could also be a victim of loopbreaker not working atm, but I suspect that's not the major player here. GHC 8.10 should be enough to significantly improve performance on such a simple program.

Edit: Admittedly, it feels funny that the final encoding of the free monad (which polysemy uses) is that bad with no optimizations, but dictionary passing could just be that horrible.

@KingoftheHomeless
Copy link
Collaborator

Verified the results. It is at least partly the absence of loopbreaker. When I used a tailor-made explicitly loop-broken State interpreter, space usage dropped from 130 MB at end of program to 70 MB. It still grows linearly though. Next-up, I'll try to compile GHC 8.10 for myself and make sure nothing's changed since the old benchmarks Sandy did from way-back-when. Still, @TheMatten we really should fix loopbreaker ASAP. If you need help, I can try my best. If fixing loopbreaker turns out to be a big problem, then we should, as a stop-gap, explicitly loop-break all interpreters of polysemy.

@TheMatten
Copy link
Collaborator

Sorry, I'm currently busy because of school+work - you can try looking into it, or as a temporary solution we could create separate branch with manually loopbreaked interpreters.

@jhgarner
Copy link
Author

I updated the stack.yaml file in the project I linked above to use ghc 8.10.1 and the current master versions of this repo and got the same results as before. Memory usage didn't seem to improve.

@TheMatten TheMatten added the bug Something isn't working label Jun 21, 2020
@tek
Copy link
Member

tek commented Jun 25, 2020

I've been experiencing space leaking issues as well, and despite having made some laziness mistakes, when I broke it down I arrived at this test case:

currentMemIO ::                                                                                                                                                                                                                                 
  IO Word64                                                                                                                                                                                                                                     
currentMemIO = do                                                                                                                                                                                                                               
  !s <- getRTSStats                                                                                                                                                                                                                             
  pure (gcdetails_live_bytes . gc $ s)                                                                                                                                                                                                          
                                                                                                                                                                                                                                                
currentMemRelIO ::                                                                                                                                                                                                                              
  Word64 ->                                                                                                                                                                                                                                     
  IO Word64                                                 
currentMemRelIO base = do                                   
  cur <- currentMemIO                                       
  pure (cur - base)                                         

memTestIO ::                                                
  Word64 ->                                                 
  IO ()                                                     
memTestIO base = do                                         
  loop (0 :: Int)                                           
  where                                                     
    loop n = do                                             
      when (n > 1000000) $ do                               
        cur <- currentMemRelIO base                         
        print cur                                           
      loop (if n > 1000000 then 0 else n + 1)               

io :: IO ()                                                 
io = do                                                     
  base <- currentMemIO                                      
  memTestIO base                                            

memTestSem ::                                               
  Members '[Final IO] r =>                                  
  Word64 ->                                                 
  Sem r ()                                                  
memTestSem base = do                                        
  loop (0 :: Int)                                           
  where                                                     
    loop n = do                                             
      when (n > 1000000) $ do                               
        cur <- embedFinal $ currentMemRelIO base            
        embedFinal $ print cur                              
      loop (if n > 1000000 then 0 else n + 1)               

sem :: IO ()                                                
sem =                                                       
  runFinal $                                                
  prg                                                       
  where                                                     
    prg = do                                                
      base <- embedFinal currentMemIO                       
      memTestSem base  

The IO variant prints a constant value for the current memory usage, while the Sem variant grows unbounded.
No idea if this is a valid case or whether it's related, but maybe it helps.
I ran this with 8.6.5, gonna try and see what happens with other versions.

@tek
Copy link
Member

tek commented Jun 25, 2020

same behaviour with 8.10.1.

@isovector
Copy link
Member

Does this still happen using IO in the effect stack instead of Final IO?

@tek
Copy link
Member

tek commented Aug 21, 2021

@isovector what do you mean by "IO in the effect stack"?

@KingoftheHomeless
Copy link
Collaborator

@tek He means the test case you presented rewritten to use Embed IO instead of Final IO.

@tek
Copy link
Member

tek commented Aug 21, 2021

yep. executes a bit slower, but same memory growth.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants