Why is MLton so slow, again? #437

favonia · 2017-11-11T14:35:17Z

During the development of immortal lines (#431), somehow MLton became unhappy again. I wonder whether it is due to the use of the new polymorphic datatypes.

MatthewFluet · 2017-11-11T16:18:31Z

Do you have a particular commit that triggers the slow compile?

favonia · 2017-11-11T16:44:39Z

@MatthewFluet It seems the time increases gradually along the commits 2e0e13b, 4d46372, 578d12a, fce66db and 63c0e98 (in the order of time). Not sure what exactly is the cause or the trend.

favonia · 2017-11-11T16:47:36Z

@MatthewFluet I tried to pin down the change that triggered the slow compile but did not have any conclusive finding.

favonia · 2017-11-13T02:40:32Z

@MatthewFluet Is there something I can do to pin down the cause? I suspect it's because of the introduction of

datatype ('a, 'b) binder = DIM of 'a | TERM of ('a * 'b)

MatthewFluet · 2017-11-13T02:54:17Z

If the slowdown was all in a single pass, then we can usually work out a reason. But, if it is just generally slower throughout the compilation, or just within the native codegen, then it is a lot harder to pin down. I wouldn't expect the introduction of a datatype to have significant effect; perhaps you changed the datatype from a unary tycon to a binary tycon and are worried that it has caused blowup in monomorphisation? But, that would probably manifest as slowdown throughout the compilation, since the monomorphisation happens very early in the compilation. What I saw previously, mentioned in MLton/mlton#196, was some huge case dispatches, which just led to massive numbers of basic blocks in some functions which seemed to be bad (for both the native codegen and for the C codegen).

favonia · 2017-11-13T03:13:20Z

@MatthewFluet Thanks. Yes, I was worried about blowup in monomorphisation. Essentially it was changed from (something option list * abt) to (something option list, abt) binder and from (something option * abt) to (something option, abt) binder to account for more features.

jonsterling · 2017-11-20T02:15:09Z

For what it's worth, whatever configuration is triggered when we do the profiling build seems to go much faster than the ordinary build (try ./script/profile.sh). At least on my machine...

favonia · 2017-11-20T03:39:08Z

Is it possible to skip almost all the optimization (on travis)? I am not sure how to do that from the help message of mlton...

MatthewFluet · 2017-11-20T20:04:45Z

You can use -drop-pass <re> to skip optimization passes matching a regular expression. However, this doesn't always reduce compile times --- because we end up skipping dead code elimination and other simplifications that the resulting required passes (e.g., codegen) run slower. Similarly, insufficient inlining means that every + operation is a full function call/return and the resulting programs are unusably slow.

It would be a nice project to develop a good "-O0" collection of passes that make a good tradeoff in compile time and run time.

MatthewFluet · 2017-11-20T20:06:22Z

That's curious about profiling leading to faster builds --- usually the addition of profiling code into the program inhibits optimizations and slows things down.

jonsterling · 2017-11-20T21:31:41Z

That's curious about profiling leading to faster builds --- usually the addition of profiling code into the program inhibits optimizations and slows things down.

It's very strange indeed! The build goes about 15x faster with profiling turned on. The built code itself is a bit slower of course...

favonia · 2017-11-20T21:43:31Z

So... the conclusion right now is to enable profiling on travis to speed up the compilation?

jonsterling · 2017-11-21T16:08:08Z

BTW @MatthewFluet I tried running MLton with -drop-pass '.*' and I get a weird error in closure conversion:

         closureConvert starting
            flow analysis starting
            flow analysis raised in 0.00 + 0.00 (nan% GC)
            flow analysis raised: Fail
         closureConvert raised in 0.00 + 0.00 (nan% GC)
         closureConvert raised: Fail
      pre codegen raised in 6.78 + 3.17 (32% GC)
      pre codegen raised: Fail
   Compile SML raised in 6.78 + 3.17 (32% GC)
   Compile SML raised: Fail
MLton raised in 6.79 + 3.17 (32% GC)
MLton raised: Fail
ClosureConvert.loopDec: strange dec

Any idea what might cause that? Or maybe there is some better more finely-tuned collection of passes I should drop.

MatthewFluet · 2017-11-21T16:40:05Z

-drop-pass '.*' won't quite work, because some non-optional passes are still under the control of the wrappers that drop passes. In this particular case, you've dropped the implementExceptions SXML pass, so there are unexpected exception <con> [of <ty>] declarations left in the program, which closureConvert is not prepared to deal with.

jonsterling · 2017-11-21T16:59:06Z

@MatthewFluet Ahh, that explains it 😄

MatthewFluet · 2017-11-21T17:35:10Z

Attached are the -drop-pass <name> options that disable all optional passes: drop.txt

And, here is the comparison on a simple hello-world program:

[matthew@shadow tmp]$ cat z.sml 
val () = print "Hello, world!\n"
[matthew@shadow tmp]$ mlton
MLton 20130715 (built Sat Feb 11 06:53:44 UTC 2017 on buildvm-29.phx2.fedoraproject.org)
[matthew@shadow tmp]$ /usr/bin/time mlton z.sml 
1.69user 0.36system 0:02.06elapsed 99%CPU (0avgtext+0avgdata 678168maxresident)k
0inputs+424outputs (0major+421459minor)pagefaults 0swaps
[matthew@shadow tmp]$ /usr/bin/time ./z 
Hello, world!
0.00user 0.00system 0:00.00elapsed 0%CPU (0avgtext+0avgdata 1896maxresident)k
0inputs+0outputs (0major+81minor)pagefaults 0swaps
[matthew@shadow tmp]$ /usr/bin/time mlton $(cat drop.txt) z.sml
9.84user 0.89system 0:10.77elapsed 99%CPU (0avgtext+0avgdata 1072052maxresident)k
0inputs+5584outputs (0major+963508minor)pagefaults 0swaps
[matthew@shadow tmp]$ /usr/bin/time ./z
^C840.92user 205.97system 47:32.22elapsed 36%CPU (0avgtext+0avgdata 31646504maxresident)k
393892352inputs+0outputs (11572217major+97157205minor)pagefaults 0swaps

Without any optimizations, the program takes 5X longer to compile and runs for more than 45min (yes 45min!) and uses 35G (yes, 35G!) heap, at which point we are GC thrashing and I killed it.

Closes RedPRL#437 MLton's `polyvariance` optimization duplicates small higher-order functions. One (unresolved, though usually not significant) issue is that polyvariance can duplicate code local to a function, even if it doesn't depend on the higher-orderness (see http://www.mlton.org/pipermail/mlton-devel/2002-January/021211.html). This seems consistent with the discussion at MLton/mlton#136.

Closes RedPRL#437 MLton's `polyvariance` optimization duplicates small higher-order functions. One (unresolved, though usually not significant) issue is that polyvariance can duplicate code local to a function, even if it doesn't depend on the higher-orderness (see http://www.mlton.org/pipermail/mlton-devel/2002-January/021211.html). This seems consistent with the discussion at MLton/mlton#196.

Closes #437 MLton's `polyvariance` optimization duplicates small higher-order functions. One (unresolved, though usually not significant) issue is that polyvariance can duplicate code local to a function, even if it doesn't depend on the higher-orderness (see http://www.mlton.org/pipermail/mlton-devel/2002-January/021211.html). This seems consistent with the discussion at MLton/mlton#196.

favonia added question A-build labels Nov 11, 2017

MatthewFluet mentioned this issue Nov 21, 2017

Update mlton compile options #470

Merged

jonsterling closed this as completed in #470 Nov 21, 2017

favonia mentioned this issue Jan 13, 2018

Make lines Immortal. #431

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why is MLton so slow, again? #437

Why is MLton so slow, again? #437

favonia commented Nov 11, 2017 •

edited

Loading

MatthewFluet commented Nov 11, 2017

favonia commented Nov 11, 2017 •

edited

Loading

favonia commented Nov 11, 2017

favonia commented Nov 13, 2017

MatthewFluet commented Nov 13, 2017

favonia commented Nov 13, 2017 •

edited

Loading

jonsterling commented Nov 20, 2017

favonia commented Nov 20, 2017

MatthewFluet commented Nov 20, 2017

MatthewFluet commented Nov 20, 2017

jonsterling commented Nov 20, 2017

favonia commented Nov 20, 2017

jonsterling commented Nov 21, 2017

MatthewFluet commented Nov 21, 2017

jonsterling commented Nov 21, 2017

MatthewFluet commented Nov 21, 2017

Why is MLton so slow, again? #437

Why is MLton so slow, again? #437

Comments

favonia commented Nov 11, 2017 • edited Loading

MatthewFluet commented Nov 11, 2017

favonia commented Nov 11, 2017 • edited Loading

favonia commented Nov 11, 2017

favonia commented Nov 13, 2017

MatthewFluet commented Nov 13, 2017

favonia commented Nov 13, 2017 • edited Loading

jonsterling commented Nov 20, 2017

favonia commented Nov 20, 2017

MatthewFluet commented Nov 20, 2017

MatthewFluet commented Nov 20, 2017

jonsterling commented Nov 20, 2017

favonia commented Nov 20, 2017

jonsterling commented Nov 21, 2017

MatthewFluet commented Nov 21, 2017

jonsterling commented Nov 21, 2017

MatthewFluet commented Nov 21, 2017

favonia commented Nov 11, 2017 •

edited

Loading

favonia commented Nov 11, 2017 •

edited

Loading

favonia commented Nov 13, 2017 •

edited

Loading