Odin compilation speed tips
Compilation speed is really important. If you hit build and it takes so long it pulls you out of the flow and you start scrolling twitter, that’s a problem.
So what can we do to compile Odin programs faster? Here I’ll try to explain some ways of debugging Odin compile times and ways to improve them.
TL;DR
- Use faster linkers
- There is a
-show-debug-messagesflag to dump things likeLOC/sand total parsed lines/tokens/packages, plus a lot of debug info. -o:minimaland-microarch:nativeis essentially free
Small programs
Compilation speed of large projects is probably the most important thing when measuring compiler performance.
However in Odin, I often write many small utility “scripts” instead of using shell/batch/python.
It’s very convenient, I get access to the core libraries running is simple as odin run my_tool.
So let’s consider the following program. It’s just a simple hello world, but all small tools need to print to console so it’s at least a little representative of the real world.
1import "core:fmt"
2
3main :: proc() {
4 fmt.printfln("Hello %s %i!", "World", 123)
5}To get the compilation speed, we can run odin build with the -show-timings. On my machine, result is something like this:
Total Time - 270.198 ms - 100.00%
initialization - 7.618 ms - 2.81%
parse files - 12.968 ms - 4.79%
type check - 66.492 ms - 24.60%
LLVM API Code Gen ( 54 modules ) - 82.512 ms - 30.53%
msvc-link - 100.602 ms - 37.23%This is not too bad! But interestingly, compiling some of my projects (with tens of thousands lines of code, and a lot more in dependencies) don’t compile that much slower.
So can we do better?
Linker
In the example above, almost 40% of the time is spent in the linker. So let’s try a different one.
Using -linker:radlink gets it down to about 15% on my machine. If you’re on linux you can also try mold.
In general, a faster linker can also improve compile times when the binary gets large (e.g. large dependencies, or big
#loadfiles).
More Timings
There is a -show-more-timings which lists timings for all the internal compilation stages.
In most cases it doesn’t tell you that much (it’s mostly valuable for compiler developers), and most time is spent in LLVM Object Generation stage.
However, you still might want to look at the results if you need to debug why something is compiling slower than expected - sometimes you hit a slow path.
For example, I ran into a case parsing a file with gigantic embedded arrays of integer constants generated by sokol shader compiler was adding ~3 seconds to my compile time.
Always profile, especially when things go wrong.
Internal Debug Messages
There is a hidden flag called -show-debug-messages, and while it’s not intended for regular users, it’s extremely useful. It dumps a LOT of good statistics to stderr useful for debugging and profiling.
So let’s measure Hello World again:
odin build hello -show-timings -linker:radlink -show-more-timings -show-debug-messagesWhen you scroll a bit past the LOC/s sections (which are also extremely useful!), you’ll see something like this:
Peak Memory Size: 271.000 MiB
Total Lines - 80727
Total Tokens - 395951
Total Files - 162
Total Packages - 26
Total File Size - 2572637This is an overview of all the things the compiler had to parse to compile your program.
But 26 packages and 80k lines of code seems like a lot for a Hello World. All of that was pulled in by core:fmt and it’s dependencies. Let’s see if we can do something about it.
base:runtime only
We can do something like the following to write data directly to stderr. The base:runtime is always included by default as it contains builtin implementations and other required features, so there’s not much better we could do.
1import "base:runtime"
2
3main :: proc() {
4 runtime.print_string("Hello World 123!")
5}The results look a lot nicer:
Peak Memory Size: 76.703 MiB
Total Lines - 9211
Total Tokens - 59321
Total Files - 31
Total Packages - 3
Total File Size - 290122Total Time - 58.230 ms - 100.00%
initialization - 7.189 ms - 12.34%
parse files - 2.853 ms - 4.90%
type check - 5.993 ms - 10.29%
LLVM API Code Gen ( 31 modules ) - 14.681 ms - 25.21%
rad-link - 27.510 ms - 47.24%Only 9k LOC, 3 packages, and 60 milliseconds to compile the entire program! That’s really nice.
But there’s a problem: we lost all the nice core:fmt’s formatting functionality. This is a big issue, because it’s very cumbersome to do all the formatting by hand.
μ-fmt experiment
This lead me to write an experimental ufmt (micro-fmt) package. It does only the bare minimum, but it covers 90% of my own core:fmt use-cases.
1import "ufmt"
2
3main :: proc() {
4 ufmt.printfln("Hello %s %i!", "World", 123)
5}The entire implementation is <200 lines of code and depends only on base:runtime.
It also compiles in 60 milliseconds and includes only 9k LOC.
There is only tprintf, printf and printfln. Only supported format qualifiers are the following:
%s: string and cstring, no cstring16%i: all integer types%x: all integer types in hexadecimal, always zero padded%f: 16, 32 and 64 bit floats, with basic NaN/Inf detection%%: to print literal%characters
Here’s the initial ufmt version as a github gist.
Of course, using this has no effect if you import a package which depends on
core:fmt. This is pretty annoying, currently there is no good way to see what exactly is your program importing and why, apart of looking at temp obj files.On my branch of the compiler I experimented with printing a graphviz graph of all the packages with their includes, but that’s not official.
LLVM
In general the slowest part of the compilaton pipeline is the LLVM backend. Even with all optimizations disabled it takes quite a while.
Odin already codegen’s each package independently to utilize all the CPU threads (LLVM cannot be multithreaded with better granularity).
And as a general rule, LLVM scales very poorly with the codegen amount. For example -disable-assert and -no-bounds-check can help by a very tiny bit because it’s slightly less things to generate (so I don’t think it’s worth it for debug builds).
On the other hand, -debug has a HUGE impact. Generating the PDB is no small feat, and it can make compile times take 20-80% longer in my experience.
Optimization
Obviously, it’s not a good idea to compile with -o:speed/size/aggressive if you want good compilation speed. But I still want debug builds that run fast!
But from my tests -o:minimal adds almost no overhead whatsoever, certainly less than ~5%.
Similarly, using -microarch:native adds no overhead. I’m pretty sure it just triggers a slightly different code path in the LLVM lowering passes, but could possibly yield some runtime perf benefits.
Conclusion
GingerBill says the Odin compiler can still be a lot faster. And while there’s plenty of room for optimization, it’s not too bad in the current state.
For a simple hello world, we got nearly 5x faster compile times by compiling with the right flags and being careful about dependencies. Of course, it’s different in real, bigger projects but I hope I shed some light on the various ways you could debug these issues, and some general rules to follow.
I recomment the following command as a reasonable default for compiling lightweight tools:
odin build my_tool -linker:radlink -o:minimal -microarch:nativeThank you for reading!
Also big thanks to all my Patrons! <3
- Voodoo51
- p1xelHer0
- Coedo
- Filip Aničić
- Moritz Falk
- Lion Schitik
- Ondřej Jamriška
- Alastair Marshall