So, I ended up doing more in this release than I had intended, causing it to take longer than planned. But it was for the best. Major changes include:
- Clang support: All tests now work in both llvm-gcc and Clang. The two produce somewhat different llvm bitcode, to the degree that different methods are needed with Clang, causing it to run 1/2 as fast as llvm-gcc code. That is mainly because llvm-gcc is more explicit with what it does, while Clang uses memcpy and such, with hardcoded C size values (4 bytes for an int, etc.).
Emscripten therefore now supports optional 'C memory layout' (QUANTUM_SIZE in settings.js). For example, an array of ints of values 1,2,3 with that enabled is [1,0,0,0,2,0,0,0,3,0,0,0] (since each int is 4 bytes), and when it is disabled, [1,2,3]. The latter works fine with llvm-gcc-generated llvm bitcode. Note that things get even more complicated with structures here, which need to be aligned and so forth. Anyhow, after that effort Emscripten should now be able to support anything that C/C++ can throw at it.
- Faster compilation speed: The original goal of the release. Compilation speed is 2-3 times faster now. Still lots of room for improvement, but it isn't a major nuisance like it was.
- Proper memory management: A call stack is implemented, and static memory allocation (for global variables, etc.) is also possible. sbrk() is emulated as well, allowing dlmalloc, a popular malloc() implementation, to be emscriptened properly. In particular that lets you use a real malloc() in your emscriptened code.
The first relooper worked on most tests, but was slow and buggy. I wrote a new version almost from scratch, and it now properly processes all the test code. It isn't very fast, though, I didn't focus on that. For that reason it is off by default, which means that Emscripten will not generate native flow structures (instead it will emulate code flow using a switch in a loop, which is very slow - but trivial to generate).
- Lots more tests: There are now 37 separate tests, from small LLVM features to high-level tests like the CubeScript engine, dlmalloc, and raytracing; each test is run through both Clang and llvm-gcc, and with relooping&optimization both on and off, for a total of 148 tests. This takes 7 minutes on my slow laptop, which is starting to be significant, but it's extremely important in a project like this.
Next goals include performance of the generated code - lots, lots of low-hanging fruit there - and compiling yet more real-world code.