Aside from OpenJPEG, lots of stuff in this release, including
- Line number debugging: Emscripten can optionally add the original source file and line to the generated JavaScript (if you compiled the source using '-g'). Useful for debugging when things go wrong, especially with the new autodebugger tool, which rewrites an LLVM bitcode file to add printouts of every store to memory. Figuring out why generated code doesn't work is then as simple as running that same code in lli (the LLVM interpreter) and in JavaScript, and diff'ing the output, then seeing which original source code line is responsible.
- Line-specific CORRECT'ing: The main speed issue with Emscripten is that JavaScript and C have different semantics. For example, -5/2 in C is -2, while +5/2 is +2, whereas in JavaScript, naive division gives floating point numbers, but worse, there is no single operator that will create the same behavior as C (Math.floor on -5/2 gives -3, and Math.ceil on +5/2 gives +3). So in this example (unless we have a trick we can use, like |0 if the value is 32-bit and signed), we must check the sign of the value, and round accordingly - and that is slow. Similar things happen not just in rounding, but also with signedness and numerical overflows, and therefore Emscripten has the CORRECT_SIGNS, CORRECT_OVERFLOWS and CORRECT_ROUNDINGS options.
With line-specific correcting in the 0.9 release, you can find out which lines actually run into such problems, and tell Emscripten to generate the 100% correct code only in them. Most of the time, the slow and correct code isn't needed, so this option is very useful. I will write a wiki page soon to give more examples of how to use it to optimize the generated code (meanwhile, check out the linespecific test). - 20% faster compilation, mainly from optimizing the analyzer pass.
- Strict mode JavaScript. The compiler will now generate strict mode JavaScript, which is simpler, less bug-prone, and in the future will allow JS engines to run it more quickly.
for integer division:
ReplyDeletea / b - a % b / b also works, don't know wether that's faster.