
Elliot Rock asked me 3 fantastic questions about my previous post, it's good information so best to share for all I think!
So the inline methods and variables is the main optimisation in the final version?
it's the only optimisation made, I simply added the "inline" keyword to 2 methods and all static vars.
Can you explain where and why the HaXe methods greatly improve performance please :)
haxe offers 3 main forms of performance optimisation (we're being swf specific here);
- haXe produces bytecode/opcodes which are optimized better than the adobe compilers / mxmlc. This provides a small but significant increases in speed which all mount up. See the links at the foot of the page for more information.
- haXe provides the ability to inline, which I'll explain further in a moment, this often gives a fantastic increase in speed.
- haXe provides access to the new opcodes for fast memory access; the only other place to get them is through Alchemy.
I must stress that these are 3 minor benefits of haxe compared to the hundreds of others, but they are very good ones none the less!
I know inlines are so great but haven't found an explanation on why :)
The simple answer is it produces less and better optimized opcodes + reduces calls to the stack(s).
I'm going to jump straight in to full on detail here! from the examples in question, here is snippet of code from the non-optimised version:
Non-Optimized
private static var WIDTH : Int = 550;
// ...
po = pointToOffset( Std.int(realVector[a*3]) , Std.int(realVector[(a*3)+1]) , WIDTH );
// ...
private static function pointToOffset( vx : Int , vy : Int , vw : Int ) : UInt {
taking the single functional line only (the middle one) here is are the opcodes produced for it:
OLabel OGetLex(Idx(62)) OFindProp(Idx(83)) OGetProp(Idx(83)) OReg(6) OSmallInt(3) OOp(OpIMul) OGetProp(Idx(17)) OToNumber OToInt OFindProp(Idx(83)) OGetProp(Idx(83)) OReg(6) OSmallInt(3) OOp(OpIMul) OSmallInt(1) OOp(OpIAdd) OGetProp(Idx(17)) OToNumber OToInt OGetLex(Idx(62)) // get name Identical OGetProp(Idx(47)) // get property WIDTH OCallProperty(Idx(67),3) // call pointToOffset with arg count: 3 (from previous uncommented opcodes) OToUInt // set result as UInt OToInt // set above as Int OSetReg(5) // store it in variable po
and here's the pointToOffset opcodes
OReg(1) OReg(2) OReg(3) OOp(OpIMul) // multiply argument 1 and 2 OOp(OpIAdd) // add argument 3 to result of previous op ORet // return result
so.. it's going to get the static var WIDTH from the class, then pass it and the other 2 arguments from our vector through to pointToOffset, run all the pointToOffset code, get the result, convert it and store in variable po. *phew* quite a lot really.
Optimised
now let's look at the optimised version, remember the only thing to change here is the addition of the keyword "inline":
private static inline var WIDTH : Int = 550;
// ...
po = pointToOffset( Std.int(realVector[a*3]) , Std.int(realVector[(a*3)+1]) , WIDTH );
// ...
static inline function pointToOffset( vx : Int , vy : Int , vw : Int ) : UInt {
and the corresponding op codes (snipped as the rest if the same)
OToNumber OToInt OIntRef(Idx(4)) // value 550 on the int stack OOp(OpIMul) // multiply first two arguments (vector values) OOp(OpIAdd) // add the third (width) OToInt // set to int OSetReg(5) // store in po
the static var width and the function point to offset are removed and never called.
back to an english explanation, we've removed some normal but heavy operations and replaced them with 3 simple and light operations, giving us a big speed increase.
All credit to Nicolas the creator of haxe for this, it's amazing what he's done.
Reading
- http://ncannasse.fr/blog/flash_9_optimizations
- http://ncannasse.fr/blog/haxe_swc (specifically "More Optimizations")
- http://haxe.org/ref/inline
Hope that answers the questions, and thanks Elliot.














Well that is definitely an "Ask and you shall be answered".
Wow it is great being able to add another method to optimise programming, I am itching to spend time in HaXe. I was very fortunate to work along side the developers of SWiSHmax2 so I miss this sort of explaiination from a op code/bytescode/compiling perspective. Now off to read those three posts (after I do some work).
Thanks Nathan and to Nicolas as well!!
Peace