<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Flat Packing, the ultimate code optimization?</title>
	<atom:link href="http://webr3.org/blog/optimization/flat-packing-the-ultimate-code-optimization/feed/" rel="self" type="application/rss+xml" />
	<link>http://webr3.org/blog/optimization/flat-packing-the-ultimate-code-optimization/</link>
	<description>brain&#039;s on fire!</description>
	<lastBuildDate>Fri, 22 Apr 2011 00:44:37 +0100</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: nathan</title>
		<link>http://webr3.org/blog/optimization/flat-packing-the-ultimate-code-optimization/comment-page-1/#comment-120</link>
		<dc:creator>nathan</dc:creator>
		<pubDate>Thu, 24 Sep 2009 14:33:22 +0000</pubDate>
		<guid isPermaLink="false">http://webr3.org/blog/?p=147#comment-120</guid>
		<description>we have 4 scenarios here:

the way things currently work (afaik) - take the language syntax, turn it in to vm opcodes (mxmlc)

the way things work better - take the language syntax, turn it in to vm opcodes and optimize on the way (haxe)

the bloated way - take the language syntax, inline everything, turn it in to vm opcodes

the proposed way - take the language syntax, inline everything, turn it in to vm opcodes, optimize, identify repeat code and factor in to functions, optimize again

thus your not making syntax work on a vm, your effectively using the compiler to rewrite your application perfectly for the target vm.

follow?</description>
		<content:encoded><![CDATA[<p>we have 4 scenarios here:</p>
<p>the way things currently work (afaik) - take the language syntax, turn it in to vm opcodes (mxmlc)</p>
<p>the way things work better - take the language syntax, turn it in to vm opcodes and optimize on the way (haxe)</p>
<p>the bloated way - take the language syntax, inline everything, turn it in to vm opcodes</p>
<p>the proposed way - take the language syntax, inline everything, turn it in to vm opcodes, optimize, identify repeat code and factor in to functions, optimize again</p>
<p>thus your not making syntax work on a vm, your effectively using the compiler to rewrite your application perfectly for the target vm.</p>
<p>follow?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Huge</title>
		<link>http://webr3.org/blog/optimization/flat-packing-the-ultimate-code-optimization/comment-page-1/#comment-119</link>
		<dc:creator>Huge</dc:creator>
		<pubDate>Thu, 24 Sep 2009 14:12:58 +0000</pubDate>
		<guid isPermaLink="false">http://webr3.org/blog/?p=147#comment-119</guid>
		<description>I agree this is interesting.  I think one advantage haxe has over TAAS is that is has more information to begin with, and it has some extra type info - eg Int vs Float, and typed arrays.
One thing with functions is that the can be virtual - ie, the runtime does not know exactly which function will be called at runtime so it must leave it in as a symbolic reference.
Obviously the code bloat could be done on the client end, keeping the swfs a constant size - but cpu cache issues make make small code run faster than large code.
I think a good way of thinking about this is factorizing in mathematics.  First you multiply out the brackets, then you group like terms, and cancel some out, and then extract factors to form a new, simpler, expression.
We all lurve smarter compilers - well worth thinking about.</description>
		<content:encoded><![CDATA[<p>I agree this is interesting.  I think one advantage haxe has over TAAS is that is has more information to begin with, and it has some extra type info - eg Int vs Float, and typed arrays.<br />
One thing with functions is that the can be virtual - ie, the runtime does not know exactly which function will be called at runtime so it must leave it in as a symbolic reference.<br />
Obviously the code bloat could be done on the client end, keeping the swfs a constant size - but cpu cache issues make make small code run faster than large code.<br />
I think a good way of thinking about this is factorizing in mathematics.  First you multiply out the brackets, then you group like terms, and cancel some out, and then extract factors to form a new, simpler, expression.<br />
We all lurve smarter compilers - well worth thinking about.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Dykam</title>
		<link>http://webr3.org/blog/optimization/flat-packing-the-ultimate-code-optimization/comment-page-1/#comment-118</link>
		<dc:creator>Dykam</dc:creator>
		<pubDate>Thu, 24 Sep 2009 14:12:18 +0000</pubDate>
		<guid isPermaLink="false">http://webr3.org/blog/?p=147#comment-118</guid>
		<description>There are two big BUT&#039;s, maybe another one related to the flash player:
1. Recursion is very, very hard to handle. This would become, or some bunch of jumps, or some methods still apart.
2. Your swf becomes big. Very big. And the loading time too. This is a common know trade-off for optimization by caching. But with big, I mean very very big. OO-projects contain many, many links between functions, causing problem 1 too.
(3.) Can flash handle such a long code run...</description>
		<content:encoded><![CDATA[<p>There are two big BUT's, maybe another one related to the flash player:<br />
1. Recursion is very, very hard to handle. This would become, or some bunch of jumps, or some methods still apart.<br />
2. Your swf becomes big. Very big. And the loading time too. This is a common know trade-off for optimization by caching. But with big, I mean very very big. OO-projects contain many, many links between functions, causing problem 1 too.<br />
(3.) Can flash handle such a long code run...</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ben Garney</title>
		<link>http://webr3.org/blog/optimization/flat-packing-the-ultimate-code-optimization/comment-page-1/#comment-110</link>
		<dc:creator>Ben Garney</dc:creator>
		<pubDate>Thu, 24 Sep 2009 04:19:38 +0000</pubDate>
		<guid isPermaLink="false">http://webr3.org/blog/?p=147#comment-110</guid>
		<description>Inlining has diminishing returns (assuming your call performance isn&#039;t painfully bad) due to issues like memory caching, instruction decoding, etc.

What you want is to inline your inner loops as much as possible, maximizing opportunities for instruction level parallelism, and not sweat the rest.

In addition because AS3 is dynamic, there are a lot of situations where the compiler lacks the knowledge needed to inline, even if you are using strict typing.

TBH I think the biggest bottleneck for AS3 right now is memory - cost of allocation - and the conflict between convenient syntax and what helps the compiler.

In general, AS3 is a young language. It needs years of serious work to mature. Languages like C# and Java are a lot more mature and have addressed these problems in pretty powerful ways.</description>
		<content:encoded><![CDATA[<p>Inlining has diminishing returns (assuming your call performance isn't painfully bad) due to issues like memory caching, instruction decoding, etc.</p>
<p>What you want is to inline your inner loops as much as possible, maximizing opportunities for instruction level parallelism, and not sweat the rest.</p>
<p>In addition because AS3 is dynamic, there are a lot of situations where the compiler lacks the knowledge needed to inline, even if you are using strict typing.</p>
<p>TBH I think the biggest bottleneck for AS3 right now is memory - cost of allocation - and the conflict between convenient syntax and what helps the compiler.</p>
<p>In general, AS3 is a young language. It needs years of serious work to mature. Languages like C# and Java are a lot more mature and have addressed these problems in pretty powerful ways.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: nathan</title>
		<link>http://webr3.org/blog/optimization/flat-packing-the-ultimate-code-optimization/comment-page-1/#comment-109</link>
		<dc:creator>nathan</dc:creator>
		<pubDate>Thu, 24 Sep 2009 01:50:40 +0000</pubDate>
		<guid isPermaLink="false">http://webr3.org/blog/?p=147#comment-109</guid>
		<description>I fully comprehend the code bloat issue, in all honesty i think that only 20% of the original code would cause the bloat - further i think this approach would lead to better results, rather than optimizing what the programmer has done (which is always inherintly flawed) monolith the entire app by inlining everything, then un-inline the bloat - quite sure this approach would lead to improved results; when used with opcode reduction and optimization techniques such as taas and in-part haxe :)</description>
		<content:encoded><![CDATA[<p>I fully comprehend the code bloat issue, in all honesty i think that only 20% of the original code would cause the bloat - further i think this approach would lead to better results, rather than optimizing what the programmer has done (which is always inherintly flawed) monolith the entire app by inlining everything, then un-inline the bloat - quite sure this approach would lead to improved results; when used with opcode reduction and optimization techniques such as taas and in-part haxe :)</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Troy Gilbert</title>
		<link>http://webr3.org/blog/optimization/flat-packing-the-ultimate-code-optimization/comment-page-1/#comment-108</link>
		<dc:creator>Troy Gilbert</dc:creator>
		<pubDate>Thu, 24 Sep 2009 01:15:34 +0000</pubDate>
		<guid isPermaLink="false">http://webr3.org/blog/?p=147#comment-108</guid>
		<description>That&#039;s definitely the *theoretically* fastest way to do things. I remember back in the day when folks would &quot;compile&quot; bitmaps down to explicit instructions instead of copying data into memory.

The problem, beyond the one you mention of interoperability between SWFs, is code bloat. Inlining everything would generate a *huge* amount of extra code. I would expect your SWF to be 10x or even 100x bigger, and not significantly faster. Remember the 80/20 rule and how it would apply: you&#039;ll bloat out 80% of your code to 10x or 100x its size without any noticeable difference.

The brilliance of what TAAS does (and any really good compiler) is doing what you describe selectively and smartly, making trade-offs that yield actual benefits as opposed to dumbly inlining everything.</description>
		<content:encoded><![CDATA[<p>That's definitely the *theoretically* fastest way to do things. I remember back in the day when folks would "compile" bitmaps down to explicit instructions instead of copying data into memory.</p>
<p>The problem, beyond the one you mention of interoperability between SWFs, is code bloat. Inlining everything would generate a *huge* amount of extra code. I would expect your SWF to be 10x or even 100x bigger, and not significantly faster. Remember the 80/20 rule and how it would apply: you'll bloat out 80% of your code to 10x or 100x its size without any noticeable difference.</p>
<p>The brilliance of what TAAS does (and any really good compiler) is doing what you describe selectively and smartly, making trade-offs that yield actual benefits as opposed to dumbly inlining everything.</p>
]]></content:encoded>
	</item>
</channel>
</rss>

