<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>EntBlog &#187; Programming</title>
	<atom:link href="http://entland.homelinux.com/blog/category/programming/feed/" rel="self" type="application/rss+xml" />
	<link>http://entland.homelinux.com/blog</link>
	<description>Code, 3D, Games, Linux and much more...</description>
	<lastBuildDate>Mon, 08 Feb 2010 09:51:25 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Code Bloat Hunting</title>
		<link>http://entland.homelinux.com/blog/2010/02/06/code-bloat-hunting/</link>
		<comments>http://entland.homelinux.com/blog/2010/02/06/code-bloat-hunting/#comments</comments>
		<pubDate>Sat, 06 Feb 2010 00:39:47 +0000</pubDate>
		<dc:creator>ent</dc:creator>
				<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://entland.homelinux.com/blog/?p=466</guid>
		<description><![CDATA[
I have dedicated the last few days of work to face a problem that is getting bigger and bigger. Although we have a quite modular codebase, use interfaces whenever possible to hide implementation details and apply most of the recipes recommended in classic books, our linking times  have started to be a problem in [...]]]></description>
			<content:encoded><![CDATA[<div class="img-shadow"><img src="http://entland.homelinux.com/images/escher.jpg" alt="Image with lots of books"/></div>
<p>I have dedicated the last few days of work to face a problem that is getting bigger and bigger. Although we have a quite modular codebase, use interfaces whenever possible to hide implementation details and apply most of the recipes recommended in <a href="http://www.amazon.com/dp/0201633620/ref=nosim?tag=ent0c-20">classic books</a>, our <strong>linking times</strong>  have started to be a problem in some of our libraries. Linking times is not the only problem. Our binary sizes are becoming really fat.</p>
<p>After a thoroughly study with the help of tools like <a href="http://aras-p.info/projSizer.html">Sizer</a>, <a href="http://gameangst.com/?p=320">Symbol Sort</a> (I strongly recommend reading the articles associated to this tool: <a href="http://gameangst.com/?p=46">1</a>, <a href="http://gameangst.com/?p=226">2</a>, <a href="http://gameangst.com/?p=212">3</a>, <a href="http://gameangst.com/?p=222">4</a>, <a href="http://gameangst.com/?p=224">5</a>, <a href="# http://gameangst.com/?p=246">6</a> and passing all them to your co-workers) we discovered lot of code bloating generated by improper template usage. The template is a powerful tool that can be easily misused although I am not against using it. We use template metaprogramming in lots of places like reflection, annotations, uri-like dependency properties, serialization and more places where it is worth it.  The problem with templates is that normally declarations and definitions go in the same header file. With that structure a compiler is normally forced (although some <a href="http://gcc.gnu.org/onlinedocs/gcc/Template-Instantiation.html">compilers</a> allow customization) to generate template instances in each obj file. Redundancy is eliminated in the linking phase, of course taking time. This is one of the faces of template bloating. The other face appears when you templatize functions when it is not strictly necessary. And you pay for it in the final size of your binaries.</p>
<p>Trying to solve all this, we applied a solution that although yet not standard seems to work in all compilers we tested: exporting <a href="http://anteru.net/2008/11/19/318/">explicit template instantiations</a>. This technique applies when you have a template that is to be instantiated for only a few known types. Those instantiations are exported from a dll. This allows even hiding big functions in the .cpp without having to show them in the header. With this solution you have the best of both worlds: short functions can be inlined by the compiler and large functions instead of being instantiated in each compilation unit are referenced as an exported symbol. By the way, a <a href="http://msdn.microsoft.com/en-us/library/by56e477.aspx">special behavior</a> of Visual Studio (to me, it is a bug) has the effect of non inlining functions being defined outside the class when exporting the template. We had to move our definitions (that we usually have pseudo hidden in a .inl file) to inside the class declaration.</p>
<p>We applied this strategy in several places reducing the bloatage by a considerable factor. For example, the vector library seems an ideal target because you normally only instantiate that library for a few types (float, double, int) and &#8220;big&#8221; functions like Invert() for 4&#215;4 matrices can be hidden and exported from the dll. Another good candidate is std::string (instances for chat and wchar_t). Although, at least in MSVC, this job is already being done by the crt (not tested in other platforms).</p>
<p>I hope you find this useful. Thanks for reading and welcome to all the new subscribers since my last post!</p>
<div class="clearer"></div>
]]></content:encoded>
			<wfw:commentRss>http://entland.homelinux.com/blog/2010/02/06/code-bloat-hunting/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>The Documentation Challenge</title>
		<link>http://entland.homelinux.com/blog/2009/07/28/the-documentation-challenge/</link>
		<comments>http://entland.homelinux.com/blog/2009/07/28/the-documentation-challenge/#comments</comments>
		<pubDate>Tue, 28 Jul 2009 19:58:51 +0000</pubDate>
		<dc:creator>ent</dc:creator>
				<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://entland.homelinux.com/blog/?p=407</guid>
		<description><![CDATA[
Documentation is the part of the development process where usually less time and money is invested. Few resources implies a poor infrastructure for something that, as the project gets bigger and bigger, becomes a very important part of the overall process.
I want to share in this post the approach we are following for a project [...]]]></description>
			<content:encoded><![CDATA[<div class="img-shadow"><img src="http://entland.homelinux.com/images/Documentation.jpg" alt="Image with lots of books"/></div>
<p>Documentation is the part of the development process where usually less time and money is invested. Few resources implies a poor infrastructure for something that, as the project gets bigger and bigger, becomes a very important part of the overall process.</p>
<p>I want to share in this post the approach we are following for a project I have been involved with for over two years. This project (I will be able to give details about it very soon) is mainly a framework composed of libraries to be used by other companies. Basically, a work from developers to developers, clearly a context where documentation becomes very important.</p>
<p><span id="more-407"></span></p>
<p>The main objectives we tried to achieve with this architecture:</p>
<ul>
<li>Documentation must be located in the code repository. This means that each development branch has its own documentation that is integrated in the same way that the code.</li>
<li>Documentation generation must be part of the building process. We use our build pipeline for everything: compile versions, test versions, compile data packages, generate installers, distribute symbols and, of course, build documentation. This means that the source of the documentation must be a very basic representation, easy to edit, that later is converted to a pdf, doc, html by the building process. This point implies that the documentation can be easily distributed to clients or even generated by themselves.</li>
<li>Access to current documentation must be integrated into the development environment. In the same way I can review bugs in the bugtracker, stories in the task manager and browse the source code I must be able to read the current documentation from a web browser without having to rebuilt it manually.</li>
</ul>
<p>We make the distinction between two kind of documentation: <strong>API Documentation</strong> and <strong>Technical Articles</strong>.</p>
<p><strong>API documentation</strong> are those documents where class usage is described: description for each function, for each parameter, enumerations, etc. <a href="http://www.stack.nl/~dimitri/doxygen/">Doxygen</a> seems the perfect fit for this purpose. But after several projects using it we opted for not using it in this project because we considered it a waste of time. Doxygen may be really useful to generate documentation for a library that is distributed without sources but one of the pillars of our development philosophy is to always provide source code to clients. This way, they can choose how to compile it: one dll, several dlls, one lib&#8230; and tweak the different configurations. Another good rule that we follow is that code must document itself. Doxygen would force us to write code like this:</p>

<div class="wp_syntax"><div class="code"><pre class="cpp" style="font-family:monospace;"><span style="color: #0000ff;">class</span> MemoryManager<span style="color: #008080;">:</span> <span style="color: #0000ff;">public</span> MemoryAllocator
<span style="color: #008000;">&#123;</span>
    <span style="color: #666666;">/// Allocates memory on a specified alignment boundary</span>
    <span style="color: #666666;">/// \param size Size in bytes of the block requested by the user</span>
    <span style="color: #666666;">/// \param alignment The alignment value, which must be an integer power of 2</span>
    <span style="color: #666666;">/// \return A pointer to the new allocated block of memory</span>
    <span style="color: #666666;">/// \remarks Thread-safe</span>
    <span style="color: #0000ff;">virtual</span> <span style="color: #0000ff;">void</span><span style="color: #000040;">*</span> AlignedAlloc<span style="color: #008000;">&#40;</span><span style="color: #0000ff;">size_t</span> size, <span style="color: #0000ff;">size_t</span> alignment<span style="color: #008000;">&#41;</span> <span style="color: #000080;">=</span> <span style="color: #0000dd;">0</span><span style="color: #008080;">;</span>
<span style="color: #008000;">&#125;</span></pre></div></div>

<p>Quite redundant code, don&#8217;t you think? Obviously <em>AlignedAlloc()</em> allocates memory that is aligned on a boundary, I do not need a comment for that! The same for the function parameters. So we ended up writing documentation for only the parts that are really confusing or need a clarification (for example, the fact that <em>AlignedAlloc()</em> is thread-safe) and using the source code as the documentation itself for both the interface and the implementation. We augment this documentation by using unit tests that serve as example usage for each class. We know this decision is a controversial one and may be in the future we return back to using Doxygen or something similar. But for now, it is working fine.</p>
<p>A <strong>technical article</strong> is like a manual that describes the architecture and philosophy of a specific part of the framework. They provide information to a deeper level than the mere description of classes provided by the API documentation. We usually link to technical articles from API documents. </p>
<p>Although a wiki seems an ideal candidate for writing technical articles we rejected it because it didn&#8217;t satisfy the points enumerated above. Years ago and inspired by the <a href="http://www.gentoo.org/proj/en/gdp/">Gentoo Documentation Project</a> I used <a href="http://www.docbook.org/">DocBook</a> for a similar purpose. I wrote about it in a previous article: <a href="http://entland.homelinux.com/blog/2006/05/02/xml-documentation/">Documentation using XML</a>. And although the results were quite satisfactory I didn&#8217;t want to return to that solution, mainly because a XML format is hard to write, edit, maintain, etc. I think we found a better solution: <a href="http://docutils.sourceforge.net/rst.html">reStructuredText</a>. reStructuredText is part of the DocUtils suite and from very simple what-you-see-is-what-you-get plaintext markup it generates quite good looking documents. Another bonus feature that finally inclined the balance towards reStructuredText is that it perfectly integrates with <a href="http://trac.edgewall.org/">Trac</a> which we have been using since the beginning. Trac is a suite that integrates wiki, bugtracking, source code browsing and task management in a single web program. I like to see trac as a free alternative to <a href="http://www.fogcreek.com/FogBugz/">FogBuz</a>. Trac supports both its own wiki format and reStructuredText allowing us to integrate the documentation located in the repository as part of our wiki infrastructure satisfying the last point mention above. Time will say if this infrastructure will suffice.</p>
<p>Want to comment how do you solve the documentation issues in your project? Please do so. Comments are open. Thanks for reading.</p>
<div class="clearer"></div>
]]></content:encoded>
			<wfw:commentRss>http://entland.homelinux.com/blog/2009/07/28/the-documentation-challenge/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Compile-Time Strings</title>
		<link>http://entland.homelinux.com/blog/2009/04/28/compile-time-strings/</link>
		<comments>http://entland.homelinux.com/blog/2009/04/28/compile-time-strings/#comments</comments>
		<pubDate>Tue, 28 Apr 2009 22:19:40 +0000</pubDate>
		<dc:creator>ent</dc:creator>
				<category><![CDATA[CodeGems]]></category>
		<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://entland.homelinux.com/blog/?p=346</guid>
		<description><![CDATA[It would be nice if we had such a feature in the C language, wouldn&#8217;t it? The term &#8216;compile-time string&#8217; is referred here as strings that are converted to unique integer identifiers at compile time. At run-time those identifiers are simple integers that can be compared and hashed very fast.  In other languages, like [...]]]></description>
			<content:encoded><![CDATA[<p>It would be nice if we had such a feature in the C language, wouldn&#8217;t it? The term &#8216;compile-time string&#8217; is referred here as strings that are converted to unique integer identifiers at compile time. At run-time those identifiers are simple integers that can be compared and hashed very fast.  In other languages, like for example Smalltalk, the concept of Symbol implements a similar idea. The following post describes a possible implementation of this feature in C/C++.</p>
<p><span id="more-346"></span></p>
<p>Imagine, for example, a generic object factory where object instances are created using unique identifiers. The classical solution here is having a shared-by-all-code header where all the identifiers are declared in a C enumeration. This solution, apart from creating a serious physical dependency where adding a new identifier to the enumeration forces a recompilation for all the project, is unfeasible in modular architectures where modules are isolated. In those architectures having a global header is not an option.</p>
<p>One viable solution may be using strings as identifiers. But strings are heavy objects, hard to compare and prone to typing errors because miswritten symbols would be detected at run-time instead of compile-time. Other equally insufficient solutions to this problem include <a href="http://en.wikipedia.org/wiki/FourCC">FourCC</a> and esoteric template tricks for generating a hash at compile time (desist from it, it is not possible to solve this 100% with templates because strings cannot be used as template parameters and anyway hashing a string is not collision-free. More information in <a href="http://www.usenet.com/newsgroups/comp.lang.c++.moderated/msg05807.html">this usenet thread</a>). Mick West proposes more solutions in his <a href="http://cowboyprogramming.com/2007/01/04/practical-hash-ids/">Practical Hash IDs</a> article.</p>
<p>What follows is an implementation that has been working nicely for me and that satisfactorily fits the requirements for simulating compile-time strings. First let me show you two examples of the usage:</p>

<div class="wp_syntax"><div class="code"><pre class="cpp" style="font-family:monospace;"><span style="color: #0000ff;">namespace</span>
<span style="color: #008000;">&#123;</span>
DECLARE_SYMBOL<span style="color: #008000;">&#40;</span>CubeMesh<span style="color: #008000;">&#41;</span>
DECLARE_SYMBOL<span style="color: #008000;">&#40;</span>SphereMesh<span style="color: #008000;">&#41;</span>
DECLARE_SYMBOL<span style="color: #008000;">&#40;</span>DuckMesh<span style="color: #008000;">&#41;</span>
<span style="color: #008000;">&#125;</span>
&nbsp;
<span style="color: #0000ff;">void</span> CollectNodes<span style="color: #008000;">&#40;</span>Ptr<span style="color: #000080;">&lt;</span>Node<span style="color: #000080;">&gt;</span><span style="color: #000040;">&amp;</span> node<span style="color: #008000;">&#41;</span>
<span style="color: #008000;">&#123;</span>
    Ptr<span style="color: #000080;">&lt;</span>Mesh<span style="color: #000080;">&gt;</span> mesh0 <span style="color: #000080;">=</span> CreateObject<span style="color: #008000;">&#40;</span>S<span style="color: #008000;">&#40;</span>CubeMesh<span style="color: #008000;">&#41;</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
    node<span style="color: #000040;">-</span><span style="color: #000080;">&gt;</span>Add<span style="color: #008000;">&#40;</span>mesh0<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
&nbsp;
    Ptr<span style="color: #000080;">&lt;</span>Mesh<span style="color: #000080;">&gt;</span> mesh1 <span style="color: #000080;">=</span> CreateObject<span style="color: #008000;">&#40;</span>S<span style="color: #008000;">&#40;</span>SphereMesh<span style="color: #008000;">&#41;</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
    node<span style="color: #000040;">-</span><span style="color: #000080;">&gt;</span>Add<span style="color: #008000;">&#40;</span>mesh1<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
&nbsp;
    Ptr<span style="color: #000080;">&lt;</span>Mesh<span style="color: #000080;">&gt;</span> mesh2 <span style="color: #000080;">=</span> CreateObject<span style="color: #008000;">&#40;</span>S<span style="color: #008000;">&#40;</span>DuckMesh<span style="color: #008000;">&#41;</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
    node<span style="color: #000040;">-</span><span style="color: #000080;">&gt;</span>Add<span style="color: #008000;">&#40;</span>mesh2<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
<span style="color: #008000;">&#125;</span></pre></div></div>


<div class="wp_syntax"><div class="code"><pre class="cpp" style="font-family:monospace;"><span style="color: #0000ff;">namespace</span>
<span style="color: #008000;">&#123;</span>
DECLARE_SYMBOL<span style="color: #008000;">&#40;</span>FirstMessage<span style="color: #008000;">&#41;</span>
DECLARE_SYMBOL<span style="color: #008000;">&#40;</span>SecondMessage<span style="color: #008000;">&#41;</span>
DECLARE_SYMBOL<span style="color: #008000;">&#40;</span>ThirdMessage<span style="color: #008000;">&#41;</span>
<span style="color: #008000;">&#125;</span>
&nbsp;
<span style="color: #0000ff;">void</span> ProcessMessage<span style="color: #008000;">&#40;</span><span style="color: #0000ff;">const</span> Message<span style="color: #000040;">&amp;</span> msg<span style="color: #008000;">&#41;</span>
<span style="color: #008000;">&#123;</span>
    <span style="color: #0000ff;">if</span> <span style="color: #008000;">&#40;</span>msg.<span style="color: #007788;">id</span> <span style="color: #000080;">==</span> S<span style="color: #008000;">&#40;</span>FirstMessage<span style="color: #008000;">&#41;</span><span style="color: #008000;">&#41;</span>
    <span style="color: #008000;">&#123;</span>
       <span style="color: #666666;">/// ...</span>
    <span style="color: #008000;">&#125;</span>
    <span style="color: #0000ff;">else</span> <span style="color: #0000ff;">if</span> <span style="color: #008000;">&#40;</span>msg.<span style="color: #007788;">id</span> <span style="color: #000080;">==</span> S<span style="color: #008000;">&#40;</span>SecondMessage<span style="color: #008000;">&#41;</span><span style="color: #008000;">&#41;</span>
    <span style="color: #008000;">&#123;</span>
       <span style="color: #666666;">/// ...</span>
    <span style="color: #008000;">&#125;</span>
    <span style="color: #0000ff;">else</span> <span style="color: #0000ff;">if</span> <span style="color: #008000;">&#40;</span>msg.<span style="color: #007788;">id</span><span style="color: #000080;">==</span> S<span style="color: #008000;">&#40;</span>ThirdMessage<span style="color: #008000;">&#41;</span><span style="color: #008000;">&#41;</span>
    <span style="color: #008000;">&#123;</span>
       <span style="color: #666666;">/// ...</span>
    <span style="color: #008000;">&#125;</span>
<span style="color: #008000;">&#125;</span></pre></div></div>

<p>A symbol represents a compile-time string. They must be declared before being used. The macro for declaring a symbol is hiding an inline function with a static inside itself:</p>

<div class="wp_syntax"><div class="code"><pre class="cpp" style="font-family:monospace;"><span style="color: #339900;">#define DECLARE_SYMBOL(id)\
    inline Symbol __GetSymbol##id() throw()\
    {\
        static size_t sym;\
        if (sym == 0)\
        {\
            sym = GetIdFromString(#id);\
        }\
        return Symbol(sym);\
    }</span></pre></div></div>

<p>The function GetIdFromString() hashes the string, stores it in an internal table and returns the table position for that string (the Symbol class is a simple wrapper around the identifier). This is done only the first time the symbol is requested. For future requests the static ID is returned. This adds a little overhead against using simple integers as symbols. Beware of local static initializations: they are not thread-safe. That is the reason of the manual comparison against 0. GetIdFromString() must be thread-safe for this code to work.</p>
<p>The S macro simply invokes the local function previously generated:</p>

<div class="wp_syntax"><div class="code"><pre class="cpp" style="font-family:monospace;"><span style="color: #339900;">#define S(id) __GetSymbol##id()</span></pre></div></div>

<p>And there you have it. Compile-time strings with negligible (in case you are doing anything more that simply comparing symbols) overhead. In case you need 100% efficient code you could pre-generate a table with the symbols being used by your project (searching for all DECLARE_SYMBOL blocks) and substitute each S() with a really unique identifier generated at compile-time. And that would be so easy if the preprocessor could be extended in a standard way&#8230;</p>
<p>Hope this makes sense. Thank you for reading.</p>
<p>&nbsp;</p>
<div class="hr"></div>
<ol>
<li>
<a href="http://cowboyprogramming.com/2007/01/04/practical-hash-ids/">Practical Hash IDs</a>
</li>
<li>
<a href="http://en.wikipedia.org/wiki/FourCC">FourCC</a>
</li>
<li>
<a href="http://www.usenet.com/newsgroups/comp.lang.c++.moderated/msg05807.html">Compile-time string hash generator</a>
</li>
</ol>
]]></content:encoded>
			<wfw:commentRss>http://entland.homelinux.com/blog/2009/04/28/compile-time-strings/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Stripping comments from Shader bytecodes</title>
		<link>http://entland.homelinux.com/blog/2009/01/15/stripping-comments-from-shader-bytecodes/</link>
		<comments>http://entland.homelinux.com/blog/2009/01/15/stripping-comments-from-shader-bytecodes/#comments</comments>
		<pubDate>Thu, 15 Jan 2009 20:37:39 +0000</pubDate>
		<dc:creator>ent</dc:creator>
				<category><![CDATA[CodeGems]]></category>
		<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://entland.homelinux.com/blog/?p=281</guid>
		<description><![CDATA[In DirectX, when compiling a shader with D3DXCompileShader() a buffer containing the shader bytecodes is received. Apart from the bytecodes, extra content like debug and symbol table information is embedded. That extra information is added in form of comments that probably can be eliminated because you are already processing it at compile-time and it is [...]]]></description>
			<content:encoded><![CDATA[<p>In DirectX, when compiling a shader with <strong>D3DXCompileShader()</strong> a buffer containing the shader bytecodes is received. Apart from the bytecodes, extra content like debug and symbol table information is embedded. That extra information is added in form of comments that probably can be eliminated because you are already processing it at compile-time and it is not needed at run-time when loading the shader.</p>
<p>If you can do without that information the following code will help you to save a few bytes, even halving the size of the byte-code in the best cases.</p>
<p>Although not documented in the DirectX SDK, this CodeGem is not an undocumented hack. The Direct3D shader code format is documented in the <a href="http://msdn.microsoft.com/en-us/library/ms800355.aspx">MSDN</a>, so it probaly won&#8217;t change in future revision of DirectX v9.0 (if there is going to be any more&#8230;)</p>

<div class="wp_syntax"><div class="code"><pre class="cpp" style="font-family:monospace;">D3DPtr<span style="color: #000080;">&lt;</span>ID3DXBuffer<span style="color: #000080;">&gt;</span> StripComments<span style="color: #008000;">&#40;</span><span style="color: #0000ff;">const</span> D3DPtr<span style="color: #000080;">&lt;</span>ID3DXBuffer<span style="color: #000080;">&gt;</span><span style="color: #000040;">&amp;</span> code<span style="color: #008000;">&#41;</span>
<span style="color: #008000;">&#123;</span>
    <span style="color: #666666;">// Calculates the new size (without comments)</span>
    <span style="color: #0000ff;">int</span><span style="color: #000040;">*</span> codeData <span style="color: #000080;">=</span> <span style="color: #0000ff;">static_cast</span><span style="color: #000080;">&lt;</span><span style="color: #0000ff;">int</span><span style="color: #000040;">*</span><span style="color: #000080;">&gt;</span><span style="color: #008000;">&#40;</span>code<span style="color: #000040;">-</span><span style="color: #000080;">&gt;</span>GetBufferPointer<span style="color: #008000;">&#40;</span><span style="color: #008000;">&#41;</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
    <span style="color: #0000ff;">unsigned</span> <span style="color: #0000ff;">int</span> sizeInWords <span style="color: #000080;">=</span> code<span style="color: #000040;">-</span><span style="color: #000080;">&gt;</span>GetBufferSize<span style="color: #008000;">&#40;</span><span style="color: #008000;">&#41;</span> <span style="color: #000040;">/</span> <span style="color: #0000dd;">4</span><span style="color: #008080;">;</span>
    <span style="color: #0000ff;">unsigned</span> <span style="color: #0000ff;">int</span> strippedSizeInWords <span style="color: #000080;">=</span> sizeInWords<span style="color: #008080;">;</span>
&nbsp;
    <span style="color: #0000ff;">for</span> <span style="color: #008000;">&#40;</span><span style="color: #0000ff;">unsigned</span> <span style="color: #0000ff;">int</span> i <span style="color: #000080;">=</span> <span style="color: #0000dd;">0</span><span style="color: #008080;">;</span> i <span style="color: #000080;">&lt;</span> sizeInWords<span style="color: #008080;">;</span> i<span style="color: #000040;">++</span><span style="color: #008000;">&#41;</span>
    <span style="color: #008000;">&#123;</span>
        <span style="color: #0000ff;">if</span> <span style="color: #008000;">&#40;</span><span style="color: #008000;">&#40;</span>codeData<span style="color: #008000;">&#91;</span>i<span style="color: #008000;">&#93;</span> <span style="color: #000040;">&amp;</span> <span style="color: #208080;">0xffff</span><span style="color: #008000;">&#41;</span> <span style="color: #000080;">==</span> D3DSIO_COMMENT<span style="color: #008000;">&#41;</span>
        <span style="color: #008000;">&#123;</span>
            <span style="color: #0000ff;">int</span> commentSize <span style="color: #000080;">=</span> codeData<span style="color: #008000;">&#91;</span>i<span style="color: #008000;">&#93;</span> <span style="color: #000080;">&gt;&gt;</span> <span style="color: #0000dd;">16</span><span style="color: #008080;">;</span>
            strippedSizeInWords <span style="color: #000040;">-</span><span style="color: #000080;">=</span> <span style="color: #0000dd;">1</span> <span style="color: #000040;">+</span> commentSize<span style="color: #008080;">;</span>
            i <span style="color: #000040;">+</span><span style="color: #000080;">=</span> commentSize<span style="color: #008080;">;</span>
        <span style="color: #008000;">&#125;</span>
    <span style="color: #008000;">&#125;</span>
&nbsp;
    <span style="color: #666666;">// Creates a new buffer with the original code but omitting the comments</span>
    D3DPtr<span style="color: #000080;">&lt;</span>ID3DXBuffer<span style="color: #000080;">&gt;</span> strippedCode<span style="color: #008080;">;</span>
    V<span style="color: #008000;">&#40;</span>D3DXCreateBuffer<span style="color: #008000;">&#40;</span>strippedSizeInWords <span style="color: #000040;">*</span> <span style="color: #0000dd;">4</span>, strippedCode.<span style="color: #007788;">GetPtrForInit</span><span style="color: #008000;">&#40;</span><span style="color: #008000;">&#41;</span><span style="color: #008000;">&#41;</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
&nbsp;
    <span style="color: #0000ff;">int</span><span style="color: #000040;">*</span> strippedCodeData <span style="color: #000080;">=</span> <span style="color: #0000ff;">static_cast</span><span style="color: #000080;">&lt;</span><span style="color: #0000ff;">int</span><span style="color: #000040;">*</span><span style="color: #000080;">&gt;</span><span style="color: #008000;">&#40;</span>strippedCode<span style="color: #000040;">-</span><span style="color: #000080;">&gt;</span>GetBufferPointer<span style="color: #008000;">&#40;</span><span style="color: #008000;">&#41;</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
    <span style="color: #0000ff;">size_t</span> offset <span style="color: #000080;">=</span> <span style="color: #0000dd;">0</span><span style="color: #008080;">;</span>
&nbsp;
    <span style="color: #0000ff;">for</span> <span style="color: #008000;">&#40;</span><span style="color: #0000ff;">unsigned</span> <span style="color: #0000ff;">int</span> i <span style="color: #000080;">=</span> <span style="color: #0000dd;">0</span><span style="color: #008080;">;</span> i <span style="color: #000080;">&lt;</span> sizeInWords<span style="color: #008080;">;</span> i<span style="color: #000040;">++</span><span style="color: #008000;">&#41;</span>
    <span style="color: #008000;">&#123;</span>
        <span style="color: #0000ff;">if</span> <span style="color: #008000;">&#40;</span><span style="color: #008000;">&#40;</span>codeData<span style="color: #008000;">&#91;</span>i<span style="color: #008000;">&#93;</span> <span style="color: #000040;">&amp;</span> <span style="color: #208080;">0xffff</span><span style="color: #008000;">&#41;</span> <span style="color: #000080;">==</span> D3DSIO_COMMENT<span style="color: #008000;">&#41;</span>
        <span style="color: #008000;">&#123;</span>
            <span style="color: #0000ff;">int</span> commentSize <span style="color: #000080;">=</span> codeData<span style="color: #008000;">&#91;</span>i<span style="color: #008000;">&#93;</span> <span style="color: #000080;">&gt;&gt;</span> <span style="color: #0000dd;">16</span><span style="color: #008080;">;</span>
            i <span style="color: #000040;">+</span><span style="color: #000080;">=</span> commentSize<span style="color: #008080;">;</span>
        <span style="color: #008000;">&#125;</span>
        <span style="color: #0000ff;">else</span>
        <span style="color: #008000;">&#123;</span>
            strippedCodeData<span style="color: #008000;">&#91;</span>offset<span style="color: #000040;">++</span><span style="color: #008000;">&#93;</span> <span style="color: #000080;">=</span> codeData<span style="color: #008000;">&#91;</span>i<span style="color: #008000;">&#93;</span><span style="color: #008080;">;</span>
        <span style="color: #008000;">&#125;</span>
    <span style="color: #008000;">&#125;</span>
&nbsp;
    <span style="color: #0000ff;">return</span> strippedCode<span style="color: #008080;">;</span>
<span style="color: #008000;">&#125;</span></pre></div></div>

]]></content:encoded>
			<wfw:commentRss>http://entland.homelinux.com/blog/2009/01/15/stripping-comments-from-shader-bytecodes/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Three free productivity booster tools</title>
		<link>http://entland.homelinux.com/blog/2008/12/20/three-free-productivity-booster-tools/</link>
		<comments>http://entland.homelinux.com/blog/2008/12/20/three-free-productivity-booster-tools/#comments</comments>
		<pubDate>Sat, 20 Dec 2008 16:30:35 +0000</pubDate>
		<dc:creator>ent</dc:creator>
				<category><![CDATA[Internet]]></category>
		<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://entland.homelinux.com/blog/?p=221</guid>
		<description><![CDATA[We, as programmers, see optimization opportunities everywhere, and more when they can be applied to our work tool, the computer. What follows is an enumeration of three tools that will save you  precious time in your daily work. To me, they have become indispensable tools. Hope they will  become the same for you:

Launchy. [...]]]></description>
			<content:encoded><![CDATA[<p>We, as programmers, see optimization opportunities everywhere, and more when they can be applied to our work tool, the computer. What follows is an enumeration of three tools that will save you  precious time in your daily work. To me, they have become indispensable tools. Hope they will  become the same for you:</p>
<ul>
<li><strong><a href="http://www.launchy.net/">Launchy</a></strong>. Launchy is a keystroke launcher that clones the behaviour of <a href="http://docs.blacktree.com/quicksilver/what_is_quicksilver">Quicksilver</a> in Mac OS. With this tool you will say goodbye to your start menu and desktop icons. Everything is now accessible from a few keystrokes: folders, applications and even websites, all with a simple alt + space. The perfect complement for Launchy is <a href="http://www.tordex.com/startkiller/">StartKiller</a>, a tool for removing the Start button from the taskbar. To me, Launchy is as revolutionary as <a href="http://en.wikipedia.org/wiki/4DOS">4DOS</a> was back in MS-DOS days. How much time did you save? (bonus: and now that you can have a 100% clean desktop it is time to use a decent <a href="http://night-fate.deviantart.com/art/another-world-wallpaper-VIII-93708854">wallpaper</a>&#8230;).</li>
<li><strong><a href="http://www.autohotkey.com/">AutoHotkey</a></strong>. Whatever can not be done with a few launchy keystrokes most likely can be programmed with an AutoHotkey macro.  AutoHotkey incorporate a powerful script language that will allow you to automate almost anything: instant access to disk folders, internet tabs, activate tray programs, change visual studio layouts, send email, check calendar, etc.</li>
<li><strong><a href="http://zabkat.com/">xplorer²</a></strong>. xplorer² is a file manager with enough <a href="http://zabkat.com/x2facts.htm">features</a> to say good bye to Windows Explorer (Microsoft, admit it, it is not designed for advanced file manipulation). Although there is a professional version, the free lite version is enough to me, especially the Tabbed dual-pane interface feature. With AutoHotkey you can easily redirect Win + E to xplorer².</li>
</ul>
<p>And that closes out my small contribution to reduce energy wastage in the world <img src='http://entland.homelinux.com/blog/wp-includes/images/smilies/icon_cool.gif' alt='8-)' class='wp-smiley' /> . I am sure you can share more gems like these. One more time, thanks for reading.</p>
]]></content:encoded>
			<wfw:commentRss>http://entland.homelinux.com/blog/2008/12/20/three-free-productivity-booster-tools/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Hacker&#8217;s Delight</title>
		<link>http://entland.homelinux.com/blog/2008/11/04/hackers-delight/</link>
		<comments>http://entland.homelinux.com/blog/2008/11/04/hackers-delight/#comments</comments>
		<pubDate>Tue, 04 Nov 2008 02:56:00 +0000</pubDate>
		<dc:creator>ent</dc:creator>
				<category><![CDATA[Books]]></category>
		<category><![CDATA[Hacking]]></category>
		<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://entland.homelinux.com/blog/?p=114</guid>
		<description><![CDATA[
Hacker&#8217;s Delight
Author: Henry S. Warren, Jr.
Pages: 306
Published: 2003
You may think that I have become obsessed with books about hacking but this book is totally different from any of the others. In this book the term hacker is meant in the traditional sense (before the negative definition was popularized) of someone interested in understanding how things [...]]]></description>
			<content:encoded><![CDATA[<div class="img-shadow"><a href="http://www.amazon.com/dp/0201914654/ref=nosim?tag=ent0c-20"><img src="http://entland.homelinux.com/images/books/HackersDelight.png" alt="Hacker's Delight book image"/></a></div>
<p><strong>Hacker&#8217;s Delight<br />
Author: Henry S. Warren, Jr.<br />
Pages: 306<br />
Published: 2003</strong></p>
<p>You may think that I have become obsessed with <a href="http://entland.homelinux.com/blog/2008/05/10/the-art-of-intrusion/">books about hacking</a> but this book is totally different from any of the others. In this book the term hacker is meant in the traditional sense (before the negative definition was popularized) of someone interested in understanding how things work and how to solve problems efficiently. Although the hacker term can be applied to whatever domain, in this book the domain is computing technology.</p>
<p>&#8216;Hacker&#8217;s Delight&#8217; is a book about bits and small programming tricks applied to machines. With &#8217;small&#8217; I mean that you won&#8217;t find here a description of the Merge sort or Radix sort but, for example, you will learn to determine in constant time if an integer is a power of two or not.</p>
<p>In more that 300 pages and with a mixture of pseudo assembler and C you will find all kind of tricks for arithmetic bounds, counting bits, searching bits, multiplications, elementary functions, floating point, etc. Even wondered if the base2 used by computers is the most efficient? This question and a lot more are covered in this book.</p>
<p>Although the book is a little bit oriented towards compiler developers, every &#8216;real&#8217; programmer can get a huge benefit from reading and thoroughly understanding this book.</p>
<p>I read this book on several flights and really enjoyed this little gem book of tricks. I would definitely recommend to have it in your bookshelf.</p>
<p><strong>Rating: 8 / 10</strong></p>
<div class="clearer"></div>
]]></content:encoded>
			<wfw:commentRss>http://entland.homelinux.com/blog/2008/11/04/hackers-delight/feed/</wfw:commentRss>
		<slash:comments>11</slash:comments>
		</item>
		<item>
		<title>Tangential Software Usage</title>
		<link>http://entland.homelinux.com/blog/2008/09/30/tangential-software-usage/</link>
		<comments>http://entland.homelinux.com/blog/2008/09/30/tangential-software-usage/#comments</comments>
		<pubDate>Tue, 30 Sep 2008 07:30:58 +0000</pubDate>
		<dc:creator>ent</dc:creator>
				<category><![CDATA[Internet]]></category>
		<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://entland.homelinux.com/blog/?p=147</guid>
		<description><![CDATA[
As you probably know I am working in a very small (3) team through internet. We do not share a physical place and we have very limited resources. All the infrastructure is based on servers we have at our own home (code repository, wiki, bug tracking service, build machines, web server, backup machines, NAS servers, [...]]]></description>
			<content:encoded><![CDATA[<div class="img-shadow"><img src="http://entland.homelinux.com/images/TangentialUsage.jpg" alt="Image for Tangential Software Usage article"/></div>
<p>As you probably know I am working in a very small (3) team through internet. We do not share a physical place and we have very limited resources. All the infrastructure is based on servers we have at our own home (code repository, wiki, bug tracking service, build machines, web server, backup machines, <a href="http://entland.homelinux.com/blog/2007/07/09/building-a-nas/">NAS</a> servers, etc). As you can imagine, we try to optimize our time and bandwidth as much as possible. I want to share with you in this post two examples of this optimizing philosophy with the idea of discussing them and discovering other interesting usages you may be doing (if you want to share of course)</p>
<ul>
<li>
<a href="http://twitter.com">Twitter</a>: I like to know where the rest of the team is working on. We have weekly voice meeting, we have emails and IM accounts but that is not enough to know with precision where each part is working on. Twitter, a micro-blogging service you probably know is ideal for this purpose. We have private twitter accounts (nobody out of the team can read it) where we update or current status: developing a new package, fixing a ticket, writing documentation, meeting a client, etc. With a simple look at your twitter account you get the status of the team.
</li>
<li>
<a href="http://www.getdropbox.com/">Dropbox</a>: I am absolutely impressed with this software. If you don&#8217;t know about it I recommend that you have a look at its <a href="http://www.getdropbox.com/tour">tutorial</a>. The service is incredibly simple to use and it just works without problems. We have created a dropbox account for internal distribution and testing of our binary releases. Our build machine copy each generated distribution to a shared dropbox folder. This dropbox folder is shared with our team giving us the following advantages:
<ul>
<li>
Every member on the team have the binaries everywhere and in all machines we want to test.
</li>
<li>
Logs for each execution are saved in the shared folder and are automatically synchronized in all the accounts. The logs give us useful information about the execution of the software that every developer can inspect.
</li>
<li>
Crashes are stored as minidumps in that same folder. And, as we distribute pdb with our releases, this means that everybody in the team can open any dump from any release and reproduce the exact crashing conditions everywhere. For more information about symbols, read my previous article about <a href="http://entland.homelinux.com/blog/2006/07/06/setting-up-a-symbol-server/">Setting up a Symbol Server</a>
</li>
</ul>
</li>
</ul>
<p>Do you have more interesting related ideas? Please, share them with us. </p>
<div class="clearer"></div>
]]></content:encoded>
			<wfw:commentRss>http://entland.homelinux.com/blog/2008/09/30/tangential-software-usage/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Practical Efficient Memory Management</title>
		<link>http://entland.homelinux.com/blog/2008/08/19/practical-efficient-memory-management/</link>
		<comments>http://entland.homelinux.com/blog/2008/08/19/practical-efficient-memory-management/#comments</comments>
		<pubDate>Tue, 19 Aug 2008 01:26:39 +0000</pubDate>
		<dc:creator>ent</dc:creator>
				<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://entland.homelinux.com/blog/?p=112</guid>
		<description><![CDATA[
A good memory management architecture is one of those key features that can make the difference between a successful application and an unsuccessful one. For realtime applications the memory architecture becomes critical. This article discovers how memory management is more than tracking where your malloc() and free() are located. Although the focus will be on [...]]]></description>
			<content:encoded><![CDATA[<div class="img-shadow"><img src="http://entland.homelinux.com/images/memory.jpg" alt="Gamelab 2008"/></div>
<p>A good memory management architecture is one of those key features that can make the difference between a successful application and an unsuccessful one. For realtime applications the memory architecture becomes critical. This article discovers how memory management is more than tracking where your malloc() and free() are located. Although the focus will be on realtime applications implemented in C, all the techniques described here can be translated to other scenarios because the terms described are language-independent.</p>
<p>The best memory management is doing no allocation at all. You should architect your software to minimize the interaction with the memory manager. In the past that was a realistic option but in modern architectures that objective becomes harder to achieve. Modern requisites for realtime architectures like, for example, content streaming or hot loading force us to have frequent interactions with the memory manager. This article describes how this can be done efficiently. The ideas provided plus the links to other articles will give you enough information to implement your own solution. No downloadable code is provided with this article but if you need help implementing the ideas described here do not hesitate to contact the author.</p>
<p><span id="more-112"></span></p>
<p>The article is subdivided in three parts. Each part is dedicated to different memory management layers. From low-level to high-level, the first layer is the Memory Allocator. The Memory allocator, in direct contact with the operating system, implements the memory allocation/deallocation functionality. The next layer that will be described is dedicated to Pools, or how to minimize the interaction with the Memory Allocator. In the last part, automatic memory management or how to avoid human interaction with the memory manager is described.</p>
<h4>1. The Memory Allocator</h4>
<p>The Memory Allocator is a class implementation with basic memory management functionality. An interface for this class would be something like this:</p>

<div class="wp_syntax"><div class="code"><pre class="cpp" style="font-family:monospace;"><span style="color: #666666;">////////////////////////////////////////////////////////////////////////////</span>
<span style="color: #666666;">/// MemoryAllocator. Base class for all memory allocators</span>
<span style="color: #666666;">////////////////////////////////////////////////////////////////////////////</span>
<span style="color: #0000ff;">class</span> MemoryAllocator
<span style="color: #008000;">&#123;</span>
<span style="color: #0000ff;">public</span><span style="color: #008080;">:</span>
    <span style="color: #0000ff;">virtual</span> ~MemoryAllocator<span style="color: #008000;">&#40;</span><span style="color: #008000;">&#41;</span> <span style="color: #000080;">=</span> <span style="color: #0000dd;">0</span> <span style="color: #008000;">&#123;</span><span style="color: #008000;">&#125;</span><span style="color: #008080;">;</span>
&nbsp;
    <span style="color: #666666;">/// \return The name of the allocator implementing this interface</span>
    <span style="color: #0000ff;">virtual</span> <span style="color: #0000ff;">const</span> <span style="color: #0000ff;">char</span><span style="color: #000040;">*</span> GetName<span style="color: #008000;">&#40;</span><span style="color: #008000;">&#41;</span> <span style="color: #0000ff;">const</span> <span style="color: #000080;">=</span> <span style="color: #0000dd;">0</span><span style="color: #008080;">;</span>
&nbsp;
    <span style="color: #0000ff;">virtual</span> <span style="color: #0000ff;">void</span><span style="color: #000040;">*</span> Alloc<span style="color: #008000;">&#40;</span><span style="color: #0000ff;">size_t</span> size<span style="color: #008000;">&#41;</span> <span style="color: #000080;">=</span> <span style="color: #0000dd;">0</span><span style="color: #008080;">;</span>
    <span style="color: #0000ff;">virtual</span> <span style="color: #0000ff;">void</span><span style="color: #000040;">*</span> Realloc<span style="color: #008000;">&#40;</span><span style="color: #0000ff;">void</span><span style="color: #000040;">*</span> ptr, <span style="color: #0000ff;">size_t</span> size<span style="color: #008000;">&#41;</span> <span style="color: #000080;">=</span> <span style="color: #0000dd;">0</span><span style="color: #008080;">;</span>
    <span style="color: #0000ff;">virtual</span> <span style="color: #0000ff;">void</span> Dealloc<span style="color: #008000;">&#40;</span><span style="color: #0000ff;">void</span><span style="color: #000040;">*</span> ptr<span style="color: #008000;">&#41;</span> <span style="color: #000080;">=</span> <span style="color: #0000dd;">0</span><span style="color: #008080;">;</span>
&nbsp;
    <span style="color: #666666;">/// Returns the size of a memory block allocated by this allocator</span>
    <span style="color: #0000ff;">virtual</span> <span style="color: #0000ff;">size_t</span> SizeOf<span style="color: #008000;">&#40;</span><span style="color: #0000ff;">void</span><span style="color: #000040;">*</span> ptr<span style="color: #008000;">&#41;</span> <span style="color: #0000ff;">const</span> <span style="color: #000080;">=</span> <span style="color: #0000dd;">0</span><span style="color: #008080;">;</span>
&nbsp;
    <span style="color: #666666;">/// Log statistics</span>
    <span style="color: #0000ff;">virtual</span> <span style="color: #0000ff;">void</span> DumpStats<span style="color: #008000;">&#40;</span><span style="color: #008000;">&#41;</span> <span style="color: #0000ff;">const</span> <span style="color: #000080;">=</span> <span style="color: #0000dd;">0</span><span style="color: #008080;">;</span>
<span style="color: #008000;">&#125;</span><span style="color: #008080;">;</span></pre></div></div>

<p>This proposed interface is virtual and a concrete implementation must be provided. Probably you will implement this removing the virtual class, but having it is a good option to reduce compile-time dependencies. Anyway the virtual functions will not be a bottleneck.</p>
<p>The first implementation for this interface is obvious: implement it using the standard C functions like malloc, realloc and free. The SizeOf() function can be implemented in Visual Studio with _msize. Rest of the platforms provide similar functionality. This first implementation may be enough for you. Test it, profile it and if you need better performance pass to the next step.</p>
<p>The next step starts with a piece of advice: do not implement your own allocator. Do not reinvent the wheel. Do not waste your precious time implementing something that probably others have already done.</p>
<p>What follows is a description of desirable properties that must have a good memory allocator. Next, links to existing implementations are provided.</p>
<p><big><strong>Properties of a good memory allocator</strong></big></p>
<p>When looking for existing memory allocator implementations you must pay attention at the following features:</p>
<ul>
<li><strong>Efficient for small block allocations</strong>: a high percentage of your allocations are for small blocks (from a byte to a few hundred). The allocation for this blocks must be very fast and the size overhead must be minimum. When you request N bytes to an allocator, it is normal that internally more than N bytes are allocated due to internal headers. A good small allocator must allocate/deallocate in O(1) time and add zero overhead. Think about it, for a small block a common overhead of 8 bytes can be quite a lot! Is it possible to have zero-bytes overhead? Of course, separating your small allocators from the big one and playing with virtual addresses that objective is achievable</li>
<li><strong>Efficient for multithreaded programs</strong>: unlike in the past, multithreading is actually not an option for your architecture, it is mandatory. A good memory allocator must provide smarter mechanisms than protecting each function with a lock. Using Thread-Local-Storage (TLS) the locking mechanism can be avoided.</li>
</ul>
<p>Although the implementation provided by Visual Studio have been improved a lot (last version tested: Visual Studio 2005) it is not especially good at the points described before. Small blocks have a high overhead and the functions are protected by thread locks, the worst way to implement a thread-safe memory allocator.</p>
<p>Known good implementations like <strong>Doug Lea Malloc Allocator [1]</strong> are not so good in multithreaded environments. From the author himself (about thread safety): &#8220;This is not especially fast, and can be a major bottleneck. If you are using malloc in a concurrent program, consider instead using ptmalloc, which is derived from a version of this malloc. (See http://www.malloc.de).&#8221;</p>
<p>The <strong>Hoard Memory Allocator [3]</strong> is very good at the two points described before. It is specially designed to be very fast on multiprocessor machines. It is licensed under the GNU terms. If you want to use it in a commercial application they offer commercial licenses in its website.</p>
<p>Inside the <strong>Google Performance Tools [4]</strong> there is a very efficient implementation of malloc. As described by the authors: &#8220;The fastest malloc we&#8217;ve seen&#8221;. Apart from the marketing issues, their implementation satisfies the two requisites described before.</p>
<p>You must test each implementation in your scenario and choose the one that better suits you. Architect the memory manager in a way that the internal allocator can be easily swapped. Your first implementation should be based on the standard malloc.</p>
<p>The allocator implementations described here have been tested in real and finished applications. There is an important point to emphasize here. Apart from the efficiency of your final applications you want to improve the day to day job when you are building your program. In those cases, efficiency is not as important as robustness. The Debug Implementation of Malloc() provided by Microsoft in their Microsoft Visual Studio is incredibly good at this point, detecting all kind of errors related to memory.  This can save a lot of valuable time. So, one more time, being able to swap memory allocators is something you definitely want for your architecture.</p>
<p><big><strong>Redirecting to your allocator</strong></big></p>
<p>Having chosen an allocator the next step is redirecting all the program allocations through it. The first and most obvious solution is doing it explicitly, like the following example.</p>

<div class="wp_syntax"><div class="code"><pre class="cpp" style="font-family:monospace;"><span style="color: #666666;">// Allocate a block from the memory manager</span>
<span style="color: #0000ff;">void</span> <span style="color: #000040;">*</span>data <span style="color: #000080;">=</span> MemoryManager<span style="color: #008080;">::</span><span style="color: #007788;">Alloc</span><span style="color: #008000;">&#40;</span><span style="color: #0000dd;">256</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>                                    
&nbsp;
<span style="color: #666666;">// Use the block</span>
<span style="color: #666666;">// ...</span>
&nbsp;
<span style="color: #666666;">// Free the block</span>
MemoryManager<span style="color: #008080;">::</span><span style="color: #007788;">DeAlloc</span><span style="color: #008000;">&#40;</span>ptr<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span></pre></div></div>

<p>And for C++ classes, the global operators new and delete must be overridden.</p>

<div class="wp_syntax"><div class="code"><pre class="cpp" style="font-family:monospace;"><span style="color: #666666;">////////////////////////////////////////////////////////////////////////////</span>
<span style="color: #0000ff;">void</span><span style="color: #000040;">*</span> operator <span style="color: #0000dd;">new</span><span style="color: #008000;">&#40;</span><span style="color: #0000ff;">size_t</span> size<span style="color: #008000;">&#41;</span>
<span style="color: #008000;">&#123;</span>
    <span style="color: #0000ff;">return</span> MemoryManager<span style="color: #008080;">::</span><span style="color: #007788;">Alloc</span><span style="color: #008000;">&#40;</span>size<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
<span style="color: #008000;">&#125;</span>
&nbsp;
<span style="color: #666666;">////////////////////////////////////////////////////////////////////////////</span>
<span style="color: #0000ff;">void</span><span style="color: #000040;">*</span> operator <span style="color: #0000dd;">new</span><span style="color: #008000;">&#91;</span><span style="color: #008000;">&#93;</span><span style="color: #008000;">&#40;</span><span style="color: #0000ff;">size_t</span> size<span style="color: #008000;">&#41;</span>
<span style="color: #008000;">&#123;</span>
    <span style="color: #0000ff;">return</span> MemoryManager<span style="color: #008080;">::</span><span style="color: #007788;">Alloc</span><span style="color: #008000;">&#40;</span>size<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
<span style="color: #008000;">&#125;</span>
<span style="color: #666666;">////////////////////////////////////////////////////////////////////////////</span>
<span style="color: #0000ff;">void</span> operator <span style="color: #0000dd;">delete</span><span style="color: #008000;">&#40;</span><span style="color: #0000ff;">void</span><span style="color: #000040;">*</span> ptr<span style="color: #008000;">&#41;</span>
<span style="color: #008000;">&#123;</span>
    MemoryManager<span style="color: #008080;">::</span><span style="color: #007788;">DeAlloc</span><span style="color: #008000;">&#40;</span>ptr<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
<span style="color: #008000;">&#125;</span>
&nbsp;
<span style="color: #666666;">////////////////////////////////////////////////////////////////////////////</span>
<span style="color: #0000ff;">void</span> operator <span style="color: #0000dd;">delete</span><span style="color: #008000;">&#91;</span><span style="color: #008000;">&#93;</span><span style="color: #008000;">&#40;</span><span style="color: #0000ff;">void</span><span style="color: #000040;">*</span> ptr<span style="color: #008000;">&#41;</span>
<span style="color: #008000;">&#123;</span>
    MemoryManager<span style="color: #008080;">::</span><span style="color: #007788;">DeAlloc</span><span style="color: #008000;">&#40;</span>ptr<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
<span style="color: #008000;">&#125;</span></pre></div></div>

<p>Apart from being insufficient, because there will be allocations that will not be caught with this method, it is prone to error. Users of this architecture have to remember that instead of using the malloc() functions they have to use other methods. Apart for creating less portable code, there is a great risk of losing allocations by not following that rule. Furthermore, code that can not be modified will not be redirected to your allocator. This includes all kind of Third-Parties you may be using. For example, DirectX or OpenGL allocations will not be caught. And last but not least this method is incompatible with other Third-parties doing the same thing (like overriding the operator new). You will be faced with hard to solve linking problems here. An example of Third-Party overriding the operator new and delete is the MFC.</p>
<p>To overcome all these problems platform specific solutions are needed. Some platforms (in many consoles for example) offer clean mechanisms to catch all the memory allocations. In other more heterogeneous platforms, like Windows PC, things are a little bit more complex. A method very similar to the one in [5] is described here for the Windows platform.</p>
<p>For catching all the memory allocations, the relevant memory functions will be &#8216;hooked&#8217;. The mechanism of hooking a function consists in inserting an assembler jump instruction at the beginning of the function. This way, all the code invoking the original function will be redirect to the new implementation. In addition, the original function code must be preserved because it will be needed. A copy of the beginning of the function, the prologue, is stored before patching it.</p>
<p>Commercial libraries like <strong>Detours [6]</strong> offer the hooking mechanism described. Free alternatives like the <strong>TRET toolkit [7]</strong> or the <strong>google-perftools [4]</strong> offer the same functionality. But if you do not want to incorporate such big packages into your project you can implement your own hooking mechanism. You are going to need a disassembler to properly implement it. <strong>Libdasm [8]</strong> is a recommended package for this purpose.</p>
<p>Having the hooking mechanism ready, now the target functions to be hooked must be selected. Standard C functions like malloc(), calloc(), realloc() and free() functions must be hooked. Global operators new and delete must be hooked too. But that is not enough, because under Windows lots of allocations come directly from functions like HeapAlloc(). At least this function family must be hooked too. [9] is a good lecture to discover how the virtual process address is populated in a Windows platform.</p>

<div class="wp_syntax"><div class="code"><pre class="cpp" style="font-family:monospace;"><span style="color: #666666;">////////////////////////////////////////////////////////////////////////////</span>
<span style="color: #666666;">// Copy prologues for original CRT memory functions</span>
OrgMalloc <span style="color: #000080;">=</span> <span style="color: #0000ff;">static_cast</span><span style="color: #000080;">&lt;</span>MallocT<span style="color: #000080;">&gt;</span><span style="color: #008000;">&#40;</span>CopyPrologue<span style="color: #008000;">&#40;</span><span style="color: #000040;">&amp;</span><span style="color: #0000dd;">malloc</span><span style="color: #008000;">&#41;</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
OrgCalloc <span style="color: #000080;">=</span> <span style="color: #0000ff;">static_cast</span><span style="color: #000080;">&lt;</span>CallocT<span style="color: #000080;">&gt;</span><span style="color: #008000;">&#40;</span>CopyPrologue<span style="color: #008000;">&#40;</span><span style="color: #000040;">&amp;</span><span style="color: #0000dd;">calloc</span><span style="color: #008000;">&#41;</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
OrgRealloc <span style="color: #000080;">=</span> <span style="color: #0000ff;">static_cast</span><span style="color: #000080;">&lt;</span>ReallocT<span style="color: #000080;">&gt;</span><span style="color: #008000;">&#40;</span>CopyPrologue<span style="color: #008000;">&#40;</span><span style="color: #000040;">&amp;</span><span style="color: #0000dd;">realloc</span><span style="color: #008000;">&#41;</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
OrgFree <span style="color: #000080;">=</span> <span style="color: #0000ff;">static_cast</span><span style="color: #000080;">&lt;</span>FreeT<span style="color: #000080;">&gt;</span><span style="color: #008000;">&#40;</span>CopyPrologue<span style="color: #008000;">&#40;</span><span style="color: #000040;">&amp;</span><span style="color: #0000dd;">free</span><span style="color: #008000;">&#41;</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
OrgMSize <span style="color: #000080;">=</span> <span style="color: #0000ff;">static_cast</span><span style="color: #000080;">&lt;</span>MSizeT<span style="color: #000080;">&gt;</span><span style="color: #008000;">&#40;</span>CopyPrologue<span style="color: #008000;">&#40;</span><span style="color: #000040;">&amp;</span>_msize<span style="color: #008000;">&#41;</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
&nbsp;
<span style="color: #666666;">// Hook CRT memory functions    </span>
mCrtPatches<span style="color: #008000;">&#91;</span><span style="color: #0000dd;">0</span><span style="color: #008000;">&#93;</span> <span style="color: #000080;">=</span> PatchFunction<span style="color: #008000;">&#40;</span><span style="color: #000040;">&amp;</span><span style="color: #0000dd;">malloc</span>, <span style="color: #000040;">&amp;</span>NsAlloc<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
mCrtPatches<span style="color: #008000;">&#91;</span><span style="color: #0000dd;">1</span><span style="color: #008000;">&#93;</span> <span style="color: #000080;">=</span> PatchFunction<span style="color: #008000;">&#40;</span><span style="color: #000040;">&amp;</span><span style="color: #0000dd;">calloc</span>, <span style="color: #000040;">&amp;</span>NsCalloc<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
mCrtPatches<span style="color: #008000;">&#91;</span><span style="color: #0000dd;">2</span><span style="color: #008000;">&#93;</span> <span style="color: #000080;">=</span> PatchFunction<span style="color: #008000;">&#40;</span><span style="color: #000040;">&amp;</span><span style="color: #0000dd;">realloc</span>, <span style="color: #000040;">&amp;</span>NsRealloc<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
mCrtPatches<span style="color: #008000;">&#91;</span><span style="color: #0000dd;">3</span><span style="color: #008000;">&#93;</span> <span style="color: #000080;">=</span> PatchFunction<span style="color: #008000;">&#40;</span><span style="color: #000040;">&amp;</span><span style="color: #0000dd;">free</span>, <span style="color: #000040;">&amp;</span>NsDealloc<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
mCrtPatches<span style="color: #008000;">&#91;</span><span style="color: #0000dd;">4</span><span style="color: #008000;">&#93;</span> <span style="color: #000080;">=</span> PatchFunction<span style="color: #008000;">&#40;</span><span style="color: #000040;">&amp;</span>_msize, <span style="color: #000040;">&amp;</span>NsMSize<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
&nbsp;
<span style="color: #666666;">// Obtain the CRT HMODULE because we need to patch operatow new/delete </span>
<span style="color: #666666;">// functions and we cannot get a pointer to them</span>
HMODULE crtModule<span style="color: #008080;">;</span>
GetModuleHandleEx<span style="color: #008000;">&#40;</span>GET_MODULE_HANDLE_EX_FLAG_FROM_ADDRESS <span style="color: #000040;">|</span> 
    GET_MODULE_HANDLE_EX_FLAG_UNCHANGED_REFCOUNT, 
    <span style="color: #0000ff;">reinterpret_cast</span><span style="color: #000080;">&lt;</span>LPCTSTR<span style="color: #000080;">&gt;</span><span style="color: #008000;">&#40;</span><span style="color: #000040;">&amp;</span><span style="color: #0000dd;">malloc</span><span style="color: #008000;">&#41;</span>, <span style="color: #000040;">&amp;</span>crtModule<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
&nbsp;
<span style="color: #666666;">// operator new        </span>
mCrtPatches<span style="color: #008000;">&#91;</span><span style="color: #0000dd;">5</span><span style="color: #008000;">&#93;</span> <span style="color: #000080;">=</span> PatchFunction<span style="color: #008000;">&#40;</span>crtModule, <span style="color: #FF0000;">&quot;??2@YAPAXI@Z&quot;</span>, <span style="color: #000040;">&amp;</span>NsAlloc<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
<span style="color: #666666;">// operator delete</span>
&nbsp;
&nbsp;
mCrtPatches<span style="color: #008000;">&#91;</span><span style="color: #0000dd;">6</span><span style="color: #008000;">&#93;</span> <span style="color: #000080;">=</span> PatchFunction<span style="color: #008000;">&#40;</span>crtModule, <span style="color: #FF0000;">&quot;??3@YAXPAX@Z&quot;</span>, <span style="color: #000040;">&amp;</span>NsDealloc<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
<span style="color: #666666;">// operator new[]</span>
mCrtPatches<span style="color: #008000;">&#91;</span><span style="color: #0000dd;">7</span><span style="color: #008000;">&#93;</span> <span style="color: #000080;">=</span> PatchFunction<span style="color: #008000;">&#40;</span>crtModule, <span style="color: #FF0000;">&quot;??_U@YAPAXI@Z&quot;</span>, <span style="color: #000040;">&amp;</span>NsAlloc<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
<span style="color: #666666;">// operator delete[]</span>
mCrtPatches<span style="color: #008000;">&#91;</span><span style="color: #0000dd;">8</span><span style="color: #008000;">&#93;</span> <span style="color: #000080;">=</span> PatchFunction<span style="color: #008000;">&#40;</span>crtModule, <span style="color: #FF0000;">&quot;??_V@YAXPAX@Z&quot;</span>, <span style="color: #000040;">&amp;</span>NsDealloc<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
&nbsp;
<span style="color: #666666;">// Copy prologues for Windows memory functions    </span>
OrgHeapAlloc <span style="color: #000080;">=</span> <span style="color: #0000ff;">static_cast</span><span style="color: #000080;">&lt;</span>HeapAllocT<span style="color: #000080;">&gt;</span><span style="color: #008000;">&#40;</span>CopyPrologue<span style="color: #008000;">&#40;</span><span style="color: #000040;">&amp;</span>HeapAlloc<span style="color: #008000;">&#41;</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
OrgHeapRealloc <span style="color: #000080;">=</span> <span style="color: #0000ff;">static_cast</span><span style="color: #000080;">&lt;</span>HeapReallocT<span style="color: #000080;">&gt;</span><span style="color: #008000;">&#40;</span>CopyPrologue<span style="color: #008000;">&#40;</span><span style="color: #000040;">&amp;</span>HeapReAlloc<span style="color: #008000;">&#41;</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
OrgHeapFree <span style="color: #000080;">=</span> <span style="color: #0000ff;">static_cast</span><span style="color: #000080;">&lt;</span>HeapFreeT<span style="color: #000080;">&gt;</span><span style="color: #008000;">&#40;</span>CopyPrologue<span style="color: #008000;">&#40;</span><span style="color: #000040;">&amp;</span>HeapFree<span style="color: #008000;">&#41;</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
&nbsp;
<span style="color: #666666;">// Hook Windows Heap functions</span>
HMODULE hModule <span style="color: #000080;">=</span> GetModuleHandle<span style="color: #008000;">&#40;</span>NST<span style="color: #008000;">&#40;</span><span style="color: #FF0000;">&quot;kernel32&quot;</span><span style="color: #008000;">&#41;</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
mW32Patches<span style="color: #008000;">&#91;</span><span style="color: #0000dd;">0</span><span style="color: #008000;">&#93;</span> <span style="color: #000080;">=</span> PatchFunction<span style="color: #008000;">&#40;</span>hModule, <span style="color: #FF0000;">&quot;HeapAlloc&quot;</span>, <span style="color: #000040;">&amp;</span>NsHeapAlloc<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
mW32Patches<span style="color: #008000;">&#91;</span><span style="color: #0000dd;">1</span><span style="color: #008000;">&#93;</span> <span style="color: #000080;">=</span> PatchFunction<span style="color: #008000;">&#40;</span>hModule, <span style="color: #FF0000;">&quot;HeapReAlloc&quot;</span>, <span style="color: #000040;">&amp;</span>NsHeapRealloc<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
mW32Patches<span style="color: #008000;">&#91;</span><span style="color: #0000dd;">2</span><span style="color: #008000;">&#93;</span> <span style="color: #000080;">=</span> PatchFunction<span style="color: #008000;">&#40;</span>hModule, <span style="color: #FF0000;">&quot;HeapFree&quot;</span>, <span style="color: #000040;">&amp;</span>NsHeapFree<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span></pre></div></div>

<p>Care must be taken to avoid infinite recursions. For example, the CRT memory functions are implemented using Windows Heap functions. This can be a problem if your memory allocator is implemented with malloc and free. Another source of problems can be found when you have several versions of the CRT in the same process. This happens when distinct dynamic libraries are linked with the static library version of the CRT.</p>
<p>The recommended settings are as follows. First, link as much libraries as possible with the DLL version of the CRT. This is the CRT that will be used by the memory manager too. Second, hook the malloc functions from this DLL to be redirected to your allocator. And third, hook the HeapAlloc functions to redirect only the requests being done in heaps distinct from the CRT used by the memory manager. Requests coming from the main CRT will be hooked by the malloc functions.</p>
<p>This is illustrated with the following example:</p>

<div class="wp_syntax"><div class="code"><pre class="cpp" style="font-family:monospace;"><span style="color: #666666;">////////////////////////////////////////////////////////////////////////////</span>
LPVOID WINAPI NsHeapAlloc<span style="color: #008000;">&#40;</span>HANDLE hHeap, DWORD dwFlags, 
    DWORD_PTR dwBytes<span style="color: #008000;">&#41;</span> 
<span style="color: #008000;">&#123;</span>        
    <span style="color: #0000ff;">void</span><span style="color: #000040;">*</span> ptr <span style="color: #000080;">=</span> OrgHeapAlloc<span style="color: #008000;">&#40;</span>hHeap, dwFlags, dwBytes<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
&nbsp;
    <span style="color: #0000ff;">if</span> <span style="color: #008000;">&#40;</span>hHeap <span style="color: #000040;">!</span><span style="color: #000080;">=</span> <span style="color: #0000ff;">reinterpret_cast</span><span style="color: #000080;">&lt;</span>HANDLE<span style="color: #000080;">&gt;</span><span style="color: #008000;">&#40;</span>_get_heap_handle<span style="color: #008000;">&#40;</span><span style="color: #008000;">&#41;</span><span style="color: #008000;">&#41;</span><span style="color: #008000;">&#41;</span>
    <span style="color: #008000;">&#123;</span>
        MemoryManager<span style="color: #008080;">::</span><span style="color: #007788;">Track</span><span style="color: #008000;">&#40;</span>ptr, dwBytes<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
    <span style="color: #008000;">&#125;</span>    
&nbsp;
    <span style="color: #0000ff;">return</span> ptr<span style="color: #008080;">;</span>       
<span style="color: #008000;">&#125;</span></pre></div></div>

<p><big><strong>Memory tracking</strong></big></p>
<p>Apart from the redirection mechanism to your own allocator, having control of all the memory allocations allows you to implement different tracking mechanisms that can give you valuable information. The tracking mechanism should be only active in the selected build configurations because it can add a considerable overhead. Two examples of tracking mechanisms are briefly described:</p>
<ul>
<li><strong>Leak detector</strong>: the memory manager can keep a list of the currently allocated blocks with extra information about them: when it was allocated, which thread requested it, in which module, stack trace, etc. When this tracking mechanism is active, the memory manager increases the memory allocation sizes by 4 bytes (a pointer, 8 bytes in 64 bits) to hold a pointer to this node info. When your program exits, the memory manager can report a list of the blocks still allocated and that are, probably, leaks. Having a leak detector active while you are building your software is an invaluable feature that will improve the quality of your architecture.
</li>
<li><strong>Memory Stats</strong>: with a tagging mechanism you can have exact control of where your memory allocations are being requested. Among all the tagging mechanisms, the stack based is the less intrusive and probably more flexible. Whenever a new memory block is requested to the memory manager, it is attached to the current stat block.
<p>The current stat block can be pushed / popped with macros like in the following example:</p>

<div class="wp_syntax"><div class="code"><pre class="cpp" style="font-family:monospace;"><span style="color: #0000ff;">void</span> DX9RenderSystem<span style="color: #008080;">::</span><span style="color: #007788;">Init</span><span style="color: #008000;">&#40;</span><span style="color: #008000;">&#41;</span>
<span style="color: #008000;">&#123;</span>  
    NS_PROFILE<span style="color: #008000;">&#40;</span><span style="color: #FF0000;">&quot;rendersystem&quot;</span>, Mem<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>    
&nbsp;
    mD3D <span style="color: #000080;">=</span> Direct3DCreate9<span style="color: #008000;">&#40;</span>D3D_SDK_VERSION<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>    
&nbsp;
...</pre></div></div>

<p>In the above example, all the memory allocated by the Init() function (and by all the functions called from it, including SLT allocations, etc) are accumulated in the &#8220;rendersystem&#8221; node. When exiting that function, the &#8220;rendersystem&#8221; node is popped from the stack.</p>
<p>Each frame, the memory manager could collect all the node stats and information about the memory usage could be listed. For example:</p>

<div class="wp_syntax"><div class="code"><pre class="cpp" style="font-family:monospace;">Node                            Memory         Memory <span style="color: #008000;">&#40;</span>including children<span style="color: #008000;">&#41;</span>
<span style="color: #000040;">--------------------------------------------------------------------------</span>
<span style="color: #000040;">*</span>root                  <span style="color: #0000dd;">0</span><span style="color: #000040;">%</span>      <span style="color: #0000dd;">10667320</span> <span style="color: #008000;">&#40;</span> <span style="color: #0000dd;">56</span><span style="color: #000040;">%</span><span style="color: #008000;">&#41;</span>    <span style="color: #0000dd;">19029395</span> <span style="color: #008000;">&#40;</span><span style="color: #0000dd;">100</span><span style="color: #000040;">%</span><span style="color: #008000;">&#41;</span>
RenderSystem          <span style="color: #0000dd;">56</span><span style="color: #000040;">%</span>       <span style="color: #0000dd;">6381409</span> <span style="color: #008000;">&#40;</span> <span style="color: #0000dd;">34</span><span style="color: #000040;">%</span><span style="color: #008000;">&#41;</span>     <span style="color: #0000dd;">6386525</span> <span style="color: #008000;">&#40;</span> <span style="color: #0000dd;">34</span><span style="color: #000040;">%</span><span style="color: #008000;">&#41;</span>
AudioSystem           <span style="color: #0000dd;">90</span><span style="color: #000040;">%</span>       <span style="color: #0000dd;">1835510</span> <span style="color: #008000;">&#40;</span> <span style="color: #0000dd;">10</span><span style="color: #000040;">%</span><span style="color: #008000;">&#41;</span>     <span style="color: #0000dd;">1836646</span> <span style="color: #008000;">&#40;</span> <span style="color: #0000dd;">10</span><span style="color: #000040;">%</span><span style="color: #008000;">&#41;</span>
reflection            <span style="color: #0000dd;">99</span><span style="color: #000040;">%</span>        <span style="color: #0000dd;">105028</span> <span style="color: #008000;">&#40;</span>  <span style="color: #0000dd;">1</span><span style="color: #000040;">%</span><span style="color: #008000;">&#41;</span>      <span style="color: #0000dd;">106576</span> <span style="color: #008000;">&#40;</span>  <span style="color: #0000dd;">1</span><span style="color: #000040;">%</span><span style="color: #008000;">&#41;</span>
symbols              <span style="color: #0000dd;">100</span><span style="color: #000040;">%</span>         <span style="color: #0000dd;">29300</span> <span style="color: #008000;">&#40;</span>  <span style="color: #0000dd;">0</span><span style="color: #000040;">%</span><span style="color: #008000;">&#41;</span>       <span style="color: #0000dd;">29300</span> <span style="color: #008000;">&#40;</span>  <span style="color: #0000dd;">0</span><span style="color: #000040;">%</span><span style="color: #008000;">&#41;</span>
factory              <span style="color: #0000dd;">100</span><span style="color: #000040;">%</span>         <span style="color: #0000dd;">10800</span> <span style="color: #008000;">&#40;</span>  <span style="color: #0000dd;">0</span><span style="color: #000040;">%</span><span style="color: #008000;">&#41;</span>       <span style="color: #0000dd;">11352</span> <span style="color: #008000;">&#40;</span>  <span style="color: #0000dd;">0</span><span style="color: #000040;">%</span><span style="color: #008000;">&#41;</span>
<span style="color: #000040;">--------------------------------------------------------------------------</span></pre></div></div>

</li>
</ul>
<p>&nbsp;</p>
<h4>2. Memory  Pools</h4>
<p>A memory pool preallocates a number of fixed size memory blocks from the memory allocator. The pool offers the same functionality than the memory allocator but due to the fixed size restriction it can offer several important advantages:</p>
<ul>
<li><strong>Avoiding Dynamic Memory Allocation</strong>: except for preallocated blocks, whose memory is requested to the memory allocator when the pools are initialized, each allocation operation done by the Memory Pool implies no heap allocation. This results in faster operation times, nearly in the same order of magnitude than stack based allocations</li>
<li><strong>Constant time for Allocation / Deallocation</strong>: thanks to the fixed size restriction, an optimal implementation of a Memory Pool can allocate and deallocate block in O(1) time</li>
<li><strong>Cache coherence</strong>: all the blocks preallocated in the Memory Pool are located in a contiguous block of memory improving the cache performance</li>
<li><strong>Minimum Memory Fragmentation</strong>: in a Memory Pool there is no external fragmentation and the internal memory fragmentation when the pool is full is insignificant (there is only internal memory fragmentation in the big block allocated at initialization time)</li>
</ul>
<p>Small objects (few bytes) that cannot be stored in the stack should me managed by Pools. A good implementation can be found in Alexandrescu&#8217;s book [10]. To encapsulate the pool usage class operator new and delete can be defined in the following way:</p>

<div class="wp_syntax"><div class="code"><pre class="cpp" style="font-family:monospace;"><span style="color: #666666;">////////////////////////////////////////////////////////////////////////////</span>
<span style="color: #339900;">#define MANAGED_BY_POOL(Pool)\
    static void *operator new(NsSize size)\
    {\      </span>
        <span style="color: #0000ff;">return</span> Pool<span style="color: #000040;">-</span><span style="color: #000080;">&gt;</span>Allocate<span style="color: #008000;">&#40;</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>\
    <span style="color: #008000;">&#125;</span>\
\
    <span style="color: #0000ff;">static</span> <span style="color: #0000ff;">void</span> operator <span style="color: #0000dd;">delete</span><span style="color: #008000;">&#40;</span><span style="color: #0000ff;">void</span><span style="color: #000040;">*</span> ptr<span style="color: #008000;">&#41;</span>\
    <span style="color: #008000;">&#123;</span>\
        Pool<span style="color: #000040;">-</span><span style="color: #000080;">&gt;</span>Deallocate<span style="color: #008000;">&#40;</span>ptr<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>\
    <span style="color: #008000;">&#125;</span>\
\
    <span style="color: #0000ff;">static</span> <span style="color: #0000ff;">void</span> <span style="color: #000040;">*</span>operator <span style="color: #0000dd;">new</span><span style="color: #008000;">&#40;</span>NsSize size, <span style="color: #0000ff;">void</span><span style="color: #000040;">*</span> placementPtr<span style="color: #008000;">&#41;</span>\
    <span style="color: #008000;">&#123;</span>\       
        <span style="color: #0000ff;">return</span> placementPtr<span style="color: #008080;">;</span>\
    <span style="color: #008000;">&#125;</span>\
\
    <span style="color: #0000ff;">static</span> <span style="color: #0000ff;">void</span> operator <span style="color: #0000dd;">delete</span><span style="color: #008000;">&#40;</span><span style="color: #0000ff;">void</span><span style="color: #000040;">*</span> ptr, <span style="color: #0000ff;">void</span><span style="color: #000040;">*</span> placementPtr<span style="color: #008000;">&#41;</span>\
    <span style="color: #008000;">&#123;</span>\
        NS_UNUSED<span style="color: #008000;">&#40;</span>ptr<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>\
        NS_UNUSED<span style="color: #008000;">&#40;</span>placementPtr<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>\
    <span style="color: #008000;">&#125;</span></pre></div></div>

<p>The following example implements a small struct that is transparently managed by a pool.</p>

<div class="wp_syntax"><div class="code"><pre class="cpp" style="font-family:monospace;"><span style="color: #0000ff;">extern</span> ObjectPool<span style="color: #000080;">&lt;</span>Point, <span style="color: #0000dd;">32</span><span style="color: #000080;">&gt;</span><span style="color: #000040;">*</span> gPointPool<span style="color: #008080;">;</span>
&nbsp;
<span style="color: #666666;">////////////////////////////////////////////////////////////////////////////</span>
<span style="color: #0000ff;">struct</span> Point
<span style="color: #008000;">&#123;</span>
    NsFloat32 x, y, z<span style="color: #008080;">;</span>
&nbsp;
    MANAGED_BY_POOL<span style="color: #008000;">&#40;</span>gPointPool<span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
<span style="color: #008000;">&#125;</span><span style="color: #008080;">;</span></pre></div></div>

<p><big><strong>STL pools</strong></big></p>
<p>STL is the perfect place where to use Memory Pools. Node base containers (list, map, hash_map) make heavy use of dynamic memory. To avoid the costs described above it is recommended to use Pools with STL. Memory used by STL containers is controlled by allocators that are passed to the container when it is created. [11] is a very recommended read that gives a good description of how to implement STL allocators. There you will learn how to code an allocator that uses a memory pool.</p>

<div class="wp_syntax"><div class="code"><pre class="cpp" style="font-family:monospace;"><span style="color: #0000ff;">struct</span> TrackInfo
<span style="color: #008000;">&#123;</span>
    NsSize size<span style="color: #008080;">;</span>
    MemProfilerNode<span style="color: #000040;">*</span> profilerNode<span style="color: #008080;">;</span>
<span style="color: #008000;">&#125;</span><span style="color: #008080;">;</span>
&nbsp;
<span style="color: #0000ff;">typedef</span> std<span style="color: #008080;">::</span><span style="color: #007788;">map</span><span style="color: #000080;">&lt;</span>
    <span style="color: #0000ff;">const</span> <span style="color: #0000ff;">void</span><span style="color: #000040;">*</span>, 
    TrackInfo, 
    std<span style="color: #008080;">::</span><span style="color: #007788;">less</span><span style="color: #000080;">&lt;</span><span style="color: #0000ff;">const</span> <span style="color: #0000ff;">void</span><span style="color: #000040;">*</span><span style="color: #000080;">&gt;</span>, 
    PoolStlAllocator<span style="color: #000080;">&lt;</span>std<span style="color: #008080;">::</span><span style="color: #007788;">pair</span><span style="color: #000080;">&lt;</span><span style="color: #0000ff;">const</span> <span style="color: #0000ff;">void</span><span style="color: #000040;">*</span>, TrackInfo<span style="color: #000080;">&gt;</span>, <span style="color: #0000dd;">1024</span><span style="color: #000080;">&gt;</span> 
&nbsp;
<span style="color: #000080;">&gt;</span> ExternalTrack<span style="color: #008080;">;</span>
&nbsp;
<span style="color: #666666;">////////////////////////////////////////////////////////////////////////////</span>
ExternalTrack mExternalTrack<span style="color: #008080;">;</span></pre></div></div>

<h4>3. Garbage Collection and Smart Pointers</h4>
<p>The less interaction you have with the memory manager the better. Memory management errors are probably one of the worst problems in software engineering. Memory leaks, buffer overruns, etc. That is why modern languages like Java, Python and C# try to avoid exposing memory management to programmers using memory garbage collection techniques.</p>
<p>In C++ there is no automatic garbage collection but it can be implemented using object reference counters and smart pointers. This way, manual new and deletes can be avoided reducing the probability of incurring into memory errors. In Alexandrescu&#8217;s book [12] all the details for implementing smart pointers can be found.</p>
<p>To implement the reference counter mechanism each class must derive from a base class defining an integer for the counter and public functions AddReference() and Release() to manage the counter. The Release() function is in charge of deleting the instance when its counter reaches to zero. An intrusive implementation for reference counting where the counter is stored in the class itself is the recommended one for reasons of efficiency.</p>

<div class="wp_syntax"><div class="code"><pre class="cpp" style="font-family:monospace;"><span style="color: #666666;">////////////////////////////////////////////////////////////////////////////</span>
<span style="color: #0000ff;">class</span> BaseRefCounted<span style="color: #008080;">:</span> <span style="color: #0000ff;">public</span> BaseObject
<span style="color: #008000;">&#123;</span>
<span style="color: #0000ff;">public</span><span style="color: #008080;">:</span>
    <span style="color: #666666;">/// Default constructor.</span>
    BaseRefCounted<span style="color: #008000;">&#40;</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
&nbsp;
    <span style="color: #666666;">/// Increments reference count</span>
    <span style="color: #666666;">/// \returns Number of references after incrementing one</span>
    <span style="color: #0000ff;">int</span> AddReference<span style="color: #008000;">&#40;</span><span style="color: #008000;">&#41;</span> <span style="color: #0000ff;">const</span><span style="color: #008080;">;</span>
&nbsp;
    <span style="color: #666666;">/// Decrements reference count, deleting the object when reaches 0</span>
    <span style="color: #666666;">/// \returns Number of references after releasing one</span>
    <span style="color: #0000ff;">int</span> Release<span style="color: #008000;">&#40;</span><span style="color: #008000;">&#41;</span> <span style="color: #0000ff;">const</span><span style="color: #008080;">;</span>
&nbsp;
    <span style="color: #666666;">/// Gets current reference count for the object</span>
    <span style="color: #666666;">/// \return Object number of references</span>
    <span style="color: #0000ff;">int</span> GetNumReferences<span style="color: #008000;">&#40;</span><span style="color: #008000;">&#41;</span> <span style="color: #0000ff;">const</span><span style="color: #008080;">;</span>
&nbsp;
<span style="color: #0000ff;">protected</span><span style="color: #008080;">:</span>
    <span style="color: #666666;">/// Destructor. Base classes are abstract classes. Destructor is pure </span>
    <span style="color: #666666;">/// virtual. This destructor is declared protected to avoid deleting </span>
    <span style="color: #666666;">/// reference counted objects. Release() should be used in this case.</span>
    <span style="color: #0000ff;">virtual</span> ~BaseRefCounted<span style="color: #008000;">&#40;</span><span style="color: #008000;">&#41;</span> <span style="color: #000080;">=</span> <span style="color: #0000dd;">0</span><span style="color: #008080;">;</span>
&nbsp;
<span style="color: #0000ff;">private</span><span style="color: #008080;">:</span>
    <span style="color: #666666;">/// Base classes are non-copyable objects</span>
    <span style="color: #666666;">//@{</span>
    BaseRefCounted<span style="color: #008000;">&#40;</span><span style="color: #0000ff;">const</span> BaseRefCounted<span style="color: #000040;">&amp;</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
    BaseRefCounted<span style="color: #000040;">&amp;</span> operator<span style="color: #000080;">=</span><span style="color: #008000;">&#40;</span><span style="color: #0000ff;">const</span> BaseRefCounted<span style="color: #000040;">&amp;</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
    <span style="color: #666666;">//@}</span>
&nbsp;
    <span style="color: #0000ff;">volatile</span> mutable <span style="color: #0000ff;">int</span> mRefCount<span style="color: #008080;">;</span>
<span style="color: #008000;">&#125;</span><span style="color: #008080;">;</span></pre></div></div>

<p>There are alternatives to reference counting (read [13] for a mark-sweep algorithm) but it is probably not worth the effort using such a sophisticated algorithm if you are mixing in your program C++ code with other high level scripting language where garbage collection is implemented natively.</p>
<p>The code becomes a lot cleaner and safer when using smart pointers:</p>

<div class="wp_syntax"><div class="code"><pre class="cpp" style="font-family:monospace;"><span style="color: #666666;">////////////////////////////////////////////////////////////////////////////</span>
<span style="color: #666666;">/// d0 and d1 are automatically destroyed when no code references them</span>
<span style="color: #666666;">////////////////////////////////////////////////////////////////////////////</span>
Ptr<span style="color: #000080;">&lt;</span>IStream<span style="color: #000080;">&gt;</span> d0 <span style="color: #000080;">=</span> fileSystem<span style="color: #000040;">-</span><span style="color: #000080;">&gt;</span>OpenRead<span style="color: #008000;">&#40;</span><span style="color: #FF0000;">&quot;Dat/Stats/s0.dat&quot;</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
d0<span style="color: #000040;">-</span><span style="color: #000080;">&gt;</span>Read<span style="color: #008000;">&#40;</span>buff, <span style="color: #0000dd;">512</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
&nbsp;
Ptr<span style="color: #000080;">&lt;</span>IStream<span style="color: #000080;">&gt;</span> d1 <span style="color: #000080;">=</span> fileSystem<span style="color: #000040;">-</span><span style="color: #000080;">&gt;</span>OpenRead<span style="color: #008000;">&#40;</span><span style="color: #FF0000;">&quot;Dat/Stats/s0.dat&quot;</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span>
d1<span style="color: #000040;">-</span><span style="color: #000080;">&gt;</span>Read<span style="color: #008000;">&#40;</span>buff, <span style="color: #0000dd;">512</span><span style="color: #008000;">&#41;</span><span style="color: #008080;">;</span></pre></div></div>

<h4>4. Conclusion</h4>
<p>This article described three memory management layers: the memory allocator, the pool allocator and the garbage collector allocator. Although memory management is important in the architecture, its usage must be as transparent as possible. This is the principle behind the layer subdivision proposed in this article. </p>
<p>Thanks for reading. Comments are opened for discussing anything related to it: bugs, suggestions, improvements, questions&#8230;<br />
&nbsp;</p>
<div class="hr"></div>
<ol>
<li>
<a href="http://g.oswego.edu/dl/html/malloc.html">Doug Lea Memory Allocator (dlmalloc)</a>
</li>
<li>
<a href="http://www.malloc.de/">Wolfram Gloger&#8217;s malloc homepage</a>
</li>
<li>
<a href="http://www.hoard.org/">The Hoard Memory Allocator</a>
</li>
<li>
<a href="http://code.google.com/p/google-perftools/"> google-perftools. Fast, multi-threaded malloc() and nifty performance analysis tools</a>
</li>
<li>
<a href="http://www.gamasutra.com/view/feature/1430/monitoring_your_pcs_memory_usage_.php">Monitoring Your PC&#8217;s Memory Usage For Game Development</a>
</li>
<li>
<a href="http://research.microsoft.com/sn/detours/">Detours instrumenting library homepage</a>
</li>
<li>
<a href="http://mconover.openrce.org/">TRET: The Reverse Engineering Toolkit</a>
</li>
<li>
<a href="http://www.nologin.org/main.pl?action=codeView&#038;codeId=49">libdasm: simple x86 disassembly library</a>
</li>
<li>
<a href="http://download.microsoft.com/download/e/3/c/e3c25fea-2b53-4174-8729-29a4ec16583b/Why%20Your%20Windows%20Game%20Won%27t%20Run%20In%202,147,352,576%20Bytes.zip">Why Your Windows Game Won&#8217;t Run In 2,147,352,576 Bytes</a>
</li>
<li>
<a href="http://loki-lib.sourceforge.net/index.php?n=Idioms.SmallObject">Loki Library &#8211; Small Object Implementation</a>
</li>
<li>
<a href="http://www.tantalon.com/pete/customallocators.ppt">Pete Isensee on Custom STL Allocators</a>
</li>
<li>
<a href="http://www.amazon.com/dp/0201704315/ref=nosim?tag=ent0c-20">Modern C++ Design</a>
</li>
<li>
<a href="http://www.hpl.hp.com/personal/Hans_Boehm/gc/">A garbage collector for C and C++</a>
</li>
</ol>
<div class="clearer"></div>
]]></content:encoded>
			<wfw:commentRss>http://entland.homelinux.com/blog/2008/08/19/practical-efficient-memory-management/feed/</wfw:commentRss>
		<slash:comments>27</slash:comments>
		</item>
		<item>
		<title>GameLab 2008</title>
		<link>http://entland.homelinux.com/blog/2008/06/27/gamelab-2008/</link>
		<comments>http://entland.homelinux.com/blog/2008/06/27/gamelab-2008/#comments</comments>
		<pubDate>Fri, 27 Jun 2008 00:01:21 +0000</pubDate>
		<dc:creator>ent</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[Videogames]]></category>

		<guid isPermaLink="false">http://entland.homelinux.com/blog/?p=111</guid>
		<description><![CDATA[
Oviedo will hold the fourth edition of the GameLab conferences on July 10th &#8211; 11th. Undoubtedly the place to be if you want to meet lot of interesting people related to the Videogames industry. This event is growing bigger and bigger each time and is becoming a point of reference here in Spain.
Like the last [...]]]></description>
			<content:encoded><![CDATA[<div class="img-shadow"><img src="http://entland.homelinux.com/images/GameLab2008.jpg" alt="Gamelab 2008"/></div>
<p>Oviedo will hold the fourth edition of the <a href="http://www.gamelab.es/">GameLab</a> conferences on July 10th &#8211; 11th. Undoubtedly the place to be if you want to meet lot of interesting people related to the Videogames industry. This event is growing bigger and bigger each time and is becoming a point of reference here in Spain.</p>
<p>Like the <a href="http://entland.homelinux.com/blog/2007/06/26/see-you-in-gamelaboviedo/">last year</a> I will be giving a course on Advanced real-time 3D techniques the first day.</p>
<p>Hope to see you there! <img src='http://entland.homelinux.com/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<div class="clearer"></div>
]]></content:encoded>
			<wfw:commentRss>http://entland.homelinux.com/blog/2008/06/27/gamelab-2008/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>Teaching at Oviedo &#8211; Noesis Engine</title>
		<link>http://entland.homelinux.com/blog/2008/04/23/teaching-at-oviedo-noesis-engine/</link>
		<comments>http://entland.homelinux.com/blog/2008/04/23/teaching-at-oviedo-noesis-engine/#comments</comments>
		<pubDate>Wed, 23 Apr 2008 10:23:17 +0000</pubDate>
		<dc:creator>ent</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[Videogames]]></category>
		<category><![CDATA[codepixel]]></category>
		<category><![CDATA[GameLab]]></category>
		<category><![CDATA[Noesis Engine]]></category>
		<category><![CDATA[Oviedo University]]></category>

		<guid isPermaLink="false">http://entland.homelinux.com/blog/2008/04/23/teaching-at-oviedo-noesis-engine/</guid>
		<description><![CDATA[
This weekend just finished the course I have been giving at the Oviedo University. The course is about programming graphic engines for videogames. In 30 hours / 6 days I tried to explain how to architect a solid engine for realtime purposes.
This is the first time I talk about the task I have been involved [...]]]></description>
			<content:encoded><![CDATA[<div class="img-shadow"><img src="http://entland.homelinux.com/images/Oviedo2008.jpg" alt="Jesus teaching at Oviedo University"/></div>
<p>This weekend just finished the course I have been giving at the Oviedo University. The course is about programming graphic engines for videogames. In 30 hours / 6 days I tried to explain how to architect a solid engine for realtime purposes.</p>
<p>This is the first time I talk about the task I have been involved in the last months: <strong>Noesis Engine</strong> (a provisional name). Till now, it has been developed by a very small team and contributed to two commercial products. A small videogame is under construction now. I expect to give more information about this in the future.</p>
<p>A link to the first session of the course: <a href="http://entland.homelinux.com/blog/wp-content/articledata/Noesis-Core.7z">Noesis &#8211; Core</a>. The document reveals not too much information if you are not attending the class, but may be you find something interesting there (or wrong, and we can discuss). The first part is a global introduction to the course, the second one is about the core technology being used for the rest of the course. The document is in Spanish, I have no time now to translate it (I would be really grateful to any volunteer helping in this). Sorry for that.</p>
<p>And following with Spanish documents, I contributed to several tutorials in <a href="http://www.codepixel.com/">codepixel</a>, a daily mandatory read if you understand Spanish, about the same topic, Graphics in Realtime.  The hard part was done by <a href="http://www.derethor.net/">Javier Loureiro / derethor</a>. <a href="http://rgba.scenesp.org/iq/">Iq / RGBA</a> helped to this documents too, The tutorials:</p>
<ul>
<li><a href="http://www.codepixel.com/content/view/5135/34/">Introduction</a></li>
<li><a href="http://www.codepixel.com/content/view/5136/34/">The Graphic Driver</a></li>
<li><a href="http://www.codepixel.com/content/view/5137/34/">The Scene Graph</a></li>
</ul>
<p>And nothing more for today. As you can see I am still alive and working really hard. <img src='http://entland.homelinux.com/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p><strong>UPDATE:</strong> Thanks to <a href="http://www.crazypointer.blogspot.com/">Ricardo Amores</a> and <a href="http://blogdemiguel.es/">Miguel Herrero</a> for translating the powerpoint to English. It can be downloaded from: <a href="http://entland.homelinux.com/blog/wp-content/articledata/Noesis-Core-Eng.7z">Noesis &#8211; Core &#8211; Eng</a></p>
<div class="clearer"></div>
]]></content:encoded>
			<wfw:commentRss>http://entland.homelinux.com/blog/2008/04/23/teaching-at-oviedo-noesis-engine/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
	</channel>
</rss>
