




<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Sinbadsoft</title>
	<atom:link href="http://www.sinbadsoft.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.sinbadsoft.com</link>
	<description>Software Wizards</description>
	<lastBuildDate>Mon, 13 May 2013 08:46:02 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.5.1</generator>
		<item>
		<title>Sorting big files using k-way merge sort</title>
		<link>http://www.sinbadsoft.com/blog/sorting-big-files-using-k-way-merge-sort/</link>
		<comments>http://www.sinbadsoft.com/blog/sorting-big-files-using-k-way-merge-sort/#comments</comments>
		<pubDate>Tue, 05 Feb 2013 20:43:36 +0000</pubDate>
		<dc:creator>Chaker Nakhli</dc:creator>
				<category><![CDATA[.NET]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Algorithms]]></category>
		<category><![CDATA[C#]]></category>
		<category><![CDATA[Code Kata]]></category>

		<guid isPermaLink="false">http://www.sinbadsoft.com/?p=1131</guid>
		<description><![CDATA[In this post I am sharing a code kata we did recently at Sinbadsoft: sorting large files with limited memory. By large files I mean files that would not fit in the available amount of RAM. A regular sort algorithm will not perform well in this case as the whole data cannot be loaded at [...]]]></description>
				<content:encoded><![CDATA[<p>In this post I am sharing a code kata we did recently at <a href="/">Sinbadsoft</a>: sorting large files with limited memory. By large files I mean files that would not fit in the available amount of RAM. A regular sort algorithm will not perform well in this case as the whole data cannot be loaded at once in memory. The solution of this classic problem is <a href="http://en.wikipedia.org/wiki/External_sorting" target="_blank">external sorting</a>. It consists of two steps: first, split the file into small chunks that would fit in memory, load each chunk, sort it, and write it back on disk. Second, perform a k-way merge on all the sorted chunks to get the final result.<br />
<span id="more-1131"></span><br />
In order to simplify the problem, we limited ourselves to the following conditions:</p>
<ul>
<li>We only deal with text files as input. We sort file lines using string comparison.</li>
<li>Lines have a &#8220;reasonable&#8221; maximum length and consistent line endings.</li>
</ul>
<p>We will be using C# here but it wouldn&#8217;t be much different in Java.</p>
<h1>Split the big file in small sorted chunks</h1>
<p>The first step is quite easy: we open a stream reader on the big file and keep reading until we have enough data in our buffer to fill a file chunk. When the buffer is full, we sort the data in memory and write it on disk on a temp file. We continue until the whole input file is processed. As a result, we have K chunk files on disk. The SplitInSortedChunks method below implements this algorithm. It returns the chunk file paths collection as a result.</p>
<pre class="brush: csharp; title: ; notranslate">
IEnumerable&lt;string&gt; SplitInSortedChunks(string filepath, long chunkSize)
{
    var buffer = new List&lt;string&gt;();
    var size = 0L;

    using (var reader = new StreamReader(filepath))
    for (string line = reader.ReadLine(); line != null; line = reader.ReadLine())
    {
        if (size + line.Length + 2 &gt;= chunkSize)
        {
            size = 0L;
            yield return FlushBuffer(buffer);
        }

        size += line.Length + 2;
        buffer.Add(line);
    }

    if (buffer.Any())
    {
        yield return FlushBuffer(buffer);
    }
}
</pre>
<p>The magic number 2 used to increment the current size is the supposed length of the line ending used in the input file. It&#8217;s probably more reliable here to use Environment.NewLine.Length or have this value as parameter.</p>
<p>The FlushBuffer method implementation is quite simple:</p>
<pre class="brush: csharp; title: ; notranslate">
string FlushBuffer(List&lt;string&gt; buffer)
{
    buffer.Sort(StringComparer.Ordinal);
    var chunkFilePath = Path.GetTempFileName();
    File.WriteAllLines(chunkFilePath, buffer);
    buffer.Clear();
    return chunkFilePath;
}
</pre>
<h1>K-way merge</h1>
<p>Now that we have K small files filled with sorted data, we are going to merge them in order to build our final result file. To perform the merge, we need first to open K file streams, one for each chunk file. Then, we have to read one line per file and take the smallest line each time. We repeat this operation until we finish reading all the chunks. In order to avoid doing a lot of string comparisons at each iteration, we keep the current line values in an ordered structure.</p>
<p>The algorithm is quite straightforward. Still we made an interesting mistake in our first implementation that I am going to share with you here.</p>
<h2>A buggy implementation</h2>
<p>Our K-way merge function takes the chunk file paths and the desired result file path as parameters. In order to save space on disk, we can use the input file path as output path; but this is not mandatory.</p>
<pre class="brush: csharp; title: ; notranslate">
private static void KwayMerge(IEnumerable&lt;string&gt; chunkFilePaths, string resultFilePath)
{
    var chunkReaders = chunkFilePaths
        .Select(path =&gt; new StreamReader(path))
        .Where(chunkReader =&gt; !chunkReader.EndOfStream)
        .ToList();

    // BUG: SortedDictionary won't cut it here
    var sortedDict = new SortedDictionary&lt;string, TextReader&gt;();
    chunkReaders.ForEach(chunkReader =&gt; sortedDict.Add(chunkReader.ReadLine(), chunkReader));

    using (var resultWriter = new StreamWriter(resultFilePath, false))       
    while (sortedDict.Any())
    {
        var line = sortedDict.Keys.First();
        var chunkReader = sortedDict[line];
        sortedDict.Remove(line);

        resultWriter.WriteLine(line);

        var nextLine = chunkReader.ReadLine();
        if (nextLine != null)
        {
            sortedDict.Add(nextLine, chunkReader);
        }
        else
        {
            chunkReader.Dispose();
        }
    }
}
</pre>
<p>This implementation worked fine until we tested it with files with duplicate lines. It crashed complaining about duplicate keys in the <a href="http://msdn.microsoft.com/en-us/library/f7fta44c.aspx" target="_blank">SortedDictionary</a>. Ouch! Notice that we would&#8217;ve had the same problem with <a href="http://msdn.microsoft.com/en-us/library/ms132319.aspx" target="_blank">SortedList</a> or <a href="http://msdn.microsoft.com/en-us/library/dd412070.aspx" target="_blank">SortedSet</a>. We need a sorted multiset structure here to fix the bug.</p>
<h2>A better implementation</h2>
<p>In order to fix the problem we have with duplicate lines, we are going to use a <a href="http://en.wikipedia.org/wiki/Priority_queue" target="_blank">priority queue</a> (based on a max <a href="http://en.wikipedia.org/wiki/Heap_(data_structure)" target="_blank">heap</a>) as a sorted multiset. We have already discussed the implementation of priority queue and heap data structures in a <a href="/blog/binary-heap-heap-sort-and-priority-queue/">previous post</a> (the implementation is also on <a href="https://github.com/Sinbadsoft/Sinbadsoft.Lib.Collections" target="_blank">github</a> and available as <a href="http://nuget.org/packages/Sinbadsoft.Lib.Collections/" target="_blank">nuget</a>).</p>
<pre class="brush: csharp; title: ; notranslate">
void KwayMerge(IEnumerable&lt;string&gt; chunkFilePaths, string resultFilePath)
{
    var chunkReaders = chunkFilePaths
        .Select(path =&gt; new StreamReader(path))
        .Where(chunkReader =&gt; !chunkReader.EndOfStream)
        .ToList();

    var queue = new PriorityQueue&lt;string,TextReader&gt;((x,y) =&gt; -string.CompareOrdinal(x,y));
    chunkReaders.ForEach(chunkReader =&gt; queue.Enqueue(chunkReader.ReadLine(), chunkReader));
    
    using (var resultWriter = new StreamWriter(resultFilePath, false))
    while (queue.Count &gt; 0)
    {
        var smallest = queue.Dequeue();
        var line = smallest.Key;
        var chunkReader = smallest.Value;

        resultWriter.WriteLine(line);

        var nextLine = chunkReader.ReadLine();
        if (nextLine != null)
        {
            queue.Enqueue(nextLine, chunkReader);
        }
        else
        {
            chunkReader.Dispose();
        }
    }
}
</pre>
<h1>Conclusion</h1>
<p>Now that we have our functions to split the input file in chunks and to do the k-way merge, we simply need to chain them in order to sort.</p>
<pre class="brush: csharp; title: ; notranslate">
void SortFile(string filepath, string resultFilePath, long chunkSize)
{
    var chunkFilePaths = SplitInSortedChunks(filepath, chunkSize);
    KwayMerge(chunkFilePaths, resultFilePath);
}
</pre>
<p>That&#8217;s it for this code kata! If you have comments or suggestions on how to improve this algorithm don&#8217;t hesitate to comment below.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.sinbadsoft.com/blog/sorting-big-files-using-k-way-merge-sort/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Backbone.js by example &#8211; Part 1</title>
		<link>http://www.sinbadsoft.com/blog/backbone-js-by-example-part-1/</link>
		<comments>http://www.sinbadsoft.com/blog/backbone-js-by-example-part-1/#comments</comments>
		<pubDate>Wed, 11 Jan 2012 21:11:34 +0000</pubDate>
		<dc:creator>Chaker Nakhli</dc:creator>
				<category><![CDATA[Scripting]]></category>
		<category><![CDATA[Backbone.js]]></category>
		<category><![CDATA[Javascript]]></category>
		<category><![CDATA[MVC]]></category>

		<guid isPermaLink="false">http://www.javageneration.com/?p=839</guid>
		<description><![CDATA[In this post we will build a concrete application step by step: a simple graphical editor (for the impatient, here is what we are going to build in about 100 lines of javascript). We will focus on basic aspects of models and views. Routing and communication with the server will be covered in the next [...]]]></description>
				<content:encoded><![CDATA[<p>In this post we will build a concrete application step by step: a simple graphical editor (for the impatient, <a href="http://nakhli.wpengine.com/wp-content/uploads/2012/01/editor.html" target="_blank">here is what we are going to build</a> in about 100 lines of javascript). We will focus on basic aspects of models and views. Routing and communication with the server will be covered in the next part of this tutorial.</p>
<p><span id="more-839"></span></p>
<h1>Before we start</h1>
<p>If you want to test and hack the code snippets presented here you&#8217;ll need to create three files: editor.js, editor.css and editor.html. The first two files editor.js and editor.css are empty for now. Here is how your editor.html would look like:</p>
<pre class="brush: xml; title: ; notranslate">
&lt;!doctype html&gt;
&lt;html&gt;
 &lt;head&gt;
  &lt;link rel='stylesheet' type='text/css' href='editor.css'&gt;
  &lt;script src='http://cdnjs.cloudflare.com/ajax/libs/jquery/1.7/jquery.min.js'&gt;&lt;/script&gt;
  &lt;script src='http://cdnjs.cloudflare.com/ajax/libs/underscore.js/1.2.1/underscore-min.js'&gt;&lt;/script&gt;
  &lt;script src='http://cdnjs.cloudflare.com/ajax/libs/backbone.js/0.5.3/backbone-min.js'&gt;&lt;/script&gt;
  &lt;script src='editor.js'&gt;&lt;/script&gt;
 &lt;/head&gt;
 &lt;body&gt;
  &lt;div id='page' style='width:2000px;height:2000px;'&gt;&lt;/div&gt;
 &lt;/body&gt;
&lt;/html&gt;
</pre>
<p>I&#8217;m using <a href="http://www.cdnjs.com/" target="_blank">cdnjs</a> here, an alternative free CDN for Javascript brought by <a href="https://www.cloudflare.com/" target="_blank">cloudflare</a>. It has many libs you won&#8217;t find at Google or Microsoft CDNs.</p>
<p>Some of the code samples are also hosted on <a href="http://jsfiddle.net/nakhli/td6Eg/5/" target="_blank">jsfiddle</a> for convenience. A link is provided in each section to the corresponding jsfiddle code snippet. You can view, test and fork them right there.</p>
<p>Now that we are set up, time to code!</p>
<h1>Models</h1>
<p>Let&#8217;s start by defining a model for a simple shape: the <em>Shape</em> class.</p>
<pre class="brush: jscript; title: ; notranslate">
var Shape = Backbone.Model.extend({
    defaults: { x:50, y:50, width:150, height:150, color:'black' },
    setTopLeft: function(x,y) {
        this.set({ x:x, y:y });
    },
    setDim: function(w,h) {
        this.set({ width:w, height:h });
    },
});
</pre>
<p>Our <em>Shape</em> class extends the <em>Backbone.Model</em> class. The <em>extend</em> method takes a hash as argument in order to configure the model. In our case, we have three properties: <em>defaults</em>, <em>setTopLeft</em> and <em>setDim</em>:</p>
<ul>
<li><em>defaults</em> is a special property that backbone uses to define a set of default properties/values in the model. So by default here, a <em>Shape</em> instance will have the properties <em>x</em>, <em>y</em>, <em>width</em>, <em>height</em> and <em>color</em> defined and set to the provided default values. Notice that these model properties are encapsulated by backbone and instead of getting/setting them directly, we will use the <em>get</em> and <em>set</em> methods that <em>Shape</em> inherits from the backbone model. The encapsulation allows backbone to control modification of the model properties and fire change event on <em>set</em> method calls. This allows registering listeners for model changes.</li>
<li><em>setTopLeft</em> and <em>setDim</em> are two helper methods that use the backbone&#8217;s <em>set</em> method in order to, respectively, set the shape&#8217;s top left corner and dimension.</li>
</ul>
<p>Now that our model class is defined, here is an example of how we would instantiate it and bind change events to its properties (<a href="http://jsfiddle.net/nakhli/td6Eg/" target="_blank">jsfiddle</a>):</p>
<pre class="brush: jscript; gutter: true; light: false; title: ; notranslate">
var shape = new Shape();

shape.bind('change', function() { alert('changed!'); });
shape.bind('change:width', function() { alert('width changed! ' + shape.get('width')); });

shape.set({ width: 170 });
shape.setTopLeft(100, 100);
</pre>
<p>In line 3 we are registering for any property change in the model. In line 4 we are only listening only to property &#8216;width&#8217; changes. By setting the shape&#8217;s width line 6, both callbacks will be invoked. On the other hand, setting the top left corner coordinates using &#8216;setTopLeft&#8217; in line 7 will only trigger the &#8216;change&#8217; event.</p>
<h2>Binding page elements to model changes</h2>
<p>In the last section, we were able to define models and listen to their change events. Now, we will try to do something useful with these events. Let&#8217;s first define a div element in our html page:</p>
<pre class="brush: xml; title: ; notranslate">
&lt;div class='shape' /&gt;
</pre>
<p>and let&#8217;s tie our model changes to it as follows (we&#8217;ll be using jQuery for DOM manipulation):</p>
<pre class="brush: jscript; title: ; notranslate">
shape.bind('change', function() {
    $('.shape').css({ left:       shape.get('x'),
                      top:        shape.get('y'),
                      width:      shape.get('width'),
                      height:     shape.get('height'),
                      background: shape.get('color') });
});
</pre>
<p>That was easy! Now we can modify the model and observe the dom changing in reaction to model changes. Just open firebug or your favorite browser&#8217;s javascript console and try something like:</p>
<pre class="brush: jscript; title: ; notranslate">
shape.setTopLeft(10, 10);
shape.setDim(500, 500);
</pre>
<p>and you will see the page automatically updating itself and the shape automatically adapting to the new position/size. The interesting bit here is that we are no longer manipulating the DOM directly. A &#8220;piece of code&#8221; is listening to model changes and is updating the page on model changes automatically. This snippet is also on <a href="http://jsfiddle.net/nakhli/td6Eg/1/" target="_blank">jsfiddle</a>.</p>
<p>Time to use user&#8217;s input now in order to mutate the model and thus, indirectly, modify the page.</p>
<h2>Basic user input handling</h2>
<p>In this section, we will enable the user to drag the shape&#8217;s element around. In order to do that, we will listen to mouse events <em>mousedown</em>, <em>mouseup</em> and <em>mousemove</em> and update the model accordingly. Here is an example of how we can achieve this (<a href="http://jsfiddle.net/nakhli/td6Eg/2/" target="_blank">jsfiddle</a>):</p>
<pre class="brush: jscript; title: ; notranslate">
var dragging = false;

$('.shape').mousedown(function (e) {
    dragging = true;
    shape.set({ color: 'gray' });
});

$('#page').mouseup(function () {
    dragging = false;
    shape.set({ color: 'black'});
});

$('#page').mousemove(function(e) {
    if(dragging) {
        shape.setTopLeft(e.pageX, e.pageY);
    }
});
</pre>
<p>As a side note, notice that we are listening to <em>mousemove</em> and <em>mouseup</em> events on the parent <em>page</em> element and not on the shape&#8217;s div since we want the div to follow (resp. stop following) the mouse position even when it gets off (resp. is up outside of) the shape&#8217;s div borders.</p>
<p>What is worth considering here is the simplicity of the jQuery callbacks handling the user input. We are not modifying or even querying the DOM in this code. We are just listening to the user input &#8212; mouse events here&#8211; and translating it to model changes. The page magically updates itself through the model change listeners.</p>
<p>To sum up, we have separated the code handling user input and mutating the model, and the code updating the view in reaction to model change events. Looks like we implemented a model-view-controller here <img src='http://www.sinbadsoft.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> .</p>
<p>This is a good step to separate concerns and reduce the callback spaghetti one ends up with when dealing with pages with more than a couple of jQuery callbacks. However, we can do better here as our controller and view code is still scattered across free anonymous functions here and there. We&#8217;ll work on this by defining proper view classes later.</p>
<h2>Model Collections</h2>
<p>Before going further, we have to introduce a special kind of model used in backbone: collections. Collections are simply a container that helps maintain an ordered list of model objects. In addition it comes with built in events for common collection operations like <em>add</em> and <em>remove</em>.</p>
<p>We will use a model collection in order to organize our previously defined <em>Shape</em> objects into a <em>Document</em>:</p>
<pre class="brush: jscript; title: ; notranslate">
var Document = Backbone.Collection.extend({ model: Shape });
</pre>
<p>That&#8217;s it! We can now instantiate the <em>Document</em> class and listen to <em>add</em> and <em>remove</em> events as follows:</p>
<pre class="brush: jscript; title: ; notranslate">
var document = new Document();

document.bind('add', function(model) { alert('added'); });
document.bind('remove', function(model) { alert('removed'); });

document.add(shape); // fires add event
document.remove(shape); // fires remove event
</pre>
<h1>Views</h1>
<p>A Backbone.js view is usually associated with a model (or a model collection). The view is responsible for two things:</p>
<ol>
<li>Rendering the model into a DOM element. It listens to model changes and updates the page accordingly.</li>
<li>Handling the events of this DOM element and updating the model.</li>
</ol>
<p>In theory, Backbone&#8217;s view is actually playing both MVC&#8217;s view and MVC&#8217;s controller roles as it is handling the user input (DOM events) and updating the model, and also listening to model events and updating the visual part. This doesn&#8217;t have a big impact in practice as you would have different methods for each operation kind.</p>
<h2>Shape View</h2>
<p>In this section we will go through the code of the view of our <em>Shape</em> model. The view manages an html element that represents the shape itself and also &#8216;control&#8217; elements that decorate the shape and that will allow the user to drag, resize, delete and change the shape&#8217;s color.</p>
<pre class="brush: jscript; gutter: true; light: false; title: ; notranslate">
var ShapeView = Backbone.View.extend({
    initialize: function() {
        this.model.bind('change', this.updateView, this);
    },
    render: function() {
        $('#page').append(this.el);
        $(this.el)
            .html('&lt;div class=&quot;shape&quot;/&gt;'
                  + '&lt;div class=&quot;control delete hide&quot;/&gt;'
                  + '&lt;div class=&quot;control change-color hide&quot;/&gt;'
                  + '&lt;div class=&quot;control resize hide&quot;/&gt;')
            .css({ position: 'absolute', padding: '10px' });
        this.updateView();
        return this;
    },
    updateView: function() {
        $(this.el).css({
            left:       this.model.get('x'),
            top:        this.model.get('y'),
            width:      this.model.get('width') - 10,
            height:     this.model.get('height') - 10 });
        this.$('.shape').css({ background: this.model.get('color') });
    },
    events: {
        'mousemove'               : 'mousemove',
        'mouseup'                 : 'mouseup',
        'mouseenter .shape'       : 'hoveringStart',
        'mouseleave'              : 'hoveringEnd',
        'mousedown .shape'        : 'draggingStart',
        'mousedown .resize'       : 'resizingStart',
        'mousedown .change-color' : 'changeColor',
        'mousedown .delete'       : 'deleting',
    },
    hoveringStart: function () {
        this.$('.control').removeClass('hide');
    },
    hoveringEnd: function () {
        this.$('.control').addClass('hide');
    },
    draggingStart: function (e) {
        this.dragging = true;
        this.initialX = e.pageX - this.model.get('x');
        this.initialY = e.pageY - this.model.get('y');
        return false; // prevents default behavior
    },
    resizingStart: function() {
        this.resizing = true;
        return false; // prevents default behavior
    },
    changeColor: function() {
        this.model.set({ color: prompt('Enter color value', this.model.get('color')) });
    },
    deleting: function() {
        this.remove();
    },
    mouseup: function () {
        this.dragging = this.resizing = false;
    },
    mousemove: function(e) {
        if (this.dragging) {
            this.model.setTopLeft(e.pageX - this.initialX, e.pageY - this.initialY);
        } else if (this.resizing) {
            this.model.setDim(e.pageX - this.model.get('x'), e.pageY - this.model.get('y'));
        }
    }
});
</pre>
<p>Don&#8217;t be afraid of the length of the code, it&#8217;s pretty simple. Here are the most important bits:</p>
<ul>
<li><em>intialize</em> is a special function that is executed on view creation. This is where you usually wire your view to the model by listening to events. In our case, the view registers itself for <em>change</em> events.</li>
<li><em>render</em> is also a special function that is executed right after initializing the view. Here, the html element representing the view is initialized and &#8220;pushed&#8221; &#8211;i.e. added&#8211; to the DOM. The view&#8217;s html element is held into the inherited backbone property <em>el</em>. Line 6, we add <em>el</em> to the page, line 7 we set its html with the shape and controls elements, and finally, line 13 we update the view with the model properties.</li>
<li>The <em>events</em> hash (line 24) is an important part in our view configuration. It maps events to handler methods. The format is <em>{ &#8216;event selector&#8217; : &#8216;handler&#8217; }</em>. For example, the pair <em>{ &#8216;mousedown .shape&#8217; : &#8216;draggingStart&#8217; }</em> means that a mousedown event on an element with class <em>shape</em> will trigger the method <em>draggingStart</em>. The events hash defines how the user input is handled and thus defines the &#8216;controller&#8217; side of our backbone view.</li>
</ul>
<p>We have a small technical problem here though. As we did in the previous section <em>Basic user input handling</em>, for a better user experience we should be listening to <em>mousemove</em> and <em>mouseup</em> events on the parent page element and not on the shape’s div itself. The current code works but resizing might a bit choppy if the user moves the mouse too fast. The work around is easy though. It is implemented in the <a href="http://jsfiddle.net/nakhli/td6Eg/4/" target="_blank">jsfiddle snippet</a>.</p>
<h2>Document View</h2>
<p>Now we need to add a bit more structure to the shape views by using the document model introduced earlier. The document has also a view that manages all the shape views as follows:</p>
<pre class="brush: jscript; title: ; notranslate">
var DocumentView =  Backbone.View.extend({
    id: 'page',
    views: {},
    initialize: function() {
        this.collection.bind('add', this.added, this);
        this.collection.bind('remove', this.removed, this);
    },
    render: function() {
        return this;
    },
    added: function(m) {
        this.views[m.cid] = new ShapeView({
            model: m,
            id:'view_' + m.cid
        }).render();
    },
    removed: function(m) {
        this.views[m.cid].remove();
        delete this.views[m.cid];
    }
});
</pre>
<p>The <em>id</em> property indicates the identifier of the DOM element the view is tied to. Backbone will use this value to set the <em>el</em> property accordingly. Since we are using an existing element in the html page, there is nothing to do in the <em>render</em> method.</p>
<p>In the <em>intialize</em> method, the view registers itself for two built in model collection events: <em>add</em> and <em>remove</em>. On the <em>add</em> event, we create and render the view corresponding to the added shape model. We maintain a set of these views in the property <em>views</em>. When the <em>remove</em> event is fired, the shape removed from the document its view is fetched in the views set and removed from the page.</p>
<p>Here is the corresponding <a href="http://jsfiddle.net/nakhli/td6Eg/4/" target="_blank">jsfiddle snippet</a>.</p>
<h1>Conclusion</h1>
<p>The source of tutorial is available on <a href="https://gist.github.com/1596813" target="_blank">github</a> and on <a href="http://jsfiddle.net/nakhli/td6Eg/5/" target="_blank">jsfiddle</a>. You can also have a look at the <a href="http://nakhli.wpengine.com/wp-content/uploads/2012/01/editor.html" target="_blank">online demo</a>.</p>
<p>In this tutorial, we&#8217;ve been through some of the aspects of Backbone.js, mainly the MVC and event driven principles it puts in place. As I mentioned in the introduction, we didn&#8217;t dive into routing and server communication. Backbone.js has, for instance, built-in support for CRUD operations. I&#8217;ll try to cover some of these features in a future post.</p>
<p>Do you have questions or suggestions? Do not hesitate, please put a comment below!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.sinbadsoft.com/blog/backbone-js-by-example-part-1/feed/</wfw:commentRss>
		<slash:comments>37</slash:comments>
		</item>
		<item>
		<title>Benchmarking numeric base conversion in C#, Java and Scala</title>
		<link>http://www.sinbadsoft.com/blog/benchmarking-numeric-base-convertion-in-c-java-and-scala/</link>
		<comments>http://www.sinbadsoft.com/blog/benchmarking-numeric-base-convertion-in-c-java-and-scala/#comments</comments>
		<pubDate>Sat, 18 Jun 2011 20:33:52 +0000</pubDate>
		<dc:creator>Chaker Nakhli</dc:creator>
				<category><![CDATA[.NET]]></category>
		<category><![CDATA[Algorithms]]></category>
		<category><![CDATA[Benchmark]]></category>
		<category><![CDATA[C#]]></category>
		<category><![CDATA[Functional Programming]]></category>
		<category><![CDATA[Java]]></category>
		<category><![CDATA[Scala]]></category>

		<guid isPermaLink="false">http://www.javageneration.com/?p=520</guid>
		<description><![CDATA[A few days ago, I needed to encode numeric identifiers in a short and url-safe format. Something similar to what url shorteners use (e.g. SlvUp7 in http://bit.ly/SlvUp7). Encoding the ids in base 64 would work if an alternative alphabet is provided for the non url-safe symbols. But since I wanted to have only alpha-numeric characters, I chose [...]]]></description>
				<content:encoded><![CDATA[<p>A few days ago, I needed to encode numeric identifiers in a short and url-safe format. Something similar to what url shorteners use (e.g. <em>SlvUp7</em> in <a href="http://bit.ly/SlvUp7"  target="_blank">http://bit.ly/SlvUp7</a>). Encoding the ids in base 64 would work if an alternative alphabet is provided for the non url-safe symbols. But since I wanted to have only alpha-numeric characters, I chose to use base 62 instead.</p>
<p>While working on this problem, I had the idea of the coding exercise I&#8217;m sharing here: a utility for converting back and forth base 10 numerals to strings of base X equivalent &#8211;just like <a href="http://www.cplusplus.com/reference/clibrary/cstdlib/itoa/" target="_blank">itoa</a> and <a href="http://www.cplusplus.com/reference/clibrary/cstdlib/atoi/" target="_blank">atoi</a> in C (except that atoi doesn&#8217;t take a base parameter). In order to make things a bit more interesting, I decided to learn myself some <a href="http://www.scala-lang.org" target="_blank">Scala</a> and translate the code to functional style. Both implementations are detailed here and available <a href="https://gist.github.com/1033234" target="_blank">online</a> under the terms of the <a href="http://www.apache.org/licenses/LICENSE-2.0.html" target="_blank">Apache version 2</a> license.</p>
<p>At the end, I didn&#8217;t resist the temptation to do a C# vs. Scala performance benchmark. I also coded and added a Java implementation to the tests in order to distinguish the performance change due to switching from the CLR to the JVM, and the one introduced by the translation to functional style and to the Scala runtime.</p>
<p><span id="more-520"></span></p>
<h1>First, Iterative style</h1>
<p>I started by implementing the C# version. It takes less than 30 lines of code:</p>
<pre class="brush: csharp; gutter: true; light: false; title: ; notranslate">
public static int Decode(string str, int baze)
{
    int result = 0;
    int place = 1;
    int length = str.Length;
    for (int i = 0; i &lt; length; ++i)
    {
        result += Value(str[length - 1 - i]) * place;
        place *= baze;
    }
    return result;
}

public static string Encode(int val, int baze)
{
    var buffer = new char[32];
    int place = 0;
    int q = val;
    do
    {
        buffer[place++] = Symbol(q % baze);
        q /= baze;
    }
    while (q &gt; 0);
    Array.Reverse(buffer, 0, place);
    return new string(buffer, 0, place);
}
</pre>
<p>Symbol and Value methods convert respectively a numeral to its symbol and vice versa. They are omitted here for the sake of clarity. An example implementation that handles bases from base 2 to base 64 is provided in the C# <a href="https://gist.github.com/1033234#file_base_x.cs" target="_blank">online sample</a>.</p>
<p>If you can afford having unsafe code in your project you can replace line 16 with something like:</p>
<pre class="brush: csharp; title: ; notranslate">
char* buffer = stackalloc char[32];
</pre>
<p>The string buffer is then allocated on the stack instead of the heap. This is meant to boost performance as it avoids garbage collection for the buffer. The performance of the unsafe version is detailed in the benchmark results below.</p>
<p>I&#8217;m not going to list the Java code here since it is very similar to the C# one and it is also available in the <a href="https://gist.github.com/1033234#file_base_x.java" target="_blank">online gist</a>.</p>
<h1>Enter Scala</h1>
<p>I must say that I have little knowledge of functional programming at the time of this writing. Still, having some Java and Ruby background, getting started with Scala was easy and fun. After grasping the bare minimum of the Scala basics, I got back to my C# code and started translating it to the functional language. After all, all I had to do is to translate two methods consisting basically of one plain old iterative loop each, to shiny tail-recursive functions, right? <img src='http://www.sinbadsoft.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<h2>A list based implementation</h2>
<p>Most functional languages have built-in immutable singly linked lists as fundamental data-structure. Prepending, getting the first element and getting the tail of the list (i.e. a sublist with all elements but head) are O(1) operations. In addition, since the list is immutable, it can be shared, manipulated and have multiple pointers at different positions without worrying about data-races. Scala makes no exception and provides the class <a href="http://www.scala-lang.org/api/2.8.1/scala/collection/immutable/List.html" target="_blank">List</a> as a built-in immutable structure with the :: operator and head and tail functions (obviously, there is no append operation since elements are not meant to be added at the end of the list).</p>
<p>Based on this finding, my first attempt to translate the code to Scala uses lists as underlying data structures for both functions.</p>
<pre class="brush: scala; title: ; notranslate">
def encode(i: Int, baze: Int): String = {
  def process(q: Int, baze: Int, list: List[Char]): List[Char] = {
    if (q &gt; 0) process(q / baze, baze, symbol(q % baze) :: list)
    else list
  }
  if (i == 0) symbol(0).toString
  else process(i, baze, Nil).mkString
}

def decode(str: String, baze: Int): Int = {
  def process(acc: Int, place: Int, lst: List[Char]): Int = {
    if (!lst.isEmpty) process(acc + value(lst.head) * place, place * baze, lst.tail) 
    else acc
  }
  process(0, 1, str.reverse.toList)
}
</pre>
<p>In the encode function, a list is used as buffer where symbols are prepended. The list is converted to a string object when all symbols are computed. In the decode function, the input string is reversed and mapped to a List where symbols are popped one by one until the result is computed. Note that the string has to be reversed first since we cannot traverse the list backward from tail to head (remember, it&#8217;s a singly linked list).</p>
<p>The translation was simple to do and I find the resulting Scala code concise and quite elegant. However, the performance of this initial implementation &#8211;discussed in more details below&#8211; was too slow.</p>
<h2>A more pragmatic implementation</h2>
<p>Lessons learned from the first version, I refactored my Scala code as follows:</p>
<pre class="brush: scala; title: ; notranslate">
def encode(i: Int, baze: Int): String = {
  def process(q: Int, place: Int, baze: Int, builder: StringBuilder): StringBuilder = {
    if (q &gt; 0) process(q / baze, place + 1, baze, builder += symbol(q % baze))
    else builder
  }
  if (i == 0) symbol(0).toString
  else process(i, 0, baze, new StringBuilder(32)).reverse.toString
}

def decode(str: String, baze: Int): Int = {
  def process(acc: Int, place: Int, str: String, index: Int): Int = {
    if (index &gt;= 0) process(acc + value(str.charAt(index)) * place, place * baze, str, index - 1)
    else acc
  }
  process(0, 1, str, str.length - 1)
}
</pre>
<p>The decode function now processes directly the input string, without converting it to a list. An additional index variable is introduced in order to keep track of the current symbol. This is less elegant than the list based version, but way more efficient.</p>
<p>The encode function is basically the same as the list based one, except that the list was replaced by a <a href="http://www.scala-lang.org/api/2.8.1/scala/collection/mutable/StringBuilder.html" target="_blank">StringBuilder</a>.</p>
<p>Stopwatch in hand, I was finally ready to start benchmarking!</p>
<h1>Lies, Damn Lies and Benchmarks</h1>
<p>The results presented here are not meant to compare the performance of Scala and C# in general. They are specific to the context of the stated problem and the provided implementations. This benchmark, like most benchmarks, have to be taken with a grain of salt. Here are some reasons why:</p>
<ul>
<li>I only measured <em>raw speed</em>. Raw speed is nice to have but it does not imply scalability.</li>
<li>The benchmarked applications are stateless and purely sequential. Results may vary considerably with more complex and/or multi-threaded code.</li>
<li>I&#8217;ve basically started learning Scala a few hours before writing this benchmark. Albeit simple, the Scala code probably needs some obvious performance tuning that I&#8217;m not aware of.</li>
<li>I didn&#8217;t have a decent server at hand to run the benchmark on. I performed this benchmark on my laptop: an Intel Core i7-720QM Processor (6M Cache, 1.60 GHz) with 4Gb RAM running a 64bit Windows 7.</li>
<li>The benchmark consists in calling the encode and decode operations 10M times each with input values 2<sup>32</sup> -1 and 23527 (a random value I picked for the test) using bases 2, 10 and 64. A more rigorous benchmark should have a wider range of input values.</li>
</ul>
<p>Despite these reservations, I still find the results of this benchmark worth considering and sharing. They at least show how C#, Java and Scala behave in this kind of contexts.</p>
<h2>Results</h2>
<p>No need to keep you more in suspense, here are the results:</p>
<p><iframe src="http://spreadsheets.google.com/spreadsheet/pub?key=0AmRW7LS6LVYTdEZyR1pxeF9LcVJzNWhrY1VYNFhfYkE&#038;single=true&#038;gid=0&#038;output=html" width="100%" height="415px"></iframe></p>
<p>Execution times are expressed in nanoseconds. The percentages denote the relative performance to the C# safe version taken as a reference. You can see that on average, compared to the C# implementation:</p>
<ul>
<li>The C# unsafe encode is 21% faster.</li>
<li>The Scala list-based implementation is terribly slow: 101% slower for encode, 414% slower for decode.</li>
<li>Scala code is 18% faster for encode and 46% faster for decode.</li>
<li>Java is the fastest: 31% faster for encode and 51% faster for decode.</li>
</ul>
<h2>Discussion</h2>
<p>It is quite surprising to see Java and Scala largely outperforming the C# &#8211;both safe and unsafe versions&#8211; on Windows. The JVM is clearly outperforming the CLR on optimizing and JIT-ting code on the fly. I must confess that I didn&#8217;t expect this one <img src='http://www.sinbadsoft.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> .</p>
<p>Using lists in the Scala code is an overkill, especially for the decode function. The delay introduced by reversing the input string and mapping it to a list is not necessary for processing it. However, the cause of the overhead in the list-based encode is a bit less obvious to me. I didn&#8217;t expect that using a buffer can outperform the singly linked list to that extent. It&#8217;s probably the buffer&#8217;s faster appending operation and/or its better locality of reference that made the difference. I&#8217;m curious to dig further and see what went wrong here. But this will be probably the topic of another blog post.</p>
<p>You have an opinion or comment about this article? Please drop a line here. Your feedback is much appreciated!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.sinbadsoft.com/blog/benchmarking-numeric-base-convertion-in-c-java-and-scala/feed/</wfw:commentRss>
		<slash:comments>16</slash:comments>
		</item>
		<item>
		<title>You are not protected by the compiler</title>
		<link>http://www.sinbadsoft.com/blog/you-are-not-protected-by-the-compiler/</link>
		<comments>http://www.sinbadsoft.com/blog/you-are-not-protected-by-the-compiler/#comments</comments>
		<pubDate>Sat, 12 Feb 2011 21:11:35 +0000</pubDate>
		<dc:creator>Chaker Nakhli</dc:creator>
				<category><![CDATA[Troll]]></category>
		<category><![CDATA[CPlusPlus]]></category>

		<guid isPermaLink="false">http://www.javageneration.com/?p=445</guid>
		<description><![CDATA[Dynamic vs static typing debates are often passionate. Probably too passionate to be totally objective. During one of these discussions I was involved in, a well respected senior engineer have made an interesting statement: My lisp program crashed after hours of computation right before delivering the result due to a missing method runtime error. This [...]]]></description>
				<content:encoded><![CDATA[<p>Dynamic vs static typing debates are often passionate. Probably too passionate to be totally objective. During one of these discussions I was involved in, a well respected senior engineer have made an interesting statement:</p>
<blockquote><p>My lisp program crashed after hours of computation right before delivering the result due to a missing method runtime error. This would&#8217;ve NEVER happened in a statically typed language. For instance, this would&#8217;ve never happened in C++ using the latest Microsoft C++ compiler. With all the static analysis it performs, these errors are prevented. You are protected by the compiler!</p>
</blockquote>
<p>Well, I don&#8217;t know how good is the MS compiler compared to other C++ compilers. Still, we&#8217;ll show how ridiculously easy it is to fool it and have the same missing method crash.</p>
<p><span id="more-445"></span></p>
<p>We will be using Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 16.00.30319.01 for 80&#215;86, with &#8220;enable all warnings&#8221; and &#8220;treat warnings as errors&#8221; flags activated. We will also activate the &#8220;Enable Code analysis for C/C++ on Build&#8221; option in the &#8220;Code Analysis&#8221; project properties section and we will use &#8220;Microsoft All Rules&#8221; as a rule set.<br />
In the code samples below, we will stick to ISO C++ with no third party libraries usage. Not even the standard C++ library.</p>
<h1>Case 1 : Dispatch table corruption</h1>
<p>Every C++ developer has run into this problem at least once in his life. An object gets deleted but the program keeps, somehow, a reference to it. Everything goes fine until you try to call a virtual method on this object : Boom! You&#8217;re most likely going to have a memory protection error followed by a crash. Of course, this will not happen during development or QA testing. This will happen in production when your application is most needed <img src='http://www.sinbadsoft.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> .</p>
<p>Here is a snippet that illustrates the problem.</p>
<pre class="brush: cpp; title: ; notranslate">
struct Base
{
    virtual void Do() =0;
};

struct Derived : public Base
{
    virtual void Do() {}
};

void main()
{
    Derived *p = new Derived();
    delete p;
    p-&gt;Do();
}
</pre>
<p>Derived class overrides and implements an abstract pure virtual method defined in class Base. In main(), we create a Derived object on the heap; we delete it and then we try calling the virtual method. The compiler compiles this program with no single warning. Static analysis detects nothing either. However, at runtime, it&#8217;s a different story. The <a href="http://en.wikipedia.org/wiki/Dispatch_table" target="_blank">dispatch table</a> used for method resolution at runtime is not valid anymore after the object is delete. Calling the virtual method will amount to accessing a memory space that we&#8217;ve just disposed; hence a runtime error.</p>
<p>Memory corruption is unfortunately quite common in C++ and the compiler just can&#8217;t anything about it. This is why C++ developers use tools like <a href="http://en.wikipedia.org/wiki/Valgrind" target="_blank">valgrind</a> and <a href="http://en.wikipedia.org/wiki/IBM_Rational_Purify" target="_blank">purify</a>. They also use smart pointers and/or disciplined memory usage patterns to help containing these problems. You just can&#8217;t rely on the compiler.</p>
<h1>Case 2 : Virtual method call from constructor</h1>
<p>Memory corruption is not mandatory to have our missing method runtime error. Let&#8217;s have a look at the following snippet:</p>
<div id="_mcePaste">
<pre class="brush: cpp; title: ; notranslate">
struct Base
{
    Base() { Do(); }
    void Do() { ReallyDo(); }
    virtual void ReallyDo() =0;
};

struct Derived : public Base
{
    virtual void ReallyDo() {}
};

void main()
{
    Derived d;
}
</pre>
</div>
<p>This program will crash complaining about a &#8220;pure virtual function call&#8221;. To understand what&#8217;s happening here, we must recall that objects are constructed top down in C++: from ancestors down to children classes. To construct the object d in main(), Base is built first. Base constructor is called, it calls method Do() that calls ReallyDo(). At this point, Derived is not built yet and thus the virtual table doesn&#8217;t contain yet the address to the correct virtual method. Therefore, the call to ReallyDo() fails.</p>
<p>To sum up,  the program is apparently calling an abstract (not implemented) method and, again, the compiler and the static analysis engine didn&#8217;t see it coming.</p>
<p>But wait, does this error really happen in real life? We&#8217;ve only provided toy programs to reproduce it after all. Well, the answer is <em>Yes</em>, it does happen even in widely used and heavily tested applications. Here is a screenshot of a crash I had recently using Adobe Acrobat Reader.</p>
<p><img class="alignnone" title="Adobe Acrobat Reader Pure Virtual Function Call Error" alt="" src="http://i.imgur.com/QoG9V.png" width="305" height="209" /></p>
<h1>Conclusion</h1>
<p>The compiler cannot help you write better code or avoid bugs. This is an urban legend that, unfortunately, some developers and managers keep propagating. Probably because of the false sense of security that a  &quot;Build succeeded &quot; compiler message gives. If you are a manager and you want your team to write better quality code, try to invest a bit more on testing, code reviews, setting up best practices etc. You are <em>Not</em> protected by the compiler.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.sinbadsoft.com/blog/you-are-not-protected-by-the-compiler/feed/</wfw:commentRss>
		<slash:comments>17</slash:comments>
		</item>
		<item>
		<title>Binary Heap, Heap Sort and Priority Queue</title>
		<link>http://www.sinbadsoft.com/blog/binary-heap-heap-sort-and-priority-queue/</link>
		<comments>http://www.sinbadsoft.com/blog/binary-heap-heap-sort-and-priority-queue/#comments</comments>
		<pubDate>Sat, 08 Jan 2011 13:29:28 +0000</pubDate>
		<dc:creator>Chaker Nakhli</dc:creator>
				<category><![CDATA[.NET]]></category>
		<category><![CDATA[Algorithms]]></category>
		<category><![CDATA[C#]]></category>
		<category><![CDATA[Code Kata]]></category>

		<guid isPermaLink="false">http://www.javageneration.com/?p=411</guid>
		<description><![CDATA[In this post I am sharing another code Kata: an implementation of a binary heap. Once we have the heap implemented, we will easily deduce a heap sort and a priority queue based on it. It takes about 100 lines of C# code. Binary Heap The heap tree is &#8220;threaded&#8221;: the tree nodes are stored in an array list [...]]]></description>
				<content:encoded><![CDATA[<p>In this post I am sharing another <a href="/blog/tag/code-kata/">code Kata</a>: an implementation of a <a href="http://en.wikipedia.org/wiki/Binary_heap" target="_blank">binary heap</a>. Once we have the heap implemented, we will easily deduce a <a href="http://en.wikipedia.org/wiki/Heapsort" target="_blank">heap sort</a> and a <a href="http://en.wikipedia.org/wiki/Priority_queue" target="_blank">priority queue</a> based on it. It takes about 100 lines of C# code.</p>
<p><span id="more-411"></span></p>
<h1>Binary Heap</h1>
<p>The heap tree is &#8220;threaded&#8221;: the tree nodes are stored in an array list provided in the constructor (line 6). For each element at index i: its left child is at index 2i+1, its right child is at 2i+2 and its parent is at (i−1)/2. These relations are respectively implemented in RightChild (line 80), LeftChild  (line 82) and Parent (line 78) methods.</p>
<p>If the provided list is not empty, it will be &#8220;heapified&#8221;. This is done using the Heapify method (line 40) and is similar to STL&#8217;s <a href="http://www.cplusplus.com/reference/algorithm/make_heap/" target="_blank">make_heap</a> operation in C++.</p>
<p>New elements can be inserted to the heap using the Insert method (line 32). A new element will be first put at the end of the tree, then pulled up until the tree satisfies the heap property again. This operation is implemented in the HeapUp method (line 48).</p>
<p>The root element of the heap can be removed from the tree using the PopRoot method (line 16). The last element of the tree is then put at the root and pushed down until the tree satisfies the heap invariant. This is implemented using the HeapDown method (line 60).</p>
<pre class="brush: csharp; gutter: true; light: false; title: ; notranslate">
public class Heap&lt;T&gt;
{
  private readonly IList&lt;T&gt; _list;
  private readonly IComparer&lt;T&gt; _comparer;

  public Heap(IList&lt;T&gt; list, int count, IComparer&lt;T&gt; comparer)
  {
    _comparer = comparer;
    _list = list;
    Count = count;
    Heapify();
  }

  public int Count { get; private set; }

  public T PopRoot()
  {
    if (Count == 0) throw new InvalidOperationException(&quot;Empty heap.&quot;);
    var root = _list[0];
    SwapCells(0, Count - 1);
    Count--;
    HeapDown(0);
    return root;
  }

  public T PeekRoot()
  {
    if (Count == 0) throw new InvalidOperationException(&quot;Empty heap.&quot;);
    return _list[0];
  }

  public void Insert(T e)
  {
    if (Count &gt;= _list.Count) _list.Add(e);
    else _list[Count] = e;
    Count++;
    HeapUp(Count - 1);
  }

  private void Heapify()
  {
    for (int i = Parent(Count - 1); i &gt;= 0; i--)
    {
      HeapDown(i);
    }
  }

  private void HeapUp(int i)
  {
    T elt = _list[i];
    while (true)
    {
      int parent = Parent(i);
      if (parent &lt; 0 || _comparer.Compare(_list[parent], elt) &gt; 0) break;
      SwapCells(i, parent);
      i = parent;
    }
  }

  private void HeapDown(int i)
  {
    while (true)
    {
      int lchild = LeftChild(i);
      if (lchild &lt; 0) break;
      int rchild = RightChild(i);

      int child = rchild &lt; 0
        ? lchild
        : _comparer.Compare(_list[lchild], _list[rchild]) &gt; 0 ? lchild : rchild;

      if (_comparer.Compare(_list[child], _list[i]) &lt; 0) break;
      SwapCells(i, child);
      i = child;
    }
  }

  private int Parent(int i) { return i &lt;= 0 ? -1 : SafeIndex((i - 1) / 2); }

  private int RightChild(int i) { return SafeIndex(2 * i + 2); }

  private int LeftChild(int i) { return SafeIndex(2 * i + 1); }

  private int SafeIndex(int i) { return i &lt; Count ? i : -1; }

  private void SwapCells(int i, int j)
  {
    T temp = _list[i];
    _list[i] = _list[j];
    _list[j] = temp;
  }
}
</pre>
<h2>Quick Complexity Analysis</h2>
<p>The complexity of HeapUp and HeapDown are bound by the depth of the (complete) tree. Hence, they have both a O(log(N)) complexity (where N is the number of elements in the heap). As a consequence, Insert and PopRoot have both the same complexity: O(log(N)).</p>
<p>PeekRoot doesn&#8217;t imply any structural changes and has a O(1) complexity.</p>
<p>The Heapify method has O(N) complexity; a detailed analysis is presented <a href="http://en.wikipedia.org/wiki/Binary_heap#Building_a_heap" target="_blank">here</a>. Therefore, constructing a heap with a non empty list of size count will cost O(count).</p>
<h1>Heap Sort</h1>
<p>The heap sort is quite straightforward given the binary heap above. We just build a heap with the input array, this is O(N), and then pop the root element from the heap until it is empty, this is N times O(log(N)). The presented heap sort here is done in-place, with O(1) space complexity and with O(N log(N)) worst case time complexity.</p>
<pre class="brush: csharp; title: ; notranslate">
public class HeapSort&lt;T&gt;
{
  public static void Sort(IList&lt;T&gt; list, IComparer&lt;T&gt; comparer)
  {
    var heap = new Heap&lt;T&gt;(list, list.Count, comparer);
    while (heap.Count &gt; 0)
    {
      heap.PopRoot();
    }
  }
}
</pre>
<h1>Priority Queue</h1>
<p>In fact the heap is a priority queue <img src='http://www.sinbadsoft.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> . We just need a wrapper (delegator) object to create the heap and the heap&#8217;s list. The code is self-explanatory.</p>
<pre class="brush: csharp; title: ; notranslate">
public class PriorityQueue&lt;T&gt;
{
  private readonly Heap&lt;T&gt; _heap;

  public PriorityQueue(IComparer&lt;T&gt; comparer)
  {
    _heap = new Heap&lt;T&gt;(new List&lt;T&gt;(), 0, comparer);
  }

  public int Size { get { return _heap.Count; } }

  public T Top() { return _heap.PeekRoot(); }

  public void Push(T e) { _heap.Insert(e); }

  public T Pop() { return _heap.PopRoot(); }
}
</pre>
]]></content:encoded>
			<wfw:commentRss>http://www.sinbadsoft.com/blog/binary-heap-heap-sort-and-priority-queue/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Recursive and Iterative Merge Sort Implementations</title>
		<link>http://www.sinbadsoft.com/blog/a-recursive-and-iterative-merge-sort-implementations/</link>
		<comments>http://www.sinbadsoft.com/blog/a-recursive-and-iterative-merge-sort-implementations/#comments</comments>
		<pubDate>Sat, 01 Jan 2011 15:46:40 +0000</pubDate>
		<dc:creator>Chaker Nakhli</dc:creator>
				<category><![CDATA[.NET]]></category>
		<category><![CDATA[Algorithms]]></category>
		<category><![CDATA[C#]]></category>
		<category><![CDATA[Code Kata]]></category>

		<guid isPermaLink="false">http://www.javageneration.com/?p=389</guid>
		<description><![CDATA[I find merge sort elegant and easy to implement and to understand for both iterative and recursive approaches. In this post I&#8217;ll share a quick (and probably dirty) iterative and recursive implementations of merge sort. Both versions share exactly the same merge operation. The implementation takes less than 30 lines of C#. Recursive Merge Sort Iterative [...]]]></description>
				<content:encoded><![CDATA[<p>I find merge sort elegant and easy to implement and to understand for both iterative and recursive approaches. In this post I&#8217;ll share a quick (and probably dirty) iterative and recursive implementations of <a href="http://en.wikipedia.org/wiki/Mergesort" target="_blank">merge sort</a>. Both versions share exactly the same merge operation. The implementation takes less than 30 lines of C#.<br />
<span id="more-389"></span></p>
<h1>Recursive Merge Sort</h1>
<pre class="brush: csharp; title: ; notranslate">
       public static T[] Recursive(T[] array, IComparer&lt;T&gt; comparer)
        {
            Recursive(array, 0, array.Length, comparer);
            return array;
        }

        private static void Recursive(T[] array, int start, int end, IComparer&lt;T&gt; comparer)
        {
            if (end - start &lt;= 1) return;
            int middle = start + (end - start) / 2;

            Recursive(array, start, middle, comparer);
            Recursive(array, middle, end, comparer);
            Merge(array, start, middle, end, comparer);
        }
</pre>
<h1>Iterative Merge Sort</h1>
<pre class="brush: csharp; title: ; notranslate">
        public static T[] Iterative(T[] array, IComparer&lt;T&gt; comparer)
        {
            for (int i = 1; i &lt;= array.Length / 2 + 1; i *= 2)
            {
                for (int j = i; j &lt; array.Length; j += 2 * i)
                {
                    Merge(array, j - i, j, Math.Min(j + i, array.Length), comparer);
                }
            }

            return array;
        }
</pre>
<h1>Merge Function</h1>
<p>The merge method below is used for both methods: recursive and iterative. It merges the two provided sub-arrays T[start, middle) and T[middle, end). The result of the merge cannot stored in the input array, it needs to be stored in a separate temporary array. This takes (end-start) memory space and will have a worst case space complexity O(n) where n is the size of the input array.</p>
<pre class="brush: csharp; title: ; notranslate">
        private static void Merge(T[] array, int start, int middle, int end, IComparer&lt;T&gt; comparer)
        {
            T[] merge = new T[end-start];
            int l = 0, r = 0, i = 0;
            while (l &lt; middle &#8211; start &amp;&amp; r &lt; end &#8211; middle)
            {
                merge[i++] = comparer.Compare(array[start + l], array[middle + r]) &lt; 0
                    ? array[start + l++]
                    : array[middle + r++];
            }

            while (r &lt; end &#8211; middle) merge[i++] = array[middle + r++];

            while (l &lt; middle &#8211; start) merge[i++] = array[start + l++];

            Array.Copy(merge, 0, array, start, merge.Length);
        }
</pre>
<h1>Conclusion</h1>
<p>As opposed to other in-place sorting algorithms, merge sort needs O(n) space to perform the merging step. On the other hand, it is a stable sort and it can be easily modified to implement <a href="http://en.wikipedia.org/wiki/External_sorting" target="_blank">external sorting</a> for big data sets that do not fit in RAM.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.sinbadsoft.com/blog/a-recursive-and-iterative-merge-sort-implementations/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Chirper: twitter clone WebApp with .NET front-end and Cassandra NoSql back-end</title>
		<link>http://www.sinbadsoft.com/blog/chirper-twitter-clone-webapp-with-net-front-end-and-cassandra-nosql-back-end/</link>
		<comments>http://www.sinbadsoft.com/blog/chirper-twitter-clone-webapp-with-net-front-end-and-cassandra-nosql-back-end/#comments</comments>
		<pubDate>Mon, 19 Jul 2010 09:44:59 +0000</pubDate>
		<dc:creator>Chaker Nakhli</dc:creator>
				<category><![CDATA[.NET]]></category>
		<category><![CDATA[NoSql]]></category>
		<category><![CDATA[Asp.NET MVC]]></category>
		<category><![CDATA[Cassandra]]></category>

		<guid isPermaLink="false">http://www.javageneration.com/?p=318</guid>
		<description><![CDATA[Chirper is the first open-source non trivial web application example with .NET/NoSql integration. Chirper implements a simple twitter clone that, unlike twitter , uses Cassandra as its only database. It would&#8217;ve been probably cooler to code Chirper it in Ruby or in Python but, unfortunately, this is already done. Chirper&#8217;s front-end is written in C# / [...]]]></description>
				<content:encoded><![CDATA[<p>Chirper is the first open-source non trivial web application example with .NET/NoSql integration. Chirper implements a simple twitter clone that, <a title="Cassandra at twitter" href="http://j.mp/clqbNd" target="_blank">unlike twitter</a> <img src='http://www.sinbadsoft.com/wp-includes/images/smilies/icon_razz.gif' alt=':P' class='wp-smiley' /> , uses <a title="Cassandra" href="http://cassandra.apache.org/" target="_blank">Cassandra</a> as its only database. It would&#8217;ve been probably cooler to code Chirper it in Ruby or in Python but, unfortunately, this is already <a title="Twissandra" href="http://github.com/ericflo/twissandra" target="_blank">done</a>.</p>
<p>Chirper&#8217;s front-end is written in C# / Asp.net MVC 2.0. The back-end is based on Cassandra using the <a title="Aquiles Library" href="http://aquiles.codeplex.com/" target="_blank">Aquiles</a> library. The source code is <a title="Chirper source code" href="https://github.com/Sinbadsoft/Chirper" target="_blank">freely available</a> under the terms of the Apache License, Version 2.0.</p>
<p><span id="more-318"></span></p>
<h1>Installation</h1>
<p>Before going into Chirper&#8217;s design details in the next sections, you might want to play a bit with it first. For this, you will need to: (1) configure the database store with Chirper&#8217;s schema, and (2) setup Chirper in your favorite webserver that supports Asp.net.</p>
<h2>Cassandra configuration</h2>
<p>Simply edit Cassandra&#8217;s storage-conf.xml file and add a Chirper keyspace as follows:</p>
<pre class="brush: xml; title: ; notranslate">
&lt;Keyspaces&gt;
 &lt;Keyspace Name='Chirper'&gt;
  &lt;ColumnFamily Name='Users' CompareWith='UTF8Type'/&gt;
  &lt;ColumnFamily Name='Tweets' CompareWith='UTF8Type'/&gt;
  &lt;ColumnFamily Name='Following' CompareWith='UTF8Type'/&gt;
  &lt;ColumnFamily Name='Followers' CompareWith='UTF8Type'/&gt;
  &lt;ColumnFamily Name='TimeLine' CompareWith='UTF8Type'/&gt;
  &lt;ColumnFamily Name='UserLine' CompareWith='UTF8Type'/&gt;
...
 &lt;/Keyspace&gt;
&lt;/Keyspaces&gt;
</pre>
<p>We will go through the data schema in more details below. If you don&#8217;t have already a Cassandra node at hand don&#8217;t worry, it&#8217;s easy to set up. Here is <a title="Setting up Cassandra on Windows." href="http://www.javageneration.com/?p=19" target="_blank">how I did</a>.</p>
<h2>Installing the webapp</h2>
<p>Just download the Chirper&#8217;s latest <a title="Chirper downloads" href="http://j.mp/bke4BR" target="_blank">binary</a> and unzip it somewhere on your hard drive. From your webserver configuration console create a new webapp based on the unzipped Chirper folder. The exact webapp creation steps depend on your webserver. Here are the instructions for <a title="Create Webapp on IIS7" href="http://technet.microsoft.com/en-us/library/cc772042(WS.10).aspx" target="_blank">IIS7</a>.</p>
<p>The alternative would be to check-out the source code and run (or debug) Chirper from visual studio, using the development web server.</p>
<h1>Design</h1>
<h2>Front-end</h2>
<p>Chirper is still work in progress. However, it has already the most important features: tweeting, following other users, displaying timeline and userline, dispalying followers list etc. The front-end is really simple, nothing much to say here. Here are the main components:</p>
<ul>
<li>Two MVC controllers are handling all the user actions: one controller for authentication and registration (similar to the one presented <a title="User authentication in Asp.NET MVC using Cassandra" href="http://www.javageneration.com/?p=128" target="_blank">here</a>) and another controller for everything else. The views are kept very simple too.</li>
<li>The model consists essentially of two classes: User and Tweet. Follower and Following are just users with a time-stamp.</li>
<li>A repository class abstracts all the web application logic from the back-end operations. This is really helpful in case we might want to mock the db store for testing.</li>
</ul>
<h2>Back-end</h2>
<p>Chirper&#8217;s schema is really straightforward and very similar to the one implemented in <a title="Twissandra" href="http://github.com/ericflo/twissandra" target="_blank">Twissandra</a>. Basically, we have two main families: Users and Tweets. Four other families are introduced to join theses main families in order to implement timelines and followers lists.</p>
<h3>Users family</h3>
<p>All Chirper users are recorded in this family. Rows are keyed by user name. Each column name/value represents a user&#8217;s property: Name, DisplayName, Location, Password etc.</p>
<p><a href="http://nakhli.wpengine.com/wp-content/uploads/2010/07/users_family.png"><img class="alignnone size-large wp-image-337" title="users_family" src="http://nakhli.wpengine.com/wp-content/uploads/2010/07/users_family-1024x152.png" alt="Users Family" width="614" height="91" /></a></p>
<h3>Tweets family</h3>
<p>This family holds all Chirper tweets (or chirps <img src='http://www.sinbadsoft.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> ).  The rows are keyed by tweet id (a UUID). As for users, the columns represent the tweet properties.</p>
<p><a href="http://nakhli.wpengine.com/wp-content/uploads/2010/07/tweets_family.png"><img class="alignnone size-large wp-image-340" title="tweets_family" src="http://nakhli.wpengine.com/wp-content/uploads/2010/07/tweets_family-1024x153.png" alt="Tweets family" width="614" height="92" /></a></p>
<h3>Timeline and Userline families</h3>
<p>A timeline (resp. userline) is simply a set of time-stamp/tweet id pairs held in rows keyed by user ids. The timestamps correspond to when the tweet was tweeted. In the userline family, the tweets are those tweeted by the user  used as row key. In the timeline family, the tweets are tweeted by the followed users.</p>
<p><a href="http://nakhli.wpengine.com/wp-content/uploads/2010/07/user_tweets_family.png"><img class="alignnone size-large wp-image-342" title="User tweets family" src="http://nakhli.wpengine.com/wp-content/uploads/2010/07/user_tweets_family-1024x198.png" alt="User tweets family" width="470" height="91" /></a></p>
<h3>Following and Followers families</h3>
<p>The following (resp. followers) family joins users with their following (resp. followers) users.</p>
<p><a href="http://nakhli.wpengine.com/wp-content/uploads/2010/07/following_users_family.png"><img class="alignnone size-full wp-image-341" title="Following/Followers family" src="http://nakhli.wpengine.com/wp-content/uploads/2010/07/following_users_family.png" alt="User Following/Followers family" width="198" height="64" /></a></p>
<h1>Conclusion</h1>
<p>It was surprisingly easy to interface a .NET webapp with Cassandra. I had to write a few helper functions around the Aquiles connector but it did the job very well. Chirper&#8217;s schema is NOT designed to scale. It was kept very simple for the purpose of demonstration. I am currently looking for some help on Chirper&#8217;s web design, if you feel like contributing, please drop a comment here. It would also be nice to host a running instance of Chirper on the web (as <a title="Twissandra" href="http://github.com/ericflo/twissandra" target="_blank">Twissandra</a> and <a title="Retwis" href="http://retwis.antirez.com/" target="_blank">Retwis</a> do). If you can help on this please contact me.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.sinbadsoft.com/blog/chirper-twitter-clone-webapp-with-net-front-end-and-cassandra-nosql-back-end/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>I see Deadlocks everywhere!</title>
		<link>http://www.sinbadsoft.com/blog/i-see-deadlocks-everywhere/</link>
		<comments>http://www.sinbadsoft.com/blog/i-see-deadlocks-everywhere/#comments</comments>
		<pubDate>Sun, 13 Jun 2010 21:11:44 +0000</pubDate>
		<dc:creator>Chaker Nakhli</dc:creator>
				<category><![CDATA[.NET]]></category>
		<category><![CDATA[Troll]]></category>
		<category><![CDATA[C#]]></category>
		<category><![CDATA[Multi-Threading]]></category>

		<guid isPermaLink="false">http://www.javageneration.com/?p=247</guid>
		<description><![CDATA[The free lunch is over and we definitely have to scale by adding more and more cores. Wether we like it or not, we&#8217;ll be coding and debugging multi-threaded applications in the near future. The bad news is multi-threading is hard to deal with. Really hard. What is surprising is that even some thick reference books [...]]]></description>
				<content:encoded><![CDATA[<p>The <a title="The Free Lunch Is Over (Herb Sutter)" href="http://www.gotw.ca/publications/concurrency-ddj.htm" target="_blank">free lunch is over</a> and we definitely have to scale by adding more and more <a title="Multi-core processor" href="http://en.wikipedia.org/wiki/Multi-core_(computing)" target="_blank">cores</a>. Wether we like it or not, we&#8217;ll be coding and debugging multi-threaded applications in the near future.</p>
<p>The bad news is multi-threading is hard to deal with. Really hard. What is surprising is that even some thick reference books covering the subject are mixing up concepts and adding more to the reader&#8217;s confusion. I&#8217;ve run into such an example recently while having a look at the multi-threading chapter of  <a title="C# 3.0 Cookbook, 3rd Edition" href="http://www.amazon.com/C-3-0-Cookbook-Jay-Hilyard/dp/059651610X/" target="_blank">C# 3.0 Cookbook, 3rd Edition</a> &#8211;then I discovered a similar example in <a title="More Effective C#: 50 Specific Ways to Improve Your C#" href="http://www.amazon.com/o/ASIN/0321485890/1.-20/" target="_blank">More Effective C#: 50 Specific Ways to Improve Your C#</a>. We will go into the details in this post.</p>
<p><span id="more-247"></span><a title="C# 3.0 Cookbook, 3rd Edition" href="http://www.amazon.com/C-3-0-Cookbook-Jay-Hilyard/dp/059651610X/" target="_blank">C# 3.0 Cookbook, 3rd Edition</a> presents a collection of multi-threading anti-patterns, each followed by a discussion and a refactoring solution. We&#8217;ll have a look at the case presented in page 722 (a similar code is also in  <a title="More Effective C#: 50 Specific Ways to Improve Your C#" href="http://www.amazon.com/o/ASIN/0321485890/1.-20/" target="_blank">More Effective C#: 50 Specific Ways to Improve Your C#</a> page 86). We&#8217;ll see that the problem presentation and the proposed solution are both fishy and misleading.</p>
<h1>The phantom deadlock</h1>
<p>This is how the problem is introduced in the book:</p>
<pre class="brush: csharp; gutter: true; highlight: [5,17]; light: false; title: ; notranslate">
public class DeadLock
{
  public void Method1()
  {
    lock(this)
    {
      // Do something.
    }
  }
}

public class AnotherCls
{
  public void DoSomething()
  {
    DeadLock deadLock = new DeadLock();
    lock(deadLock)
    {
      Thread thread = new Thread(deadLock.Method1) ;
      thread.Start();
      // Do some time-consuming task here.
    }
  }
}
</pre>
<blockquote><p>[...] When Method1 is called, it locks the current DeadLock object. Unfortunately, any object that has access to the DeadLock class may also lock it. This is shown in AnotherCls.DoSomething(). [...]</p></blockquote>
<p>Indeed, exposing the lock object to clients, by locking on <em>this</em>, can be error prone. But, contrary to what the name of the class <em>Deadlock</em> might suggest, <strong>there is no</strong> <strong>deadlock</strong> <strong>in this code!</strong> All what we have here is that the started thread in line 19 and the current thread have mutually exclusive sections: the lock blocks respectively in <em>Method1</em> line 5 and in method <em>DoSomething</em> line 17. The lock object used is the one instantiated line 16.</p>
<p>A <a title="Deadlock" href="http://en.wikipedia.org/wiki/Deadlock" target="_blank">deadlock</a> is &#8220;<em>a </em><em>situation where in two or more competing actions are each waiting for the other to finish, and thus neither ever does</em>&#8220;. It is not the case here. As it is presented, this pattern doesn&#8217;t imply any deadlocks.</p>
<h1>Wrong problem &#8211; wrong solution</h1>
<p>Unfortunately, things get worse with the refactoring the book suggests:</p>
<blockquote><p>[...] The DeadLock class can be rewritten, as follows to fix this problem:</p></blockquote>
<pre class="brush: csharp; gutter: true; light: false; title: ; notranslate">
public class DeadLock
{
  private object syncObj = new object( );
  public void Method1( )
  {
    lock(syncObj)
    {
      // Do something.
    }
  }
}
public class AnotherCls
{
  private object deadLockSyncObj = new object( );
  public void DoSomething( )
  {
    DeadLock deadLock = new DeadLock( );
    lock(deadLockSyncObj)
    {
      Thread thread = new Thread(deadLock.Method1) ;
      thread. Start( );
      // Do some time-consuming task here.
    }
  }
}
</pre>
<blockquote><p>[...] Now in the DeadLock class, you are locking on the internal syncObj, while the DoSomething method locks on the DeadLock class instance. <strong>This resolves the deadlock condition</strong> [...]</p></blockquote>
<p>The mutually exclusive execution condition is relaxed by using two different lock objects. Wow! This is impressive. In order to remove a &#8220;deadlock&#8221; between two threads you just remove synchronization between them and you&#8217;re done! Notice that the functional behavior of the code is deeply changed here. Now we have the two threads authorized to modify concurrently the same instance, created in line 17, at will. We have serious data race problems with the proposed solution.</p>
<h1>Conclusion</h1>
<p>Again, multi-threading is hard and error-prone.  One should avoid threads as much as possible. If not possible, try to avoid shared state between threads. Also, reading some formal documents to recall the fundamentals can probably be helpful. If you are still in trouble, it&#8217;s time for you to switch to a hyped functional language! <img src='http://www.sinbadsoft.com/wp-includes/images/smilies/icon_razz.gif' alt=':P' class='wp-smiley' /> </p>
]]></content:encoded>
			<wfw:commentRss>http://www.sinbadsoft.com/blog/i-see-deadlocks-everywhere/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>A Ruby command-line option parsing template</title>
		<link>http://www.sinbadsoft.com/blog/a-ruby-command-line-option-parsing-template/</link>
		<comments>http://www.sinbadsoft.com/blog/a-ruby-command-line-option-parsing-template/#comments</comments>
		<pubDate>Mon, 24 May 2010 12:02:31 +0000</pubDate>
		<dc:creator>Chaker Nakhli</dc:creator>
				<category><![CDATA[Scripting]]></category>
		<category><![CDATA[optparse]]></category>
		<category><![CDATA[Ruby]]></category>

		<guid isPermaLink="false">http://www.javageneration.com/?p=209</guid>
		<description><![CDATA[Even utility scripts should be robust and well documented. It&#8217;s a pain to have to read a script source to figure out what it is doing and what are the possible parameters and how they are used. In this post I share a script template I use for my Ruby scripts. I&#8217;m using the command [...]]]></description>
				<content:encoded><![CDATA[<p>Even utility scripts should be robust and well documented. It&#8217;s a pain to have to read a script source to figure out what it is doing and what are the possible parameters and how they are used. In this post I share a script template I use for my Ruby scripts.<br />
<span id="more-209"></span><br />
I&#8217;m using the command line arguments parsing library <a title="optparse" href="http://ruby-doc.org/core/files/lib/optparse_rb.html" target="_blank">optparse</a> (a detailed example <a title="optparse example" href="http://ruby-doc.org/core/classes/OptionParser.html" target="_blank">here</a>). When used with -h flag, the script output is as follows:</p>
<pre>Optparse Script Template
Usage: optparse_script_template.rb [OPTIONS]
Example
optparse_script_template.rb -e val1 -i 1000 -f 2.35 -l foo,bar,baz

    -e, --enumeration ENUM           val1 : Value1 description.
                                     val2 : Value2 description.
    -i, --integer INT                An integer.
    -f, --float [FLOAT]              A float.
    -l, --list e1,e2,e3              A list of things.

    -h, --help                       Show this help message.</pre>
<p>The code below is really straightforward. You can see that:</p>
<ul>
<li>Line 11: we&#8217;re using a common Ruby idiom. The script doesn&#8217;t get run when loaded from another file (or from irb).</li>
<li>Line 17: the script name is not hard-coded,  File.basename($0), more robust, is used instead.</li>
<li>Line 33: an optional float value is specified using square brackets: [FLOAT].</li>
<li>Line 44: the destructive version of the parsing method parse! is used. This helps detecting unknown options in line 48.</li>
</ul>
<pre class="brush: ruby; gutter: true; highlight: [11,17,33,44,48]; light: false; title: ; notranslate">
class OptparseScriptTemplate
  def initialize(options)
    @enum, @int, @float, @list = options[:enum], options[:int], options[:float], options[:list]
  end

  def execute()
    puts(&quot;enum=#{@enum}|int=#{@int}|float=#{@float || '&lt;value not set&gt;'}|@list=#{@list}&quot;)
  end
end

if __FILE__ == $0
  require 'optparse'

  options = {}

  ARGV.options do |opts|
    script_name = File.basename($0)
    opts.banner = 'Optparse Script Template'
    opts.define_head(&quot;Usage: #{script_name} [OPTIONS]&quot;,
                     'Example', &quot;#{script_name} -e val1 -i 1000 -f 2.35 -l foo,bar,baz&quot;)

    opts.separator('')

    # enumeration
    opts.on('-e', '--enumeration ENUM', [:val1, :val2],
            'val1 : Value1 description.',
            'val2 : Value2 description.')  { |v| options[:enum] = v }

    # integer
    opts.on('-i', '--integer INT', Integer, &quot;An integer.&quot; )  { |v| options[:int] = v }

    # float (value optional)
    opts.on('-f', '--float [FLOAT]', Float, &quot;A float.&quot; )  { |v| options[:float] = v }

    # list
    opts.on('-l', '--list e1,e2,e3', Array, &quot;A list of things.&quot; )  { |v| options[:list] = v }

    opts.separator('')

    opts.on_tail('-h', '--help', 'Show this help message.') do
      puts(opts)
      exit(0)
    end
    opts.parse!()
  end

  begin
    raise &quot;Unknown option(s) #{ARGV.join(', ')}&quot; if ARGV.any?()
    raise &quot;Enum option is missing&quot; if !options.key?(:enum)
    raise &quot;Integer option is missing&quot; if !options.key?(:int)
    raise &quot;Float option is missing&quot; if !options.key?(:float)
    raise &quot;List option is missing&quot; if !options.key?(:list)
  rescue Exception =&gt; ex
    puts &quot;#{ex.message()}. Please use -h or --h for usage.&quot;
    exit(1)
  end

  begin
    OptparseScriptTemplate.new(options).execute()
  rescue Exception =&gt; ex
    puts(&quot;Error while executing script: #{ex.message}&quot;)
    puts(&quot;Trace: #{ex.backtrace().join($/)}&quot;)
  end
end</pre>
]]></content:encoded>
			<wfw:commentRss>http://www.sinbadsoft.com/blog/a-ruby-command-line-option-parsing-template/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>User authentication in Asp.NET MVC using Cassandra and HectorSharp</title>
		<link>http://www.sinbadsoft.com/blog/user-management-in-asp-net-mvc-using-cassandra-and-hectorsharp/</link>
		<comments>http://www.sinbadsoft.com/blog/user-management-in-asp-net-mvc-using-cassandra-and-hectorsharp/#comments</comments>
		<pubDate>Sat, 15 May 2010 19:36:34 +0000</pubDate>
		<dc:creator>Chaker Nakhli</dc:creator>
				<category><![CDATA[.NET]]></category>
		<category><![CDATA[NoSql]]></category>
		<category><![CDATA[Asp.NET MVC]]></category>
		<category><![CDATA[C#]]></category>
		<category><![CDATA[Cassandra]]></category>
		<category><![CDATA[HectorSharp]]></category>

		<guid isPermaLink="false">http://www.javageneration.com/?p=128</guid>
		<description><![CDATA[In this post we will write a sample account management system for Asp.NET MVC applications using Cassandra as a back-end and HectorSharp as Cassandra&#8217;s .NET client. Before we begin, if you are not familiar with HectorSharp, here is a gentle introduction. Note also that we will be using .NET framework 3.5 and Asp.NET MVC 1.0 [...]]]></description>
				<content:encoded><![CDATA[<p>In this post we will write a sample account management system for <a title="Asp.NET MVC" href="http://www.asp.net/(S(d35rmemuuono1wvm1gsp2n45))/mvc" target="_blank">Asp.NET MVC</a> applications using <a title="Cassandra" href="http://cassandra.apache.org/" target="_blank">Cassandra</a> as a back-end and <a title="HectorSharp" href="http://hectorsharp.com/" target="_blank">HectorSharp</a> as Cassandra&#8217;s .NET client.</p>
<p><span id="more-128"></span></p>
<p>Before we begin, if you are not familiar with HectorSharp, here is a <a title="Cassandra meets Hector(Sharp)" href="http://www.javageneration.com/?p=104" target="_blank">gentle introduction</a>. Note also that we will be using .NET framework 3.5 and Asp.NET MVC 1.0 in the code fragments presented here.</p>
<h1>Interface</h1>
<p>In our application, account management and authentication is handled by a dedicated MVC controller: AccountController. As you would expect, this controller will handle actions like: LogOn, LogOff, Register, ChangePassword etc. The actual work is not implemented in the controller methods. It is instead delegated to a separate authentication service. This separation of concerns is important as it will make the code cleaner and easier to change. Moreover, it eases controller testing as we will be able to inject a mocked disconnected authentication services into the controller.</p>
<p>Let&#8217;s start by specifying the contract of our authentication service:</p>
<pre class="brush: csharp; title: ; notranslate">
using System.Web.Security;

public interface IAuthentificationService
{
  bool SignIn(string userName, string password, bool createPersistentCookie);

  void SignOut();

  MembershipCreateStatus SignUp(string userName, string password, string email);

  bool ChangePassword(string userName, string oldPassword, string newPassword);
}
</pre>
<p>This interface allows the basic operations needed to back up our controller actions, let&#8217;s keep it simple for the moment. We&#8217;ll refactor the code as new requirements and needs emerge during development iterations. Notice that we are reusing the <a title="MembershipCreateStatus" href="http://msdn.microsoft.com/en-us/library/system.web.security.membershipcreatestatus.aspx" target="_blank">MembershipCreateStatus</a> enumeration from <a title="System.Web.Security" href="http://msdn.microsoft.com/en-us/library/system.web.security.aspx" target="_blank">System.Web.Security</a>.</p>
<h1>Implementation</h1>
<p>Now, time to implement the interface using Cassandra as a store for user information. Basically, each time a new user sings up we will insert his info into Cassandra, and each time a user want&#8217;s to sign in, we will look for his info in Cassandra. Before writing code, let&#8217;s have a look at the schema we will be using. For the purpose of demonstration, we will simplify things.  We will use a single column family, keyed by user names, to hold the user information: name, password and email. Here is an example of what a column family row would look like:</p>
<p><a href="http://nakhli.wpengine.com/wp-content/uploads/2010/05/user_info1.png"><img class="alignnone size-full wp-image-199" title="user_info" src="http://nakhli.wpengine.com/wp-content/uploads/2010/05/user_info1.png" alt="User Info" width="334" height="66" /></a></p>
<p>Our column family is defined in the storage-conf.xml as follows:</p>
<pre class="brush: xml; title: ; notranslate">
&lt;Keyspace Name=&quot;Chripper&quot;&gt;
  &lt;ColumnFamily Name=&quot;Users&quot; CompareWith=&quot;UTF8Type&quot; /&gt;
&lt;/Keyspace&gt;
</pre>
<p>Based on this schema, the authentication service implementation is straightforward. We will use <a title="FormsAuthentication" href="http://msdn.microsoft.com/en-us/library/system.web.security.formsauthentication.aspx" target="_blank">FormsAuthentication</a> for maintaining the authentication cookie and focus on the membership management.</p>
<pre class="brush: csharp; title: ; notranslate">
using System.Web.Security;
using HectorSharp;

public class AuthentificationService : IAuthentificationService
{
  private static readonly ColumnPath UsersPasswordPath =
    new ColumnPath(&quot;Users&quot;, null, &quot;Password&quot;);
  private static readonly ColumnPath UsersNamePath =
    new ColumnPath(&quot;Users&quot;, null, &quot;Name&quot;);
  private static readonly ColumnPath UsersEmailPath =
    new ColumnPath(&quot;Users&quot;, null, &quot;Email&quot;);
  private readonly ICassandraClient client;

  public AuthentificationService(ICassandraClient cassandraClient)
  {
    client = cassandraClient;
  }

  private IKeyspace Keyspace { get { return client.GetKeyspace(&quot;Chirpper&quot;); } }

  public bool SignIn(string userName, string password, bool createPersistentCookie)
  {
    if (!ValidateUser(userName, password))
    {
      return false;
    }

    FormsAuthentication.SetAuthCookie(userName, createPersistentCookie);
    return true;
  }

  public void SignOut()
  {
    FormsAuthentication.SignOut();
  }

  public MembershipCreateStatus SignUp(string userName, string password, string email)
  {
    Column nameColumn;
    if (Keyspace.TryGetColumn(userName, UsersNamePath, out nameColumn))
    {
      return MembershipCreateStatus.DuplicateUserName;
    }

    Keyspace.Insert(userName, UsersNamePath, userName);
    Keyspace.Insert(userName, UsersPasswordPath, password);
    Keyspace.Insert(userName, UsersEmailPath, email);
    return MembershipCreateStatus.Success;
  }

  public bool ChangePassword(string userName, string oldPassword, string newPassword)
  {
    if (!ValidateUser(userName, oldPassword))
    {
      return false;
    }

    Keyspace.Insert(userName, UsersPasswordPath, newPassword);
    return true;
  }

  private bool ValidateUser(string userName, string password)
  {
    Column passwordColumn;
    if (Keyspace.TryGetColumn(userName, UsersPasswordPath, out passwordColumn))
    {
      return string.Equals(password, passwordColumn.Value);
    }

      return false;
    }
}</pre>
<p><span style="font-family: Georgia, 'Times New Roman', 'Bitstream Charter', Times, serif; line-height: 19px; white-space: normal; font-size: 13px;"> </span></p>
<h1>Conclusion</h1>
<p>We implemented an authentication service using Cassandra as a store for user information. This implementation is a first iteration and chances are that we would like to have a richer schema (enable searching users by email, add a password question etc.) and to add more functionality to the service (control password strength, hash or crypt stored passwords etc.). We might also want to refactor the code that queries and updates the Cassandra store to a separate class. This class will act as a mediator between Cassandra and our application code (aka a repository).</p>
<h1>Epilogue</h1>
<p>If you are curious on how the MVC controller would consume our authentication service, here is a sample implementation:</p>
<pre class="brush: csharp; title: ; notranslate">
using System;
using System.Security.Principal;
using System.Web.Mvc;
using System.Web.Security;

[HandleError]
public class AccountController : Controller
{
  private readonly AccountInputValidator validator;

  public AccountController()
    : this(null)
  {
  }

  public AccountController(IAuthentificationService authentificationService)
  {
    AuthentificationService = authentificationService
      ?? new AuthentificationService(CassandraClients.Make());
    validator = new AccountInputValidator(ModelState);
  }

  private IAuthentificationService AuthentificationService { get; set; }

  public ActionResult LogOn()
  {
    return View();
  }

  [AcceptVerbs(HttpVerbs.Post)]
  public ActionResult LogOn(string userName, string password, bool rememberMe, string returnUrl)
  {
    if (!validator.ValidateLogOn(userName, password))
    {
      return View();
    }

    if (!AuthentificationService.SignIn(userName, password, rememberMe))
    {
      ModelState.AddModelError(&quot;_FORM&quot;, &quot;The username or password provided is incorrect.&quot;);
    }

    if (!string.IsNullOrEmpty(returnUrl))
    {
      return Redirect(returnUrl);
    }

    return RedirectToAction(&quot;Index&quot;, &quot;Home&quot;);
  }

  public ActionResult LogOff()
  {
    AuthentificationService.SignOut();
    return RedirectToAction(&quot;Index&quot;, &quot;Home&quot;);
  }

  public ActionResult Register()
  {
    return View();
  }

  [AcceptVerbs(HttpVerbs.Post)]
  public ActionResult Register(string userName, string email, string password, string confirmPassword)
  {
    if (validator.ValidateRegistration(userName, email, password, confirmPassword))
    {
      var status = AuthentificationService.SignUp(userName, password, email);
      if (status == MembershipCreateStatus.Success)
      {
        return RedirectToAction(&quot;Index&quot;, &quot;Home&quot;);
      }

      ModelState.AddModelError(&quot;_FORM&quot;, AuthentificationStatus.ToString(status));
    }

    // If we got this far, something failed, redisplay form
    return View();
  }

  [Authorize]
  public ActionResult ChangePassword()
  {
    return View();
  }

  [Authorize]
  [AcceptVerbs(HttpVerbs.Post)]
  public ActionResult ChangePassword(string currentPassword, string newPassword, string confirmPassword)
  {
    if (!validator.ValidateChangePassword(currentPassword, newPassword, confirmPassword))
    {
      return View();
    }

    try
    {
      if (AuthentificationService.ChangePassword(User.Identity.Name, currentPassword, newPassword))
      {
        return View(&quot;ChangePasswordSuccess&quot;);
      }

      ModelState.AddModelError(&quot;_FORM&quot;, &quot;The current password is incorrect or the new password is invalid.&quot;);
      return View();
    }
    catch
    {
      ModelState.AddModelError(&quot;_FORM&quot;, &quot;The current password is incorrect or the new password is invalid.&quot;);
      return View();
    }
  }

  protected override void OnActionExecuting(ActionExecutingContext filterContext)
  {
    if (filterContext.HttpContext.User.Identity is WindowsIdentity)
    {
      throw new InvalidOperationException(&quot;Windows authentication is not supported.&quot;);
    }
  }
}</pre>
]]></content:encoded>
			<wfw:commentRss>http://www.sinbadsoft.com/blog/user-management-in-asp-net-mvc-using-cassandra-and-hectorsharp/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
	</channel>
</rss>
