<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Derivante &#187; Uncategorized</title>
	<atom:link href="http://www.derivante.com/category/uncategorized/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.derivante.com</link>
	<description>to obtain or receive from a source</description>
	<lastBuildDate>Mon, 26 Apr 2010 18:44:42 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
<xhtml:meta xmlns:xhtml="http://www.w3.org/1999/xhtml" name="robots" content="noindex" />
		<item>
		<title>ActiveRecord and Zend_Paginator_Adapter_Interface</title>
		<link>http://www.derivante.com/2009/10/29/activerecord-and-zend_paginator_adapter_interface/</link>
		<comments>http://www.derivante.com/2009/10/29/activerecord-and-zend_paginator_adapter_interface/#comments</comments>
		<pubDate>Thu, 29 Oct 2009 14:29:07 +0000</pubDate>
		<dc:creator>Clay vanSchalkwijk</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.derivante.com/?p=757</guid>
		<description><![CDATA[Zend has a lot of tools to help speed up the application development process.  One such tool I found useful, was Paginator.   I am using php-activerecord in my project using Zend_Framework as the backend, to tie the two together is very simple.  Paginator requires two methods, it needs to be able to pull a count [...]]]></description>
			<content:encoded><![CDATA[<p>Zend has a lot of tools to help speed up the application development process.  One such tool I found useful, was Paginator.   I am using <a href="http://www.phpactiverecord.org" target="_blank">php-activerecord</a> in my project using Zend_Framework as the backend, to tie the two together is very simple.  Paginator requires two methods, it needs to be able to pull a count to get the total and it also needs to be able to pull in a subset of the data.  Take a look at the following example:</p>
<pre class="php">&nbsp;
<span style="color: #000000; font-weight: bold;">&lt;?php</span>
<span style="color: #000000; font-weight: bold;">class</span> My_Paginator implements Zend_Paginator_Adapter_Interface <span style="color: #66cc66;">&#123;</span>
	<span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">function</span> __construct<span style="color: #66cc66;">&#40;</span><span style="color: #0000ff;">$table</span>,<span style="color: #0000ff;">$conditions</span> = <a style="text-decoration: none;" href="http://www.php.net/array"><span style="color: #000066;">array</span></a><span style="color: #66cc66;">&#40;</span><span style="color: #66cc66;">&#41;</span><span style="color: #66cc66;">&#41;</span>
	<span style="color: #66cc66;">&#123;</span>
		<span style="color: #b1b100;">if</span><span style="color: #66cc66;">&#40;</span>!<a style="text-decoration: none;" href="http://www.php.net/is_array"><span style="color: #000066;">is_array</span></a><span style="color: #66cc66;">&#40;</span><span style="color: #0000ff;">$conditions</span><span style="color: #66cc66;">&#41;</span><span style="color: #66cc66;">&#41;</span>
&nbsp;
			<span style="color: #0000ff;">$conditions</span> = <a style="text-decoration: none;" href="http://www.php.net/array"><span style="color: #000066;">array</span></a><span style="color: #66cc66;">&#40;</span> <span style="color: #0000ff;">$conditions</span> <span style="color: #66cc66;">&#41;</span>;
&nbsp;
		<span style="color: #0000ff;">$this</span>-&gt;<span style="color: #006600;">conditions</span> = <span style="color: #0000ff;">$conditions</span>;
&nbsp;
		<span style="color: #0000ff;">$this</span>-&gt;<span style="color: #006600;">table</span>	  = <span style="color: #000000; font-weight: bold;">new</span> <span style="color: #0000ff;">$table</span>;
&nbsp;
	<span style="color: #66cc66;">&#125;</span>
&nbsp;
	<span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">function</span> getItems<span style="color: #66cc66;">&#40;</span><span style="color: #0000ff;">$offset</span>, <span style="color: #0000ff;">$itemCountPerPage</span><span style="color: #66cc66;">&#41;</span>
	<span style="color: #66cc66;">&#123;</span>
			<span style="color: #b1b100;">return</span> <span style="color: #0000ff;">$this</span>-&gt;<span style="color: #006600;">table</span>-&gt;<span style="color: #006600;">find</span><span style="color: #66cc66;">&#40;</span><span style="color: #ff0000;">'all'</span>, <a style="text-decoration: none;" href="http://www.php.net/array"><span style="color: #000066;">array</span></a><span style="color: #66cc66;">&#40;</span><span style="color: #ff0000;">'limit'</span> =&gt; <span style="color: #0000ff;">$itemCountPerPage</span>, <span style="color: #ff0000;">'offset'</span> =&gt; <span style="color: #0000ff;">$offset</span>, <span style="color: #ff0000;">'conditions'</span> =&gt; <span style="color: #0000ff;">$this</span>-&gt;<span style="color: #006600;">conditions</span><span style="color: #66cc66;">&#41;</span><span style="color: #66cc66;">&#41;</span>;
	<span style="color: #66cc66;">&#125;</span>
&nbsp;
	<span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">function</span> <a style="text-decoration: none;" href="http://www.php.net/count"><span style="color: #000066;">count</span></a><span style="color: #66cc66;">&#40;</span><span style="color: #66cc66;">&#41;</span>
	<span style="color: #66cc66;">&#123;</span>
		<span style="color: #b1b100;">return</span> <span style="color: #0000ff;">$this</span>-&gt;<span style="color: #006600;">table</span>-&gt;<span style="color: #006600;">count</span><span style="color: #66cc66;">&#40;</span><a style="text-decoration: none;" href="http://www.php.net/array"><span style="color: #000066;">array</span></a><span style="color: #66cc66;">&#40;</span><span style="color: #ff0000;">'conditions'</span> =&gt; <span style="color: #0000ff;">$this</span>-&gt;<span style="color: #006600;">conditions</span><span style="color: #66cc66;">&#41;</span><span style="color: #66cc66;">&#41;</span>;
	<span style="color: #66cc66;">&#125;</span>
<span style="color: #66cc66;">&#125;</span>
<span style="color: #000000; font-weight: bold;">?&gt;</span>
&nbsp;</pre>
<p>The two methods the Zend_Paginator_Adapter_Interface expects are count() and getItems().  The above example is a little "raw", it should serve to guide you in what to do when extending the Paginator with it's own adapter regardless of what your database layer is.  In the case of $conditions, these are the parameters you are passing to SQL:</p>
<pre class="php">&nbsp;
<span style="color: #0000ff;">$paginator</span> = <span style="color: #000000; font-weight: bold;">new</span> Zend_Paginator<span style="color: #66cc66;">&#40;</span><span style="color: #000000; font-weight: bold;">new</span> My_Paginator<span style="color: #66cc66;">&#40;</span><span style="color: #ff0000;">'User'</span>,<span style="color: #ff0000;">' active = &quot;Y&quot; '</span><span style="color: #66cc66;">&#41;</span><span style="color: #66cc66;">&#41;</span>;
&nbsp;</pre>
<p>We want to access the User model, and only want to pull out users who are active.  Easy enough, you can certainly put in more complicated SQL here, but for a general use purpose it solves 99% of what I want with ActiveRecord and paging, pass in a model to the adapter, and some basic conditions for listing.</p>
<pre class="php">&nbsp;
    <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">function</span> pageUserAction<span style="color: #66cc66;">&#40;</span><span style="color: #66cc66;">&#41;</span>
    <span style="color: #66cc66;">&#123;</span>
 		<span style="color: #0000ff;">$paginator</span> = <span style="color: #000000; font-weight: bold;">new</span> Zend_Paginator<span style="color: #66cc66;">&#40;</span><span style="color: #000000; font-weight: bold;">new</span> My_Paginator<span style="color: #66cc66;">&#40;</span><span style="color: #ff0000;">'User'</span>,<span style="color: #ff0000;">' active = &quot;Y&quot; '</span><span style="color: #66cc66;">&#41;</span><span style="color: #66cc66;">&#41;</span>;
		<span style="color: #0000ff;">$paginator</span>-&gt;<span style="color: #006600;">setCurrentPageNumber</span><span style="color: #66cc66;">&#40;</span><span style="color: #0000ff;">$this</span>-&gt;_getParam<span style="color: #66cc66;">&#40;</span><span style="color: #ff0000;">'page'</span>, <span style="color: #cc66cc;">1</span><span style="color: #66cc66;">&#41;</span><span style="color: #66cc66;">&#41;</span>;
		<span style="color: #0000ff;">$paginator</span>-&gt;<span style="color: #006600;">setItemCountPerPage</span><span style="color: #66cc66;">&#40;</span><span style="color: #ff0000;">'25'</span><span style="color: #66cc66;">&#41;</span>;
		<span style="color: #0000ff;">$this</span>-&gt;<span style="color: #006600;">view</span>-&gt;<span style="color: #006600;">paginator</span> = <span style="color: #0000ff;">$paginator</span>;
    <span style="color: #66cc66;">&#125;</span>
&nbsp;</pre>
<p>This is a basic usage of Zend_Paginator, you are passing in the current page you are on, and the results per page and pushing it out to the view, and on the view side:</p>
<pre class="php">&nbsp;
&lt;div id=<span style="color: #ff0000;">&quot;userlist&quot;</span>&gt;
<span style="color: #000000; font-weight: bold;">&lt;?php</span> <span style="color: #b1b100;">if</span> <span style="color: #66cc66;">&#40;</span><a style="text-decoration: none;" href="http://www.php.net/count"><span style="color: #000066;">count</span></a><span style="color: #66cc66;">&#40;</span><span style="color: #0000ff;">$this</span>-&gt;<span style="color: #006600;">paginator</span><span style="color: #66cc66;">&#41;</span><span style="color: #66cc66;">&#41;</span><span style="color: #66cc66;">&#123;</span>
	<span style="color: #b1b100;">foreach</span> <span style="color: #66cc66;">&#40;</span><span style="color: #0000ff;">$this</span>-&gt;<span style="color: #006600;">paginator</span> <span style="color: #b1b100;">as</span> <span style="color: #0000ff;">$user</span><span style="color: #66cc66;">&#41;</span><span style="color: #66cc66;">&#123;</span>
		<a style="text-decoration: none;" href="http://www.php.net/echo"><span style="color: #000066;">echo</span></a> <span style="color: #0000ff;">$user</span>-&gt;<span style="color: #006600;">username</span>.<span style="color: #ff0000;">&quot;&lt;br&gt;&quot;</span>;
	<span style="color: #66cc66;">&#125;</span>
<span style="color: #66cc66;">&#125;</span>
<span style="color: #000000; font-weight: bold;">?&gt;</span>
&lt;/div&gt;
&nbsp;
<span style="color: #000000; font-weight: bold;">&lt;?</span>= <span style="color: #0000ff;">$this</span>-&gt;<span style="color: #006600;">paginationControl</span><span style="color: #66cc66;">&#40;</span><span style="color: #0000ff;">$this</span>-&gt;<span style="color: #006600;">paginator</span>, <span style="color: #ff0000;">'Elastic'</span>, <span style="color: #ff0000;">'/common/paginator.phtml'</span><span style="color: #66cc66;">&#41;</span>; <span style="color: #000000; font-weight: bold;">?&gt;</span>
&nbsp;</pre>
<p>If anyone has any problems or questions getting the two to work together let me know in the comments and I will do my best to answer your questions.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.derivante.com/2009/10/29/activerecord-and-zend_paginator_adapter_interface/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Bayesian filter training with N-gram</title>
		<link>http://www.derivante.com/2009/03/31/bayesian-filter-training-with-n-gram/</link>
		<comments>http://www.derivante.com/2009/03/31/bayesian-filter-training-with-n-gram/#comments</comments>
		<pubDate>Wed, 01 Apr 2009 04:09:19 +0000</pubDate>
		<dc:creator>Clay vanSchalkwijk</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.derivante.com/?p=274</guid>
		<description><![CDATA[Bayesian filtering is based on the principle that most events are dependent and that the probability of an event occurring in the future can be inferred from the previous occurrences of that event (link). A probability value is then assigned to each word or token; the probability is based on calculations that take into account [...]]]></description>
			<content:encoded><![CDATA[<p>Bayesian filtering is based on the principle that most events are dependent and that the probability of an event occurring in the future can be inferred from the previous occurrences of that event (<a href="http://support.gfi.com/manuals/en/me12/me12manual.1.13.html">link</a>).  A probability value is then assigned to each word or token; the probability is based on calculations that take into account how often that word occurs in one category or another.  The most common application of the filter is for identifying words that appear in spam versus legitimate emails. A word by itself is often times useless without the context  it was used in.</p>
<p>There is a whole suite of tools that are able to break down content to help improve the filter by supplementing it not only with a database of words to categories, but also sets of <a href="http://en.wikipedia.org/wiki/N-gram">N-gram</a> derived from the text.   There are several scripts out there that will help with this extraction and it offers a few more layers of depth for Bayesian filtering.  One such tool is, <a href="http://ngram.sourceforge.net/">Ngram Statistics Package (NSP)</a> which is easy to install and run.</p>
<p><span id="more-274"></span><br />
I ran a very basic test against an older <a href="http://www.derivante.com/2009/01/26/there-and-back-again-an-ec2-mysql-cluster/">post</a> to see how it does with bigram extraction.</p>
<p># perl bin/count.pl --ngram 2 test.cnt test.txt<br />
# perl statistic.pl --ngram 2 dice test.res test.cnt</p>
<p>Sample bigrams found:</p>
<p>cloud computing, master slave, groups online, Back Again, made absolutely, very costly, extensive development, hefty bill, start ups, distribution awareness</p>
<p>Rather than running a probability that the set of words above would fit into one category in this case, "Technology" we can now compound the score with the probability that those terms fall into the category as well.  For another layer of scoring, trigrams can be extracted, 4-grams, etc.  In the financial sector the terminology is thick and analysis will be almost impossible without N-gram extraction.  "Filed for bankruptcy" and "avoided bankruptcy" could not be further apart.  With traditional filtering, the word "bankruptcy" would be meaningless because it really is not an indicator as to the probability that the article is favorable or not because there is no context.  In this case by extracting the phrases the filter can understand and score appropriate the difference between the two terms.</p>
<p>Paul Graham has been working on <a href="http://www.bgl.nu/bogofilter/graham.html">improving </a>the Bayesian filter to deal with spam by splitting the data into categories.  Text is classified not only as legitimate or spam based on the context of the message, but the likely hood of tokens appearing in various parts of the message.  N-gram filtering in this case wouldn't work as well for spam as the amount of grammar mistakes, misspellings, and word ordering would make any benefit worthless.   Spammers are adjusting their content to beat such filters all the time.  When the source data is reliable, the N-gram addition to the  filter will boost categorization accuracy.</p>
<p>Integration to traditional Bayesian filtering is very easy.  <a href="http://googleresearch.blogspot.com/2006/08/all-our-n-gram-are-belong-to-you.html">Google</a> has been using text processing for a while now.  This is a huge area of study in linguistics, language processing and machine learning.  With so much data out there and more being collected on a daily basis, deriving context from text will allow for applications to behave smarter.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.derivante.com/2009/03/31/bayesian-filter-training-with-n-gram/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Bayesian Filtering &amp; Financial Applications</title>
		<link>http://www.derivante.com/2009/03/27/bayesian-filtering-financial-applications/</link>
		<comments>http://www.derivante.com/2009/03/27/bayesian-filtering-financial-applications/#comments</comments>
		<pubDate>Fri, 27 Mar 2009 17:08:17 +0000</pubDate>
		<dc:creator>Clay vanSchalkwijk</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[bayesian]]></category>
		<category><![CDATA[content analysis]]></category>
		<category><![CDATA[Machine Learning]]></category>

		<guid isPermaLink="false">http://www.derivante.com/?p=257</guid>
		<description><![CDATA[A friend of mine and I recently started a new project. After kicking around several ideas we finally reached a consensus on applying software prediction to financial data. This has been pursued pretty heavily but from a home brew stand point, we wanted to make software that could compete by mashing up existing data and [...]]]></description>
			<content:encoded><![CDATA[<p>A friend of mine and I recently started a new project.  After kicking around several ideas we finally reached a consensus on applying software prediction to financial data.  This has been pursued pretty heavily but from a home brew stand point, we wanted to make software that could compete by mashing up existing data and technology available on the internet to make competitive and functioning software.</p>
<p>We intend on predicting the movement of stocks based on real time content analysis.  This requires a good deal of machine learning and historical data, but even good content analysis is not enough.  Using Bayesian Filtering with noise word reduction we plan on processing historical data and assigning the content to one of three categories: moveup, movedown, nomove.  In order to train the filters, past press releases will be inserted into the filter mashed up with the stock data to track how the markets reacted to the context of the content.  Over time, the software will be able to recognize keywords that trigger positive versus negative emotion in the market that would drive the price one way or the other.  A score can be applied much like spam scores are applied and this number can be used as part of a greater overall algorithm to determine an action.</p>
<p>Just to bring a few readers up to speed on exactly how this will be applied, take the following formula:</p>
<p><img class="aligncenter size-full wp-image-263" src="http://www.derivante.com/wp-content/uploads/2009/03/b307149835ea31ced4ae23af2ab89b05.png" alt="" width="437" height="46" /></p>
<p>Rather than training it to recognize the probability of spam we train it to recognize the probability that the word will trigger positive stock movement:</p>
<ul>
<li><span class="texhtml"><em>p</em></span> is the probability that the content will result in positive movement.</li>
<li><span class="texhtml"><em>p</em>1</span> is the probability <span class="texhtml"><em>p</em>(<em>S</em> | <em>W</em>1)</span> that it is positive knowing it contains a first word (for example "capital");</li>
<li><span class="texhtml"><em>p</em>2</span> is the probability <span class="texhtml"><em>p</em>(<em>S</em> | <em>W</em>2)</span> that it is positive knowing it contains a second word (for example "boosted");</li>
<li><em>etc...</em></li>
</ul>
<p>The entire body of the content will be processed against a known database of words and the market reaction to the presence of those words.  The basic Bayesian filtering will need to be extended to deal with phrase recognition but overall a solid proven technology for machine learning to build from.</p>
<p>This information by itself, is nothing revolutionary but with strong pattern analysis like candlestick pattern recognition and other market indicators it can be used to create an accurate trading platform for marginal gains which over time can offer pretty high returns.  There is certainly a lot of potential for this if it works, but it heavily depends on working accurately and there will be a lot of trial and error in the process.</p>
<p>For more reading on the concepts and components behind this idea, check out:</p>
<ul>
<li><a href="http://en.wikipedia.org/wiki/Naive_Bayes_classifier">Naive Bayes Classifier</a></li>
<li><a href="http://www.leavittbrothers.com/education/candlestick_patterns/">Candlestick Patterns</a></li>
<li><a href="http://en.wikipedia.org/wiki/Candlestick_chart">Candlestick Charting</a></li>
<li><a href="http://www.paulgraham.com/better.html">Better Bayesian Filtering</a></li>
<li><a href="http://www.tdameritrade.com/tradingtools/partnertools/api_dev.html">TD Ameritrade API</a></li>
</ul>
<p>The nice part is all the historical data is out there around the internet which makes back-testing and scoring very easy to do and there will need to be a lot of testing.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.derivante.com/2009/03/27/bayesian-filtering-financial-applications/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Is Amazon&#8217;s EC2 right for you?</title>
		<link>http://www.derivante.com/2009/01/26/is-amazons-ec2-right-for-you/</link>
		<comments>http://www.derivante.com/2009/01/26/is-amazons-ec2-right-for-you/#comments</comments>
		<pubDate>Mon, 26 Jan 2009 20:50:11 +0000</pubDate>
		<dc:creator>Justin Leider</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Amazon]]></category>
		<category><![CDATA[architecture]]></category>
		<category><![CDATA[ec2]]></category>
		<category><![CDATA[Hartware]]></category>
		<category><![CDATA[horizontal architecture]]></category>
		<category><![CDATA[horizontal database]]></category>
		<category><![CDATA[IT Infrastructure]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[Site Architecture]]></category>

		<guid isPermaLink="false">http://justinleider.com/?p=49</guid>
		<description><![CDATA[I've been asked this and similar questions quite a bit lately. But before I delve into the answer to this I want to lay the foundation and ask you a question. This one question should play a large part in your final assessment to go with EC2 or not. The question you should ask yourself [...]]]></description>
			<content:encoded><![CDATA[<p><!-- 		@page { size: 8.5in 11in; margin: 0.79in } 		P { margin-bottom: 0.08in } --></p>
<p style="margin-bottom:0;">I've been asked this and similar questions quite a bit lately. But before I delve into the answer to this I want to lay the foundation and ask you a question. This one question should play a large part in your final assessment to go with EC2 or not. The question you should ask yourself is:</p>
<p style="margin-bottom:0;"><strong>How quickly do you actually need to scale either up or down? </strong></p>
<p style="margin-bottom:0;">The answer to this will likely influence the correct solution to your problems. The following bullet point list is how I classify levels of scalability, each one comes with its own pros and cons but generally the quicker you need something the more expensive it is going to be.</p>
<ul>
<li><strong>Immediate</strong> - within minutes - EC2 or other cloud computing networks</li>
<li><strong>Fast</strong> - within days to a week - Managed Hosting, Rackspace, The Planet, etc</li>
<li><strong>Average</strong> - within weeks to a month - Own your own hardware, Dell, HP, IBM, etc</li>
<li><strong>Corporate</strong> - within months/years - Good Luck</li>
</ul>
<p style="margin-bottom:0;">With this in mind, everyone hears the hype of EC2, with its scalability, fully managed hardware and virtualization but there really aren't that many people out there describing their experiences with it. When we made the decision to go with EC2 we did our research and due diligence before making the switch. There wasn't much to go on but the few articles and blog posts we did read were all positive. I guess we all got caught up in the hype here as well.</p>
<p style="margin-bottom:0;">Even after all our research it turns out that going with EC2 was one of the poorer IT decisions we have made. EC2 has turned out to be more expensive, more difficult to implement and with poorer performance than we had ever expected even with our worst case estimations. To top it all off, we didn't fully utilize the benefits of going with EC2 which was immediate scalability. Our traffic is relatively predictable and grows or shrinks in manageable percentages and can be scaled up within days instead of minutes. We never have any massive spikes in our traffic either up or down. Even if we did have spikes we are limited by our MySQL cluster.</p>
<p style="margin-bottom:0;">While we had to rethink a lot of our architecture to create a more horizontal platform instead of the traditional vertical scaling, MySQL was by far our biggest bottleneck. The source of the problem is rooted in Amazon's preset machine size. While they have done an adequate job of offering different types of instances with more memory in one line and more computational power in the other you are still limited to what they are offering. With the large database we have and the latencies between the instances and their permanent storage we were forced to keep as much of our database cached in RAM. Now this shouldn't have been too big a deal. Just get a machine with a ton of RAM. Well, unfortunately Amazon's biggest instance only offered us a maximum of 15GB. Needless to say this was not sufficient and forced us to adopt a cluster solution. This in and of itself is not ideal especially when you should be able to run off a single box with 32GB of RAM and access to fast local disks. However, it took us twelve (12) m1.xlarge instances to reach the level of performance and availability we desired. Not to mention the network IO latency between node and disk storage and node to node adding insult to injury.</p>
<p style="margin-bottom:0;">While the speed and size of the cluster was not desirable, it worked. However, we had to completely forfeit any sort of scalability to achieve a working database. To my knowledge there is no way to quickly and easily boot up more instances of MySQL to supplement a live cluster. In order for us to add more capacity we would have to perform a rolling reboot of every machine in the cluster. Its unfortunate that databases were not designed with EC2 in mind.</p>
<p style="margin-bottom:0;">However, there are companies who are trying to tap into this pain point. We were looking very intently at a company called Continuent who produces a MySQL cluster monitoring and management tool. Unfortunately, as of Jan 2009 the product was still in private beta and was unavailable to us. This tool would have allowed us to add nodes to the cluster on the fly without having to take it down in the process. Although, even then with this extra tool, which wasn't cheap, you still couldn't scale down the cluster without taking it off-line. As far as I am concerned, if you are already using the largest instance available to you (an m1.xlarge or c1.xlarge), there is no way to vertically scale up a database with EC2. Instead you are forced into a less than ideal environment for hosting a horizontal architecture which could have serious consequences for your code base and SQL queries.</p>
<p style="margin-bottom:0;">To be honest, EC2 offers a lot of benefits that are hard to come by with other solutions. EC2 is great for companies doing lots of non-real-time activities such as batch and queued processing. Companies who have a small database that can be cached in RAM and replicated easily will also benefit from EC2, just boot up a bunch of instances and go to town. However, the bottom line is if you have fairly consistent usage patterns and your applications are performance sensitive then there are much faster and more cost effective ways of abstracting your hardware requirements. We at citysquares are in the process of moving off of EC2 and onto a managed hosting platform. We still enjoy the benefits of leased hardware like we had with EC2 and the ability to quickly add new hardware. Granted, more servers aren't available to us at the drop of a hat but a couple days lead time to get another box up and running is more than sufficient for us. Not only that but we also have a whole team of IT people working with us to help alleviate our burden of supporting the entire hardware/software stack. We can now focus on what we do best which is our application.</p>
<p style="margin-bottom:0;">Keep in mind that there is no concrete answer as to whether EC2 or cloud computing in general will work for you or not. You need to determine if the capacity and latencies of the pre-determined instance sizes will meet your growing infrastructure needs. For us the bitter answer was a resounding no. We were able to spec out a solution in a fully managed hosting environment for about half the monthly cost of EC2 while increasing the performance of our application significantly.</p>
<p style="margin-bottom:0;">
<p style="margin-bottom:0;">So, is Amazon's EC2 right for you?</p>
<p style="margin-bottom:0;">
<p style="margin-bottom:0;">
<p style="margin-bottom:0;">
]]></content:encoded>
			<wfw:commentRss>http://www.derivante.com/2009/01/26/is-amazons-ec2-right-for-you/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
	</channel>
</rss>
<!-- WP Super Cache is installed but broken. The path to wp-cache-phase1.php in wp-content/advanced-cache.php must be fixed! -->