<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Derivante &#187; high availability</title>
	<atom:link href="http://www.derivante.com/tag/high-availability/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.derivante.com</link>
	<description>to obtain or receive from a source</description>
	<lastBuildDate>Mon, 26 Apr 2010 18:44:42 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
<xhtml:meta xmlns:xhtml="http://www.w3.org/1999/xhtml" name="robots" content="noindex" />
		<item>
		<title>Digging into HAProxy</title>
		<link>http://www.derivante.com/2008/08/13/digging-into-haproxy/</link>
		<comments>http://www.derivante.com/2008/08/13/digging-into-haproxy/#comments</comments>
		<pubDate>Wed, 13 Aug 2008 22:59:08 +0000</pubDate>
		<dc:creator>Justin Leider</dc:creator>
				<category><![CDATA[Web Architecture]]></category>
		<category><![CDATA[Web Technology]]></category>
		<category><![CDATA[apache]]></category>
		<category><![CDATA[ec2]]></category>
		<category><![CDATA[free]]></category>
		<category><![CDATA[HAProxy]]></category>
		<category><![CDATA[high availability]]></category>
		<category><![CDATA[Load Balancing]]></category>
		<category><![CDATA[open source]]></category>
		<category><![CDATA[reliability]]></category>
		<category><![CDATA[security]]></category>

		<guid isPermaLink="false">http://justinleider.wordpress.com/?p=19</guid>
		<description><![CDATA[Well its been a few weeks since my last posting here and there is certainly a good reason for that. Every once in a while I just need to completely unplug from technology. So it only made sense for me to go away on vacation to the middle of no where up in Maine's great [...]]]></description>
			<content:encoded><![CDATA[<p>Well its been a few weeks since my last posting here and there is certainly a good reason for that. Every once in a while I just need to completely unplug from technology. So it only made sense for me to go away on vacation to the middle of no where up in Maine's great north woods for a couple of weeks. No computers, no cellphones, no towns, no people, just dirt logging roads, lakes, rivers, wildlife and trees. Now that I'm back and caught up I will begin to start posting regularly again.</p>
<p style="margin-bottom:0;">Getting back to reality, as the title states, this post will focus on the reasons behind using <a title="HA Proxy -- Load Balancing " href="http://http://haproxy.1wt.eu/">HAProxy</a> as well as a little bit on <a title="Hyper-Local Search Portal" href="http://citysquares.com">CitySquare's</a> implementation of the load balancer. Let me start by quoting a description of HAProxy from their website:</p>
<blockquote>
<p style="margin-bottom:0;">“HAProxy is a free, <em><strong>very</strong></em> fast and reliable solution offering <a href="http://en.wikipedia.org/wiki/High_availability">high availability</a>, <a href="http://en.wikipedia.org/wiki/Load_balancer">load balancing</a>, and proxying for TCP and HTTP-based applications. It is particularly suited for web sites crawling under very high loads while needing persistence or Layer7 processing. Supporting <strong>tens of thousands</strong> of connections is clearly realistic with todays hardware. “</p>
</blockquote>
<p style="margin-bottom:0;">While the high availability aspect of HAProxy is all well and good, everything is expected to be high availability these days. Any sort of downtime has become unacceptable even in the middle of the night. This is especially true when relying on search engine driven traffic. I've noticed that search engines like Google and Yahoo to name a couple, really ramp up their crawl rate in the wee hours of the morning. The crawl rate is boosted more so on weekend nights when even fewer people are searching the web and the search engines can allocate more of its resources towards web crawls. CitySquares has certainly been subject to DoS attacks by GoogleBot on Friday nights.</p>
<p style="margin-bottom:0;">This is where the load balancing aspect of HAProxy comes into play, it is one of the main reasons for choosing it as our front facing service.  With just a couple HAProxy servers we can maintain redundancy while having a nearly unlimited pool of Apache web servers to hand off requests to. We don't need any special front facing, load balancing hardware to act as a single point of failure. We can also keep some money in our pocket at the same time by utilizing a software solution. Luckily, HAProxy is open source and free to the world, licensed under the <a title="GPL v2 License Terms" href="http://www.opensource.org/licenses/gpl-2.0.php">GPL v2</a>.</p>
<p style="margin-bottom:0;">Not only does HAProxy handle our load balancing but it also serves as a central access point for DNS purposes. This solution is certainly much better than our current DNS round robin which is limited in its own right. Is this common sense? Probably, but I figured it was worth pointing out.</p>
<p style="margin-bottom:0;">Lastly, security is always a concern for heavily trafficked and high profile sites. The developer behind HAProxy has been very proactive with the program architecture and coding practices and as such HAProxy can claim it's never had a single known vulnerability in over five years. Since all front facing applications are subject to attacks from so many different sources these days, having a stable and secure application is a godsend when it comes to any sort of security related IT maintenance.</p>
<p style="margin-bottom:0;">As far as implementation goes, I suspect that eventually we might need to move the HAProxy instances onto their own dedicated servers as traffic increases. In the meantime, with EC2, we are running them in parallel with Apache on the same servers. This is purely a cost savings measure as every server instance  started with EC2 results in more cash out the door. As it is, HAProxy is incredibly fast and lean and really doesn't consume much in the way of system resources, either CPU load or memory utilization.</p>
<p style="margin-bottom:0;">There are certainly other reasons for choosing HAProxy but they are past of the scope of this post. I encourage everyone to take a serious look at HAProxy when spec'ing out a load balancer or proxy.</p>
<p style="margin-bottom:0;">
<p style="margin-bottom:0;">
]]></content:encoded>
			<wfw:commentRss>http://www.derivante.com/2008/08/13/digging-into-haproxy/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>
<!-- WP Super Cache is installed but broken. The path to wp-cache-phase1.php in wp-content/advanced-cache.php must be fixed! -->