article

Building a 2-Node ESXi Cluster with Centralized Storage for $2,500

by Nick 11 years ago

What started as a simple goal… replacing my vSphere 4.1 whitebox with something that more closely resembles a production environment, became a design requirement for a multi-node ESXi lab cluster that can do HA, vMotion, DRS, and most of the other good stuff. But do it without having to resort to using nested ESXi. And I wanted an iSCSI storage array that was fast. Not, Synology NAS “kind of” fast under certain conditions… but something with SSD-like performance, and a bunch of space. And I didn’t want to spend more than $2,500. For everything. In other words, what really I wanted was something akin to a 2-node ESXi cluster with SAN backing it with 10TB of “fast” disk I/O. Almost something like a Dell VRTX, but for my home lab. And I wanted to spend less one-tenth of what it might cost with Enterprise-grade gear. As I iterated through hardware configurations, checking the vSphere whitebox forums, contrasting against the HCL, and running out of budget quickly, it became clear that the only way to do this was to figure out storage piece first.

Key Design Decision

So the question became, do I really want to bother with having a “real” iSCSI storage array. If not, then I could opt for multiple SSD drives locally with a hardware RAID, and just reconsider the need for a dedicated storage array entirely – or making some other compromises. Sure, I would have had more IOPS than I would have known what to do with, and yeah, maybe I could have resigned myself to living in a nested ESXi world, but no. As it turned out, someone had already done some good heavy lifting on this topic, and had a fast homemade iSCSI storage array with hardware RAID for $1,500. No VAAI of course, but still… not bad for $1,500. More than that through, that left $1,000 to put together a couple of hosts – so plenty, right?

Critical Path item – Inexpensive Storage

If the storage array as a build component was the biggest overall challenge, then raw storage was the critical path item in terms of procurement. In order to come out with about 11TB of usable capacity, I needed 14 x 1TB drives in a RAID6, and I needed to spend less than $700. Depending on when you read this article, that may be less of a challenge than it was in early 2014, but at the time $50 per TB on a 7200RPM drive was hard to come by. Still harder was finding Hitachi Ultrastar 7200pm drives with 32MB of cache – as they’re Enterprise-grade and not always around on Ebay. After a couple of months bidding on large lots, I eventually found 14 for $50 each. Given no schedule constraints, I could have perhaps ended up with 2X as much storage, for a reasonable cost premium – but 11TB of usable space in a RAID6 configuration exceeded my need, and kept me on pace to acquire most of the hardware on schedule.

Controller

The LSI MegaRAID 84016E is, albeit last generation, a SAS/SATAII workhorse of a controller card. Supporting up to 16 drives, and RAID levels 0, 1, 5, 6, 10, 50, and 60 at 3Gb/s per port, with a battery backup module, and online capacity expansion and it’s dirt cheap on Ebay and nearly always available. For my use-case, it compared favorably against going the local SSD drive route, or using LSI9260-4i which cost 4X. If you’re copying the build, be sure and pick-up four of these Mini SAS (SFF-8087) Male to SATA 7-pin Female cables so that you can plug you SATA drives into the LSI 84016E.

Storage Array: Everything Else

This section could almost everything else, because by now we’ve spent about 31% of the budget and many of the remaining decisions are mostly inconsequential. Still I opted for a Rosewill RSV-L4500, because I knew it would fit all 14 drives in it (and it does, and the design isn’t bad at all – easy enough for me to work in). I did add a spare SSD drive as an OS and Level -2 Cache for PrimoCache (though the SSD was scavenged from another box). For CPU, Memory, and Power Supply – I went with the ASRock 970 Extreme 3 – $65, AMD FX8320 8-core CPU (because it was on-sale for about $105), 32GB of the lowest-cost DDR3-1600 ECC RAM I could find, and a 750W Corsaid HX750 power-supply to power all of these drives. For the network interfaces, I had a spare Intel 2-port NIC lying around that I had picked-up in a lot sale a couple of years ago.

Making the Storage Array Useful

There are a number of ways you can go with this. Nexenta, Microsoft Storage Spaces, OpenIndiana, or FreeNAS to name a few. But I really wanted to take a look at Starwind’s SAN/iSCSI combined with PrimoCache. Short version… you install Windows 2008R2/2012R2, configure the LSI controller software (MegaRAID), add Starwind’s SAN/iSCSI software and carve out some storage to expose to ESXi, and PrimoCache and configure it to use as much RAM as you can for the Level-1 Cache (26GB), and as much SSD space as you can for the Level-2 Cache (120GB). In my usage scenario, this seems to work pretty well – keeping the disks from bogging down I/O.

vSphere ESXi Cluster

With the remaining budget, I still needed to build two vSphere ESXi boxes. As I started looking, I found it really challenging to come up with a lower-cost and still good build than using the ASRock 970 Extreme 3, AMD-FX-8320 3.6GHz 8-Core processors, 32-GB of RAM, ATI Rage XL 8mb, Logisys PS550E12BK, spare Intel Pro/1000, 16GB USB sticks… I built two more machines – installing vSphere ESXi 5.1 on the USB sticks, and mounting the iSCSI volumes that I exposed in Starwind’s SAN/iSCSI and began building out my vCenter box and templates.

Compromises to hit budget

In order to stay around $2,500 there were a few small compromises that I had to make. The first was the hard drives… I couldn’t wait for the 2TB Hitachi Ultrastar drives. While not significant for my use case, it is nonetheless noteworthy. Secondly, I ran out of budget to put a full 32GB of RAM in the both of the hosts, so I have a total of 48GB across the two nodes, instead of the 64GB that I hoped for. Finally, for the hosts, I bought the lowest cost cases I could find – $25 each. Aside from scavenging a few parts that I had laying around (an SSD drive, a few Intel Pro/1000), I managed to come in on budget.

Bottom Line

The project met all of my goals – a home lab, multi-node ESXi cluster with a dedicated iSCSI storage array that resembles a production environment – all on a budget of around $2,500. I’m able to vMotion my VMs around, DRS is functioning, and Veeam Backup and Replication is working. Better still, I can tear-down and rebuilt the environment pretty quickly now. I didn’t really run into any show-stoppers per say, or real problems with the build. If there’s interest, I’ll post some additional information about the lab in the future. A big thanks to Don over at The Home Server blog for his work on Building a Homemade SAN on the Cheap, particularly in validating that you can actually buy descent drives in large lots on Ebay at a discount, as well as for the Motherboard recommendation which was critical in hitting the budget.

The Enterprise Storage Market in 2014 (storage for the rest of us)

by Nick 12 years ago

Small breakthroughs applied in new ways can re-shape the world. What was true for the allied-powers in 1942 – by then monitoring coded Japanese transmissions in the Pacific theater of World War II – is just as relevant in the Enterprise storage business today. In both cases, more than a decade of incremental improvement and investment went into incubating technologies with the potential to disrupt the status quo. In the case of the allied-powers, the outcome changed the world. In the case of the Enterprise storage business, the potential exists to disrupt the conventional Enterprise SAN business, which has remained surprisingly static for past 10 years.

In large and small enterprises alike, one thing that everyone can agree on is that storage chews through capital. In some years it’s a capacity problem. In other years, it’s a performance problem. And occasionally, when it happens to be neither of those… it’s a vendor holding you hostage with a business model that looks reminiscent mainframe computing. You’re often stuck between developers demanding a bit more of everything, leadership wanting cost reductions, and your operations team struggling with their dual mandate.

Storage, it seems… is a constant problem.

If you roll-up all of your storage challenges, the root of the problem with the Enterprise storage and the SAN market over the better half of the past decade is that storage has all too often looked like a golden hammer. Running out of physical capacity this year? Buy a module that gives you more capacity. A performance issue, you say? Just buy this other expensive module with faster disks. Historically, it’s been uncreative. If that pretty much resembles what you’re assessment of the storage market was the last time you looked, you’d certainly be forgiven for having low expectations.

Fortunately, it looks like we’re finally seeing disruption in Enterprise storage by a number of technologies, and businesses, and this has the potential to change the face of the industry in a ways that just a few years ago wouldn’t have even been envisioned.

Where were we a few years ago?

When you breakdown your average SAN, by and large it consists of commodity hardware. From standard Xeon processors, and run of the mill memory, to the same basic mix of enterprise HDDs… for the most part it’s everything you’d expect to find. Except perhaps for that thin layer of software where the storage vendor’s intellectual property lives. That intellectual property – more specifically, the reliability of that intellectual property is why there are so few major storage vendors. Dell, EMC, NetApp, IBM, and HP are the big players that own the market. They own the market because they’ve produced reliable solutions, and as a result they’re the ones that we trust not to make a mess of things. Beyond that, they also acquired strategic niche players that popped up over the years. In a nutshell, trust and reliability are why enterprise storage costs what it does, and that’s exactly why these companies are able to sustain their high margins… they simply have a wide-moat around their products. Or, at least, they used to. Flashing forward to today, things are finally starting to change.

What’s changed? Flash memory:

For years the only answer to storage has been more spinning magnetic disks (HDD). Fortunately, over the past few years, we’ve had some interesting things happen. First, is the same thing that caused laptop and tablets to outpace desktops: flash memory. Why flash memory? Because it’s fast. Over the past couple of years, SSD drives and flash technology are finally being applied to the Enterprise storage market – a fact which is fundamentally disruptive. As you know, any given SSD outperforms even the fastest HDD by orders of magnitude. The fact that flash memory is fast and the fact that it’s changing the Enterprise storage market might not news to you, but if there was any doubt… here are some of my takeaways concerning SSD Adoption Trends from August 2013, based on the work done by Frank Berry, and Cheryl Parker at IT Brand Pulse.

These changes range from the obvious…

As SSDs approach HDDs dollar per GB cost, organizations are beginning to replace HDD with SSD
Quality is the most important HDD feature
Organizations are mixing disk types in their arrays in order to achieve the best cost for reliability, capacity, and performance required
More organizations have added SSDs to their storage array in 2013 than 2012

… to the disruptive…

Within 24 months, the percent of servers accessing some type of SSD is expected double
SSDs will comprise 3x the total percentage of storage within the next 24 months
IT is depending increasingly more on their storage vendors (to embed the right underlying NAND flash technology in a manner that balances cost, performance, and reliability)

In other words, we’re finally starting to see some real traction in flash-based memory replacing hard disk drives in the Enterprise tier, and that trend appears to be accelerating.

What’s changed? The Cloud:

It seems that no IT article today is complete without a conversation about the Cloud. While some have been quick to either dismiss or embrace it, the Cloud is already starting to pay dividends. Perhaps the most obvious change is applying the Cloud as a new storage tier. If you can take HDDs, and mix in some sort of Flash memory, and then add the Cloud… you could potentially have the best of all possible worlds. Add in some intellectual property that abstracts out the complexity of dealing with these inherently different subsystems, and you get a mostly traditional-looking SAN that is fast, unlimited, and forever. Or, at least that’s the promise from vendors like Nasuni, and StorSimple (now a Microsoft asset), who have taken HDDs, SSDs, and the Cloud and delivered a fairly traditional SAN-like appliance. However, these vendors have taken the next-step, and inserted themselves between you and the Cloud. Instead of you having to spin-up nodes on AWS, or Azure, vendors like Nasuni have taken that complexity out, and baked-it into their service. On the surface, your ops team can now leverage the Cloud transparently. Meanwhile, Nasuni has successfully inserted themselves as a middleman in a transaction that is on-going and forever. To the extent that’s a good thing, I’ll leave up for debate. But it works quite well and solves most of your storage problems in a convenient package.

The Hidden Cloud?

The storage industry’s first pass at integrating the Cloud has been interesting. If not yet transformative in terms of Enterprise storage, it’s definitely on everyone’s radar. What’s arguably more interesting and relevant and what has the potential to be truly transformative is the trickle down benefits that come from dealing with Big Data. In short, it’s large companies solving their own problems in the Cloud, and enabling their customer base as a byproduct. The Cloud today, much like the Space race of the 1960s, and cryptography advancements of the 1940s, are transformative pressures with the potential to reshape the world.

… and the most disruptive advancements are probably coming in ways you haven’t suspected.

Data durability.

In a similar way that the Japanese assumed that their ciphers were secure in 1942, IT organizations often assume RAID to be the basic building block of Enterprise storage. As in, it goes without saying that your storage array is going to rely on RAID6, or RAID10, or what have you. After all, when was the last time you really gave some thought to the principles behind RAID technologies? RAID relies on classic erasure codes (a type algorithm) for data protection, enabling you to recover from drive failures. But we’re quickly reaching a point where disk-based RAID approaches combined with classic erasure codes, like Reed-Solomon in RAID6 (with can tolerate up to 2 failures), simply aren’t enough to deal with the real-world risk inherent in large datasets with many physical drives. Unsurprisingly, this is a situation that we tend to find in the Cloud. And interestingly, one type of solution to this problem grew out of the space program.

The medium of deep space is often noisy. There’s background radiation, supernovae, solar wind and other phenomena that conspire to damage data packets in transit, rendering them undecipherable by the time they reach their destination. When combined with situations where latency can be measured in hours or days, re-transmission is at best an inconvenience and at worst a show stopper. Consider the Voyager I probe for a moment. Instructing Voyager to re-transmit data, where the round-trip takes about 34 hours and the bandwidth available is on the order of 1.4 kbit/s – extra re-transmissions can frustrate the science team, or could even put the mission at risk. In other cases, where two way communication isn’t possible, or the window for operation is narrow – say, like if you had to ask Huygens to re-transmit from the surface of Titan – in those cases, re-transmission is simply a non-starter. As a result of these and other needs – the need for erasure encoding was obvious.

When erasure encoding is applied to storage, the net of it is that you’re using CPU (math), to create new storage efficiencies (by storing less). In real-word terms, when the challenge is that re-build times for 2TB disks (given a certain I/O profile) are measured in hours or days, they simply aren’t able to deal with failure risk where a second (or third) drive could fail while waiting on the rebuild. What’s the net result? Erasure encoding prevents data loss, and decreases the need for storage; as well it’s supporting components (fewer drives, which translate into less maintenance, lower power requirements, etc.). The need for data durability has spurred the implementation of new erasure codes, targeting multi-level durability requirements, which can reduce storage demands and increase efficiencies.

Two examples of tackling the data durability problem though the use of new erasure codes include Microsoft’s Local Reconstruction Codes (LRC) erasure encoding, which is a component of Storage Spaces in Windows 2012 R2, and Amplidata’s Bitspread technology.

In the case of Microsoft LRC encoding, it can yield 27% more IOPS given the same storage overhead as RAID6, or 11% less storage overhead given the same reconstruction I/O relative to RAID6.

Bitspread in action

Amplidata’s approach to object-based storage is their Bitspread erasure coding technology, which distributes data redundantly across a large number of drives. Amplidata claims it requires 50% to 70% less storage capacity than with traditional RAID. It works by encoding (as in, mathematically transforming) data (files, pictures, etc.) at storage nodes, using their propriety equations. Based on a user-definable policy, you can lose multiple stores, or nodes and Bitspread can still heal the loss. Amplidata’s intellectual property is such that it can work for small or very large Exabyte-sized data sets. The result in a failure situation is much faster recovery, with fewer total drives and less overhead than RAID6.

The Hidden Cloud: Continued…

When Microsoft was building Azure for their internal services a few years ago, they knew they needed scale-out highly available storage solution. Like the rest of us, they didn’t want to pay a storage tax to one of the big storage vendors for something they clearly had the technical capability to architect in-house. What’s more, for Azure to be a viable competitor to AWS, they were driven to eliminate as much unnecessary cost from their datacenters as possible. The obvious low-hanging fruit in this scenario is storage, in the form of Windows Azure Storage (WAS).

Stepping back, Microsoft implemented the LRC erasure coding component within Storage Spaces, enabling software-defined storage within the context of their Windows server OS, which you can use in your datacenter to create elastic storage pools out of just about any kind of disk – just like Microsoft does in Azure.

One of the most interesting tickle down benefits that I’ve seen from the Cloud, comes in the form of a Microsoft Windows Server 2012 R2 feature known as Storage Spaces. The pitch for storage spaces boils down to this… Storage Spaces is a physically scalable, continuously available storage platform that’s more flexible than traditional NAS/file-sharing, offers similar performance to a traditional SAN, and does so at a commodity-like cost point. In other words, Storage Spaces is Cloud storage for the Enterprise without the storage tax.

The obvious question becomes, “So how is Storage Spaces as an implementation, any different than the intellectual property of the traditional SAN vendors? Isn’t Microsoft just swapping out the vendor’s intellectual property, for Microsoft Storage Spaces technology? Yes, that’s exactly what happening here. The difference being that you’re now just paying for the OS license plus commodity hardware, instead of paying for the privilege of buying hardware from a single storage vendor as well as their intellectual property. In the process, you’re effectively eliminating much of storage tax premium. You still need to buy storage enclosures, but instead of vendor-locked-in arrays, they’re JBOD arrays, like one of many offered by DataON Storage, as well as other Storage Spaces-certified enclosures.

The net result is dramatically lower hardware costs, which Microsoft calculates based on their Azure expenses to be on the order of 50% lower. As another datapoint, CCO was able to avoid purchasing a new $250,000 SAN and instead acquire a $50,000 JBOD/Storage Spaces solution. Now granted, that’s a Microsoft-provided case study, so your mileage may vary. But the promise is to dramatically cut storage costs. In a sense, Storage Spaces resembles a roll-your-own SAN type of approach, where you can build out your Storage Spaces-based solution, using your existing Microsoft infrastructure skill-sets, to deliver a scale-out continuously available storage platform, with auto-tiering that can service your VMs, databases, and your file shares. Also, keep in mind that Storage Spaces isn’t limited to Hyper-V VMs, as Storage Spaces can export NFS mount points, which your ESXi, Xen, etc. hosts use.

The Real World:

What attributes are desirable in an Enterprise storage solution today?

Never Fails. Never Crashes. Never have to worry.
Intelligently moves data around to the most appropriate container (RAM, Flash-memory, Disk)
Reduces the need for storage silos that crop-up in enterprises (auto-tiers based on demand)
Reduces the volume of hardware, by applying new erasure encoding implementations
Inherits the elastic and unlimited properties of the Cloud (while masking the undesirable aspects)
Requires less labor for management
Provides automated disaster recovery capabilities

As IT decision makers, we want something that eliminates the storage tax, and drives competition among the big players, and most importantly is reliable. While flash-memory, the Cloud, and new technologies are all driving the evolution of the Enterprise Storage, it’s the trickle-down benefits that come in the form of emerging technologies, that are the often times the most relevant.

While the U.S. was monitoring the coded Japanese transmissions int he pacific they picked-up on a target known as “objective AF”. Commander Joseph J. Rochefort and his team sent a message via secure undersea cable, instructing the U.S. base at Midway to radio an uncoded message stating that their water purification system had broken down and that they were in need of fresh water. Shortly after planting this disinformation, the US team received and deciphered a Japanese coded message which read, “AF was short on water”. The Japanese, still relying on a broken cipher, not only revealed that AF was Midway, but they went on to transmit the entire battle plan along along with planned attack dates. With the strategy exposed, U.S. Admiral Nimitz entered the battle with a complete picture of the Japanese strength. The outcome of the battle was a clear victory for the United States, and more importantly the Battle of Midway marked the turning point in the Pacific.

Outcomes can turn on a dime. The Allied Powers code-breaking capabilities in both the European and Pacific theaters are credited with playing pivotal roles in the outcome of World War II. In business, as in war, breakthrough and disruptive technologies are often credited with changing the face of the world. Such breakthroughs are often hard-fought, and the result of significant investment of time and resources with unknown, and often times surprising outcomes. Enterprise storage pales when viewed in comparison with the need to protect lives (perhaps, the ultimate incubator). But nevertheless, the Cloud is incubating new technologies and approaches to problems previously thought of as solved – like Data durability, and has the potential to fundamentally change the storage landscape.

With this article, I tried to step back and look at the industry at high-level and see where the opportunities for disruption are, where we’re headed, and the kind of challenges that a sysadmin who is taking a fresh look at Enterprise storage might be faced with today. As part of my option analysis effort, I got a bit deeper with Nasuni and Microsoft Storage Spaces, and complied that information, and some “getting started” pricing in the 4800+ word Enterprise Storage Guide. So if you sign-up for my Newsletter here, you’ll get the Enterprise Storage guide as free download (PDF format) in the welcome email. No spam, ever. I’ll just send you the occasional newsletter with content similar to this article, or the guide. And of course, you can unsubscribe at any time.

 1 2