Wednesday, August 24, 2011

vSphere 5: virtual networking revisited

One of the less touted features of vSphere 5 is the improvements in virtual networking. The visibility into virtual networking has traditional been a enigma in virtual environments. This was significantly improved due to the partnership with CISCO and the introduction of the distributed and managed switch options into the vSphere 4 environment.

With vSphere 5, VMware has started to support standard discovery protocols to allow more interaction between physical and virtual switching environments. In addition it is now possible to 'tag' and prioritize traffic in the virtual environment through QoS support.

VMware did not stop there in beefing up the transparency in the networking stack. You can enable NetFlow on a distributed to switch and pass it to a collector. This provides the opportunity to understand network traffic flow and bandwidth utilization between vms located on the same hosts, different hosts and between the vms and the physical environment.

Why is this significant? Many customers are now dealing with internal multi-tenancy issues in which virtual clusters are often the demarcation points between business units that are reluctant to 'share' resources. The visibility and prioritization allows the IT team to make policy based decisions to logically separate the environment and then demonstrate it through granular reporting all the way down to traffic flow. This allows them to collapse clusters of 'parked' resources that could be better utilized.

This is the same problem that cloud providers often struggle with. It is difficult to provide visibility into traffic flow down to the vm level. This in turn leads to issues in deriving SLA's that can be measured in real terms. Interestingly these features and the pain points they address are not well represented in the promotional material from VMware. I find this unusual as it clearly distinguishes the vSphere 5 platform from their competitors.

If you have not looked at these features yet, it is a good idea to give them a thorough review. As your environment scales, these technologies become increasingly important in ensuring you have end-to-end visibility and manageability of your virtual environment.

- Posted using BlogPress from my iPad

Monday, August 8, 2011

VDI and storage; can't find one without the other

During a recent design engagement building a scalable VDI environment, storage quickly became a core consideration. Those of you who have designed and operated large VDI environments know that storage I/O can easily cause performance issues if not properly planned for. There were a few interesting things we learned as we worked with different SAN vendors to ensure their solution addressed some of the unique characteristics of our VDI workload.

What is unusual about virtual desktop environments is two very different disk I/O conditions; operational and burst I/O. Burst I/O is more common in VDI environments because operational requirements necessitate large reboots of desktop operating systems not typical in virtual server environments. Operational I/O can also be problematic if things such as virus scanning activities are synchronized based on time vs. randomized to reduce the performance hit on the VMs.

Some storage vendors have a very utilitarian view of storage services; they do not view virtual desktop workloads as any more unique then other virtual workloads. The limitation with SAN vendors who do not differentiate between server and desktop virtualization environments is that in order to guarantee good throughput you may have to consider their enterprise class storage systems to ensure good performance. This can be a tough sell hosting virtual desktop operating systems.

Others storage vendors provide mid-tier solutions and provide Solid State Drives in order to deal with burst I/O. While better, it still requires you to adjust your design so that high I/O requirements are segregated onto volumes made up of SSD. This leads to a very static design where you may or may not make good use of high performance drives.

Most recently we have seen storage vendors start to build mid-tier storage systems that have some of the features of enterprise class systems such as Dynamic Tiering. Dynamic Tiering is the ability to move hot or data that is in demand to high performance drives so that the SAN delivers great performance. This can typically be done on the fly or scheduled to happen periodically during the day. These solutions are ideal for Virtual Desktop environments as they do not require the premium of enterprise class storage systems but still deliver the features. EMC has clearly targeted the VNX line to provide features that make them ideally suited for virtual workloads. Of course companies such as NetApp have been using Programmable Acceleration Modules (PAM) cards for years to deal with burst I/O. Whichever solution you select here are a few general considerations for putting together your design.

1. Each SAN vendor has very different numbers when est. I/O’s for virtual/virtual desktop workloads. It is best to have your own reference numbers based on internal testing. Use these to make sure the estimates provided meet your requirements.

2. Burst I/O and Operational I/O are treated distinctly by most storage vendors. For example if your numbers est. that your environment may generate 15,000 burst I/O’s and required 4 TB of storage the vendor may suggest 6 X SSD drives (6 x 2500 IOs each = 15K burst, excluding RAID considerations) and SAS drives to meet your storage/operational requirements.

3. Ensure that your virtual desktop design incorporates the SAN environment. A good design should provide consistent performance over the lifetime of the solution (typically 3 years). This is not possible if you build a great VDI design that does not set specific requirements for storage. While your VDI environment may run great during the first year you may see high SAN utilization lead to problems over time.

4. Separate your expected read and write I/Os. Take the number of writes and ensure you factor the number by 4 to allow for an I/O penalty on writes. For example if you expect 2000 Reads and 2000 writes, multiply the writes X 4 for a total of 10,000 expected I/Os (2000 Read I/O + 8000 Write I/O) .

Thursday, August 4, 2011

Cloud Design; the return of Hive Architecture

I have been interested in the notion of what I like to call “Hive Architecture” for a few years.  It is designing software services that are the sum of their parts using virtual machines.  As with a bee hive, each member has a simple function but collectively they form a complex system necessary for survival.  Designing IT infrastructure in a similar manner allows you to be more Cloud friendly. 

Although the design concept is not new; as clustering has been around for years and we have seen early versions of this based on virtualization with projects such as LiquidVM (BEA/Oracle) and JeOS (the Just Enough OS initiative). 

At one point VMware was a strong proponent of the concept.  I remember sitting in a keynote session for VMware, the speaker was explaining that the time of multi-function, general purpose OSes was over.  Well that didn’t happen, and with the strong adoption of Windows 2008 R2 it seems that the general purpose OS will be around for a time longer. 

There are lessons however in the design principles, especially when we look at how VMware in enabling Cloud adoption.  With vCloud Director VMware is very much betting on organizations adopting Clouds that look and feel very much like their internal virtualization infrastructure.  So how does the concept of Hive Architecture and VMware’s vision come together?

If we move forward to todays virtualization environments; the focus has been on automation to simplify management of the virtualization stack and reduce the operational overhead of managing a cluster of VMs running traditional operating systems and leveraging virtualization much more heavily in the supporting Network and storage infrastructure.  In addition with the vCloud and vShield product line; the stretching of virtual infrastructure securely between separate locations.  How then does Hive architecture add any value to what todays operational environments look like?

The concept is rather simple; when we implement multi-tiered applications that have database,web, load balancing, network considerations in our virtual infrastructure we should keep in mind that the application should be deployed as if it is a single ‘hive’ avoiding the sharing of services between non-related VMs even if it goes against our grain.  Now this may not be possible due to licensing costs; consider the conundrum of running dedicated SQL services to an app vs. the simplicity of collecting databases on a centralized service and the operational overhead.

So why consider it?  Well if we use the example of a company that has an internal datacenter running VMware and has stretched this to take advantage of virtual infrastructure at their cloud provider and is now considering what makes sense to put at arms length, Hive architecture makes sense.  The entire business application is made up of a logically collected bunch of VMs (‘the hive if you will’).  The IT organization does not have to go through a large decoupling of shared services to take advantage of the cloud opportunity.

The idea occurred to me several years ago when I had the experience of working with an organization that was isolating and virtualizing services by business application for QA. It was a significant challenge to map out all the involved software, servers and supporting infrastructure.  It also struck me when watching a presentation by Intel corporation about Cloud adoption in which they detailed the amount of organization required to enable them to make use of Cloud computing.  It was no small effort.  While Hive architecture is not the end all, it is an important consideration in simplifying the move to the Cloud.