If everyone is thinking the same, someone isn't thinking

Lori MacVittie

Subscribe to Lori MacVittie: eMailAlertsEmail Alerts
Get Lori MacVittie via: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn

Blog Feed Post

Your Call is Important to Us at CloudCo: Please Press 1 for Product, 2 for OS, 3 for Hypervisor, or 4 for Management Troubles

When there’s a problem with a virtual network appliance installed in “the cloud”, who do you call first?

imageAn interesting thing happened on the way to troubleshoot a problem with a cloud-deployed application – no one wanted to take up the mantle of front line support. With all the moving parts involved, it’s easy to see why. The problem could be with any number of layers in the deployment: operating system, web server, hypervisor or the nebulous “cloud” itself. With no way to know where it is – the cloud has limited visibility, after all – where do you start?

Consider a deployment into ESX where the guest OS (hosting a load balancing solution) isn’t keeping its time within the VM. Time synchronization is a Very Important aspect of high-availability architectures. Synchronization of time across redundant pairs of load balancers (and really any infrastructure configured in HA mode) is necessary to ensure that a failover even isn’t triggered by a difference caused simply by an error in time keeping. If a pair of HA devices are configured to failover from one to another based on a failure to communicate after X seconds, and their clocks are off by almost X seconds…well, you can probably guess that this can result in … a failover event. Failover events in traditional HA architectures are disruptive; the entire device (virtual or imagephysical) basically dumps in favor of the backup, causing a loss of connectivity and a quick re-convergence required at the network layer. A time discrepancy can also wreak havoc with the configuration synchronization processes while the two instances flip back and forth.

So where was the time discrepancy coming from? How do you track that down and, with a lack of visibility and ultimately control of the lower layers of the cloud “stack”, who do you call to help? The OS vendor? The infrastructure vendor? The cloud computing provider? Your mom?

We’ve all experienced frustrating support calls – not just in technology but other areas, too, such as banking, insurance, etc… in which the pat answer is “not my department” and “please hold while I transfer you to yet another person who will disavow responsibility to help you.” The time, and in business situations, money, spent trying to troubleshoot such an issue can be a definite downer in the face of what’s purportedly an effortless, black-box deployment. This is why the black-box mentality marketed by some cloud computing providers is a ridiculous “benefit” because it assumes the abrogation of accountability on the part of IT; something that is certainly not in line with reality. Making it more difficult for those responsible within IT to troubleshoot and having no real recourse for technical support makes cloud computing far more unappealing than marketing would have you believe with their rainbow and unicorn picture of how great black-boxes really are.

The bottom line is that the longer it takes to troubleshoot, the more it costs. The benefits of increased responsiveness of “IT” are lost when it takes days to figure where an issue might be. Black-boxes are great in airplanes and yes, airplanes fly in clouds but that doesn’t mean that black-boxes and clouds go together. There are myriad odd little issues like time synchronization across infrastructure components and even applications that must be considered as we attempt to move more infrastructure into public cloud computing environments. 

So choose your provider wisely, with careful attention paid to support, especially with respect to escalation and resolution procedures. You’ll need the full partnership of your provider to ferret out issues that may crop up and only a communicative, partnership-oriented provider should be chosen to ensure ultimate success. Also consider more carefully which applications you may be moving to “the cloud.” Those with complex supporting infrastructure may simply not be a good fit based on the difficulties inherent not only in managing them and their topological dependencies but also their potentially more demanding troubleshooting needs. Rackspace twitterbird put it well recently when they stated, “Cloud is for everyone, not everything.”

That’s because it simply isn’t that easy to move an architecture into an environment in which you have very little or no control, let alone visibility. This is ultimately why hybrid or private cloud computing will stay dominant as long as such issues continue to exist.

Connect with Lori: Connect with F5:
o_linkedin[1] google  o_rss[1] o_facebook[1] o_twitter[1]   o_facebook[1] o_twitter[1] o_slideshare[1] o_youtube[1]

AddThis Feed Button Bookmark and Share

Related blogs & articles:

Read the original blog entry...

More Stories By Lori MacVittie

Lori MacVittie is responsible for education and evangelism of application services available across F5’s entire product suite. Her role includes authorship of technical materials and participation in a number of community-based forums and industry standards organizations, among other efforts. MacVittie has extensive programming experience as an application architect, as well as network and systems development and administration expertise. Prior to joining F5, MacVittie was an award-winning Senior Technology Editor at Network Computing Magazine, where she conducted product research and evaluation focused on integration with application and network architectures, and authored articles on a variety of topics aimed at IT professionals. Her most recent area of focus included SOA-related products and architectures. She holds a B.S. in Information and Computing Science from the University of Wisconsin at Green Bay, and an M.S. in Computer Science from Nova Southeastern University.