If everyone is thinking the same, someone isn't thinking

Lori MacVittie

Subscribe to Lori MacVittie: eMailAlertsEmail Alerts
Get Lori MacVittie via: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn

Related Topics: DevOps for Business Application Services, DevOps Journal

Blog Post

Microservices, The Scale Cube and Load Balancing By @LMacVittie [#DevOps]

This decompositional approach to application architecture encourages developers and operations to re-evaluate scaling strategies

Microservice architectures are the new hotness, even though they aren't really all that different (in principle) from the paradigm described by SOA (which is dead, or not dead, depending on whom you ask).

One of the things this decompositional approach to application architecture does is encourage developers and operations (some might even say DevOps) to re-evaluate scaling strategies. In particular, the notion is forwarded that an application should be built to scale and then infrastructure should assist where necessary.

It was just this notion that led me to a discussion on a particularly useful explanation of scaling strategies called "the scale cube" which is introduced and explained further in The Art of Scalability. Go ahead and open it up and bookmark it; it's a good read and I highly recommend it.

The aforementioned discussion provides an overview of the three axes perspective on scale: x, y and z. Reading the descriptions, it became fairly apparent to me (who lives with one foot in the network and the other in the app) that the use of layer 7 load balancing was a way to implement in some cases and augment in others.

X-axis scaling
X-axis scaling is essentially a typical horizontal (scale out) scaling pattern implemented using a load balancer. Simple and effective for many types of applications, this pattern has been the age old "go to" for quickly scaling out apps that were not perhaps built to scale in the first place. Monolithic applications are almost always scaled out (x-axis) because they were not developed with other scalability models in mind and their reliance on state (via cookies or server-side sessions stored in memory, not databases) makes other scalability models nearly impossible to deploy successfully.

X-axis scalability can easily be implemented using layer 4 (TCP) load balancing if state is not important, but more often than not requires layer 7 (HTTP) load balancing due to the need to examine headers or other variables to ensure persistence to the right application instance (think sticky sessions).

How does this apply to microservices? Consider that instead of apps, each microservice is scaled out (along the x-axis) using an app proxy or application delivery controller (ADC). This allows operations to tune each app proxy or ADC based on the specific purpose of the microservice, improving performance by applying image optimization, compression or even caching where appropriate to the specific service. In a monolithic application, an ADC will be a better choice for this scalability model because of its ability to interpret requests and optimize responses with the benefit of context. In cloud-scale microservice architectures, an app proxy may be the better option when considering cost per service and the relatively simple delivery needs of a given service.

Y-axis scaling
Y-axis scaling is essentially a layer 7-based sharding pattern when applied to infrastructure. Y-axis scaling relies on the decomposition of applications into services. It is highly appropriate for SOA or RESTful APIs that group like functionality into a service. For example, verb-based decomposition focused on "login" or "checkout" or noun-based decomposition with an emphasis on "customer" or "partner."  The key is that there is some mechanism within each request - either in the URI or in the HTTP headers - that enable the app proxy or ADC to determine to which service the request needs to be forwarded.


Sharding can be implemented in the app, itself, using a routing object to dissect the URI or that functionality can be offloaded to the network and implementing using the data path programmability associated with an app proxy or ADC. This programmability allows operators or developers to implement targeted logic that dissects the URI and determines to which service the request should be directed. This pattern can be (and often is) implemented along with an X-axis scaling strategy for the specific service.

The combination of both Y and X axis scaling is increasingly a good choice for bifurcated networks which split "core" networking from "app" networking. The core network usually provides a significantly capable load balancing service managed by the network team while the app network includes app proxies or virtual ADCs that are managed by operations or developers.

While this pattern can be implemented on monolithic applications, particularly monolithic web applications that rely on URI-based interactions, care must be taken with respect to state. That is, one cannot simply route to service B for "checkout" when it depends on session-level data that may be stored already in service A or C. Shared nothing application architectures do not  lend themselves well to sharding strategies based on application function or content type. Rather, such applications should be scaled using a more traditional approach. Shared session application architectures, however, are very well suited to this type of scalability strategy because the application state is shared across instances, and all services will have access to the necessary data.

Z-axis scaling
Z-axis scaling is a cross between X and Y scaling strategies, using a data sharding-based approach. This strategy is commonly used to shard databases, but can also be used to scale monolithic applications based on some user characteristic.

Z-axis scaling is like X-axis scaling in that it relies on cloning of application instances. The difference is that some other component - like an app proxy or ADC - is responsible for distributing requests based on some other information, like the data being requested or the user identity. As long as the data is accessible to the app proxy or the ADC (increasingly iintermediaries have the ability to reach out and query databases or directories to obtain additional information)


When using Z-axis scaling each server runs an identical copy of the code. In this respect, it's similar to X-axis scaling. The big difference is that each server is responsible for only a subset of the data. Some component of the system is responsible for routing each request to the appropriate server. One commonly used routing criteria is an attribute of the request such as the primary key of the entity being accessed. Another common routing criteria is the customer type. For example, an application might provide paying customers with a higher SLA than free customers by routing their requests to a different set of servers with more capacity.

This pattern is also useful for premium access policies, where certain users or customers are afforded higher performance guarantees.These instances may be further augmented with additional services or scaled out faster to improve performance. Only certain customers are allowed to access these "gold" instances, and such determinations might be made based on API key, cookie value, or membership in a specific group as determined by a database or directory lookup.

The point is that the scaling strategies associated with application architecture can be duplicated and/or augmented by the use of a app proxy or ADC. It is almost always the case that such an intermediary will be necessary to scale an application. That's because reality is that it's just as bad to let network logic (routing) seep into business logic as it is business logic to seep into the presentation (GUI) layer.

Keep your logics separate, and use the tools available to act on the scaling strategy best suited for your application or service.

Read the original blog entry...

More Stories By Lori MacVittie

Lori MacVittie is responsible for education and evangelism of application services available across F5’s entire product suite. Her role includes authorship of technical materials and participation in a number of community-based forums and industry standards organizations, among other efforts. MacVittie has extensive programming experience as an application architect, as well as network and systems development and administration expertise. Prior to joining F5, MacVittie was an award-winning Senior Technology Editor at Network Computing Magazine, where she conducted product research and evaluation focused on integration with application and network architectures, and authored articles on a variety of topics aimed at IT professionals. Her most recent area of focus included SOA-related products and architectures. She holds a B.S. in Information and Computing Science from the University of Wisconsin at Green Bay, and an M.S. in Computer Science from Nova Southeastern University.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.