It's 4:30 in the afternoon. You are the CTO of a hot new startup and your killer mobile app went live at 6:00 AM that morning. Users are flocking to your app, but a couple of hours ago the calls, texts, and emails started coming in. Your servers just can't keep up. You rush to provision new servers from the Cloud provider who promised you unlimited scalability, but the messages - now angry - keep coming.
It's going to be a long night.
Hope is not a Strategy
About a year ago, I was speaking with the CIO of a mid-size company. She was betting the business on an innovative mobile strategy and had spent a lot of energy on developing a flagship mobile app with a luxurious UX. She was forecasting aggressive adoption rates with target numbers in the hundreds of thousand in the first 90 days. Ignoring (for the moment) where these numbers were coming from, I asked her how she was going to scale to support this amount of traffic (her entire data center fit in a maintenance closet).
With a nervous laugh she replied. "We're deploying this on the Cloud. We'll just stand-up a load balancing router and provision new capacity. And then we'll just hope for the best."
Sigh.
The sad part is she is not alone. "Hope" seems to be the strategy of a lot of folks when it comes to scalability. But it doesn't have to be that way. There are a number of proven, design-time methods for ensuring your app is scalable.
Here are the top five.
1. Layer and Specialize
Don't bundle everything into your web container. Layer your back-end and split things out.
Serve up multi-media content from specialized media servers. These scale independently from your core and can be deployed geographically (to reduced latency). Best of all, there are plenty of off-the-shelf options so your don't spend valuable time replicating this functionality in your solution.
Partition back-end services into their own web container. Where latency isn't an issue, having a service run in it's own web container allows the service to scale independently of the core.
When you have third-party services, wrap these into your own asset with a service interface. These assets can then be deployed into their own container and scaled independently of the core.
2. Request Scope/100% Stateless
Sessions are meaningless with mobile apps. Native apps define the context and maintain the state. Except for persisted state, there is no reason for the back-end to manage any state. If the back-end needs state, it is passed as part of a request. The response provides all the information the native app needs to update it's local state.
Any state on the backend that spans requests and does not have a front-end representation will be persisted. Liberal use of caching will reduce performance hits. There may be added latency, but the tradeoff for scalability is (almost always) worth it.
With your back-end stateless and your front-end managing local state, there is no reason to hope when adding capacity. Add as many servers as you need and the app will scale.
3. Non-Blocking Requests
When HTTP is used between the native app and the back-end core, and between the various layers on the back-end, app performance can take a significant hit if parts of the application were to block waiting for a response to these requests.
The answer is to use asynchronous requests with separate response handlers that are called whenever a response is received. This allows the developer to fire off a request and continue processing when the app is waiting to receive a response.
When a response is received, it is processed asynchronously by a separate handler. From an native app perspective, initiating a request and processing the response are two independent events.
Check out this great tutorial on how this is done in with Objective-C and Spring.
4. Replicated Singletons
I know what you're thinking. "Replicated Singletons!? Really?"
Singleton's have state. If that state is localized to a container (e.g. the pool of connection objects to a database), then it's perfectly okay to have multiple copies of the Singleton. It doesn't matter what the state is for a Singleton on another container if that state is localized.
The challenge is to design a Singleton with localized state. This isn't that hard (Factories and many other types of Singletons use localized state), but the payoff is high.
5. Get the Most Out of What You Have
The single best way to scale your app is to never need more capacity. Books have been written on how to improve app performance, but focusing on just a handful of things can make a huge difference.
- Design Matters
Good design can prevent performance problems form crashing your app. But great design ensures there's never a problem in the first place.
- Tune Your Container
A full GC can stop your app in its tracks... Too few connections and there aren't enough to service the load;Too many and you waste resources... Is your container loaded up with things you don't need?
Spend some time tuning and your app will hum.
- Know Your Environment
Are your physical servers oversubscribed with too many VMs? Are the code libraries you use under-performers? What's your CPU utilization?
The last thing you need to do is add capacity if the current capacity isn't being used.
- Cache Everything (Except for What You Can't)
Servers are loaded with memory. Use it.
Epilog
Sometimes hope is good enough.
It turns out my CIO friend never had a scaling problem. The app worked just fine under load. Of course, the load only had to handle 4,500 users.
-Mike
@mobilebizguru