Your favorite band is coming to town. You’ve never attended one of their live shows in person. You log into the online ticketing system early to be ready to click the “Buy” button as soon as tickets go on sale. You watch the countdown timer hit zero and then...nothing. The system fails and you aren’t able to get your tickets! Have you ever dealt with a frustrating scenario like this?
The issue is that many web applications are designed around a smooth, consistent flow of traffic. When a system is architected to handle a steady load, but then that load suddenly spikes to 10x or 100x the normal amount, the system fails. Spiky traffic is a tough problem to solve, and one that requires an innovative architectural solution to accommodate.
This is the type of problem we faced with a client who utilizes a reservation system for some of their events. The system would be inundated as soon as it was announced that the reservations were open. Handling this intense volume of incoming requests would require:
Usually, in a situation like this, the bottleneck becomes the underlying hardware that is hosting the software the user is interacting with. You can spend a lot of money to scale horizontally (more instances) or vertically (bigger instances), but the downfall to this approach is that you have to scale out to handle the maximum possible traffic patterns at all times. Even if your system is only under the high load for 5 minutes every 24 hours, the system must be scaled high enough to handle the traffic for that spike. This approach was not acceptable to us, so we looked for a different path.
Our solution? To go serverless. If the bottleneck for these systems is the hardware, why not get rid of the need for that hardware and its implementation details? Serverless architecture is a bit of a misnomer. There are actually servers somewhere that handle your hardware needs, but you no longer need to worry about them. This concept is very similar to the cloud. There are data centers full of servers all around the globe that make up the cloud, but it is known as an ambiguous solution that can be used by people who do not need to understand the full inner workings of what goes on behind the scenes. Now that we had an idea of the type of architecture we wanted to use, we needed to start choosing which particular tools and technologies to get us there.
As stated above, we knew that our front-end solution needed to be able to handle a sudden spike in traffic and not bat an eye. We also wanted the user interface to be snappy and responsive. This led us to look for a client-side framework that could rely on some static pages and heavy caching, while also putting as much of the workload as possible on the client instead of a backend server. This setup would help distribute the load and not lead to a focalized point of possible failure. The solution we chose to use was React with Next.js. Next.js bills itself as the React Framework for Production, and we now have to agree after testing it out in a harsh environment. The framework adds structure to a React web app along with supporting a hybrid approach of both static and server-rendered pages. This was the main reason we chose it but we discovered along the way that it offers quite a few additional niceties that assist in building a polished web application.
While that answered the question of which path we were going to take on the frontend, we still needed to decide on a hosting solution, a backend solution to handle server-side requests, and a datastore. In the end, we chose AWS Amplify and connected services because we were already familiar with Amazon Web Services. It proved itself to be a single ecosystem that could provide a service to answer each of our remaining needs.
Full list of AWS services used on the solution:
By linking and layering these services together, we had a fully serverless system that allowed us to worry only about the business requirements of our solution and ignore the implementation details of the infrastructure altogether. In a traditional solution, there would be a need to specify a server size or instance type for the underlying hardware. Then, when a limitation of that hardware is hit there is a need to choose a larger size or type and migrate over to the scaled-up instances. With the services mentioned above, AWS handles those needs for you by automatically scaling to fit the resource requirements of your workload on-demand. If you run into a limitation, it is an arbitrary throttle (put in place by Amazon to protect from rogue processes or bad code potentially causing unwanted billing) that can be raised by submitting a support request to AWS. Along with scaling up as needed, these services also scale down (or even off) as needed which can save on costs. Instead of being scaled out to max capacity at all times, you only pay for the necessary resources. Let’s now talk briefly about each of these services in more detail.
AWS Amplify was the heart of this solution. It both serves as a hosting solution and gives the ability to create a “backend” environment that allows an easy-to-manage way of connecting multiple services together. The hosting works by allowing you to connect a git repository and specifying a branch inside that repository. Then when that branch receives a new push, Amplify will automatically build your app using specified settings and deploy the static pages of the app out to its CloudFront distribution nodes. These nodes act as a CDN to deliver your static pages blazingly fast. Your server-rendered pages are served by Lambda functions that are pushed out by Amplify to distributed nodes on Lambda@Edge. We’ll talk more about these later. The backend environment serves as a single point of reference for the following services to connect.
API Gateway is aptly named. It serves as a fully managed gateway for API requests. It then forwards those requests to other services using configured settings, such as caching and custom headers. We used API Gateway to serve our REST API requests that did not belong to our GraphQL API. For our purposes, Gateway acted as a proxy and simply passed our requests through to Lambda functions along with their parameters, and then returned the output of the Lambdas. You can think of these requests and their accompanying Lambdas as our backend layer that handled our business logic needs. We had a function that would allow us to retrieve the current time from a trusted source and another function that we used to process new reservations.
AppSync is a fully managed GraphQL API. This service makes it extremely simple to set up an autoscaling GraphQL API that can be connected to multiple data sources such as DynamoDB, Lambda, etc. We chose to use DynamoDB as our data store because of its ability to perform at scale. We used the AppSync GraphQL API to store, update, and retrieve our data. We also made use of the provided GraphQL subscriptions to listen for changing data in real-time inside our Next.js application and update the user interface instantly.
CloudFront is Amazon’s global Content Delivery Network (CDN). It empowered our application by using its distribution network to serve our static pages out of cached s3 buckets and our server-rendered pages from Lambda functions deployed to Lambda@Edge.
DynamoDB is a fast, flexible, and fully managed NoSQL key-value database. It was built to be highly performant at any scale. Using this datastore allowed us to quickly insert new reservations as they were made with very low latency.
Lambda is a serverless compute service that provides an environment to execute your code without worrying about servers, clusters, or instances. These babies were our workhorses behind the scenes. Each time a request would come through for a new reservation we would:
These Lambdas encapsulated all of our backend code that dealt with the business logic of our application. They are highly scalable because they can run concurrently, allowing the handling of thousands of requests per second. They are also efficient because if another request comes in while a Lambda is finishing it will pull in the new request instead of spinning back down. Lastly, they are cost-effective because you only pay for the computing power for the amount of time the Lambas are running, billed down to the second.
Thankfully after a few months of research, implementation, and testing, we were able to deliver a system that could handle a spiky traffic pattern with ease. This system was also efficient and cost-effective, which met the original needs that led us to look for a new architectural solution in the first place. We enjoyed working on this project as it allowed us to explore new technologies to solve an old problem. In the end, we were happy to be able to work on such a fun project, the client was happy to have a robust solution, and the client’s users were happy to be able to make their reservations using a system that did not let them down.