Mike Ensor, VP Global Cloud Practice at Nortal, March 22, 2019
A multi-tenant system allocates a single instance of software to infrastructure shared between multiple clients. This type of system grew in popularity as a response to maximizing static infrastructure resources; over time, however, multi-tenant systems become a monolith and add complexity, inherent coupling, and scaling concerns.
Single-tenant systems allocate one stack of software and infrastructure for each client. Historically, there has been strong hesitancy regarding deploying single-tenant systems for SaaS enterprises due to operational and maintenance complexity and cost. In the past, the complexities of scaling a single-tenant system’s resources became an infrastructural issue rather than a software issue — making the challenge beyond the reach of typical business in-house development teams. This causes resource inefficiency as developers are forced to focus on operational activities. However, this concern has been addressed by recent cloud technologies and dev-ops strategies (e.g., automating, managing, and exposing operations appropriately with a container orchestration like that of Google Cloud Platform’s [GCP] Kubernetes).
Overall, the tools offered by cloud and dev-ops foundations, like continuous integration and delivery (CI/CD), open up the option to pursue what I’m going to call “single- as multi-tenant” systems: a service that is multi-tenant from the consumer’s perspective but actually implemented as separate single-tenant apps. This allows enterprises to automate scaling, decouple interdependent applications, and reduce operations and overall complexity.
Multi-tenant systems have different scale characteristics than single-tenant solutions. Most multi-tenant solutions scale the highest impact resource vertically. Databases are often the largest constraint in a multi-tenant solution. In order to maintain performance, databases are vertically scaled until expensive solutions are added in order to horizontally scale the system. This often changes the scale characteristic of the application from a CA (consistent and available) system to either a CP (consistent and partitioned) or AP (available and partitioned). It can be a large risk, and produce difficult-to-address bugs, if there’s a fundamental change in the CAP (consistency, availability, and partition tolerance) theorem after the solution has clients. Applications in a multi-tenant solution also scale synchronously as client demand rises — resulting in resource contention and cost inefficiencies. Clients of different sizes start to compete for resources, and software needs to be created to adjust the balance.
Single-tenant solutions can scale asynchronously from the perspective of the product. Larger clients scale independently from smaller clients. The complexities of scaling a single-tenant system’s resources becomes an infrastructure challenge rather than a software challenge. With the introduction of tools like Infrastructure-as-Code, infrastructure can be created, scaled and destroyed with tools built by the community using declarative formats.
Creating a stack representing a single client instance, defining the infrastructure that can elastically scale, building in security and automating the entire lifecycle, are all now all possible. As stated in a previous blog, containers and Kubernetes are the keys to completing the last mile of the software development lifecycle (SDLC). New advancements in service meshes have provided the automation required to stitch together clusters and allow a team to deliver a series of single-tenant clusters that masquerade as a multi-tenant. Using package tools, such as Helm, provides lifecycle management allowing automation to deploy new versions of the application to all clusters simultaneously or in a rolling method.
Systems written with multi-tenant capabilities inherently couple clients together through feature development, deployments, availability and many other areas. Features are designed to be used by all clients, but often, sales needs to add custom features in order to keep or obtain new clients. Features customized within a multi-tenant solution can be designed through extension patterns. However, multi-tenant systems typically start without client customizations in mind, meaning that design features uses a permission model. Permission models can quickly become a monolith and nightmare to manage.
Multi-tenant systems are also rigid. Since all clients in multi-tenancy are contained within one application, the risks are higher and the margin of error is smaller. Deployments become a high-stakes concern and need to be planned with more care. Mistakes in deployments or unplanned outages negatively impact all clients in a multi-tenant system. Beyond the deployment and feature development, systems resources and stability inherently couple clients together. Imagine one client running a back-office job that takes up all the available CPU and RAM, the other clients are negatively affected by the sudden drop in resources. Clients start to lose the value from the solution. Software needs to be added in order to manage resource contention, ultimately increasing solution complexity.
Different strategies can be applied to a single-tenant system to help reduce the coupling of clients such as rolling client deployments. With both the permission and extension pattern, custom features can be deployed into a single-tenant cluster without impacting additional clients, thereby reducing coupling. The overall packaged binaries will be tailored to the individual client, thus, reducing the risks of shared or incorrect features for a client. It is still recommended to keep feature customizations to a minimum in order to maintain solution operations.
Software in a multi-tenant solution requires special attention to ensure tenants are segregated. This creates additional complexity woven throughout the entire solution stack, resulting in challenges that do not exist in a single-tenant solution. These challenges are added only to ensure features and data segregation. In order to ensure segregation, database tables are often created with additional columns containing foreign keys, complex constraints, foreign-key tables and other data modeling techniques. Queries for relatively simple segments of data can become complicated and convoluted. Performance tuning is required to ensure complex multi-tenant queries are efficient — as there are additional joins that put more strain on the database. As previously mentioned, database contention is another challenge inherent within multi-tenant systems.
Complexity in a multi-tenant solution manifests itself when debugging solutions. Observability tools, such as logging and tracing, require the client be included in all events stored. Logging and tracing data becomes a concern in a multi-tenant solution for several reasons:
Single-tenant solutions grow at the pace of the user, and the aggregate of the data in a multi-tenant solution will outgrow a database quicker than the individual systems alone. While redaction and special access may still be required in a single-tenant solution, often, only the misbehaving client(s) clusters are examined, thus, reducing your security and compliance risks.
The benefits for single-tenant systems for SaaS enterprises and businesses are clear: from scalability to reducing complexity and rigidity. Now that cloud-native development has matured, CI/CD pipelines have improved and Infrastructure-as-Code can build exact replicas of environment, and multiples of single tenants can be engineered to operate as a multi-tenant. But a word of caution! Despite all the advantages, serving a single-tenant solution as a multi-tenant solution is not simple. The solution requires the use of automation for the whole lifecycle of a client, deployments, upgrades, reliability and more. Operations teams need to work closely with application development to ensure the solution does not drift too far from a common core of features. Additionally, software development should work with the operations team to build SRE principles into the software in order to reduce the observability challenges from running many single-tenants. Strong CI/CD pipelines coupled with regularly practiced deployments reduce these risks.