The Holy Grail of Web Scale

With customers demanding access to goods and services where and when they want, and servers deluged with requests, concurrency and expectations of responsiveness, organizations of all types are focused on building “web-scale” systems.

Initially built by companies such as Facebook and Google because their sheer size demanded it, web-scale technology has become the mantra for companies of all sizes. A web-scale architecture delivers systems capable of providing the resiliency, scale and performance that enterprises need to keep customers happy today.

But across the technology stack, the database has emerged as the toughest layer to build at web-scale.

This challenge is in part due to not only the technical complexity, but also the very nature of databases themselves — the need for data consistency and preservation of ACID properties (atomicity, consistency, isolation and durability) even as data is replicated across multiple servers, possibly in different data centers. So databases, especially relational databases, have remained the Achilles’ heels of web-scale technology stacks. The database affects the application — and the application is the customer’s experience of your online services.

In my 10 years in venture, I’ve been fortunate enough to take advantage of this Achilles’ heel by investing in technologies adjacent to and around it: SpringSource (application servers), MuleSoft (enterprise service buses), DataStax (the Cassandra NoSQL database), Redis Labs (the Redis NoSQL database), Hazelcast (in-memory computing fabric), Akka (microservices platform for Java and Scala), Iron.io (AWS Lambda-style microservices architecture) and others. All of these either directly or indirectly alleviate the pain caused by the database Achilles’ heel, without addressing the pain itself.

Those Googles and Facebooks who first built web-scale technology solved the traditional (relational) database challenge by building an abstraction layer. They built what some of them called a “data access layer” that sits between the application and database tiers.

When it comes to uptime, the database is often a system’s weakest link.

That abstraction layer, even though it adds a “bump in the wire,” improves uptime and performance because it breaks the 1:1 tie that has historically existed between apps and databases. Having this separation, and this discreet layer of technology, now sits on the list of best practices for scaling systems and applications because it simplifies app development and delivers web-scale benefits to the database: improved resilience, scale and performance.

An abstraction layer for the database fits right in with the trend of other technologies that have been “abstracted” or “virtualized.” Server virtualization, software-defined networking and web load balancers at the web tier provide similar benefits in breaking a 1:1 tie and providing web-scale benefits in uptime and performance.

At the database tier, the abstraction layer provides key benefits that overcome some of the database’s shortcomings. This kind of database load-balancing software, for example, transparently enables failover, scale out and faster throughput.

That “transparent” characteristic is crucial — gaining all these capabilities with no changes to the application or database is the holy grail.

When the “data access layer” abstraction was first discovered by the early Internet companies, developers had to change their apps to be coded to the specifics of the abstraction layer, rather than the database. But a transparent network-level proxy that intermediates the application and the database does not require such re-coding and re-architecting, which is why it is truly the holy grail.

Here at Bain Capital Ventures’ infrastructure software practice — which is where we deploy more than half our venture capital — we see the drive toward abstraction at the database tier on many fronts:

  • alternative databases
  • database vendors creating proxies
  • web load balancer companies creating SQL versions of their products and
  • startups creating purpose-built systems

All of them recognize the value of the abstraction approach. The first web-scale companies got this right, and these best-practices are helping the other 99 percent of organizations — all those that don’t have hundreds of engineers and millions of dollars to build their own abstraction layer.

Four years ago, we invested in ScaleBase, a startup building purpose-built database abstraction software. Those technology assets are now owned by ScaleArc (we recently led ScaleArc’s most recent funding round). This technology addresses the database Achilles’ heel.

Gartner recently highlighted this trend toward the abstraction architecture. The research firm added “SQL load balancing” to its latest IT Service Continuity hype cycle. In that report, Gartner recommends IT shops look for software that:

  • supports multiple databases
  • runs in the cloud and on-premise equally well
  • doesn’t compromise security

When it comes to uptime, the database is often a system’s weakest link — because it is the hardest part of the technology stack to build at web-scale. We can’t all be Facebooks and Googles, solving this challenge with our own internal engineering team. Instead, database abstraction software gives organizations the web-scale capabilities — resilience, scale and performance — that they need.