What Google Is

Editor’s note: Benjy Weinberger is the engineering site lead for foursquare’s San Francisco office. He previously worked on infrastructure and revenue engineering at Twitter, and before that on search and ad engineering at Google for eight years.

No, really, what is Google? TechCrunch co-editor Alexia Tsotsis recently posted an interesting piece about Google’s focus, or rather the perceived lack of it. Google has its fingers in so many pies that there are quite a few angles from which to consider the above question.

The title of Alexia’s post says it all: “Remember When Google Was a Search Engine?” For consumers, Google is, or at least used to be, a search company. On the other hand, for investors, and cynics, Google is an ad network. That is, after all, where the money comes from.

But, as a former Googler and unabashed fan of the company (take this as both full disclosure and a disclaimer), I have a different perspective. For me, Google is, and always has been, a systems company.

Systems First

Most startups begin by focusing on the product: user experience, design, features, marketing and so on. These companies rely primarily on hosted or off-the shelf systems infrastructure, and focus their engineering resources on the front end elements, the things that make their company unique.

But some of these startups enjoy massive growth, and their traffic increases to the point where they can no longer scale with general-purpose systems. This is an important inflection point in a company’s life: you either hire a bunch of engineers with systems experience to develop the custom technology you need to scale, or you sell the company and let someone else worry about it.

Google, however, had a very different technology trajectory … It did systems first. This isn’t really that surprising: the front end user experience in a search engine, at least back in 1998, was dirt simple, an HTML form with a single input box and a ‘Search’ button.

The tricky parts of search were crawling the web, indexing the content and retrieving relevant results very quickly. These problems required an ability to run complex computations in parallel on large numbers of computers, while being resilient to failure of any one of them. In other words, web search is fundamentally a distributed systems problem, as well as, more obviously, an Information Retrieval (IR) problem.

As a result, Google focused on systems from day one. It hired the best and the brightest, such as the now-renowned Jeff Dean and Sanjay Ghemawat, legendary Bell Labs pioneers Rob Pike and Ken Thompson, and many other incredibly talented systems engineers, both famous and anonymous (note: I don’t count myself in that number. I was just lucky to get to work with these folks).

The outcome was that distributed systems are a core part of Google’s DNA, even more so than search.

The Google Iceberg

Once Google had its formidable systems in place, many applications suggested themselves, applications that in some cases only Google was able to build. Most of what consumers see of Google, from search to Gmail to ads to Google Docs to book scanning to YouTube, are the one-tenth of the iceberg that sticks out of the water.

What connects these seemingly disparate products is the submerged nine-tenths: Google’s planet-scale distributed systems. Even seemingly left-field projects, such as the self-driving car, benefit from Google’s unrivaled data-crunching ability.

There are other companies with world-class systems proficiency, such as Amazon, Yahoo! and Microsoft. But Google casts an unusually long shadow over the rest of Silicon Valley. The bulk of the technologies that power so many startups out there, from distributed filesystems to MapReduce to NoSQL databases, were primarily invented at Google. And the company has served as such a wellspring of talent for startups that its technical influence has spread wide, despite being a meager contributor to the open-source world (*).

Trimming from the Middle

Of course not everything Google does is driven by a technology-first attitude. Android and Google+, for example, address strategic threats to Google’s core business, and Google obviously has to pursue them. But the technology behind even the less successful of these is first-rate.

While Google’s product karma is hit-and-miss, the company’s systems prowess gives both management and employees confidence that they can solve hard problems no one else can tackle, including moonbeam problems such as augmented reality glasses and self-driving cars. Whether Google should be tackling these problems is a matter of opinion, but doing so is endemic to the company.

Between these two extremes however, are the middle-ground projects, and it’s these, neither strategic nor epic, that Larry Page is trying to pare down as CEO. If Google doesn’t need it, and Google isn’t uniquely positioned to do it, then why do it?

What binds all the different Google efforts together then, is not an overarching plan, but an underlying technology platform. This may not form a coherent vision, but great things will continue to come from it. As well as no small number of duds.

Note: Huge credit goes to Yahoo!, Facebook, Twitter and other companies for creating open-source versions of these technologies, both for their own use and for the benefit of the community at large. Google publishes many papers on these technologies, but keeps their own implementations proprietary (their technology stack is too tightly integrated to open-source just parts of it), requiring the open-source community to re-implement the publications.