Researchers at USC have stumbled on a huge change in how Google architects its search services. The result? Reduced lag in serving search queries, especially in more far-flung regions (as in, far from Google’s own data centres).
The insight into Mountain View’s pipes stems from other research the team was doing to develop a new method for tracking and mapping servers, identifying when they are in the same data center and estimating where that data center is. The method also identifies the relationships between servers and clients, and — as luck would have it — the team happened to be using it when Google made its big move. Unless of course Mountain View makes such massive shifts regularly (which seems unlikely).
According to the findings, over the past 10 months, Google has “dramatically” increased (by 600 percent no less) the sites around the world from where it serves client search queries (the animated GIF at the top of this post depicts this ramping up — with black circles being Google data centres, and red triangles being others’ sites now being utilised by Google to relay search traffic).
The researchers note:
From October 2012 to late July 2013, the number of locations serving Google’s search infrastructure increased from a little less than 200 to a little more than 1400, and the number of ISPs grew from just over 100 to more than 850.
The USC team says Google has made this change by repurposing existing infrastructure — utilizing client networks it was already relying upon to host content such as videos on YouTube, and reusing them to relay — and crucially speed up — user requests and responses for search and ads.
“Google already delivered YouTube videos from within these client networks,” said USC PhD student Matt Calder, lead author of the study, commenting in a statement. “But they’ve abruptly expanded the way they use the networks, turning their content-hosting infrastructure into a search infrastructure as well.”
Previously search queries would have gone direct to a Google data centre, a network structure that could introduce an element of lag — based on how far from the data centre the query originated. The new architecture means searches go to a regional network first, and are then relayed on to Google’s data centre. While that might sound more long-winded, it actually has the opposite effect, thanks to the continuous connection between regional node and Google data centres, keeping speeds up and helping to mitigate the effect of lost data packets.
The researchers explain:
Data connections typically need to “warm up” to get to their top speed – the continuous connection between the client network and the Google data center eliminates some of that warming up lag time. In addition, content is split up into tiny packets to be sent over the Internet – and some of the delay that you may experience is due to the occasional loss of some of those packets. By designating the client network as a middleman, lost packets can be spotted and replaced much more quickly.
Google’s new search architecture resembles the architecture of content delivery networks (CDNs) — such as Akamai and Limelight Networks — which are used to support video services to reduce lag when streaming content.
How much lag is Google’s new world order for search eliminating? Report author Ethan Katz-Bassett told TechCrunch that’s difficult to assess at this point (the team is doing ongoing work to quantify the performance implications of the change), and said lag reduction will also necessarily vary “a lot” by region. But he described one example where search latency looks to have decreased by around a fifth.
“To eyeball results from one machine in New Zealand, it used to get served from Sydney, and now it is directed to a frontend in NZ. As a result, it looks like the latency dropped by about 20%,” he said.
“The high level implication is that many regions around the world that were previously somewhat underserved should receive faster performance,” he added. “For example, of the networks we see using these new servers, 50% were 1600+km away from their old server on Google’s network. Now, half of them are within 50km of their new server in the local ISP.”
The new infrastructure looks to be a win not just for users (getting faster results) and for Google (delivering more ads), but also for ISPs — because it should lower their operational costs since they are now serving more local traffic. And if Google is leaning more heavily on their infrastructure, it’s possible Mountain View is paying them more too.
Rather than the shift being about Google future-proofing for expected global growth in search queries, Katz-Bassett’s view is this is about helping to serve existing users around the world better. “On its own, it doesn’t necessarily aid capacity, but is probably mainly useful for improving performance,” he said when asked.
Why has Google made this change now? Again, hard to say (Google isn’t commenting on the research). Katz-Bassett speculates that there were engineering and technical challenges preventing it from routing search traffic this way before (more likely that than a lack of business partnerships, at least — since the study notes that Google is ‘mostly’ utilising existing client networks, such as Time Warner Cable, for this new search topology).
That and prioritising this change vs other performance improvements, said Katz-Bassett.
“It does introduce some challenges: how should the system decide which server to direct a particular client to to get the best performance? In the past, Google controlled the whole path as soon as a request hit a frontend. Now that most of the frontend locations are outside Google’s network, the frontends have to relay it over the public Internet (towards Google data centers), so I imagine the conditions vary more (congestion, available bandwidth, etc), and it is a very large system to manage,” he added.
The USC team presented their findings at the SIGCOMM Internet Measurement Conference in Spain yesterday.