Menlo Park based Cuil will launch later this evening with an index of 120 billion web pages, making them arguably the most comprehensive search engine on the web (Google doesn’t disclose the size of their index, although they claim to know about a trillion unique web pages) (Update: see our very early testing here). They’ve also dropped one of the “l’s” from their name – previously the company was “Cuill .” Either way, it’s pronounced “cool.”
The super-stealth search project was founded by highly respected search experts. Husband and wife team Tom Costello (CEO) and Anna Patterson (VP Engineering) were joined by Russell Power. Patterson and Power are also ex-Google employees, and the company has been the subject of intense speculation over the last couple of years.
Much of the secret sauce of Cuil is in the way they index the web and handle actual queries by users. Both are costly to scale, and Cuil claims to have found a way to massively reduce those costs. That allows them to run the search engine a lot cheaper, even at Google-scale should it ever reach that point. By some estimates, Google spends a billion dollars a year to run the back end infrastructure of it’s search business.
Cuil also claims to have better search results than Google and others based on how they index websites. They do not simply catalog keywords on a site and then rank the site based on its importance. They also work to understand how words are related (France – cheese – wine, for example), to return more relevant results to users. This is a semantic approach to search, but very different from Powerset’s natural language approach. Powerset uses artificial intelligence to try to understand what sentences on a website actually mean. Cuil, by comparison, simply tries to properly categorize and file a web page, even if the category name doesn’t appear on the site.
That means users search the same way they always have, but Cuil will try to return better results via refinements in a “explore by category” module to the right of results. A search for dogs, for example, will return category results for “water dogs,” “crossbreed,” “cocker spaniel,” etc. Some of these related terms do not include the term “dog.”
Cuil is experimenting with a new type of search interface as well. Results are shown in three columns and contain an image and more summary text than existing search engines. In addition to refinement by category, Cuil will recommend related searches via tabs across the top of search results. A search for New York, for example, also has tabbed results for recommended refinements like New York Times, New York City, New York Yankees, etc.:
Cuil also says that they will put user privacy at the top of their business objectives. User IP addresses are not recorded to their servers, they say, and cookies are not used to associate a computer with queries. The data is simply dumped as it is created. That means user data cannot be turned over to others, whether its via blind stupidity or lawsuits.
Cuil has raised $33 million over two rounds of financing from Greylock, Madrone Capital Partners and Tugboat Ventures.