
One of the next frontiers of search is taking all of the unstructured data spread helter-skelter across the Web and treat it like it is sitting in a nice, structured database. It is easier to get answers out of a database where everything is neatly labeled, stamped, and categorized. As the sheer volume of stuff on the Web keeps growing, keyword search keeps getting closer to its breaking point. Adding structure to the Web is one way to make sense of all that data, and Google is starting the tackle the problem with a Google Labs project called Google Squared, which Marissa Mayer mentioned earlier today at the company’s Searchology briefing.
Google Squared extracts data from Web pages and presents them in search results as squares in an online spreadsheet. Michael was at the event and got a personal demo (see video below). From Michael’s Searchology notes:
Google Squared is launching later this month in labs. Google Squared returns search results in a spreadsheet format. It structures the unstructured data on web pages. So a search for Small Dogs returns results with names, description, size, weight, origin, etc., in columns and rows.
Google is looking for data structures on the web that imply facts, and then grabbing it for Squared results. “It takes an incredible amount of compute power to create one of those squares,” she says.
This type of technology has obvious applications for many types of targeted searches, including product search, health search, scientific searches, you name it. There are dozens of semantic search startups trying to impose structure on the Web to perform similar tricks. Another high-profile search startup which is launching on Monday, Wolfram Alpha, takes a slightly different approach in that it simply ingests massive amounts of information into its own databases where it can query it to its heart’s delight. Already there is a bit of a rivalry between Google and Wolfram because getting back structured results is a major new direction for search.
Wolfram does a pretty good job parsing the information in its own databases, but those databases will never match what is available on the Web. Wolfram’s databases currently store only 10 terabytes of information, a tiny fraction of what is on the Web. (I will be posting my impressions of Wolfram’s search engine soon). Google Squared is an early attempt to take the messy data which exists on the Web and place it into simple tables. It is still very experimental and isn’t always on target, but you can see where this is going. Turning the Web into a giant database will crush any attempt to segregate the “best” information into a separate database so that it can be processed and searched more deeply.
In the video demo below, a search for “camera” sorts the results in different columns by images, description, and manufacturer, resolution, etc.. You can refine results by clicking on a particular column such as manufacturer. A search for “rollercoasters” sorts results by name, image, description, height, length, and number of inversions. But sometimes it gets confused. A search for “spaceships” turns up a Corvette and a missile carrier. It is going to be a while before this makes it out of Google Labs






If Google could pull this off it could crush Wolfram, but as MG pointed out in his recent article on Google News, Google’s algorithms aren’t always that impressive. It’s exceedingly difficult to create an algorithm like this, if even possible. I’m putting my money on Wolfram, we’re not yet at the stage where we can structure data without human curation.
When you square the edges of a square you get…
http://mathworld.wolfram.com/Octagon.html
And when you enter the Octagon…
http://www.imdb.com/title/tt0081259/
You reach only one conclusion:
Yes. This is will be a battle.
Chuck Norris vs. Ninjas.
Place your bets.
Patent is there for very same issues. If they launch a good product without protecting IP, big people will rip it.
Hauser I agree. Let’s just wait and see what Wolfram will come up with. It is pre-mature that Google is talking like this, given the fact that they haven’t include the time-stamp of documents as an add-on to the PageRank to make recent documents rank higher (ie, tensorised PageRank) as a 3D dataset. The current PageRank only works on 2D dataset (inbound and outbound). Stephen Wolfram’s background is in Cosmology and tensor modeling is something that Physicists/Cosmologists had been doing for almost 80 years or so. To the best of my understanding, Google is still new to tensors. Wolfram might not be using tensors in his current product, I wouldn’t be surprised that he will explore it in the near future for product enhancement or perhaps use it for something new.
google is just behaving like the monopolist it is – attempting to thwart innovation and to steal IP from the smaller guy. SHAME!
STOP GOOGLE NOW!
You cannot stop competition. You’re contradicting yourself by trying to squander innovation. Put your head in a paper bag, please.
Seems to me that Google Labs is doing its usual good job of anticipating a market for its customers. Innovation is good for all.
like all giants, google will fall one day.
Actually Google has actively struggled with how page updates should affect rank. Some sites are more usable if they’re fresher – IE, news & entertainment sites – whereas some encyclopedic or reference sites are very authoritative but not very likely to change over time.
I thought his background was elementary particles, but that is not really important. Google is not good at performing calculations and making deductions. Tensors may not be important for the average person, but statistics and differential equations certainly are. You often want to compare different sets of statistics and perform statistical tests.
If you base a robot on anything except stepping motors you are in the realm of differential equations. Ask a question, “how do you perform a mundane task” and you are into differential equations. Google can’t handle this data at all except to output it in a file. Mathematica code handles this extremely well.
Wolfram alpha still has a long way to go but it has more promise of developing into AI than does Google.
I don’t agree with Hauser. Wolfram Alpha databases are combed over by experts in the field. It will be data which you can reference as a reputable source, in addition to providing query results which have no available answers yet.
Google Squared will only show you data from what other people have already published to their websites, reputable or not. It cannot answer questions which haven’t already been answered somewhere by someone.
For example, type the following query into Google: solve (5x)^2 – 3 = 0. You will not get an answer, only results which a lexically similar to your query such as “2x^2-5x+3=0″. This is something that Wolfram Alpha would easily be able to answer.
http://www.google.com/help/calculator.html
That is what I call a slap in the face.
It does NOT solve equations.
With such a small database and resources at their hand Wolfram would never be able to compete with the search giant but hats off to Mr. Wolfram for his innovation has forced google to work on google square which will definitely serve the users in a more organized way.
totally agree with Hauser.
has potential to be pretty useful once all the bugs and accuracy issues are work out.
I don’t think there is anything you can’t easily find using Google as is. That’s while it’s so hard to find or create a new tool for search. I can’t see the video so I don’t know about Google Squared, but it sounds interesting.
adon…
surely you jest….
my god, youngsters….
take a ook at any site that has a few levels of forms… you’ll discover that what you can easily find in a few secs, you can’t bring up on a google search….
while google is wide… it’s not deep…
tom…
relax…
I’m by no means saying Google is perfect. I’m just saying that the barriers to entry are huge because for most people it is pretty easy to find websites that contain the information you are looking for in a few clicks using Google.
Example from the video; search ‘roller coasters’ on Google and the first result is a database where you can search for roller coasters, or search ‘Cedar point roller coasters’ and the first result is cedar point’s website listing all their roller coasters and their stats.
while Google is not perfect…it is pretty damn good…
The difference between searching for static information (such as a list of roller coasters, the history of rollercoasters, etc.) is substantially different than searching through multiple sets of data AND analyzing said data.
For instance, I can easily find (via Google, MSN, Yahoo, etc.) sets of statistics for 2008. But it’s difficult to statistical analysis on that data and even more difficult to do statistical analysis on that data as it compares to historical data.
You will be crushed!!!!
Ok, can see the video now. That looks really promising. Awesome.
Wolfram alpha is vaporware!
CNN link with interesting points and links to MIT reviews for WA..
http://scitech.blogs.cnn.com/2009/05/08/wolframalpha-a-new-way-to-find-data-online/
Google squared seems to be a great tool to improve searches on the web. I want to try it!
Unfortunately for Wolfram, good enough is good enough for a majority of the consumers who are searching today. Google is fast and has the brand and with the other tools(WonderWheel, et al.) they have been releasing show that there will be no Google killer in the near future. I am not a fan boy and would enjoy watching someone to kick there @ss
Replace Google with Yahoo and Wolfram with Google in your comment. It can happen… maybe it won’t be Wolfram but Google would not be what they are today if Larry Page and Sergey Brin had this kind of attitude. Even with Twitter and the iPhone dominating tech news (if not the real consumer market), today is really not that different from when Yahoo! was king. Really.
Well said. Once you believe you are king, you will be off the throne soon enough
That wouldn’t apply when you actually know you are king, however…
Wolfram Alpha reminds me of the bigfoot story on techcrunch a while back
http://www.techcrunch.com/2008/08/15/bigfoot-discovery-unveiled-in-palo-alto/
Same thing huge PR.. of the next best thing after google… with what at the end. An ice cooler with a costume…
the way this nova spivack douche has hyped up wolfram alpha, the way they have been doing it, they created so much expectations it will come crushing down on them… and I think they are afraid of the genie they let out….
I agree Wolfram Alpha is Vaporware… like Duke Nuke’em Forever – that turned out to be NEVER.
Lets rename that search engine
DUKE WOLFRAM NUKEM ALPHA NEVER….
Google rocks they’re smoking them….
Wrt to “the way this nova spivack douche has hyped up wolfram alpha”, poor Stephen Wolfram!
The last person anyone wants to hype their platform — particularly one that’s got aspirations to be a reputable reference source — is Spivack (or as one of his own Twine users Chesterfield once called him “SPINVACK”).
Spivack hyped up his own platform and failed to actually deliver to his users and disappointed them so much, users who had previously shown goodwill towards Twine wrote these articles:
* TWINE: A VISION LOST
http://xosfaere.wordpress.com/2009/02/27/twine-a-vision-lost/
* TWINE SUCCESS CONTESTED
http://www.semanticsincorporated.com/2009/02/twines-success-contested-whats-the-right-pr-approach-for-semantic-web-ventures.html
The same Spivack whose management guru grandfather, Peter Drucker, offers us these wisdoms:
* The most important thing in communication is hearing what isn’t said.
* The aim of marketing is to know and understand the customer so well the product or service fits him and sells itself.
Let’s check whether Spivack himself has “walked the talk” of these wisdoms.
He’s
(1.) Marketed Twine as “We organize that s***”.
(2.) Consistently closed public and private channels for member-users to provide feedback: Beta Feedback, User Feedback, Power Users, Product Community and most recently Twine Lounge (which a key user who knows called “a nursery” for its uselessness) have all been closed or deactivated.
(3.) Decimated users’ content and contributions and not listened to their legitimate concerns over feature improvements, spam contributing to traffic count, reciprocal respect towards users — particularly the ones who’ve provided such brilliant suggestions and quality content, etc.
During his GRID08 presentation he talked about creating a Global Brain that will enable the “increase of collective intelligence” by means of us each acting as mirrors on each other’s insights and links.
Yet when Twine’s user-members reflect and channel back to his team any constructive criticism, instead of committing to delivering improvements to them RN decides to close the user feedback channel (in other words, smash their mirrors in) — the closure of the Lounge despite users’ objections being the most recent example of RN creating yet another deadwood twine and their User Communication problems.
Spivack is also the person who kept insisting that Google isn’t into semantic search (please see all his previous interviews). This despite Twain posting an original and objective article on Twine with plenty of useful links, which substantiated that Google was building semtech teams, interested in SemWeb and implementing semantic structures in its search algorithms.
She wrote this article over 6 MONTHS before ReadWriteWeb spotted Google’s move into the space this January 2009:
* http://www.readwriteweb.com/archives/google_semantic_data.php
Now, Greg Boutin and David Provost wanted to know where that article on Google disappeared to (Please see their blog comments here: http://semanticbusiness.blogspot.com/2008/11/im-tired-of-googles-shell-game.html):
“There was a great Twine post on Google’s involvement in the space, which “disappeared” a few days after it was posted.
” — Greg Boutin, July 2008.
I say conspiracy
Here’s the answer for you all.
Twain wrote that post, using publicly available sources in the interests of making sense of the Semantic space and Google’s strategy which was under most radars.
Nova Spivack, CEO of Radar Networks, insisted that Google was not into SemWeb and would not be offering semantic features. In almost all of his interviews in 2008 he said that searching on Google was like “trying to find a needle in a haystack” and that all Google did was use statistical algorithms and not any semantics.
This in spite of Twain’s well-researched article which proved his insights to be wide of the mark.
Spivack is responsible for that post not being democratically available to collectively contextualize whether Wolfram Alpha or True Knowledge or Powerset or any other SemWeb offering is a “Google killer”.
How can any SemWeb offering be this before it’s even allowed to launch and stand up on its own merits, objectively?
Moreover, if we start off with the wrong assumptions — e.g., Google is not into semantics — we risk arriving at the wrong conclusions. NO company is a genuine paradigm shifter or potential “Google killer” by virtue of them incorporating semantics and visual knowledge representation alone.
Don’t believe anyone’s spin or hype. Seek out the substance from the smoke / snake oil salesmen / Emperor’s New Clothes stuff because it matters — if we really want to know what the future of the Internet (on+off browser) will be like for servicing our needs and understanding what we’re looking for.
Has anyone seriously, systematically and objectively analyzed Google’s semantic strategy since 2005? Has anyone looked at the visualization innovations happening with Google Finance, Google Earth and Google Draw?!
Or are we following insubstantial breadcrumbs?
Spivack is responsible for the deletion of Twain’s post which pointed to Google already being advanced in its semtech moves.
So…….poor Stephen Wolfram. Let’s hope his computational knowledge engine gets a chance to prove itself and find its natural audience without anyone’s hype, I say.
Unfortunately, Nova’s radar is not exactly optimally calibrated. His strategies for Twine don’t seem to hit the spot much either (please see TWINE: A VISION LOST and how his user-members have been consistently disappointed and exasperated by the team, particularly its Customer Relations and Marketing).
As for Wolfram Alpha, for a more reasoned testing of the system instead of spin-hype please read this article:
* http://www.spiegel.de/international/zeitgeist/0,1518,624065,00.html
*******************************************
NOTE TO NOVA: You’re a Tibetan Buddhist so should be schooled in karma.
In your eradication and desecration of Twain’s content, you violated the most basic principles of respect for the property of the digitally dead. It means you’ll attract BAD karma for an eternity.
Plus your actions of closing legitimate user channels demonstrate the oppression of democratic opinion and the lack of commitment to core users, who have done nothing but be supportive and collaborative.
Lastly, your deletion of Twain’s avatar from Twine is worse than what the Chinese government supposedly did to the Dalai Lama’s image.
Twain has never claimed to be either a Buddhist or a Semantic space expert and yet she’s more aware than you.
Twain regrets entrusting her content to Twine for stewardship. She won’t make that mistake again.
She also won’t be relying on your calls on the Semantic space, “Google killers,” online democracy, collective intelligence, the Global Brain or how to treat user-members, plus more.
Your radar’s off.
We even don’t know 1% of WAalpha or the Squared yet. They are already ‘crushing’ each other? Impact of too much video games!
Google Squared is so cool. Looks very much like our vision at Cazoodle.
At Cazoodle, our goal is enable “data-aware” search, by understanding the structure of Web data. Try our current products in apartment rentals, local events, and online shopping.
http://www.cazoodle.com
Gee, that’s really fascinating stuff about Cazoodle. Please post more details on http://who-gives-a-rats-ass.com.
Yeah right… even caveman can replicate your website. Everybody can do FIND-SEARCH.
“Crush Wolfram Alpha”? There is nothing to crush. You are a moron squared, Eric.
Is it just me or is TechCrunch getting more desperate everyday for headlines? I have to admit they come up with some catchy headlines, yet most of the time when I read the articles I’m pretty disappointed. To compare and judge two products requires extensive reviews and a certain time on the market. To say that Google will “crush” Wolfram Alpha is at this point totally stupid.
It’s definitely NOT JUST YOU.
Ad $$$ must be in free fall, right now… as is the quality of the posts.
Oh, Erick. Wasn’t it just in the last couple of weeks that you were telling us how Wolfram Alpha was dead because of Google Public Data?
But now they’re really, really dead because of Google Squared? Is this a new, separate Google product, or has Google already had to re-brand Google Public Data after its disappointing, rushed debut where they tried to hijack the attention that Wolfram Alpha was getting? (Err, I mean, just coincidentally jumped into Wolfram’s spotlight all because somebody at Google was having a baby, hur hur.)
Instead of acting like a fanboy, try investigating why Google seems to be flailing about trying to respond to Wolfram Alpha. Right now you’re just coming off like a shitty excuse for a journalist.
I second the notion that Google Public Data was a half-baked product with a botched launch that, if anything, revealed quite a bit of insecurity. Really, a grade-schooler could have programmed those charts from Public Data; it’s something I’d expect to see come out of a weekend mashup camp, not the all-powerful Google. Is it me or have they not only lost their best executives but their best engineers?
ouch burn +squared.
Google Squared isn’t even competing in the Wolfram space. Wolfram will be great at scientific and academic queries, while GS will be better at general interest and e-commerce queries that your everyday user would actually perform. For instance, Google Squared could help you find a digital camera with a certain number of megapixels or a certain focal length (Froogle integration, anyone?), while I’d go to Wolfram Alpha to figure out when Halley’s Comet will pass Earth next. They’re two different products for two different audiences, with very little overlap.
Finally, the only comment that made sense!
I agree Joel. When people search they are either looking to discover broad information about a topic or find specific data point. The concept of Google Squared is good for comparative queries such as for consumer purchases. However, I question adaptation for both services for the every day user. Search should be a simple and seamless experience.
Does this surprise anyone?
It was just the matter of time before google comes with something like this.
I am actually a little bit surprised considering the ever increasing monopoly threats to Google
This is cool but jeez, isn’t it a bit early to predict who will win? I’d like to see a few players in the structured data space over the next few years to get some healthy competition and innovation. Don’t kill Wolfram|Alpha before it’s even left the gate =)
I am thinking about what google cannot do. @Holden Google is definately going to attract more antitrust regulations against them.
the problem with google squared how they can learn money if everybody can sort the search the way they want …
Hmm.. maybe companies should start thinking about releasing innovative products that will actually affect google
Presumably, they’ll also expose he data for use by gSpreadsheets & other services; so e.g. a search for ‘london restaurants’, ‘online backup’, etc. results in an instant side-by-side comparison chart. Imagine what this opens up… "Did you mean to search for ‘phones with HD video capture’?"
Vodex, it is going to be quite amazing and Google has the index to actually make this plausible unlike wolfram alpha. I am pumped
http://www.spiegel.de/international/zeitgeist/0,1518,druck-624065,00.html# check that out
Wolfram Alpha is a joke. Google Squared looks pretty cool if it works.
Numbers are replacing the text. I see this as search engines finding numbers with the relavent text. Functions:Text(independent variable) and number(dependent variable).
Hah!: "How about real time?" "I’m sorry?" "It’s just a joke" – video, 5:02.
This has nothing to do with real time
one good locator could crush wolfram.
Eric thats some of the best commentary i have heard regarding the evolution of search.
“neatly labeled, stamped, and categorized”. you took the words right out of my mouth.
G does not want to be a major gateway destination when aprox 65% of income comes from other peoples sites. you will never see this functionality on there home page. commonsense, logic and organization does not work well with G’s core revenue model the adsense syndicate network.
CategoryLocator.com – select yourself
Mr. Locator, can I have a drink with you when I visit LA next time? I admire your persistence and that’s a true character of being an entrepreneur. Good luck with the launching of your product.
i appreciate your consideration and insight. since i have been at TC i have become stronger at articulating my position.
now more than ever i know for a fact that there is no startup in existence that can compete with the MyLocator “promise or potential.”
CharacterLocator.com – personality matters
Sup, dawg…
I want to meet you. I am interested in doing character studies of people who are completely delusional.
Re: "monopoly threats to Google" – heh – "Google the government" might take on a whole new meaning.
Real time, I do not know how many times google crawls the web or by using any other method but it is sure ccaching the friendfeed pretty fast since when i google my name it shows my friendfeed comments made today. I think real time is not too far.