Google has Google Trends, Twitter has trending topics, and now so does Wikipedia. Pete Skomoroch, a Senior Research Scientist at LinkedIn and blogger at Data Wrangling, built a trending topics page for Wikipedia. The homepage ranks the top-25 Wikipedia articles with the most pageviews over the past 30 days, as well as the fastest rising articles in the past 24 hours.
Some of the most popular Wikipedia articles in the past month include ones on the Perseids meteor shower, Danish physicist Hans Christian Ørsted, director John Hughes, and G.I. Joe: The Rise Of Cobra. These are quite different than the types of search trends you would find on Google trends or realtime trending topics on Twitter. Even the trending topics over the past 24 hours (District 9, Woodstock Festival, Usain Bolt, Gina Carano) are quite different than the hot searches on Google. And, no, I have no idea why Perseids was the top trending topic last month, it is usually visible in the summer.
You can search for any topic, and the you will get a chart showing pageview trends, along with the actual article placed in an iFrame below the chart. It’s as good a way as any to explore Wikipedia. The site is built on Cloudera’s version of Hadoop.
Wikipedia is a Wikimedia Foundation project to build free encyclopedias in all languages of the world. Virtually anyone with Internet access is free to contribute, by contributing neutral, cited information. As of March 2008 Wikipedia is offered in 250 languages.
Google provides search and advertising services, which together aim to organize and monetize the world’s information. In addition to its dominant search engine, it offers a plethora of online tools and platforms including: Gmail, Maps, YouTube, and Google+, the company’s extension into the social space. Most of its Web-based products are free, funded by Google’s highly integrated online advertising platforms AdWords and AdSense. Google promotes the idea that advertising should be highly targeted and relevant to users thus providing...
Hadoop is a Free Java software framework that supports distributed applications running on large clusters of commodity computers that process huge amounts of data. It is a top level Apache project and was originally developed to support distribution for Nutch. Hadoop consists of an open source implementation of Google’s published computing infrastructure, specifically MapReduce and the Google File System (GFS).