Next Year's Headline: Microsoft fails to do anything significant with Powerset

Next Story

Anonymous source leaks details of the LG Voyager update

There is one glaring detail that everyone who reported on the Microsoft-Powerset acquisition has failed to mention: Powerset is a Unix based company. All of its core components – its applications, its packaging system, and its continuous integration system – were designed and written to run on a Unix platform.

As a large software company, Microsoft is a big supporter of its own craftsmanship. Unsurprisingly, all of Microsoft’s products, including Live search, run on Windows servers. Thus we come to the source of the Microsoft-Powerset repugnance. While not impossible to port software from Unix to Windows, but there is no denying that it is a hellish task. Powerset has developed a large repository of code over the last three years and naturally it has opted to utilize many open source projects in its development stack which all would have to be ported as well. A large chunk of the software that it uses is already cross platform, but those that are not present a real problem.

When Powerset launched, developer Kevin Clark gave a nice shout-out to the open source technologies that Powerset relies on. The state of each of those technology’s compatibility with Windows is given below.

Windows Compatibility of the open source technologies Powerset uses

Hadoop/Hbase Requires Cgywin to run on Windows.
Ruby Runs slow on Windows, good opportunity for Microsoft to put IronRuby to use?
Ruby on Rails Will run anywhere Ruby will run. IronRuby reached the Rails singularity recently.
Merb Same situation as Rails. No word yet on IronRuby capability.
God Does not run on Windows period. Nor is Windows support planned. However, the good samaritan who wrote God happens to work for Powerset. Maybe they could beg him to make a Windows version.
Mongrel Fully supported on Windows.
Mootools Javascript- their battle is in the browser not the OS.
Erlang Good Windows support!
YAWS Erlang based, also works on Windows.
Memcached There is an unofficial port for Windows that should not be used in production environments.

To sum it up, anything requiring Cgywin is not going to be capable of operating in a high traffic production environment like The official Ruby interpreter on Windows is too slow and IronRuby is too new and not sufficiently field tested to be placed behind Microsoft’s search in any capacity. Using God is not an option, so they are going to have to find some other way to manage their processes on Windows. Replacing Memcached will not be difficult. Finally, Erlang appears to be in good standing on Windows but the transition will still be rough.

Getting Ruby into an optimal state is important if Microsoft wants to leverage Powerset’s existing IP. According to a interview on the Ruby on Rails podcast, “all parts of the organization use Ruby”. Including all of their development of natural language parsing and at least 60% of their internal tools.

The most likely scenario is that Microsoft will ditch the majority of Powerset’s code and task the team with porting the core technology and implementing the necessary tie-ins to the Windows Live search – an assignment that may be taken on resentfully by the Powerset engineers depending on which platform they feel is superior. Powerset’s core is realistically the semantic search technology it picked up from PARC and have expanded upon. In other words, Microsoft really wants the PARC technology but it bought Powerset “for their people” because Powerset’s engineers are the only ones who really understand the PARC stuff.

The claim by Microsoft that aspects of Powerset’s semantic search will show up in Live search by the end of the year is balderdash that Microsoft is broadcasting to its shareholders to help justify their 100 million dollar purchase. Switching onto the Windows platform is no small task and switching to a unfamiliar closed platform magnifies the complexity. However, it is rumored that Microsoft runs some BSD servers internally, in which case the Powerset integration might actually be possible. In the end, this acquisition feels like a R&D experiment for Microsoft. Semantic search is still a gray area of Computer Science and Microsoft is taking a chance at winning the lottery by picking up Powerset.

  • Stephan

    I don’t think Microsoft will throw out code they paid some coin for just because the operating system is not their own.

  • Batchu

    I don’t agree with that. Did microsoft throw away hotmail source code? I am sure hotmail is not written in microsft technologies either. Even if microsoft wants to throw away powerset’s code, they will do it after integrating with their search over a period of time.

  • Rob Olson

    @Batchu – Microsoft purchased Hotmail in 1998. Hotmail ran on a mixture of Solaris and FreeBSD systems until late 2001 reported it had finally been moved to Windows NT. That was around 3 years. I don’t think it was until 2005 when Hotmail that rewritten from scratch using Microsoft technologies.

    Hotmail was a standalone system for the most part. Microsoft plans to integrate Powerset’s NL search into their Live search in at least some aspect by the end of the year. For that to be feasible Powerset will continue running on their Unix platform for the foreseeable future.

  • Jim

    Microsoft has used Linux in the past on both current and legacy projects, and I doubt they are going to re-develop Powerset (at least right off the bat) just to conform to their standards.

    @rob – Microsoft can keep Powerset on its current platform while maintaining a high level of integration via low level API’s.

    On a dreamier note: maybe, just maybe (we can all hope) that this will trigger a Visual Ruby. :D

  • Rob Olson

    @Jim How would a “Visual Ruby” be different from their current efforts with IronRuby?

  • Summary of Powerset technology including Ruby on Rails « CTOWatch

    […] July 19, 2008 · No Comments TechCrunchIT » Blog Archive » Next Year’s Headline: Microsoft fails to do anything significant w… […]

  • Jesse Ezell

    Keep in mind that Microsoft doesn’t have to rewrite a single bit to use PowerSet functionality. The Ruby on Rails part doesn’t mean jack, since Microsoft will be rewriting the UI to match the rest of the properties anyway. That’s an easy task. The complex code lives behind the UI and they don’t have to rewrite that any time soon. All they have to do is pop in a webservice or two on those unix servers and they can access everything PowerSet has to offer quite easily from Windows servers.

    Memcached is just distributed caching tech, wouldn’t be a big deal to run a bunch of unix servers to handle caching regardless of the platform, since the communication with the cache happens with simple network calls. There are plenty of memcached .NET client implementations that can talk to unix memcached servers.

    As for rewriting, IronRuby would be sufficient. Already, it is significantly faster than the latest release of Ruby for many things and they haven’t really don’t a lot of perf tuning. Most likely, just diverting resources to work on IronRuby perf / stability would be a much faster solution than trying to rewrite anything. Microsoft loves using beta technology on their servers (large amounts of traffic were running on Windows 2008 under Virtual Server while it was still beta), so it’s silly to think they wouldn’t try something like this. In fact, porting any of the code from Ruby would just be retarted, since it would provide no real value.

  • GirlPhobia

    I completely agree that Powerset must be assimilated before Microsoft will consider it to be theirs. That probably means throwing away whatever tangible stuff they’ve built and redoing it all in C#, ASP.NET, Silverlight, etc. However, Microsoft is pretty experienced at pissing away money to convert open source web sites to proprietary ones over several years. Remember that when they bought Hotmail it was all running on BSD Unix servers. So who knows, maybe we’ll see Powerset on some time in 2012.

blog comments powered by Disqus