Deconstructing Twitter

Today we get the latest full ongoing iterative back-filling explanation of Twitter and XMPP. Dave Winer’s intuition that FriendFeed is one of several (4) receipients of the XMPP data stream turns out to be correct. In a post this afternoon, Biz Stone confirms Summize and adds Twittervision and Zappos as the others. FriendFeed makes only small demands on its current subscriber base, but its rapid growth in recent weeks suggests that will become more of a problem as time goes by.

Stone suggests Summize has “greatly improved on Twitter’s Track feature” with its extended filtering capability. More accurately, it has improved on a search function that Twitter does not have, but Track’s real-time display of keyword “hits” enables conversations that Summize only simulates. Track, when combined with XMPP over IM (Gtalk/Gchat in my case) allows instant replies as hits present themselves, where Summize has no direct input to Twitter but rather calls the Web application for replies to search results.

A work-around for that limitation would be to allow the Summize stream to be syndicated to other services, something that appears to be working in some fashion via Twhirl’s implementation. Clicking on the Spylass icon produces an input window to query Summize’s server. But you have to manually enter terms, and the results are not made available or updated in real time while the possibility of a conversation ephemerally exists.

It’s difficult to tell in this query by blog post communication with the mother ship just what the deals are with the 4 favored sippers from the firehose. It may just be a design decision on Summize’s part as to how to “greatly improve” on Track, but the other possibility suggested by earlier statements on the subject is that Summize is at least somewhat constrained in how it can pass along the stream to other parties.

Specifically, is the constraint (if it exists) that the more ongoing requests for standing searches (Track) the more stress on Twitter’s core? We know the average interval between a message being posted on Twitter to surfacing in Summize is somewhere around 30 to 60 seconds. It’s hard to say without testing a service no longer available what the interval between posting and appearing on Twitter is, but a followed post is sometimes as fast as 10 seconds.

A reasonable assumption can be made that Summize is recording the XMPP stream and then searching it, separating any additional load from Twitter’s servers. Therefore, there is no reason other than business considerations why Summize could not provide access to its search services to other parties. But Twitter continues to offer language that can only be read as an evasion of the fundamental issue.

While the XMPP feed of the full Twitter Public Timeline is an amazing resource, drinking from the fire hose is not the best way to quench a thirst. With continued updates and refinement, our API will support most scenarios in a way that preserves overall system performance.


Yes, Track rocks. But doing what Track does is not the best way to do what Track does. We will continue to try and figure out how to provide fixes and improvements to Twitter without crashing our system. Even though we have no additional demands on our system if Track (or super-fast search) is offloaded to another party, we’re not going to do that because we want to reduce the Track capability from its current amazing status to one that we control through our API where we can limit the business case to “most” scenarios, in other words, scenarios that are less than amazing.

Despite Twitter’s best efforts to restrict access to their data, a third-party developer has created a service called which implements Track using Summize data. Regardless of the effort Twitter makes in controlling this data, there is so much user demand for these features that developers will always find a way around it. Twitter would be better served by opening this data up rather than attempting to control it and fighting both the developer and user communities.