The Archive Team, the same group who once saved around 900 GB of Geocities content before Yahoo shut it down, later releasing it in torrent format, is now focusing on archiving content from another Yahoo property preparing to hit the chopping block: Upcoming.org. The collective’s involvement in saving the index of the events database follows an impassioned plea from creator Andy Baio, who recently explained how Yahoo’s security made it difficult to back up the site’s content by simply scraping pages.
His post on Friday – well worth the read for the personal insight on what it was like to watch Yahoo slowly destroy his startup following the acquisition – has him calling the choice to sell to Yahoo “a horrible mistake,” and Yahoo “a particularly horrible steward for the community” Upcoming.org had built.
Baio doesn’t at all mince words in his post, saying that the team came into Yahoo hopeful, but soon discovered that the tech company wouldn’t live up to its promises. He wrote:
It wasn’t clear how dysfunctional the rest of Yahoo was until we’d settled in, and there was no indication how horrible they’d soon become in the years to follow. This was long before they gave up dissidents to the Chinese government, closed Geocities, weaponized their patents, “sunsetted” Delicious, and a number of other awful decisions.
The founder, also known for having built Playfic and Supercut, and helping to build Kickstarter, among other things, refers Yahoo’s decision to shutter Upcoming.org as “Yahoo’s typical f***-off-and-die style,” noting that the company offered only eleven days of notice and no way to back up past events.
He then detailed some of the technical challenges involved with attempting to archive from Upcoming, saying that even though its events and venues use auto-incremented IDs, Yahoo security measures only allow for scraping a few pages using curl or httrack before Yahoo starts serving up blank responses.
Baio then asks the community for help with getting a dump in any form.
As it turns out, the community had already stepped in. Baio tells us that he heard about the closure from friends, and then posted about it on the afternoon of the 19th. But he soon realized that the Archive Team had started rescue work of their own, after checking out this Github repository.
And now, the Archive Team needs your help, too, if you’re at all inclined.
The team, led by computer historian Jason Scott, has experience pulling down content from not only Geocities, but also other sites and services like Friendster, MobileMe, and Fortune City. It’s also currently working on Posterous, Formspring, plus Gamespy, 1up, UGO, and IGN, to name a few.
To help with Upcoming.org’s archival efforts, interested participants can download and install ArchiveTeam Warrior for Windows, Mac or Linux. The software is a virtual appliance that helps anyone to archive websites, then backup the saved content to Archive Team servers.
After you install the application, just choose the “Upcoming” project to begin the backup process, or choose “ArchiveTeam’s Choice,” which allows the team to adjust the software’s focus on demand. You can track progress of the Upcoming.org archival process here.
Here’s how the software works, an explanation courtesy of Baio himself:
Baio says that the Archive Team’s software also addresses the technical challenges he had previously encountered, noting that “they’re very thorough.”
And he adds that while he has no plans to ever relaunch Upcoming, he does want to put the saved content back online. “I’m hoping to set up a permanent archive for posterity and some visualizations of the site’s activity over the last decade,” Baio explains.
If you have spare cycles, it’s a lot more beneficial than bitcoin mining these days, that’s for sure.