
The following guest post was written by Dan Birdwhistell, founder of people directory Bigsight (reviewed here) and creator of Hacking Facebook, a website that teaches developers how to pull user data out of Facebook.
There’s one thing about Facebook that most people still seem to have wrong: that it’s a walled garden. Quite the contrary, the Platform allows for full data portability and has since its inception. It actually isn’t a walled garden at all.
The problem is that this knowledge is buried deep within the FB documentation, a place few developers have wandered. For whatever strange reason, legal documents are like amusement parks for me, so I’m now fairly well acquainted with the ins and outs of porting data (and users) out of FB. So that’s what this whole post is about: To show you how it’s done.
Background
Once we got our heads around the Platform back in October, 2007, we hacked together FriendCSV as a demonstration. This is an app that allows you to export your full social graph (and all friend data) to your hard drive. This is all done in accordance with FB policies. After people got comfortable with this, we took it a step further by allowing users instantly port their own personal data into bigsight to create a new profile and account. Test out our importer here.
Why Facebook and the Platform are important
We believe FB is architecting the next version of the web. This is a bold claim – no doubt — but here’s the thinking:
The result is a web based on users and not content, with an individual’s FB ID ultimately serving as his chief tour guide, passport, and keymaster (but not like Vinz Clortho) around the rest of the web. So if I am right, FB will become king – not as a social network, but as the architect, owner, and manager of the next version of the web. So the point: you need to know how FB works and how you can leverage the Platform to grow your site or business. So here we go…
Understanding how FB Data is structured
Before you go messing around in the pool house, you’ll need to get your head around how everything is structured. It’s best to first focus entirely on non-user data given that these are the permanent structures users “claim”. Each of these elements has a unique ID and entry fields are typically auto-complete to ensure data alignment.
So exactly how much data can you export?
Stated simply, you can touch basically everything but a user’s contact information. So here’s the list, including how the data is structured in its output. We’ll address friend lists and data in a moment.
| Data Element | Export Format | |
| UID | Permanent | |
| First name | Free form (ff) | |
| Last name | ff | |
| About me: | ff | |
| Activities: | ff | |
| Birthday | Day, Month, Year (1900-2008) | |
| Books | ff | |
| Colleges | Up to five: name, type, degree, concentration, grad year | |
| Hometown | “City, State” or “City, Country” if outside the US | |
| High school | Up to two: name, grad year | |
| Interests | ff | |
| “interest sex” | Male or female | |
| “interest meeting” | Friendship, Dating, Relationship, or Networking | |
| Location | “City, State” or “City, Country” if outside the US | |
| Movies | ff | |
| Music | ff | |
| # of notes | # | |
| # of wall posts | # | |
| Networks | (up to four) Region, High School, College, Work | |
| Photo albums | All pictures + tags, titles, etc. | |
| Pictures | Misc. pictures + tags, etc. | |
| Political Affiliation: | Party name | |
| Profile pictures: | 50×50, 50×150, 100×300, or 200×600 | |
| Profile update time: | Date, time | |
| Quotes: | ff | |
| Relationship Status: | Single, in a relationship, engaged, married, it’s complicated, open relationship | |
| Sex: | Male or female | |
| ID of Significant Other: | UID | |
| Status message: | ff + date/time | |
| Timezone: | # offset from GMT: “-6” for Nashville, for instance | |
| TV shows | ff | |
| Work History: | Up to 15 companies: name, position, description, location, duration |
In addition to these core profile elements, you can also make calls for and then export huge amounts of data through:
Now about friend lists: As you’ll see when you use FriendCSV, you can not only access all of the above for a single user, but you can also access the same data from their friends. Pretty crazy, right? This means that by touching one user you can instantly touch thousands more. But hold on now…time to talk Privacy.
Understanding FB Privacy, Terms of Service, and Platform Documentation
There are five key documents that come into play re: data portability on FB. Taken alone, each is hard enough to understand – taken together, it’s downright labyrinthine. As a developer, though, there are really only four things you need to know:
“If you, your friends, or members of your network use any third-party applications developed using the Facebook Platform, those Platform Applications may access and share certain information about you with others in accordance with your privacy settings…
…in addition, third party developers…may also have access to your personal information (excluding your contact information) if you permit Platform Applications to access your data.”
“You may retain copies of Exportable Facebook Properties for such period of time (if any) as the Applicable Facebook User for such Exportable Facebook Properties may approve, if (and only if) such Applicable Facebook user expressly approves your doing so pursuant to an affirmative “opt-in” after receiving a prominent disclosure of a) the uses you intend to make of such Exportable Facebook Properties, b) the duration for which you will retain copies of such Exportable Facebook Properties, and c) any terms and conditions governing your use of such Exportable Facebook Properties (a “Full Disclosure Opt-In”).”
This is a bit wordy, so we’ll translate: If you outline which data you’ll use, how you’ll use it, for how long, what other terms the User might be subject to, and get User consent, then you can keep and use profile information for as long as you want.
So the main lesson here is that you shouldn’t be afraid of the various policies and documents because they are outlined to help you rather than restrict you. But again… a note about friends’ data. FB has been incredibly aggressive in policing how developers are accessing and using these data, and rightfully so. Last week they shut down the Top Friends app for allowing too much data access and earlier this year they canned Google Facebook Connect because it didn’t operate in accordance with their policies.
I’ll say again that they were right to do this and when thinking through how to port users, you should be mindful not just that FB might shut you down, but that a secondary friend who doesn’t opt-in to your site probably should be left alone. More than likely, he doesn’t want what you’re selling. Of course, there are ways around this if you want to brute force it, but we’ll just keep that to ourselves. So let’s keep going…
Setting up the Application(s) and managing the exports
Your importer can be inside FB as part of an application or it can exist as a standalone. We do it both ways. With FriendCSV, users install the app and we then direct them to their new profile as an add-on; meanwhile, out in the ether, we have a dedicated portal at http://fb.bigsight.org that directs users to FB for initial authentication, but then kicks them right back to our web app. If you already own a great app with lots of traffic, start there. If not, it’s probably best to set up your porter out on the web. Exporting the key data for a single user doesn’t take too long, so you can typically create a new page/account for them instantly. However, if you plan on exporting an element like friends lists (careful, hoss) or photos, you’ll need to batch up FQL requests when possible and also be open to allowing some processes to happen in the background.
The FB API is “REST-like,” which means it can be used by anything that handles standard HTTP requests. Libraries exist for PHP, Java, Ruby, and other languages that make the API easier to use. The following example code is for Ruby on Rails and the Facebooker library, as that’s what we use at bigsight. No matter which language you choose, writing FB applications to extract data is surprisingly easy. One line of code will tell your application to authenticate with FB. Simply add “ensure_authenticated_to_facebook” to your Rails controller and it will send your user to the FB login page if needed, and return them to your application. From that point on you have full access to the FB user and all exportable data. Here’s one example of how to extract educational history:
def gather_schools
# Create a local copy of the Facebook user
@user = User.create(:name => @fb_user.name, :fb_uid => @fb_user.uid)
# Load the user's schools
for fb_school in @fb_user.education_history
School.create(:name => fb_school.name, :user_id => @user.id)
end
end
For a full view of the FQL queries, check out this page in the documentation.
Integrating FB Data into an Existing Third Party Site
Ok so now you know what the data look like and how to access it, you need to think through a few things to figure out how to integrate it all with your site or widget. These are the questions to ask:
Basically, get creative. It’s almost silly how many cool things can be done here.
Conclusion
Like I said above, we believe that FB is on the path to doing something amazing with the web, and we believe that everyone in the industry needs to know how to not just adapt to it, but also thrive from (and alongside) it. It should be an interesting summer re: the web as Facebook Connect launches and more and more people begin leveraging this and the Platform for utility rather than blind user engagement.
Our opinion is that while FB Connect will offer some amazing functionality in regards to quick user integration and synching, it likely won’t be as powerful as the Platform in terms of data access. Either way, these developments will not only change how users interact with third party sites, but they will also raise the bar for user experience as individuals accustomed to the FB UI will begin to demand increased alignment. Soon we’ll likely see businesses start to build sites on the back of FB rather than a) going out on their own or b) doing what could prove to be complicated integration. Additionally, we’ll probably also find resolutions to a few ongoing discussions and questions such as who owns a friends’ list and how what FB calls “dynamic privacy” actually works out in the wild.
It’s all pretty interesting stuff to think through and incredibly fun to see it all come together so quickly. Creative destruction all around, you know. Lots of warriors in the arena. ARE YOU NOT ENTERTAINED?
Austin, TX
Seattle, WA
San Diego, CA
Menlo Park, CA
San Francisco, CA
Berlin, Germany