OpenProvider

Friday, May 7, 2010

Intro ¶

Airplane rides are always a good time for reflection—there aren’t really any distractions (unless you’re unlucky enough to be sitting next to a crying baby) and the white noise of the engines provides for a nearly perfect brainstorming environment.

The main thing on my mind last weekend was the current state of the social web—how it’s broken and how best to “fix” it.

Facebook ¶

One of the main problems that people have with the system as it is today is that the state of your information is currently in flux—there is no way to tell if the service that you’re using today will have the same privacy policy as it will tomorrow. Facebook is especially guilty of this alarming trend and others are likely to follow. For an illustrative timeline of Facebook’s privacy policy, check out the post from the EFF.

The F8 conference two weeks ago unveiled Facebook’s intention to own everything you do on the Internet, from the movies you watch on IMDB to the music you listen to on Pandora. Understandably, a lot of people involved in the tech industry have been deleting their Facebook accounts left and right. There is a lot of misunderstanding regarding why this is happening, and no, it isn’t because of the lack of privacy. The reason is that Facebook was sold to us on the promise that we would just be sharing our information with our friends and family, and not with advertisers and “partners”.

The Plan ¶

For all the reasons mentioned in addition to many others, there has been an enormous push to change the way we think about social networking over the past few days. One project that’s gotten a lot of traction is diaspora, a soon-to-be open source project being championed by 4 computer science students at NYU. No one has any real idea on what it is or how it will work, but I’m sure that we’ll be hearing more about them soon.

I do think what they’re doing is awesome, and I hope it works. I just wanted to get my thoughts down on paper before I forgot about them (as I frequently do). What follows is my plan for a distributed social protocol.

First of all, I’m going to go over the “must haves”—things that the ideal social network should have, whether it be distributed or centralized.

Accessibility (or Virality)
Privacy
Low barrier to entry
Portability

Accessibility and Virality ¶

The first step in using a social network appropriately is finding your friends, and there’s no way to do this if all the searchable information is hidden. Complying with #4 above makes this extra difficult.

Well, there’s a way to solve it at the expense of losing “fuzzy search”, and that solution is hashing. A hash function maps a piece of data into a smaller, more manageable chunk, and may or may not be irreversible. E.g.

>>> sha1("[email protected]").hexdigest()
c4080eb3969b7c95c5e38d563e15bd2407e35153

That value you see is not reversible. There is no function that will take c4080eb3969b7c95c5e38d563e15bd2407e35153 as a parameter and will return [email protected]. It’s never been done, and I don’t suspect it will be. Hashing email addresses in this manner will hide them to spammers while at the same time allowing people to search for them.

“How can you search for something that you can’t decode?”, you ask. Well, the cool thing is that when people search for you, the search query is itself encoded in the same way, so if the encoded search query is equivalent to the encoded email address, then there’s a match! For more info on hashing, check out Wikipedia on the subject.

This leads into my proposal that there be a global identity database that is accessible and downloadable by anyone (addresses #3 above). It will store the encoded email (as described), plus two other pieces of information necessary to complete the puzzle.

(code)* email

provider

So what’s a “provider”, you may ask? A provider is a social web service, like Facebook or Twitter, that controls a user’s information.

If the maintainers of the central node start turning to the dark side, well, everyone can and will abandon them, as others already have access to the data. provider is in itself worthless to a spammer.

Connections ¶

A connection is a one-way link between two users, regardless of the provider. So if I have an account with provider A and my friend has an account with provider B, I will still be able to connect with him. In essence, all profiles function like public Twitter profiles. Private profiles sound like a good idea but I’m at a total loss as to how they might be implemented.

A user can only have an account with one provider at a time.

After I make a connection with someone, that connection is stored in my browser. Whenever I log into my provider, it pulls the follower info from the browser and updates my profile with the appropriate information. For browsers that aren’t compatible, you can simply store all the information on a flat file on your computer, and upload it to your provider whenever you decide to change. Providers themselves can store this info. Providers should allow for export and import of follower data. If they don’t, the users can leave.

The great thing about this system is that it prevents any chance of a bait-and-switch. Users will already have access to their connection data and can just start using another provider in the unlikely chance that it does happen.

Privacy ¶

So all we’ve got now are connections. Now what?

Connections link profiles to profiles. It’s up to each provider regarding what information to expose to the world at large, but it all follows the same basic protocol. So no matter what, if I go to

http://www.example.org/profile/?id=1234567
&format=json
&key=6b7894e44e440f83e3fa4be3a6d98961c70d6f2f

I’ll get back something like

{"profile": {"first_name": "Dan", "last_name": "Loewenherz" }}

The key is my email address ([email protected]) plus a cryptographic salt. The salt to use is something that the community as a whole could agree upon. Using an access key like the one above ensures that only people who know my email address can access my profile information. Nifty, huh?

“So that all sounds cool, but hold on a second,” you say. “Didn’t you just say that it’s a huge problem that certain providers control our profile info?”.

Well, the answer is yes, I did. Kind of. But I left something out. Today, if you leave Facebook, you lose all of your connections to your friends. The connections you’ve spent time “making” online are really what’s valuable—not the personal information you post to your profile. Let’s not kid ourselves, Facebook can guess most of what we think is private just from our connections (see Project Gaydar for a real example).

The power to control the connections is the only power that really matters. And that’s already been addressed. If a provider chooses to not let you export profile info, then that’s their prerogative. And it’s also the user’s prerogative to leave them.

Conclusion ¶

Social networking is still in its infancy, and it shows. Even though most of what I went over in this post was extremely “high-end” and abstract, I think I make some valid points and hope it leads into more detailed discussion of how something like this might work. I think there’s something to it.