Find a friend

From Greg Linden's excellent blog post, I ran across this Google paper ( Evaluating Similarity Measures - A Large Scale Study in the Orkut social network [PDF]) detailing how Orkut generates 'related communities'.
The actual findings of the paper do not hold much surprises. Cosine distance works as it should and it doesn't seem to penalize large communities much. By 'penalizing', I mean that if community A or B is real large, there is lesser chance of them showing up as 'related' though they may have Orkut members in both communities.
The paper makes some interesting observations on how much importance to give to whether a user joins a community when he views it by clicking on it in the 'related communities' section. In retrospect, this is pretty obvious - users are shy of joining communities which may 'embarass' them somehow (since all users can see what communities a user is part of).
I'm surprised that Google didn't go the extra step and try and find 'related people' using a similar measure. It would be fun to get a list of people you might be interested in talking to where 'potential interest' is determined by common communities, shared friends, degrees of separation,etc. A true 'find a friend' service. In fact, I remember Orkut's newsletter having something like this long ago - I wonder why they scrapped it.
On a tangential note, I once used the Del.icio.us APIs to try and build a recommendation system ('If you liked this page A, you may also like this page B since other folks who liked A - liked B too'). Unlike Orkut, del.icio.us has too much of a geek/early-adopter leaning. Maybe they will get a healthy influx of normal people due to their Yahoo acquisition.
Btw, if any of you find this stuff interesting, do check out George Karypis' paper - 'Evaluation of Item-based Top-N Recommendation Algorithms'.
P.S Orkut must be only Google service to run on Asp.Net (Writely doesn't count as it is an acquisition). However, I'm not sure whether such a donut-starved server is a good advertisement for Scott Guthrie's team :-)
P.P.S Before someone points it out, yes, I'm a Microsoft employee running a blog powered by Google's Blogger which is hosted on a Linux/Apache webhost. Why? 'Coz a friend already had some space which I could share easily. And Wordpress, the only other option I investigated seriously,had one thing I didn't like - it didn't generate static html pages. I know other blog engines do but I was too lazy to go use one of them. So shoot me :-)
Going off-topic, I found RACOFI - a system to find related music pretty fascinating (http://www.daniel-lemire.com/fr/abstracts/COLA2003.html)
I am curious about orkut linking a friend algorithm. its amazing that thay find possible llink between two profle via friends of each other. in theory it is a NP-complete problem. do you have any idea how orkut does that
thanks
parbati
Thanx,
Pradeep...
<< Home
Archives
November 2004 January 2006 June 2006 July 2006 August 2006 September 2006 October 2006 November 2006 December 2006 January 2007 February 2007 March 2007 April 2007 May 2007 June 2007 July 2007 August 2007 September 2007 October 2007 December 2007 January 2008 February 2008 March 2008 April 2008 May 2008

