• Update: Digg Recommendation Engine Confirmed For This Week

    Michael Arrington

    J. Michael Arrington (born March 13, 1970 in Huntington Beach, California) is a serial entrepreneur and the founder of TechCrunch, a blog covering startups and technology news. Arrington attended Claremont McKenna College (BA Economics, 1992) and Stanford Law School (JD, 1995) and practiced as a corporate and securities lawyer at two law firms: O’Melveny & Myers and Wilson Sonsini Goodrich... → Learn More

    Monday, June 30th, 2008

    Digg has released some materials around their new Recommendation Engine, which we wrote about last night, and say that it will be released this week. Two overview videos are below, including an interview with Digg Lead Scientist Anton Kast. We’ve also included the text of a white paper on the Recommendation Engine.


    Digg Recommendation Engine from Kevin Rose on Vimeo.


    Anton Talks About The Digg Recommendation Engine from Kevin Rose on Vimeo.


    The Digg Recommendation Engine
    People love Digg because it’s a place to discover and share great content from around
    the Web. The Digg homepage always has the most popular stories, but many Digg
    users find their content in the Upcoming section, which gets over 15,000 new stories a
    day. To help users filter this enormous amount of content, we have created a new
    feature: The Digg Recommendation Engine.

    When you Digg a story, you tell the Recommendation Engine two things: that you
    recommend the story to other users and, less obviously, that the users who Dugg the
    story before you are good at finding content. The Recommendation Engine keeps track
    of users who Dugg particular stories before you did, and it recommends you the stories
    they Dugg. The more content you Digg, the smarter the Recommendation Engine
    becomes.

    Finding Diggers Like You
    The Digg Recommendation Engine uses your Digg history over the last thirty days to
    make Recommendations. (You can see the number of items you have Dugg over the
    last month on the right-hand side of the Recommended view.) Every time you Digg a
    story, the Engine matches you with other Diggers who Dugg the same story, and keeps
    track of all your Diggs in common with them.

    When it’s time to calculate your Recommendations, the Engine draws from this pool of
    matched Diggers. For each matched Digger, it computes a correlation coefficient
    between you and them. It then picks a cutoff for this correlation coefficient, and the
    Diggers who make the cut are called “Diggers Like You.”

    It’s easy to understand how the correlations are calculated. For each user with whom
    you Dugg something in common, the Engine determines how many stories the two of
    you Dugg in common, and divides that number by the total number of stories you or they
    Dugg. The ratio is a correlation coefficient, a number between zero and one (zero if you
    and the other user never agreed; one if you always did). Such a ratio is sometimes
    called a “Jaccard coefficient.”

    This scheme automatically accounts for the overall level of Digging activity. If another
    user Diggs a lot, they have to agree with you on many stories to become a Digger Like
    You. If another user Diggs rarely, then a small amount of agreement can suffice.
    2
    From Diggers Like You to Recommendations
    Once the Engine has determined your Diggers Like You, your Recommendations consist
    of stories that your Diggers Like You have already Dugg, minus the stories you already
    Dugg or Buried. There are some extra steps, like the diversity rules and the
    promotability constraint described below, but this is the basic idea.

    Recommendations are always displayed together with your Diggers Like You and their
    compatibility percentages. These percentages are just correlation coefficients. You may
    notice that you are more compatible with a user that has fewer Recommendations than a
    user with less compatibility but with more Recommendations. This is because although
    you have Dugg more items in common with the more compatible user, that user has not
    Dugg as much.

    The Recommendations you get from any particular user will come from topics (such as
    Technology or World News) where you have a shared Digging history. We figure that
    two users may have similar interests in a subject like ‘playable web games’, but one
    person might be into politics while the other follows celebrity gossip. So we actually
    compute correlations, Diggers Like You, and compute Recommendations in several
    collections of topics independently.

    Promotable Stories
    Since the Recommendation Engine works only with Upcoming stories, all the stories you
    get from the Recommendation Engine are “promotable”, meaning that they are recent
    enough to be eligible for the Digg homepage but haven’t appeared there yet. This
    means that whenever you Digg one of your Recommendations, you are helping select
    stories for the front page of Digg!

    Diversity
    Just like stories on the homepage, we want your Recommendations to be diverse: a
    balanced number of stories, not all on the same topic, and not all Dugg by the same
    people.

    To make sure that your Recommendations are diverse, the Engine imposes limits that
    keep things from getting too focused. It makes sure that no one Digger Like You
    determines too many of your stories. It attempts to make your Recommendations reflect
    the spectrum of topics that you’ve Dugg in the past, and it adjusts the compatibility cutoff
    for Diggers Like You so you don’t get too many or too few stories.

    The Engine also limits the influence of any single one of your Diggs. For instance, if you
    are Digg number 1,000 on a popular story, you will have 999 similar users from that one
    Digg alone, and those users are not necessarily more compatible with you than the two
    3
    or three who may have Dugg a less popular story you also liked. The Engine limits the
    total pool of users you can get from a single Digg to balance things out.

    We hope you enjoy using the Recommendation Engine and look forward to helping you
    uncover even more great stories on Digg!
    Digg on!
    Anton Kast – Lead Scientist Digg

    Tags:
    blog comments powered by Disqus