There are still a lot of questions about this alleged Yahoo Voices data breach — including whether there was a reason behind the breach in the first place — but Yahoo has now officially confirmed that the data did in fact come from its servers, and that “approximately” 400,000 email addresses and passwords have been leaked in plain text online. Meanwhile, security specialists are now parsing the data and one has created a script to check if your email address (which doesn’t have to be a @yahoo.com address) is among those exposed.
In a statement in which it apologizes for the attack, Yahoo tells us that the data came from an older file from the Yahoo! Contributor Network (which it picked up via its Associated Content acquisition). But it also noted that less than five percent of the emails had valid passwords, and that it is now working to fix the vulnerability that led to the disclosure — note, it didn’t say it’s fixed yet.
Here is the full statement:
At Yahoo! we take security very seriously and invest heavily in protective measures to ensure the security of our users and their data across all our products. We confirm that an older file from Yahoo! Contributor Network (previously Associated Content) containing approximately 400,000 Yahoo! and other company users names and passwords was stolen yesterday,July 11. Of these, less than 5% of the Yahoo! accounts had valid passwords. We are fixing the vulnerability that led to the disclosure of this data, changing the passwords of the affected Yahoo! users and notifying the companies whose users accounts may have been compromised. We apologize to affected users. We encourage users to change their passwords on a regular basis and also familiarize themselves with our online safety tips at security.yahoo.com.
Meanwhile, Sucuri, the company that created the above script, also has started to analyst the breached list. It identified some of the most common domains in the hacking list, including the most common passwords, and further analysis on password length. Looking at it, it’s yet another reminder of how important it is to be a bit more cryptic in how you secure yourself online:
According to Sucuri’s analysis, Yahoo is the most popular domain in the email list, but it’s by no means the only one.
It notes that 135,599 emails came from yahoo.com; but that a further 106,185 came from gmail.com; 54,393 from hotmail.com; 24,677 from aol.com; 8,422 from comcast.net and 6,282 msn.com. Note that these numbers are not exactly the same as those being tallied by others, but they are close. Daniel Cid, the CTO, also noted that there were multiple passwords from government accounts.
Other parts of the leaked data are true to form: “123456″ was used as the password for 1,666 of the accounts; “password” was used for 780 of them. Other frequent ones were common first names like Maggie and Michael, as well as other number variations (123123). Seven characters is the most common length of passwords.
The fact that Yahoo says that it is still working on fixing the vulnerability should raise some alarm bells. Sony had a massive problem last year involving a data breach of user profiles on the PlayStation network, which took some time to finally patch up — but not before much embarrassment and more leaks for the company.
I don’t think this is the can of worms that Erick was referring to when Yahoo bought Associated Content in 2010, but it’s a pretty fitting example of one nonetheless.
The home page has the feel and look that Associated Content curates all front-page material. There is no trace of the amateur quality or bad grammar that you would find on YouTube. Associated Content pays some content producers for their submissions, but anyone can upload their stuff for free. Associated Content shares some user content with their partner sites. This is probably why they curate and pay some content producers for their submissions. The content search feature on...