Your Not-So-Confidential SSN
Study shows that your SSN could be predicted
August 2009Alessandro Acquisti doesn’t think Social Security numbers make good passwords. And he thinks certain organizations that use them as such, to “authenticate” a person’s identity, need to stop. Why? Because with patience, luck and a little technological savvy, identity thieves can use your place and date of birth to predict your Social Security number (SSN), according to recent research from two Carnegie Mellon University scholars. This means the nine-digit numbers that so many people and institutions use to secure private financial and health information are “predictable,” says Alessandro Acquisti, the professor of information technology and public policy at Carnegie Mellon University who published the study.
The research results “demonstrate once and for all” that SSNs are inadequate to protect private data, Acquisti says. And the risk is especially great for children and young adults. Working with postdoctoral researcher Ralph Gross, Acquisti analyzed a set of 500,000 SSNs belonging to deceased individuals. (After a person dies, the SSN becomes publicly accessible—along with information on their date and location of birth, two factors the researchers relied upon to predict an SSN’s numeric composition—through the Social Security Administration’s Death Master File).
Through a computer-assisted system of trial and error based on statistical patterns observed in the Death Master File sample, the researchers identified, with a single attempt, the first five SSN digits for 44 percent of those born between 1989 and 2003. For 8.5 percent of these records, they determined complete SSN in less than 1,000 attempts. For people born between 1973 and 1988, they had slightly less luck, identifying the first five digits for 7 percent of the sample in a single try.
That’s a potential recipe for identity theft, Acquisti says. In order to perpetrate financial, medical and other types of fraud in the United States, one often needs little more than a person’s name, date of birth and SSN. Of these data requirements, the SSN is typically thought to be the most private.
But advances in computing, and the deluge of personal data available in the public sphere (not to mention the sheer number of missing hard drives, thumb drives and other corporate data losses) have made “predicting” SSNs easier than ever.
“We are not the first ones to say it,” Acquisti says of his concerns about the privacy of SSNs. Previous researchers have sounded the same alarm. Even the Social Security Administration (SSA) has “cautioned the private sector, including educational, financial and health care institutions, against using the SSN as a personal identifier” for decades, according to a statement the agency released in response to the study. “We can’t pretend anymore that SSNs can be kept secret,” Peter Swire, a law professor at The Ohio State University, told The Washington Post regarding Acquisti’s research. “This report puts a nail in that coffin.”
Privacy experts contacted by Identity Theft 911 weren’t surprised by Acquisti’s results. After all, the SSA’s own Web site explains the underpinnings of its numeric allocation process, says Linda Foley, founder of the Identity Theft Resource Center. She also agrees with Acquisti’s wish for the private sector to stop using SSNs as “authenticators” to verify individuals’ identities.
“That’s something we’ve been saying for a long time but I don’t think it’s going to happen,” Foley says. “What we can hope for at this point is they find a more random way of [assigning] Social Security numbers.”
For reasons “unrelated to this report,” according to the SSA’s written statement, the agency will begin randomizing all future SSNs next year.
“The public should not be alarmed by [Acquisti’s] report because there is no foolproof method for predicting a person’s Social Security Number,” the statement reads. “The method by which Social Security assigns numbers has been a matter of public record for years. The suggestion that Mr. Acquisti has cracked a code for predicting an SSN is a dramatic exaggeration.”
While they may not have cracked codes, the Carnegie Mellon researchers have exposed vulnerabilities in a tangible way.
Anatomy of a Social Security Number
Ever wonder why your SSN is divided into three parts? Those first three digits, the “area number,” are determined according to the zip code of the mailing address listed on a Social Security application. (Prior to 1972, the area corresponded to the state in which the card was issued.) The second two digits are the “group number,” issued in precise but nonconsecutive order. These range from 01 to 99. The last four digits, or the “serial number,” run consecutively from 0001 to 9999 within each group number.
And if that seems like a mind-boggling roadmap for SSN “prediction”—that’s what computers are for.
Relying on patterns they discovered in their analysis of the Death Master File, Acquisti’s team found that people born in lower- population states had numbers that could be predicted more easily. The same is true for people born after 1989, when the SSA’s “Enumeration at Birth” initiative integrated the Social Security application into the certification of birth process. This made for a tighter correlation between the date of birth and SSN application, allowing the researchers to predict more precisely the range of potential SSNs attached to an individual.
The subset of babies born in Delaware in 1996 illustrates the ramifications of this increased risk. For this group, researchers needed 10 or fewer tries to predict all nine digits for 1 out of 20 SSNs.
Deceased people and their survivors have little to fear from controlled studies by university researchers using number-guessing algorithms. But in the hands of crafty cyber-criminals, such formulas could become dangerous, turning online credit application sites into testing grounds for valid SSNs, Acquisti warns.
Ondrej Krehel, a digital forensic examiner with tech security consulting firm Stroz Friedberg, agrees. “All it takes is finding a system that could be exploited,” he says. “And sooner or later that system will be found.”
To replicate Acquisti’s process for predicting SSNs, attackers would need first to replicate the researcher’s sophisticated algorithms, and then apply them to data including a person’s birth location and date. Identity thieves could purchase such information from data brokers, or look it up on voter registration lists, the Carnegie Mellon researchers suggest. Even social networking sites like Facebook, Twitter and Myspace, where people freely share personal details through blogs, biographies and “quizzes” for all to see, could be used by scammers.
“Unexpected Privacy Consequences”
SSNs were never meant to be used by the private sector. They were created after the Social Security Act of 1935 as part of a system designed to keep track of an individual’s Social Security payments over the course of a lifetime. The federal government first became concerned about potential private sector applications in the 1970s. An SSA task force recommended back in 1971 that the agency take a “cautious and conservative” position toward SSN use, and avoid promoting the number as an identifier, according to the agency’s Web site. In 1977, the Carter administration warned that the Social Security card should not be used as a national identity document.
In the Internet age, the SSN allocation system is especially fraught with peril, Acquisti says. His team conducted a second test that relied on demographic data taken from the social networking profiles of 621 students at a “North American University,” and used the patterns established by the Death Master File analysis to predict students’ SSNs.
The rate of prediction was only slightly less accurate than that found in the test of the larger Death Master File sample—a phenomenon the researchers attribute to inaccurate or deliberately misleading self-reported data provided on the social networking sites. In a single attempt, researchers determined the first five digits for 6.3 percent of the sample, composed mostly of students born in populous states before 1989. Nearly one-third of the predictions that matched the target’s first five digits fell within “fewer than 1,000 integers” from the target’s true SSN.
And if the considerable gap in efficiency between predicting the first five digits of an SSN and the entire number, which is needed to commit fraud, seems like some consolation, there’s an additional wrinkle to consider: some business documents redact all but an SSN’s last four digits, potentially giving identity thieves the most difficult piece of the numerical puzzle.
Authentication Alternatives?
Many experts have offered different solutions to protect consumers and businesses from identity theft involving SSN fraud. Privacy Rights Clearinghouse Director Beth Givens, who has long warned against the use of SSNs as authenticators, warns that before credit cards, banks or mortgage lenders approve credit applications, they must check “multiple data elements”—a past address, for example— against what’s listed on a credit report. Guidelines recently issued by the Federal Trade Commission, known as “Red Flags” rules, give businesses criteria for determining other potential warning signs of identity theft.
But it remains to be seen whether businesses will faithfully follow the new guidelines. Krehel points out that businesses which rely solely on SSNs for authentication will be bigger targets for fraud. “Whoever is the weakest in the chain is most likely to get exploited.”
Acquisti encourages businesses to incorporate technology-driven authentication practices. One idea is to create digital certificates that could be used to verify the identity of a credit applicant. These electronic files are issued by trusted third parties, which use them to electronically verify identities (Verisign is one popular certificate authority). Krehel points out that two-factor identification—which relies on something you know (a password) and something you have (unique software or hardware)—could also be implemented.
For now, time will tell whether cyber-criminals can discover an effective way to exploit the SSN-issuance process. While Linda Foley of the Identity Theft Resource Center found Acquisti’s findings “interesting,” she questions the decision to make the findings of his methodology public. “They have the right to publish what they’re going to publish,” Foley said, but she would have preferred “a more selective way to try to get the information across to the appropriate people.”
Acquisti says he shared his findings with government agencies before he went public, and that the decision to share the findings so widely had to do with the scope of the problem, which involves “many different disjointed parties,” especially in the private sector.
“We have to reach a broader community,” he says. “We did remove sensitive information from the version which was published to make it much harder to replicate.” Acquisti believes one of the unintended consequences of companies strengthening data security measures may be forcing identity thieves to find other systems of attack—like SSN prediction. “The challenge here is that we don’t want to say ‘this is completely theoretical so there is no risk’ or say ‘be afraid, run for the hills because everything is compromised,’” Acquisti says. “The reality is in-between.”
Also from Identity Theft 911's August 2009 newsletter
Editorial: Your Number Is Up
©2003-2010 Identity Theft 911, LLC. All rights reserved.