In 2018, California investigators closed a case that had haunted the state for four decades. The Golden State Killer, responsible for at least 13 murders and 50 rapes, was identified not through traditional police work but through a genealogy website. Investigators uploaded crime scene DNA to GEDmatch, a publicly accessible database, found a partial match to a distant relative, and worked backward through family trees to a former police officer named Joseph James DeAngelo. A DNA sample collected from his car door confirmed the match.1 He pleaded guilty to 13 counts of first-degree murder in 2020. The victims waited 40 years.
Kirk Bloodsworth waited nine years. Convicted of a child’s rape and murder in 1985, he became the first American sentenced to death to be exonerated by DNA evidence in 1993. He did not commit the crime. The actual perpetrator, whose DNA was in the evidence, was identified and convicted years later.2 Both stories involve the same technology. One caught a killer who had escaped justice for decades. The other freed a man who should never have been imprisoned in the first place.
What We Already Have, and What We Do Not The United States already operates the world’s largest forensic DNA database. The FBI’s Combined DNA Index System, or CODIS, contains more than 19.2 million offender profiles, 6.1 million arrestee profiles, and 1.4 million forensic profiles from crime scenes as of November 2025. Since its creation, CODIS has aided more than 758,000 investigations.3 Indiana alone reports that more than 50 percent of its unsolved cases that generated a DNA profile resulted in a CODIS hit.4 It is important to be precise about what this database is and is not. CODIS is built primarily from people who have been arrested or convicted. It operates under specific federal statutory authority, with access controlled by law enforcement protocols and the DNA Identification Act. Consumer databases such as 23andMe and AncestryDNA are built from voluntary submissions, governed by terms of service rather than statute, and accessible to law enforcement only through legal process. A legally mandated universal registry would be categorically different from both: mandatory in scope, governed by statute, and requiring a new constitutional and legislative framework that does not currently exist. The argument for building it deliberately and under democratic authorization is precisely that these distinctions matter, and that the current patchwork of overlapping systems provides neither the full benefit nor the full protection that a purpose-designed registry could offer.
Meanwhile, the consumer genetics industry has built a parallel archive. More than 26 million people have submitted DNA to testing services, and a landmark 2018 study in Science found that a database of approximately 1.28 million individuals of Northern European descent was sufficient to identify nearly any person in that population through third-cousin or closer matches.5 An increasing share of all Americans are now reachable through a relative’s voluntary submission. The privacy boundary is eroding regardless of what legislatures do. The question is whether that erosion occurs with or without enforceable rules.
The Case for Universal Collection Crime deterrence. The premise that universal registration deters crime rests on rational-choice theory: if a potential offender knows that any biological trace left at a scene connects directly to a known identity, the calculus of commission changes. The empirical evidence is suggestive, though not conclusive. A peer-reviewed 2017 study by Jennifer Doleac in the American Economic Journal: Applied Economics found that states that expanded their DNA databases to include convicted felony offenders saw statistically significant reductions in violent crime rates relative to control states, even after controlling for broader policing changes and community factors.6 The deterrence argument is plausible and supported by the best available evidence, but it should be treated as such rather than as settled fact.
Solving more crimes, faster, and more equitably. The distinction between genetic genealogy and direct DNA matching is not merely technical. It is the difference between eventually solving a case and solving it at all.
A direct CODIS match occurs when a crime scene sample is compared against the database and finds an exact profile. If the perpetrator is already in CODIS, the match is instantaneous. If not, the case may go cold indefinitely. The Golden State Killer investigation illustrated the alternative: genetic genealogy, which uses consumer databases to identify distant relatives of the unknown perpetrator, then narrows the family tree through traditional investigation. This approach is labor- intensive, requiring weeks or months of genealogical research, and depends on whether enough relatives have voluntarily submitted consumer DNA. A universal registry would collapse that distinction. The perpetrator’s own profile would be in the database. The match would be direct and immediate.
The implications for unsolved cases are significant. DNA evidence is available in fewer than 10 percent of all criminal cases, largely because many crime scenes do not contain testable biological material. But when it is present and the perpetrator is not in the database, the case frequently goes cold not for lack of evidence but for lack of a match.7 A universal registry would convert a substantial portion of those cold cases into solvable ones.
Because CODIS is built primarily from the arrested and convicted, its population mirrors the racial disparities documented in the criminal justice system. The most rigorous published study of CODIS racial composition, a 2020 California Law Review article by Erin Murphy and Jun Tong, found that Black Americans make up approximately 13 percent of the U.S. population but contribute an estimated 34 percent of all profiles in the national database, a ratio of roughly 2.6 to 1. White Americans make up 62 percent of the population and 49 percent of the database. Non-black people of color, comprising roughly 25 percent of the population, account for only about 13 percent of the database.8 A universal registry eliminates that disparity by definition. Scholars arguing for universal collection have framed this as an equity imperative: a database that imposes genetic surveillance selectively on communities of color is not race-neutral, but one that applies equally to everyone at least removes the structural bias at the point of collection.9 That equity argument carries an important caveat. Universality equalizes exposure in the database; it does not automatically remedy racially disparate policing or prosecutorial discretion. Reform of the database is necessary but not sufficient. It must be paired with the enforceable query controls described in the legal framework section below.
Exonerating the wrongly convicted. Since 1989, DNA testing has contributed to more than 600 exonerations in the United States, according to the National Registry of Exonerations.10 The Innocence Project has directly participated in more than 250 exonerations through DNA and other evidence, including nearly 40 cases in which the exoneree had been sentenced to death, according to the National Registry of Exonerations.11 In an analysis of 375 exonerations, the Innocence Project found that more than 11 percent of individuals who pleaded guilty were in fact innocent, often coerced by the threat of harsher punishment.12 A universal registry accelerates this corrective function in both directions: it identifies the actual perpetrator when a wrongly convicted person is in prison, and it clears a suspect during investigation before charges are ever filed. The database that catches criminals also frees innocents. These are not competing outcomes. They are the same mechanism.
The convictions side of the ledger deserves acknowledgment as well. While comprehensive national statistics on convictions specifically secured by DNA evidence are not systematically tracked, CODIS’s 758,000 investigative hits represent cases in which a biological sample was matched to a known profile, providing the evidentiary foundation for arrest, prosecution, and in a significant subset, conviction. DNA is now the gold standard of forensic identification, considered more reliable than eyewitness identification and most other physical evidence types. Its absence at trial has even created what prosecutors describe as the CSI effect, where jurors expect DNA evidence in cases where it was never available. A universal registry would strengthen that evidentiary foundation across the board, while simultaneously providing the exculpatory counterweight when the match points elsewhere.
DNA as a superior identity credential. The Social Security number was designed in 1935 to track retirement accounts, not to serve as the universal identity credential it has become. It is a nine-digit number that can be stolen, sold, and counterfeited. DNA is its opposite: present in every cell of your body from birth to death, and with a carefully designed profile, unique to you with a probability of accidental match estimated at one in several quadrillion.
It is worth explaining how that uniqueness is established. Your DNA is a long double-stranded molecule built from four nucleotide bases: adenine, thymine, cytosine, and guanine, abbreviated A, T, C, and G. The sequence of these bases along the chromosome encodes the instructions for building and running the human body. Most of that sequence is identical across all humans. What differs are specific locations scattered across the genome where a short sequence of bases repeats end-to-end, like a stutter in the genetic text. These locations are called Short Tandem Repeats, or STRs. At any given STR location, the number of times the sequence repeats varies from person to person: one individual might have seven repeats at a particular locus, another might have twelve. CODIS profiles examine 20 of these STR locations simultaneously. The number of possible combinations across 20 loci produces a profile space so vast that the probability of any two unrelated people sharing the same profile at all 20 sites is estimated at less than one in a quadrillion. Crucially, STR locations reside in non- coding regions of the genome, the stretches sometimes called junk DNA, which do not directly encode proteins or reveal health information. A CODIS STR profile says nothing about disease risk, ancestry, or behavioral traits. It is, by design, a numerical identifier and nothing more. One edge case deserves direct attention: identical twins share the same STR profile, which could in theory produce two suspects from a single crime scene sample. However, forensic science is establishing that even twins diverge epigenetically over time in ways that may allow differentiation.1314 These are known limitations a well- designed registry must account for. Cornell Tech researchers have demonstrated identity verification via portable DNA sequencing in minutes, requiring only 60 to 300 cross-references.15 A DNA-anchored identity layer could underpin voter verification, electronic signatures on legal documents, and digital identity systems resistant to the fraud and impersonation that plague current infrastructure.16 Medical breakthroughs and hereditary disease prevention. Traditional newborn screening currently tests for approximately 35 to 60 conditions. Genome-wide sequencing of newborn blood spots could detect predispositions to thousands of additional genetic diseases, many preventable or treatable if caught early.17 The BeginNGS program published results in December 2024 in the American Journal of Human Genetics showing a 97 percent reduction in false positives compared to unfiltered genomic screening approaches.18 Genomics England’s Generation Study is sequencing 100,000 newborns; the BabySeq project has expanded to seven U.S. sites.19 The medical case is strong, but it comes with genuine ethical complexity. In a mandatory registry, collection is the law rather than a matter of parental consent, which removes one layer of the debate but intensifies another: what is done with the data, and who controls access to which portion of it. The dual-use architecture described in the legal framework section below addresses this directly. The identification layer is strictly separated from any broader genomic health data. Parents may separately elect a full genomic diagnosis for clinical purposes, with those results staying in the health layer and never crossing into the identification registry. Bioethicists writing in the European Journal of Human Genetics identify a remaining tension: the child enrolled at birth has no voice in the decision.20 The legislative response should include a mandatory review right at legal majority, not to exit the identification registry, which is a legal obligation, but to audit what data is held, correct errors, and understand its uses. The Early Check program’s research demonstrates that families support genomic screening when they receive genuine transparency about data use.21 That transparency must be built into the statute, not left to agency discretion.
The Case Against: Serious Risks That Demand Serious Answers DNA reveals far more than your identity, although it does this really well. Unlike a license plate read or a call detail record, a DNA profile contains information about health predispositions, ancestry, familial relationships, and potentially behavioral traits. The physical sample contains the entire genome. Even if only STR markers are initially profiled for identification, the underlying sample is a permanent biological archive. Legislation prohibiting secondary uses is only as durable as the political will to enforce it, and the history of CODIS is a case study in exactly this kind of mission drift. When Congress established the database in 1994, it was limited to convicted sex offenders and perpetrators of violent state crimes. By 1998 it expanded to federal criminals; by 2000 to a broader set of federal offenses including burglary and kidnapping. The DNA Fingerprint Act of 2005 extended collection to all federal arrestees regardless of charge. In 2020, the Trump administration added immigration detainees, expanding the pool from roughly 7,000 to a projected 743,000 new profiles per year; by 2025, more than 1.5 million noncitizen profiles had been added, the vast majority from immigration encounters with no criminal charge attached.22 Senator Wyden described this as a 5,000 percent expansion of DNA collection from noncitizens.23 The consumer ancestry database story is equally instructive. When GEDmatch launched as a genealogy tool, users uploaded DNA to find relatives, not to be searched by police. After investigators used it to identify the Golden State Killer in 2018, law enforcement made it the de facto national genealogy database for criminal investigations. Forensic genealogists were later found to have searched profiles of users who had explicitly opted out of law enforcement sharing, skirting the platform’s own privacy rules.24 A Florida court issued a warrant compelling GEDmatch to open its entire database of 1.3 million profiles to a single detective, the first time a court had approved a search of that scope, regardless of individual consent settings.25 GEDmatch’s founder said the original Golden State Killer search was conducted without his knowledge, and the terms of service were updated retroactively. This is not purpose creep. It is purpose erasure.
The risks of a universal DNA registry are not purely abstract. They have been imagined, and imagined well. In 1997, writer and director Andrew Niccol released Gattaca, produced by Danny DeVito, Michael Shamberg, and Stacey Sher through Jersey Films, a dystopian science fiction film set in a near future where genetic profiles determine social status, employment, and life prospects. Citizens conceived naturally without genetic selection are classified as “In-Valids” and relegated to menial labor, while “Valids”, those engineered for genetic superiority, occupy the upper strata of society. The film’s title is spelled entirely from the letters of the four DNA nucleotides: G (guanine), A (adenine), T (thymine), C (cytosine). Released the year before the Human Genome Project completed its first working draft, Gattaca anticipated with striking accuracy the questions now confronting legislatures and courts: What happens when biological data becomes the primary lens through which institutions classify people? Who controls the registry, and who is harmed by it? The film was not a hit in 1997. It has become increasingly difficult to dismiss.26 The presumption of innocence and principled objection. The Supreme Court addressed DNA collection from arrestees in Maryland v. King (2013), upholding 5-4 its constitutionality as a booking procedure analogous to fingerprinting.27 But Justice Antonin Scalia’s dissent identified the core problem: “Make no mistake about it: because of today’s decision, your DNA can be taken and entered into a national database if you are ever arrested, rightly or wrongly, and for whatever reason.”28 A universal registry extends the logic further: it presumes the state has a legitimate interest in every citizen’s biological identity from birth, without any predicate of suspicion. That is a meaningful constitutional step beyond anything the Court has sanctioned. Some objections are not merely strategic but philosophical: religious and cultural traditions that treat the body as inviolable, or that regard mandatory biological registration as incompatible with individual dignity, deserve acknowledgment rather than dismissal.
Data breach: the one you cannot patch. In 2023, 23andMe suffered a breach affecting approximately 6.9 million users, including genetic ancestry and relative-matching data.29 The company filed for bankruptcy in 2025.30 A compromised password can be changed. A compromised DNA profile cannot, and the exposure is hereditary: your children and grandchildren share your genetic markers. A national registry that suffers a significant breach does not produce a recoverable situation. It produces a permanent one.
Genetic discrimination and incomplete legal protections. The Genetic Information Nondiscrimination Act of 2008, known as GINA, prohibits health insurers and employers from using genetic information against individuals.31 It does not cover life insurance, disability insurance, long- term care insurance, immigration enforcement, or the criminal justice system. A universal database expands the surface area of genetic exposure across every one of those unprotected domains. Without statutory extension of GINA’s protections to cover all downstream uses, the registry becomes a tool of actuarial discrimination as readily as a tool of public safety.
What the Legal Framework Must Look Like The policy question is not whether this data will exist. Consumer databases cover the majority of the country already. The FBI is expanding CODIS by hundreds of thousands of profiles per year through immigration enforcement. Newborn genomic sequencing programs are enrolling in seven states. The question is whether these systems converge under enforceable rules or continue to expand under the patchwork of vendor contracts and incomplete statutes that currently governs them.
The precedent from Carpenter v. United States (2018) is instructive: the Supreme Court required a warrant for historical cell-site records because of their comprehensiveness and the intimacy of what they reveal. DNA is more comprehensive and more intimate than any location record. The Carpenter logic applied to genetic data points toward a warrant requirement for all investigative access.
Warrant requirement for all investigative queries. Access to the registry for criminal investigative purposes requires a warrant supported by probable cause, specifying the target, the offense, and the temporal scope. Law enforcement cannot browse the database. A judge must authorize each search, as with medical records, financial records, and comprehensive location data since Carpenter.
Symmetric access: prosecution and defense alike. If a prosecutor can subpoena registry data to build a case, defense counsel must have equal subpoena power to query the registry for alibi or alternative- suspect evidence. The framework must specify chain-of-custody protocols for defense-initiated samples, privacy protections for third- party relatives appearing in familial search results, and procedural standards preventing speculative fishing expeditions by either party. Defense access must be guaranteed, not discretionary.
Mandatory registration with a dual-use data architecture. In a legally mandated registry, collection is the law, not a matter of consent. The critical design question is therefore not whether data is collected but how it is partitioned and used. The proposal rests on a strict dual-use architecture. The identification layer, the 20 STR loci used in CODIS and comparable systems, is collected universally and governs law enforcement access under the warrant framework above. It contains no health information and is governed exclusively by judicial oversight. A separate health and research layer, derived from broader genomic sequencing, is held under independent governance and accessible only to healthcare providers and, with appropriate anonymization, to population researchers. For research purposes, the identifying STR profile is redacted from the genomic dataset using cryptographic separation, making it technically impossible to reconnect the anonymized research data to the individual without a separate judicial order. Parents may separately elect, at birth, to have a full genomic diagnosis performed for personal clinical use, with results disclosed to the family and their physicians. That clinical record stays in the health layer and never crosses into the identification registry. This architecture preserves the population-scale research and medical benefits of universal genomic data while ensuring that the identification registry is used for nothing beyond what a warrant authorizes.
Security under FISMA High standards with independent audit. The registry must be classified as a High Value Asset under FIPS 199 and governed by the full NIST SP 800-53 control catalog.32 FISMA mandates annual security reviews; the registry should additionally require mandatory third-party assessment every two years, with public reporting of findings. Breach notification must be mandatory within 72 hours. Penalties for unauthorized access must include criminal liability. Congress should appropriate dedicated funding in the authorizing legislation, with a cost estimate developed by CBO in advance of enactment.
Extended GINA protections covering all downstream uses. Genetic information collected in the registry must be excluded from use by life insurers, disability insurers, long-term care insurers, immigration enforcement, and any other entity not covered by current GINA protections, enforced by private right of action and meaningful civil penalties.
Independent oversight with mandatory sunset review. The authorizing legislation must establish an independent oversight board with subpoena power, public reporting requirements, and a congressional reauthorization review every five years. Sunset provisions force the democratic accountability that has been absent from every prior expansion of DNA collection authority.
The Path Forward No model bill for a universal DNA registry currently exists in Congress, and the political conditions for enacting one do not exist today. What does exist, and what this piece argues for, is a serious legislative debate conducted before the infrastructure assembles itself further by accident. The appropriate venues are the Senate Judiciary Committee’s Subcommittee on Privacy, Technology and the Law and the House Energy and Commerce Committee’s Health Subcommittee. Stakeholder consultations should include law enforcement, the Innocence Project, the ACLU, bioethicists, genomic medicine researchers, and representatives of communities that currently bear a disproportionate share of the existing database’s surveillance burden.
The cases that opened this article, a killer identified after four decades, an innocent man freed after nine years on death row, are not exceptional. They represent a pattern that a universal registry would accelerate on both ends: faster identification of perpetrators and faster exoneration of the wrongly accused. The medical case is the same: children diagnosed only after irreversible damage occurs because a genome that could have predicted their condition was never examined. The tools exist. The data to make them work is accumulating with or without democratic authorization. What is missing is the governance framework that would make using them legitimate.
One more thing must be said plainly. The genie is already out of the bottle. More than 26 million Americans have voluntarily submitted their DNA to consumer testing services. CODIS has grown to more than 25 million profiles. Immigration enforcement has added 1.5 million noncitizen profiles without public debate. Forensic genealogy has demonstrated that roughly 60 percent of Americans of European descent are already identifiable through a relative’s database entry. There is no legislative path back to a world of meaningful genetic privacy for most Americans. That ship has sailed, the data is out, and no combination of regulations will recall it.
What remains in our control is how the data that exists, and the registry that is being assembled whether we authorize it or not, is governed. The choice is not between a universal registry and genetic privacy. It is between a registry built deliberately, under democratic authorization, with judicial oversight and symmetric access, and the one we are currently constructing by accident, under vendor contracts, immigration orders, and consumer terms of service, with none of those things.
The genie does not go back in the bottle. We can still decide what it is allowed to do.
{} The author has no financial relationship with any DNA testing vendor, genomics company, or law enforcement agency.