Is 75% of the Human Genome Junk DNA?
By the rude bridge that arched the flood,
Their flag to April’s breeze unfurled,
Here once the embattled farmers stood,
And fired the shot heard round the world.
–Ralph Waldo Emerson, Concord Hymn
Emerson referred to the Battles of Lexington and Concord, the first skirmishes of the Revolutionary War, as the “shot heard round the world.”
While not as loud as the gunfire that triggered the Revolutionary War, a recent article published in Genome Biology and Evolution by evolutionary biologist Dan Graur has garnered a lot of attention,1 serving as the latest salvo in the junk DNA wars—a conflict between genomics scientists and evolutionary biologists about the amount of functional DNA sequences in the human genome.
Clearly, this conflict has important scientific ramifications, as researchers strive to understand the human genome and seek to identify the genetic basis for diseases. The functional content of the human genome also has significant implications for creation-evolution skirmishes. If most of the human genome turns out to be junk after all, then the case for a Creator potentially suffers collateral damage.
According to Graur, no more than 25% of the human genome is functional—a much lower percentage than reported by the ENCODE Consortium. Released in September 2012, phase II results of the ENCODE project indicated that 80% of the human genome is functional, with the expectation that the percentage of functional DNA in the genome would rise toward 100% when phase III of the project reached completion.
If true, Graur’s claim would represent a serious blow to the validity of the ENCODE project conclusions and devastate the RTB human origins creation model. Intelligent design proponents and creationists (like me) have heralded the results of the ENCODE project as critical in our response to the junk DNA challenge.
Junk DNA and the Creation vs. Evolution Battle
Evolutionary biologists have long considered the presence of junk DNA in genomes as one of the most potent pieces of evidence for biological evolution. Skeptics ask, “Why would a Creator purposely introduce identical nonfunctional DNA sequences at the same locations in the genomes of different, though seemingly related, organisms?”
When the draft sequence was first published in 2000, researchers thought only around 2–5% of the human genome consisted of functional sequences, with the rest being junk. Numerous skeptics and evolutionary biologists claim that such a vast amount of junk DNA in the human genome is compelling evidence for evolution and the most potent challenge against intelligent design/creationism.
But these arguments evaporate in the wake of the ENCODE project. If valid, the ENCODE results would radically alter our view of the human genome. No longer could the human genome be regarded as a wasteland of junk; rather, the human genome would have to be recognized as an elegantly designed system that displays sophistication far beyond what most evolutionary biologists ever imagined.
The findings of the ENCODE project have been criticized by some evolutionary biologists who have cited several technical problems with the study design and the interpretation of the results. (See articles listed under “Resources to Go Deeper” for a detailed description of these complaints and my responses.) But ultimately, their criticisms appear to be motivated by an overarching concern: if the ENCODE results stand, then it means key features of the evolutionary paradigm can’t be correct.
Calculating the Percentage of Functional DNA in the Human Genome
Graur (perhaps the foremost critic of the ENCODE project) has tried to discredit the ENCODE findings by demonstrating that they are incompatible with evolutionary theory. Toward this end, he has developed a mathematical model to calculate the percentage of functional DNA in the human genome based on mutational load—the amount of deleterious mutations harbored by the human genome.
Graur argues that junk DNA functions as a “sponge” absorbing deleterious mutations, thereby protecting functional regions of the genome. Considering this buffering effect, Graur wanted to know how much junk DNA must exist in the human genome to buffer against the loss of fitness—which would result from deleterious mutations in functional DNA—so that a constant population size can be maintained.
Historically, the replacement level fertility rates for human beings have been two to three children per couple. Based on Graur’s modeling, this fertility rate requires 85–90% of the human genome to be composed of junk DNA in order to absorb deleterious mutations—ensuring a constant population size, with the upper limit of functional DNA capped at 25%.
Graur also calculated a fertility rate of 15 children per couple, at minimum, to maintain a constant population size, assuming 80% of the human genome is functional. According to Graur’s calculations, if 100% of the human genome displayed function, the minimum replacement level fertility rate would have to be 24 children per couple.
He argues that both conclusions are unreasonable. On this basis, therefore, he concludes that the ENCODE results cannot be correct.
Response to Graur
So, has Graur’s work invalidated the ENCODE project results? Hardly. Here are four reasons why I’m skeptical.
1. Graur’s estimate of the functional content of the human genome is based on mathematical modeling, not experimental results.
An adage I heard repeatedly in graduate school applies: “Theories guide, experiments decide.” Though the ENCODE project results theoretically don’t make sense in light of the evolutionary paradigm, that is not a reason to consider them invalid. A growing number of studies provide independent experimental validation of the ENCODE conclusions. (Go here and here for two recent examples.)
To question experimental results because they don’t align with a theory’s predictions is a “Bizarro World” approach to science. Experimental results and observations determine a theory’s validity, not the other way around. Yet when it comes to the ENCODE project, its conclusions seem to be weighed based on their conformity to evolutionary theory. Simply put, ENCODE skeptics are doing science backwards.
While Graur and other evolutionary biologists argue that the ENCODE results don’t make sense from an evolutionary standpoint, I would argue as a biochemist that the high percentage of functional regions in the human genome makes perfect sense. The ENCODE project determined that a significant fraction of the human genome is transcribed. They also measured high levels of protein binding.
ENCODE skeptics argue that this biochemical activity is merely biochemical noise. But this assertion does not make sense because (1) biochemical noise costs energy and (2) random interactions between proteins and the genome would be harmful to the organism.
Transcription is an energy- and resource-intensive process. To believe that most transcripts are merely biochemical noise would be untenable. Such a view ignores cellular energetics. Transcribing a large percentage of the genome when most of the transcripts serve no useful function would routinely waste a significant amount of the organism’s energy and material stores. If such an inefficient practice existed, surely natural selection would eliminate it and streamline transcription to produce transcripts that contribute to the organism’s fitness.
Apart from energetics considerations, this argument ignores the fact that random protein binding would make a dire mess of genome operations. Without minimizing these disruptive interactions, biochemical processes in the cell would grind to a halt. It is reasonable to think that the same considerations would apply to transcription factor binding with DNA.
2. Graur’s model employs some questionable assumptions.
Graur uses an unrealistically high rate for deleterious mutations in his calculations.
Graur determined the deleterious mutation rate using protein-coding genes. These DNA sequences are highly sensitive to mutations. In contrast, other regions of the genome that display function—such as those that (1) dictate the three-dimensional structure of chromosomes, (2) serve as transcription factors, and (3) aid as histone binding sites—are much more tolerant to mutations. Ignoring these sequences in the modeling work artificially increases the amount of required junk DNA to maintain a constant population size.
3. The way Graur determines if DNA sequence elements are functional is questionable.
Graur uses the selected-effect definition of function. According to this definition, a DNA sequence is only functional if it is undergoing negative selection. In other words, sequences in genomes can be deemed functional only if they evolved under evolutionary processes to perform a particular function. Once evolved, these sequences, if they are functional, will resist evolutionary change (due to natural selection) because any alteration would compromise the function of the sequence and endanger the organism. If deleterious, the sequence variations would be eliminated from the population due to the reduced survivability and reproductive success of organisms possessing those variants. Hence, functional sequences are those under the effects of selection.
In contrast, the ENCODE project employed a causal definition of function. Accordingly, function is ascribed to sequences that play some observationally or experimentally determined role in genome structure and/or function.
The ENCODE project focused on experimentally determining which sequences in the human genome displayed biochemical activity using assays that measured
- binding of transcription factors to DNA,
- histone binding to DNA,
- DNA binding by modified histones,
- DNA methylation, and
- three-dimensional interactions between enhancer sequences and genes.
In other words, if a sequence is involved in any of these processes—all of which play well-established roles in gene regulation—then the sequences must have functional utility. That is, if sequenceQperforms functionG, then sequenceQis functional.
So why does Graur insist on a selected-effect definition of function? For no other reason than a causal definition ignores the evolutionary framework when determining function. He insists that function be defined exclusively within the context of the evolutionary paradigm. In other words, his preference for defining function has more to do with philosophical concerns than scientific ones—and with a deep-seated commitment to the evolutionary paradigm.
As a biochemist, I am troubled by the selected-effect definition of function because it is theory-dependent. In science, cause-and-effect relationships (which include biological and biochemical function) need to be established experimentally and observationally, independent of any particular theory. Once these relationships are determined, they can then be used to evaluate the theories at hand. Do the theories predict (or at least accommodate) the established cause-and-effect relationships, or not?
Using a theory-dependent approach poses the very real danger that experimentally determined cause-and-effect relationships (or, in this case, biological functions) will be discarded if they don’t fit the theory. And, again, it should be the other way around. A theory should be discarded, or at least reevaluated, if its predictions don’t match these relationships.
What difference does it make which definition of function Graur uses in his model? A big difference. The selected-effect definition is more restrictive than the causal-role definition. This restrictiveness translates into overlooked function and increases the replacement level fertility rate.
4. Buffering against deleterious mutations is a function.
As part of his model, Graur argues that junk DNA is necessary in the human genome to buffer against deleterious mutations. By adopting this view, Graur has inadvertently identified function for junk DNA. In fact, he is not the first to argue along these lines. Biologist Claudiu Bandea has posited that high levels of junk DNA can make genomes resistant to the deleterious effects of transposon insertion events in the genome. If insertion events are random, then the offending DNA is much more likely to insert itself into “junk DNA” regions instead of coding and regulatory sequences, thus protecting information-harboring regions of the genome.
If the last decade of work in genomics has taught us anything, it is this: we are in our infancy when it comes to understanding the human genome. The more we learn about this amazingly complex biochemical system, the more elegant and sophisticated it becomes. Through this process of discovery, we continue to identify functional regions of the genome—DNA sequences long thought to be “junk.”
In short, the criticisms of the ENCODE project reflect a deep-seated commitment to the evolutionary paradigm and, bluntly, are at war with the experimental facts.
Bottom line: if the ENCODE results stand, it means that key aspects of the evolutionary paradigm can’t be correct.
Resources to Go Deeper
- “ENCODE Junks Evolutionary Concept” by Fazale Rana (article)
- “Responding to ENCODE ‘Skeptics’” by Fazale Rana (article)
- “Do ENCODE Skeptics Protest Too Much? Part 1 (of 3)” by Fazale Rana (article)
- “Do ENCODE Skeptics Protest Too Much? Part 2 (of 3)” by Fazale Rana (article)
- “Do ENCODE Skeptics Protest Too Much? Part 3 (of 3)” by Fazale Rana (article)
- “Do Scientists Accept the Results of the ENCODE Project?” by Fazale Rana (article)
- “Protein-Binding Sites ENCODEd into the Design of the Human Genome” by Fazale Rana (article)
- “Is Most of Our DNA Garbage?” by Fazale Rana (podcast)
- “Does the Evolutionary Paradigm Stymie Scientific Advance?” by Fazale Rana (article)
- “New Research Suggests Two Overlooked Functions of Junk DNA” by Fazale Rana (article)
- Who Was Adam? A Creation Model Approach to the Origin of Humanity by Fazale Rana with Hugh Ross (book)
- Dan Graur, “An Upper Limit on the Functional Fraction of the Human Genome,” Genome Biology and Evolution 9 (July 2017): 1880–85, doi:10.1093/gbe/evx121.