Several recently published scientific papers strongly critique the results of the ENCODE Project’s second phase—which reveal that at least 80 percent of the human genome is functional. In this installment of a three-part response to those critiques, I examine two of the most significant challenges: (1) The ENCODE project used a faulty definition of function; and (2) The results of the ENCODE project are absurd in light of the evolutionary paradigm.
The Lady doth protest too much, methinks.
Queen Gertrude
Hamlet, Act III, scene ii
Today when someone "protests too much," it can create a suspicion that the protested viewpoint or opinion may actually be true, or at least have legitimate merit. In Shakespeare’s day, however, the word “protest” was used in the same way that we use the word “promise.” So when Queen Gertrude observed that, “the Lady doth protest too much, methinks,” she was really saying that “the Lady” is promising more than she could deliver.
A number of scientists are arguing that the ENCODE Project—much lauded by creationists and the Intelligent Design Movement for its September 2012 announcement that, at minimum, 80 percent of the human genome consists of functional elements1—is promising way too much hope for creationists and intelligent design proponents, who pointed to this result as a challenge to one of the best arguments for an evolutionary origin of humanity.2 However, a careful examination of the critiques of ENCODE suggests that maybe these skeptics are so passionate in their criticism that it seems the opposite may be true: namely, the ENCODE Project’s conclusion is actually justified.
In part 1 of this series, I listed what I think to be the most significant criticisms published by ENCODE skeptics. Here in part 2, I will describe and respond to two of the most serious objections: (1) The ENCODE Project used a faulty definition of function; and (2) the project results are absurd in light of the evolutionary paradigm.
How Do Biologists Define Function?
Defining function in biological systems is far from straight forward. In many respects, the definition depends on philosophical considerations as much as anything else. In a published critique of ENCODE, University of Houston biology professor Dan Graur and his coauthors suggest defining functional elements in genomes as either selected effect or causal role. In their critique, researchers Deng-Ke Niu and Li Jiang highlight threeways to define biological function: (1) selected effect, (2) sequences correlated with disease, and (3) essential sequences determined by knockout experiments. As part of his assessment of ENCODE, biochemist W. Ford Doolittle identified selected effect, regions deemed essential as determined by ablation (knockout experiments), and mere existence of sequences in the genome as three distinct definitions for function.
In all three cases, the research teams argue that the ENCODE Project employed a faulty definition for biological function. These skeptics maintain that if the ENCODE scientists had used an appropriate definition, then they would have discovered that only 5 to 10 percent of the human genome is functional, not 80 percent.
Yet, the ENCODE skeptics can’t agree on the best way to define biochemical utility or to determine which genetic sequences are functional. Moreover, the three research teams disagree on which definition the ENCODE team employed when assigning function to the human genome’s sequences.
Definitions of Function
For the purposes of this article, it is worth briefly examining the different definitions for biochemical function that are in play.
- Selected effect: According to Doolittle, “the functions of a trait or feature are all and only those effects of its presence for which it was under positive natural selection in the (recent) past for which it is under (at least) purifying selection now. They are why the trait or feature is there today and possibly why it was originally formed.”3 In other words, sequences in genomes can be deemed functional only if they evolved under evolutionary processes to perform a particular function. Once evolved, these sequences, if they are functional, will resist evolutionary change (due to the effects of natural selection) because any alteration would compromise the function of the sequence and endanger the organism. If already deleterious, the sequence variations would be eliminated from the population due to the reduced survivability and reproductive success of organisms possessing those variants. Hence, functional sequences are those under the effects of selection.
- Sequences associated with diseases: Niu and Jiang point out that one way to determine function is if variations in the sequence are associated with a disease. The idea is that if a sequence alteration results in a genetic disorder, then the sequence must have some utility.
- Essential sequences determined by ablation: Genetic ablation, or knockout experiments, could be useful for identifying functional sequences in genomes. According to this thought, if an organism can tolerate the disabling or removal of a particular DNA sequence within its genome, then this sequence must not be functional. Conversely, if deactivation or elimination of a specific DNA sequence leads to the organism’s death, then that sequence must be essential and, therefore, functional.
- Existence of sequences in genomes: Doolittle points out that the mere presence of a sequence or process associated with genomes could be taken as evidence for their function. In other words, if the DNA sequence is found in the genome, it must be there for some reason. Doolittle explains, “Because a region is transcribed, its transcript must have some fitness benefit, however remote.”4
- Causal definition: According to Graur’s team, “for a trait, Q, to have a ‘causal role’ function, G, it is necessary and sufficient that Q performs G.”5 In other words, the causal definition ascribes function to sequences that play some observationally or experimentally determined role in genome structure and/or function.
Graur and his team prefer the selected effect definition. They write that “only sequences that can be shown to be under selection can be claimed with any degrees of confidence to be functional.”6 Doolittle also prefers this definition. These ENCODE skeptics argue that the selected effect definition is the only one that fits naturally into the context of the evolutionary paradigm. Graur’s group believes “most biologists use the selected effect concept of function, following the Dobzhanskyan dictum according to which biological sense can only be derived from evolutionary context.”7
But Graur and his team and Doolittle readily acknowledge that it can be difficult to determine which sequences in a genome are under selection. Niu and Jiang appear sympathetic to the selected effect definition, too, but they point out that some functional regions of genomes aren’t under selection and yet remain critical. That is, Niu and Jiang believe the selected effect definition underdetermines functional regions of the genome. They prefer to define functional regions through ablation. This definition is not on the radar screen for Graur’s team; Doolittle dismisses it outright because he sees it as being equivalent, in essence, to a casual definition. Based on how I understand their arguments, Niu and Jiang would disagree with Doolittle. They would argue that ablation serves as a proxy for natural selection.
How Did the ENCODE Project Define Function?
All of these critics reacted strongly to the way the ENCODE Project assigned function to sequences in the human genome. Graur and his team accuse the ENCODE researchers of adopting a “strong version” of causal function. Doolittle, however, argues that ENCODE used the “mere existence” definition of function. Niu and Jiang don’t specify how they believe the ENCODE Project defined biological function, but from reading their paper, I get the impression they would most likely agree with Graur’s group.
How, then, did the ENCODE Project define function? I don’t think Doolittle is correct in his assessment. It seems to me that the ENCODE Project did more than assign function to sequences based on its mere existence in the human genome. I would maintain that the ENCODE Project employed a causal definition of function. The ENCODE Project focused on experimentally determining which sequences in the human genome displayed biochemical activity using assays that measured:
- transcription,
- binding of transcription factors to DNA,
- histone binding to DNA,
- DNA binding by modified histones,
- DNA methylation, and
- three-dimensional interactions between enhancer sequences and genes.
The implied assumption is that if a sequence is involved any of these processes—all of which play well-established roles in gene regulation—then the sequences must have functional utility. To use Graur’s lingo: sequence Q performs function G, therefore, sequence Q is functional.
Is There Anything Wrong with the Way the ENCODE Project Defined Function?
So what’s wrong with the causal definition of function? From my vantage point: nothing. Biochemists typically determine function using this definition. Even Doolittle acknowledges this point. He states,