Greetings, to borrow a term from a friend of ours.
Given the scope of this debate, my opening is going to be a bit lengthy (though hopefully not too rambling), and is going to cover a lot of ground that most of you will be familiar with. I don't intend to simply present evidence that evolution is a fact, although plenty of appropriate evidence will be herein. Because of the circumstances that instigated this discussion, it's my intention to clearly define what evolution is, what evolutionary theory
actually postulates, as opposed to the caricatures of evolution we're used to seeing from certain quarters, and show that what evolutionary theory postulates has, by and large, been observed occurring.
The best place to start is, I suspect, to delineate between evolution and the theory of evolution.
Evolution is a process of population resampling. Specifically, it's a process of gene flow at all levels of the biosphere. In the technical literature, this is often reduced to its most basic component, namely variation in the frequencies of alleles, where an allele is a specific iteration of a given gene, and where a gene can broadly be defined as any stretch or sequence of DNA of specific interest
*. The theory of evolution is the over-arching explanatory framework encompassing all the facts, laws and hypotheses pertaining to evolution the observed process. It's a fully predictive and quantitative theory, and far and away the best supported theory in all of science. It always amazes me that it's talked about as a theory in crisis, despite being better supported than the theories that underpin the technological world. I know physicists who'd sell their grandmother's for the same sort of support for their fields.
Returning to evolution as a process of gene flow, we can look at alleles in different species
† that code for the same protein. Here, I'll use an example provided by my friend
Calilasseia, presented to expose a particularly pernicious fallacy, the fallacy of one true sequence, elucidated
HERE.
The important portion of the text for our purpose here is the section detailing the genetic sequence that codes for the production of insulin in humans and lowland gorillas respectively. I reproduce it here with
Cali's kind permission:
Calilasseia wrote:[1] Human insulin gene on Chromosome 11, which is as follows:
atg gcc ctg tgg atg cgc ctc ctg ccc ctg ctg gcg ctg ctg gcc ctc tgg gga cct gac
cca gcc gca gcc ttt gtg aac caa cac ctg tgc ggc tca cac ctg gtg gaa gct ctc tac
cta gtg tgc ggg gaa cga ggc ttc ttc tac aca ccc aag acc cgc cgg gag gca gag gac
ctg cag gtg ggg cag gtg gag ctg ggc ggg ggc cct ggt gca ggc agc ctg cag ccc ttg
gcc ctg gag ggg tcc ctg cag aag cgt ggc att gtg gaa caa tgc tgt acc agc atc tgc
tcc ctc tac cag ctg gag aac tac tgc aac tag
which codes for the following protein sequence (using the standard single letter mnemonics for individual amino acids, which I have colour coded to match the colour coding in this diagram of the insulin synthesis pathway in humans):
MALWMRLLPLLALLALWGPDPAAAFVNQHLCGSHLVEALYLVCGERGFFYTPKT
RREAEDLQVGQVELGGGPGAGSLQPLALEGSLQKR
GIVEQCCTSICSLYQLENYCN
Now, I refer everyone to this data, which is the coding sequence for insulin in the Lowland Gorilla (differences are highlighted in red):
atg gcc ctg tgg atg cgc ctc ctg ccc ctg ctg gcg ctg ctg gcc ctc tgg gga cct gac
cca gcc gcg gcc ttt gtg aac caa cac ctg tgc ggc tcc cac ctg gtg gaa gct ctc tac
cta gtg tgc ggg gaa cga ggc ttc ttc tac aca ccc aag acc cgc cgg gag gca gag gac
ctg cag gtg ggg cag gtg gag ctg ggc ggg ggc cct ggt gca ggc agc ctg cag ccc ttg
gcc ctg gag ggg tcc ctg cag aag cgt ggc atc gtg gaa cag tgc tgt acc agc atc tgc
tcc ctc tac cag ctg gag aac tac tgc aac tag
this codes for the protein sequence:
MALWMRLLPLLALLALWGPDPAAAFVNQHLCGSHLVEALYLVCGERGFFYTPKT
RREAEDLQVGQVELGGGPGAGSLQPLALEGSLQKR
GIVEQCCTSICSLYQLENYCN
which so happens to be the same precursor protein.
This highlights how a single gene can come in different variations. Each of these individual variations is defined as an allele. Same gene, different version, same outcome. This will be important later, and we'll be returning to this.
All of the above can be independently verified simply by following the links given to the specific genomes of the organisms in question, an online database containing a plethora of sequenced genes and genomes.
An important point to note here is that ALL of the genomes recorded in that database represent a single organism's sequence and, in most cases, those sequences will vary even within a species. No two humans have exactly the same genome (even placental twins), or DNA-based paternity tests would be worthless. To map the entire human genome, for example, with complete accuracy, would involve extracting the DNA of every living human and running the same sequencing process for all of them, and even then you would only get a kind of RMS of the genome, because there simply isn't a single human genome. It's not improbable beyond the limit of probabilistic resources that some human somewhere carries the exact insulin coding allele as the lowland gorilla. This will also have some import later when dealing with macroevolution.
It's also worth noting precisely what the difference is in those highlighted examples. Each grouping of three letters is called a triplet codon (or triplet or codon). The letters themselves are place-holders for the amino acids they represent, namely adenine (a), cytosine (c), guanine (g) and thymine (t) respectively. This is the genetic code, and by this, I mean this substitution of the chemicals for their initial letters, and the language we've formulated to describe it (more accurately, it's a cipher). The code is not the chemicals themselves. As detailed in my signature, DNA is a code in precisely the same way that London is a map. Much confusion arises when the map is falsely conflated with the terrain.
These amino acids always pair in the same way. Wherever (a) appears on one side of the helix, (t) will always appear opposite. The same is true of (c) and (g). These are the Watson-Crick base pairs, and they work by specific hydrogen bonds, which in turn are responsible for the helical structure of the molecule.
You'll note that each difference in the above comparison occurs in the third letter in the codon, which also has extremely important implications, both in fidelity, and in how some specific mutations impact the genome. We'll return to those shortly.
In the very simplest scenario, we have two sexually reproducing organisms who copulate and give birth to offspring. A certain monk cloistered in the city of Brno, in what is now the Czech Republic (a beautiful old city which I had the pleasure of visiting some years ago) elucidated for us the principles of inheritance with some meticulous experiments in pea plants. His name was Gregor Mendel, and he provided the piece that Darwin was missing, namely how inheritance actually works. It's unclear whether Darwin encountered Mendel's work, but Mendel was certainly familiar with Darwin's.
In any event, what Mendel demonstrated was that the popular view, that traits from parents were blended in offspring, a view that Darwin held to, was wrong.This could all have turned out quite differently not least because, when Mendel went to university in Vienna, his primary studies were in physics.
In any event, Mendel returned to the abbey at Brno after university, and this was where he conducted his famous pea experiments. He obtained some 30 or so varieties of pea that showed discontinuous characteristics, which is to say that the offspring of any given pairing exhibited traits that were not in the parent varieties. After much careful experimentation, including blending of 'child' varieties, he determined that some traits could skip generations, only to appear later in future generations
[1]. By this mechanism, Mendel determined what are properly termed 'laws of evolution'. Specifically, the laws of segregation and independent assortment. The law of segregation tells us that offspring receive two alleles, one (only) from each of its parents, which has some interesting consequences when the same allele is inherited from both parents, a topic I will be exploring later in this post, while the law of independent assortment tells us that alleles for different characteristics remain separate.
One of my favourite questions for those who deny the reality of evolution is this:
Do you look more like your father or your mother?
I hope that some of the above highlights why I think this is a particularly revealing question, as it goes to the very heart of inheritance, which is one of the major mechanisms for evolution. As such, this question is the first piece of evidence I present that evolution is indeed a fact. I'll extend the question for the purpose of this discussion and ask directly:
Bernhard, do you look more like your father or your mother? Do people tell you that you have your mother's eyes, or your father's nose? The simple fact is that, in one way or another, you will exhibit features of both, while your siblings, if you have any, will exhibit different features of each, or even the same features, but with tiny variations that allow people to distinguish between you. I assume, if you have siblings, that you don't all look exactly alike. That's evolution, right in front of you.
In a nutshell, every time that two organisms reproduce, their alleles are mixed, and the result is a unique blend of genetic material, even when the same organisms produce multiple offspring. This mixing in and of itself constitutes a variation in the frequency of alleles in a population, and is properly termed evolution. This is observed and incontrovertible.
Moving on, we should touch on the mechanisms of evolution. The first is mutation, which is a permanent change. Mutations can be broadly classified in two ways, germline (or hereditary) mutations, and somatic mutations. Germline mutations occur in all (or almost all) of the body's cells and are inherited (or occur during meiosis), while somatic mutations are acquired during one's life and occurs only in some cells. These are not passed on, unless they occur in germline cells (eggs and sperm). These are termed
de novo (new) mutations
[2]. Somatic mutations are generally the result of transcription errors during cell division, or occur via radiation or chemical mutagens, although it's generally less than straightforward to determine precisely when a
de novo mutation occurred.
De novo genes represent the 'new information' that some insist doesn't arise
[3]. Any
de novo mutation that occurs in more than roughly 1% of a population of organisms is termed a 'polymorphism'. These are generally harmless (although they can be a factor in propensity to certain disorders) and they account for such things as hair and eye colour, etc. These broad classes are further subdivided, but that's beyond the scope of this discussion for the time being.
In the gene sequences compared above, as already mentioned, the differences between the genes coding for the production of insulin in humans and lowland gorillas respectively appear in the third letter in each of those codons. This is important, because the third of any triplet is neutral, which means that, if a mutation occurs there, it will have no impact on how the gene expresses. We can readily see this in the fact that, despite those differences, the insulin precursor protein produced is identical in both species.
Mutations can happen in slightly different ways as well. For example, we could have the sequence
This sequence can not only suffer a direct mutation in any of those individual bases, but it can also suffer from an insertion or a deletion, which in turn will 'frameshift' the codons. A deletion might leave it looking like
or an insertion might leave it like
etc, which of course, change the protein, as the first two bases in the codon are affected. Frame-shifts are often involved in genetic disorders. Cystic fibrosis, for instance, is the result of a frame-shift.
The second mechanism is selection. Many are of the opinion that natural selection is the key driving force in evolution, although opinions are divided on this, and in fact some are of the opinion that separating mechanisms is folly in itself, a point I will return to shortly. You'll also note that I initially said simply 'selection'. This will also come up again soon.
To treat selection properly, and specifically to treat the 'random/non-random' dichotomy, it's important to first look at the levels at which selection operates.
Natural selection has to be looked at in two ways to be fully appreciated. The first is from the perspective of the population, at which level the effects of selection are seen. At this level, NS is most definitely not random, where 'random' means specifically 'statistically independent', because it can be probabilistically quantified. At this level, we see that, on average, advantageous alleles are selected for, in the form of being passed on to future generations with a statistical weighting. We also see that, on average, deleterious alleles are selected against, in the form of not being passed on to future generations, again with a statistical weighting.
The second way to look at NS is from the perspective of the individual organism, at which level selection actually operates. From this perspective, NS is random. The particular selection pressure that an individual organism will succumb to or indeed evade, is statistically independent, thus random. The organism with an allele that allows it to evade a particular selection pressure has statistical significance, but the means of checking out without issue are many and diverse, and which particular pressure said individual will fall prey to (pardon the pun) can only be treated in the broadest of terms (and indeed even an organism carrying an advantageous allele that confers a specific advantage can still fall prey to the selection pressure that the allele confers the advantage against, while an organism not carrying the allele evades it).
It's also worth a brief digression on what is meant by advantageous and deleterious here, I think, because this is often misunderstood. Whether or not a given allele or trait confers an advantage or disadvantage is primarily a function of environment. There are very few mutations that are deleterious in and of themselves, but are deleterious in the context of the environment in which they appear (I should note for completeness that the genome constitutes part of the environment for a gene, and in the case of those few alleles that are inherently deleterious, this is the environment we would be looking at), so, for example, a mutation that conferred a disadvantage in swimming will not be strongly deleterious for, say, a camel, but it might be a problem for a tuna. Yes, that's a little glib, but it illustrates the point well enough. To make the point more explicit, though, let's return to the law of segregation and its implications for some mutations.
There is one particular allele that is prevalent in humans, particularly Africans. I'm talking, of course, about the sickle gene. This is a nasty little beastie, which is the result of a single difference in amino acid in the hemoglobin gene, but thankfully it only expresses under certain conditions. If you inherit the sickle allele from only one of your parents, it remains recessive, and no anaemia will result. However, if you inherit a copy from each of your parents, anaemia is the result (this is a loose treatment of it, of course, and it isn't necessarily the case that anaemia results from having two copies, but it certainly appears with a significant statistical weighting). Under a naïve treatment of natural selection, that should be sufficient that the allele is weeded out of the population, except for two things. The first is that anaemia only kills before reproductive age in a small percentage of the population. The second, and this is the bit that's important here, is that a single copy of the allele confers a distinct advantage in terms of resisting malaria. These two are the reason that natural selection hasn't eliminated it, and it also illustrates just what the connection is between alleles, selection and environment
[4].
The third mechanism is genetic drift, which occurs via the random
†† mixing of alleles during reproduction. Broadly speaking, in any population of size
x, there are 2
x copies of every gene (remember that you get one copy from each of your parents). Of those, there will be a percentage that are of a specific type, or allele. In succeeding generations, that percentage will vary randomly about a mean. The effects of drift are most strongly felt where populations are small, as these random variations will tend to cancel out in larger populations. Where drift results in an allele going extinct, that allele will stay extinct unless it arises again via mutation. Where drift results in a given allele appearing in the entire population, it is said to have fixed. I'll be coming back to this again shortly. I should note that, even without selection, evolution will occur in a population as long as there is mutation and drift (indeed, genetic drift on its own IS evolution, as I detailed above, because it constitutes a variation in allele frequency)
[5].
As I mentioned above, there are advocates of both the position that natural selection is the primary driver of evolution, and that genetic drift is the primary driver of evolution (this is known as the neutralist-selectionist debate, largely driven by the work of Motoo Kimura, whose neutral theory of molecular evolution asserted that molecular evolution is driven by drift, while phenotypic evolution (large scale morphological change) is driven by natural selection). My own position, garnered over much discussion with an extremely knowledgeable palaeobotanist friend of ours, is that separating them is a mistake, as both are facets of a single mechanism, namely population resampling.
Much of what we've been dealing with so far has been what we would call 'microevolution' which, as I understand it, my opponent doesn't dispute. What he does dispute is 'macroevolution' (although I'm not entirely sure he understands what we mean by it in the literature), and I'm going to give the remainder of my opening over to that.
In the very simplest of terms, microevolution is variation in frequencies of alleles that occurs below species level, or within a population. Much of genetic drift, for example, would constitute microevolution (although there are some exceptions). Macroevolution, on the other hand, is variation in the frequencies of alleles at or above species level. This gives us one crystal clear example of a macroevolutionary process, namely fixation via genetic drift and selection. This is where all variants of an allele have been eliminated from a population except one. The variant alleles are now extinct, as detailed above, and the single remaining allele is fixed. Because this occurs at species level, it is properly termed a macroevolutionary process.
One nice observed example of this is detailed in a paper by Lee
et al studying population data in
Drosophila melanogaster and looking for instances of genetic hitchhiking, which is where an allele is strongly selected for and goes to fixation, and other nearby alleles 'hitch a ride', and basically survive by association
[6].
We also have extinction as a properly macroevolutionary process, because it is a variation in the frequencies of alleles at species level, in which, bluntly, all the alleles in a species go from some to none. So another neat little question about observations of macroevolutionary processes is 'when was the last time you saw a live [insert extinct animal of choice] (I usually go with the thylacine or the dodo, not least because their extinctions were both
observed).
To extend this example a little, we can also talk about a hypothetical (I did search for an example of this in the literature, but couldn't find one in the time I allocated to research for this post), namely a speciation and an extinction in one, namely an extinction event in the middle of a ring species.
Where you have a single species, this species can be a string of subspecies (loosely) each of which are interfertile with their immediate neighbours (and often several neighbours beyond), but the ends of which cannot interbreed with each other, this is a ring species. An obvious example is the
Ensatina salamanders of California's Central Valley
[7]. If an event, such as, say, a meteorites strike, wipes out a portion of the middle of the ring species, such that the closest neighbours can no longer interbreed, you have a speciation event, because the nearest neighbours are now separate species, even when the same members of the same subspecies were the same species before the event. This is an extinction that caused a speciation, and therefore a macroevolutionary event.
I should also note here that there are various instances of alleles that are shared between multiple species. Indeed, when we say that chimps and humans share a massively high percentage of their DNA, we're actually saying that there are many, many instances in which humans and chimps carry exactly the same alleles (and indeed multiple shared allelic variations of individual genes across populations). The study of these genes is properly termed an area of macroevolutionary study. This is, of course, what I was alluding to earlier when talking about some human somewhere carrying the exact insulin coding gene as that shown above for the lowland gorilla.
And finally, to round this opening off, I'm going to return to one of my favourite examples of speciation, namely a replicated speciation event via hybridisation in
Heliconius butterflies
[8]. In this paper, the author's noted that one species,
Heliconius heurippa had wing markings intermediate between two other species,
Heliconius melpomene and
Heliconius cydno. Since I feel this is the final slam-dunk, and because I've typed over 3,500 words in this post, I'm simply going to cite the reference and hand you to my good friend, the fearsome Blue Butterfly, and his wonderful and enthusiastic presentation of the paper:
There is no more thunderous prescient of doom than the flutter of tiny wings – hackenslash.
TL:DR version: Evolution is a fact.
______________________________________________________________________________________________
* Gene is a term that has varying definitions across different disciplines in evolutionary biology, generally context-sensitive. For example, in population mechanics, a gene can be something even as loose as 'is a descendant of Henry VIII' or 'can survive a bolide impact'. More broadly, a gene is any unit of heredity. For clarity, unless otherwise specified, I will be referring to a stretch of DNA of specific interest.
† All uses of the word 'species' will, unless otherwise specified, refer to the biological species concept (BSC), which is defined as a population of organisms throughout which gene flow naturally occurs at a given time.
†† Random here means 'statistically independent', which is to say that any result is as probable as any other.
Refs:
[1]
Versuche über Pflanzenhybriden – Mendel 1866 (translated into English by Druery and Bateson 1901- Journal of the Royal Horticultural Society).
[2]
http://ghr.nlm.nih.gov/handbook/mutatio ... nemutation[3] For a detailed treatment of 'information' and the various canards surrounding it, see:
hackenslash on information[4]
http://www.cdc.gov/malaria/about/biolog ... _cell.html[5]
http://www.sciencedirect.com/science/ar ... 2211008827[6]
Differential Strengths of Positive Selection Revealed by Hitchhiking Effects at Small Physical Scales in Drosophila melanogaster – Lee
et al 2014
[7]
Incipient species formation in salamanders of the Ensatina complex – Wake 1997
[8]
Speciation By Hybridisation In Heliconius Butterflies – Mavarez
et al 2006 (full paper downloadable; link in
Cali's linked post).