It is an exciting time to be working on Heliconius butterflies, as new sequencing technologies allow us to get closer and closer to the genes underlying colour pattern variation. We’re going to start featuring new Heliconius papers on the blog, to highlight new findings and put them into context. In this post, we’re going to look at a new paper in Genome Research that describes the region of the genome that produces red patterns in Heliconius erato and reveals how this region varies between H. erato and its mimic, Heliconius melpomene [1].
Red bands and rays are one of the key features of Heliconius wing patterns and occur repeatedly in mimics across South America. For example, Heliconius melpomene and Heliconius erato mimic each other in many different countries, often sharing either bands or rays (see picture above [2]). For over a century, researchers have been trying to understand how this mimicry could evolve. One of the major parts of this puzzle is to discover the genes that produce the red patterns and how they vary between species.
It has been known for many years that these features are controlled by one large region of the genome less than one megabase long, known as the B locus in H. melpomene and the D locus in H. erato [3]. In recent years, we have been making great progress in describing and understanding this region. In 2010, the whole region was sequenced using bacterial artificial chromosomes (BACs) in H. melpomene and H. erato [4,5]; the region is around 700 kilobases long in H. melpomene and 1 megabase long in H. erato, and contains around 20 genes, which are not only the same genes in both species but are also conserved in the same order.
We now know that the optix gene, which is found within the B/D region, is expressed on the wing wherever red pattern elements are expressed across many different Heliconius species, and that genetic variations in optix are strongly associated with variations in red patterns in multiple species [6]. We also know that this gene has evolved to produce red patterns separately in H. melpomene and H. erato, converging on the same phenotype but differing genetically [2].
We now really want to know what’s happening across the whole B/D region. Are there any other genes strongly associated with red band patterns? What genetic variations in H. melpomene and H. erato exist in the wild? To answer these questions, we need to sequence the whole B/D region in many butterflies from each species. Unfortunately, until recently it has not been possible to do this. BAC sequencing of one copy of the whole B/D region was very laborious and expensive; it would be very difficult to sequence tens of butterflies this way. A short 800 bp fragment of optix was sequenced in many butterflies using polymerase chain reactions (PCRs) to associate optix with red patterns, but over 1000 similar experiments would be required to cover the whole region.
With next generation sequencing, we can now sequence whole genomes of many butterflies and use these sequences to study particular regions of the genome like B/D. The new paper in Genome Research by Megan Supple and colleagues reports on the sequencing of short reads from the whole genomes of 45 H. erato butterflies from four hybrid zones across South America using next generation sequencing. By taking many butterflies with rayed patterns and red banded (‘postman’) patterns, and looking for variations in the D locus sequences, we can see what’s going on across the whole region.
Next generation sequencing data is still fairly new, and as with many scientific projects the scientists usually learn how to analyse their data during the project. As Megan says, “I was just handed a large pile of data. At that point, I had no idea how to go about analyzing next-gen sequences. It was quite a learning process.”
It is not yet possible to sequence whole chromosomes in one go; current technologies can only produce short reads a few hundred bases long. The raw sequence data for the 45 H. erato genomes is hundreds of millions of 100 base pair (bp) sequence reads. The H. melpomene genome has been assembled, but the H. erato genome has not, so the first step of the analysis was to align the 100bp reads to the existing BAC sequence of the erato D region. By comparing reads between different butterflies at the same locations, variations in the D locus could be identified. This is done with tools like the Genome Analysis Tool Kit, a suite of software written to analyse human genome data sets such as the 1000 Genomes Project data, but as DNA is the same in butterflies and humans, we can use the same software to study the evolution of mimicry.
Megan found the same strong association with optix shown in previous papers, but also found a very strong association in a 65 kb long region upstream of optix, which does not appear to contain any genes. This strongly suggests that there are sequences that regulate other genes in this region, which are responsible for turning the genes with red patterns on and off. In the H. melpomene genome paper, this same region was shown to vary considerably between H. melpomene subspecies [7].
“What surprised me the most was that analyzing H. erato and H. melpomene separately identified almost the same exact region. I did not expect that the boundaries identified in H. melpomene would almost perfectly coincide with the boundaries in H. erato. That indicates that we have hit the boundary of something important – we believe that those boundaries are close to important functional regions that are under strong divergent selection.”
So are these regions actually the same in both species? Apparently not. In H. erato, there are 76 single base variations with one variant perfectly associated with the postman pattern and the other associated with rays; in H. melpomene, there are 430 such variations. When the sequences of the whole regions are compared, the sequences from H. melpomene individuals with rayed patterns do not group together with sequences from H. erato rayed individuals, and H. melpomene postman individuals do not group with H. erato postman individuals.
This strongly indicates that, although the H. melpomene and H. erato butterflies look very similar, the genetic sequences producing the patterns are not the same, and H. melpomene has evolved the pattern independently from H. erato. Megan Supple: “It was really compelling to see how perfectly the sequences in this region cluster by pattern within each species, but always keep the two mimics separate.”
Now the architecture of the B/D region is clear, and the full sequences and variations are available, we can search for the functional variants that produce the different red patterns. This will take some time, as there are 65,000 bases to search through, and confirming the function of any candidate sequences will require some difficult experimental work. But this new paper brings us very close to finally identifying these variants, and to providing a mechanical explanation for the evolution of red pattern mimicry in Heliconius.
1. Supple M.A., Hines H.M., Dasmahapatra K.K., Lewis J.J., Nielsen D.M., Lavoie C., Ray D.A., Salazar C., McMillan W.O., Counterman B.A. (2013). Genomic architecture of adaptive colour pattern divergence and convergence in Heliconius butterflies. Genome Research, doi:10.1101/gr.150615.112.
2. Hines H.M., Counterman B.A., Papa R., de Moura P.A., Cardoso M.Z., Linares M., Mallet J., Reed R.D., Jiggins C.D., Kronforst M.R., McMillan W.O. (2011). Wing patterning gene redefines the mimetic history of Heliconius butterflies. Proceedings of the National Academy of Sciences 108(49):19666-19671.
3. Sheppard P.M., Turner J.R.G., Brown K.S., Benson W.W., Singer M.C. (1985). Genetics and the Evolution of Muellerian Mimicry in Heliconius Butterflies. Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences 308(1137):443-610.
4. Baxter S.W., Nadeau N.J., Maroja L.S., Wilkinson P., Counterman B.A., Dawson A., Beltran M., Perez-Espona S., Chamberlain N., Ferguson L., Clark R., Davidson C., Glithero R., Mallet J., McMillan W.O., Kronforst M., Joron M., ffrench-Constant R.H., Jiggins C.D. (2010). Genomic Hotspots for Adaptation: The Population Genetics of Muellerian Mimicry in the Heliconius melpomene Clade. PLoS Genetics 6(2):e1000794.
5. Counterman B.A., Araujo-Perez F., Hines H.M., Baxter S.W., Morrison C.M., Lindstrom D.P., Papa R., Ferguson L., Joron M., ffrench-Constant R.H., Smith C.P., Nielsen D.M., Chen R., Jiggins C.D., Reed R.D., Halder G., Mallet J., McMillan W.O. (2010). Genomic Hotspots for Adaptation: The Population Genetics of Muellerian Mimicry in Heliconius erato. PLoS Genetics 6(2):e1000796.
6. Reed R.D., Papa R., Martin A., Hines H.M., Counterman B.A., Pardo-Diaz C., Jiggins C.D., Chamberlain N.L., Kronforst M.R., Chen R., Halder G., Nijhout H.F., McMillan W.O. (2011). optix Drives the Repeated Convergent Evolution of Butterfly Wing Pattern Mimicry. Science 333:1137-1141.
7. Heliconius Genome Consortium (2012). Butterfly genome reveals promiscuous exchange of mimicry adaptations among species. Nature 487:94-98.