This post is written in response to a journal club blog post by the Ross-Ibarra lab on our recent paper in Genome Research, where we used RAD data to study parallel hybrid zones in Heliconius erato and H. melpomene. First, we’d like to say many thanks for your comments and discussion! Here are responses to a few of the main points:
We compared two methods of analysing the data in our paper – first de novo assembly of the sequences, and second alignment to the reference genome. In response to your criticism that this comparison is a ‘little unfair’ – well we largely agree with this. We started out by using two ‘off-the-shelf’ solutions to this problem, first the program Stacks for de novo assembly, and second a well established pipeline that we use in the lab for analysis of resequence data based on the read-mapper Stampy and base-caller GATK. What we found after this first round of analysis was that, much to our surprise, the read mapping approach worked really well even when the reference genome was quite distant from the study species (i.e. when mapping H. erato reads to the H. melpomene genome). We therefore pursued this approach for the rest of the paper and did not invest more time in the Stacks-based approach. As highlighted in the post, this isn’t a very fair comparison as there is currently no way of adding the paired end data and calling SNPs in Stacks – so basically Stacks is throwing away at least half the data. It should be possible to get around this, but it would require quite a bit of compute time and script-writing (we previously did something similar in an analysis of data for the diamondback moth – Baxter et al., PLoS One 6, e19315). We decided not to do this because the read-mapping approach worked pretty well for both species. However, we did decide to report both analyses throughout the paper, as the results reflect what can be achieved with relatively straightforward application of existing programs. However we fully acknowledge that the de novo assembly approach could be taken much further and recommend that readers consider both approaches in analysis of similar data sets.
An interesting question is whether these populations are differentiated at loci other than those involved in colour pattern. We suggested that some differentiated SNPs that were not associated with wing pattern traits might represent such ecological differentiation, perhaps through adaptation to altitude. There is a major problem here in that colour pattern and ecology are strongly correlated, with one wing pattern form typically found at higher altitudes in each zone. However, we did have some power to separate such effects as the correlation between colour pattern and altitude in the samples wasn’t perfect (some of the phenotypically “pure” individuals were sampled from closer to the centre of the hybrid zone) – so we did try to test for associations with altitude. We didn’t find evidence for genomic regions uniquely associated with altitude but there were some loci that did show high Fst but not colour pattern associations and that might be due to ecological differences. However, as you correctly pointed out, this could also be due to small sample sizes or colour pattern phenotypes we didn’t measure. Future work will need much larger sample sizes, but in general we were surprised and pleased by how much information could be extracted from these relatively small sample sizes (especially after some of the individuals dropped out due to hidden cryptic species!).
In response to your comment about divergence times, we would completely agree that our data are not well designed to estimate relative divergence times for the two species (mainly because we carried out separate phylogenetic analyses in the two species). Our discussion of co-divergence was mainly focussed on the order of divergence of the co-mimetic subspecies in the phylogeographic tree, which did strike us as rather surprisingly similar in the two species. Not entirely sure how to interpret that though, and as you state this may not tell us much about the history of Mullerian mimicry.
On a similar note, it is indeed surprising how genetically similar these races are to one another, despite their huge differences in appearance. If, populations are compared from further afield, such as Brazil, there is more genetic differentiation across larger distances, but we think this is due to isolation-by-distance rather than being associated with wing patterning. John Turner suggested in 1979 that there were ‘Contrasted modes of evolution in the same genome’ (Turner PNAS 76:1924-1928; it’s a great paper, worth reading!). Everything we have discovered recently supports this assertion – the wing pattern races differ at little apart from a few major wing patterning loci, which change abruptly across narrow hybrid zones. These therefore show very different patterns of evolution as compared to the rest of the genome, which shows a more typical geographic structuring.
Thanks for your observation about the divergent region on Chromosome 13 of Ecuadorian H. melpomene, you may be correct that this could be due to the homologue of Ro in H. melpomene. However, we don’t see any divergence outliers in H. melpomene on the scaffold that has the strongest Ro associations in H. erato, so the idea that we have found the Ro homologue in H. melpomene remains quite tentative at this stage. Again, you are right to highlight the small sample sizes here which will have reduced our power to detect associations in this population.
In summary, the major findings of our paper were that 1) mapping of short-read RAD data to a reference genome even ~15% divergent worked remarkably well 2) RAD data provide a great deal of power to detect large-effect loci in hybrid zones, which is extremely promising for other species where there isn’t a long history of traditional mapping studies as in Heliconius and 3) we found a new cryptic species in our samples! However our sample sizes were not large enough to identify small-effect loci reliably, or to reliably separate highly correlated factors such as wing pattern and altitude, and these interesting questions will only be resolved with larger studies.
Chris and Nicola