Research to date

Understanding the hybrids

We test how hybridisation and polyploid history shape genomes and traits using exemplar systems. One of our first projects investigated the population structure of rice landraces from Vietnam and placed them within global Asian rice diversity. With local partners, we found that Vietnamese rice landraces clustered separately from elite materials. We also identified an isolated subpopulation of indica landraces around the Red River Delta with genome-wide japonica introgression. We then identified support for lineage-specific signatures of selection and mapped the japonica introgressions to candidate genes, showing that these landraces were not only distinctive, but that these findings also relate to functional traits and breeding.

In parallel, we established work on admixture between the Mesoamerican and Andean gene pools in common bean in Colombia and neighbouring countries. This region appears to be both highly diverse and a point of contact between the two domestication events of the species. We assembled a panel including admixed and pure gene-pool accessions and mapped QTL for domestication-associated traits, notably photoperiod insensitivity and determinacy. We then mapped introgressed regions in the panel and tested whether introgression explained trait variation that was confounded with population structure. In this work, domestication was interpreted from a gene flow perspective, not simply as QTL mapping. We also quantified tolerance to water deprivation and identified variation in drought-response strategies, together with markers associated with these strategies that support marker-assisted selection in breeding populations.

Genome evolution in complex crop systems

We have also mapped introgression in banana and linked it to fruit-related traits using genome-wide association analyses in a panel spanning variable ploidy, including predominantly triploid cultivars, diploid wild accessions, and tetraploid breeding lines. We confirmed that cultivated accessions formed clonal groups corresponding to named varieties shaped by long-term farmer selection and maintained through clonal propagation. We then mapped homoeologous exchanges between subgenomes by developing a novel read-depth-based metric and carried out association analyses for plant architecture and yield-related traits.

More recently, in collaboration with a UK biotech company, we established macropropagation and infection-assay protocols and evaluated Fusarium tolerance using representative accessions from those clonal groups. We constructed a banana pan-NLRome and combined NLR presence/absence variation with disease response to prioritise candidate disease-response loci for editing. We focus on disease tolerance because immune-related NLR genes often occur in rapidly evolving, copy-variable genomic regions in both hosts and pathogens.

Because most edible banana cultivars are clonally propagated, their genomes change primarily through the accumulation of somatic variation, making banana a tractable system for relating somatic genomic change to phenotypic effects. This work also extends to epigenetic variation and to the analysis of non-true-to-type somatic variants arising during gene-editing pipelines, with the aim of understanding somatic mosaicism and scale-up constraints in precision breeding.

Our contributions have also allowed us to place greater emphasis on polyploids because long-read sequencing enables us to tackle their technical challenges. Arising from chromosomal rearrangements and introgression, copy number variants can alter gene dosage and contribute to phenotypic variation. We are developing end-to-end long-read workflows to test this in banana, and we have also completed and published haplotype-resolved polyploid genomes from both PacBio and ONT reads. More broadly, our work helps fill technical gaps in polyploid genomics, as many community tools are still primarily optimised for diploid organisms.

Accelerating crop improvement

A central part of the lab’s work is to translate genomic insight into breeder-ready resources. With a British Forages Breeding Company, we are characterising a diverse panel of white clover and testing associations with yield across fertilisation regimes and temperature treatments using GWAS, alongside landscape-genomic analyses of climatic adaptation, to identify suitable breeding materials. These analyses also allow us to test whether GWAS using pangenome graphs outperforms single-reference approaches. This project contributes both to breeding and to our broader interest in subgenome composition and hybrid complexity.

With a British Vegetables Breeding company, we are characterising a herb diversity panel and conducting multi-trial GWAS to deliver markers for marker-assisted selection in breeding populations. This project has also served as an example of transferability to new crops and commercial settings, highlighting that domestication history and population structure can be more decisive than phylogenetic distance for robust association mapping.

Over the past two years, we have also expanded into phenotype prediction in breeding. Our work in large EU breeding consortia, Legume Generation, supports marker-assisted and genomic prediction in legumes. We curate and manage raw trial data under FAIR principles, establish standards and ontologies, and perform multi-site, multi-season analyses to generate breeder-ready phenotypes. We also collaborate with breeders to evaluate imputation accuracy and to benchmark genomic prediction models, including mixed-model and machine-learning approaches, for sparse testing across environments. Together, these efforts highlight a long-term path in which genomics supports decision-making in breeding and reduces the burden of trials through better predictions.