• Reduce text

    Reduce text
  • Restore text size

    Restore text size
  • Increase the text

    Increase the text
  • Print

Arabidopsis thaliana in flower in a greenhouse at INRA Versailles-Grignon. © INRA, NICOLAS Bertrand

Evolution of repeated sequences in thale cress genome decoded

Repetitive DNA sequences represent a significant percentage of a plant’s genetic make-up and play a major role in changes to its structure. Scientists at INRA Versailles-Grignon have decoded certain evolutionary features of repetitive DNA sequences in Arabidopsis thaliana: their distribution along chromosomes and their composition are heterogeneous, and this divergence, over time, has been jointly accompanied by long-term changes to the mechanisms which control their epigenetic regulation. This breakthrough in the world of plant genetics is reported in the 23 June 2014 edition of Nature Communications.

Updated on 11/18/2015
Published on 06/20/2014

The genomes of living things contain varying amounts of what are known as ‘repeating’ DNA sequences, since they are present in large numbers of copies. These repeating sequences are mobile elements that can cause mutations if they are not repressed by the host. Currently, very little is known about these sequences and how they are regulated.

Under the microscope: ongoing genome reduction

Arabidopsis thaliana is a member of the Brassicaceae family. Approximately 25% of its genome, which was sequenced at the beginning of the 21st century, is made up of repeating sequences – far less than its ancestors and other cousins, making it a genome undergoing reduction.

It is an ideal object of study for researchers at INRA Versailles-Grignon seeking to explore the evolution of repeated sequences in plants: its genome contains few young repeats and is likely to contain more ancient repeats. They were able to determine the age of the repeats using a data method usually applied in studies of animal genomes. These results were then confirmed by comparing the divergence seen in repeats in A. thaliana to those in close cousins, A. lyrata and Capsella rubella, and Brassica rapa, among others.

 Heterogeneous distribution and varying composition of repeats along chromosomes

The scientific team has revealed that the genome of A. thaliana is primarily composed of ancient repeats derived from fragments likely inserted into an ancestor between 15 and 20 million years ago, and its distribution along chromosomes is very heterogeneous. The majority of young repeats are found at a distance from genes, while ancient repeats are more frequently found in gene-rich regions.

In plants, a genome is protected against invasion from repeats by molecules (sRNA, or small RNA) which steer the modification of DNA (methylation of certain cytosines). These are called epigenetic modifications because they are transmitted to descendents but do not affect the DNA sequence.

Researchers noticed that in A. thaliana, nearly all young repeats are targeted by this sRNA, as are half of all ancient repeats, even if these have been present in the genome for millions of years. Young or old, the repeated sequences targeted by sRNA are subject to significant methylation, which suggests that methylation of repeats via sRNA can be ongoing for tens of millions of years, and that divergence in repeats over time is reflected within the population of small RNA.
Methylcytosines are rather inclined to spontaneous deamination, which changes this DNA base to another one, called thymine. Scientists observed that ancient repeats contain less cytosines and more thymines than young ones as a result of methylcytosine deamination. This suggests that they are able to shield themselves from methylation using sRNA and thus remain in the genome of A. thaliana. Added to the heterogeneous distribution of young and ancient repeats along chromosomes is a different composition of the genome sequence: proximity to genes determines how rich this space is in thymine.

Genes which are close to repeats are expressed at low levels; the team also demonstrated that overall, the repeats contained in A. thaliana have little impact on the level of expression. This suggests that selective pressure against repeated elements is negatively correlated on neighbouring gene expression levels.

An ancient inheritance with long-term consequences

This research, conducted using an innovative methodology, and the results it produced, is a first in the field of plant biology and particularly in studies of A. thaliana. Scientific exploration of the different classes of repeated sequences based on age helps to better understand the evolution of plant genomes and the epigenetic changes to which they are subject. It also provides new perspectives in studies of genome composition in relation to epigenomic evolution – an evolution of repeated sequences which has long – very long – term consequences on the biology and evolution of the A. thaliana genome.

Scientific contact(s):

Press Relations:
INRA News Office (+33 1 42 75 91 86)
Associated Division(s):
Plant Biology and Breeding
Associated Centre(s):


Florian Maumus and Hadi Quesneville, Ancestral repeats have shaped epigenome and genome composition for millions of years in Arabidopsis thaliana, DOI:10.1038/ncomms5104Nature Communications, 23 June 2014