princeps, which have lost the regulatory ‘ATC’ domain, or the loss of the ‘HTH’ domain of birA, the ‘PNPase C’ domain of rne and the ‘DEAD box A’ of dead in the case of M. endobia. Additionally, many other genes have been shortened due to frameshifts or the presence of premature stop codons, in comparison with their orthologs in free-living relatives (e.g. sspB, rplQ, rplO and aroC in T. princeps; thiC, ybgI, yacG, ygbQ, ftsL, ftsY and tilS in M. endobia). In some cases, the shortening removes some non-essential protein domains completely (e.g., engA, rpoA and rpoD in T. princeps; secA, aceF, yebA and metG in M. endobia). The loss of the ‘anticodon
binding domain of tRNA’ and ‘putative tRNA binding domain’ of metG, encoding methionyl-tRNA synthetase is common to other endosymbionts with reduced genomes. Finally, even though both genomes have an unusually high G + C content compared learn more with most bacterial endosymbionts, at least M. endobia seems to be suffering the AT mutational bias typical of bacterial genomes [27, 28]. This conclusion is drawn from the analysis of the nucleotide composition of genes, pseudogenes and IGRs (Table 1), as well as the preferential use of AT-rich codons (Additional file 2) including a high incidence of the TAA
stop codon (56.44%). Since both genomes seem to rely on the DNA replication and repair machinery of M. endobia (see next section), both genomes could be expected Repotrectinib in vivo Glutathione peroxidase to undergo a similar trend towards an increase in AT content. However, this trend is undetectable in T. princeps, where the G + C content of pseudogenes and IGRs do not differ from that of the genes (Table 1). The differences in G + C content between both genomes could be due to a higher ancestral G + C content plus a slower
evolutionary rate for T. princeps, due to its extreme genome reduction, and the biology of the system (i.e., a lower replication rate, since each T. princeps cell retains several M. endobia cells). In fact, the codon usage bias (Additional file 2) and differences in the amino acidic composition between both endosymbiont proteomes (Figure 2) reflect their differences in G + C content. Thus, T. princeps find more proteins are rich in amino acids encoded by GC-rich codons (Ala, Arg, Leu, Gly, Val and Ser represent 56.82% of the total, whereas Phe and Trp are scarce), while M. endobia has a weaker amino acid composition bias (Additional file 2). Figure 2 Amino acid content profiles for T. princeps and M. endobia proteomes. Amino acids are ranked from left to right according to the GC-richness of the corresponding codons (see Additional data file 2). T. princeps genome comparison The genome alignment of both T. princeps strains showed a high degree of identity at the sequence level (99.98%, being 138,903 bp identical), which is coherent with their evolutionary proximity and extreme genome reduction.