Слайд 1Medical Academy named after S.I. Georgievsky
of Vernadsky CFU
Phylogenetic disorder of
genetic system
Submitted to : anna zhukova
Submitted by: anuj
Слайд 2Phylogenetic disorder of genetic system
Genes with common profiles of the
presence and absence in disparate genomes tend to function in
the same pathway. By mapping all human genes into about 1000 clusters of genes with similar patterns of conservation across eukaryotic phylogeny, we determined that sets of genes associated with particular diseases have similar phylogenetic profiles.
Слайд 3The hundreds of eukaryotic genomes now sequenced allow the tracking
of the evolution of human genes, and the analysis of
patterns of their conservation across eukaryotic clades. Phylogenetic profiling describes the relative sequence conservation or divergence of orthologous proteins across a set of reference genomes.
Слайд 4Different classes of functional gene groups have distinct coevolution patterns
The
TCA cycle is an extreme example of a well‐studied, highly
annotated molecular pathway that overlaps significantly with the phylogenetic profile classification of human genes. To systematically query the overlap between our phylogenetic profiling of human genes and many other analyses of human molecular pathways,
0
Слайд 5Systematic identification of genes that coevolve with known pathways and
diseases
In the mapping of genes classified by HPO groups or
by MSigDB groups to phylogenetic clusters, we noted that some of the same genes were correlating with distinct diseases and distinct molecular signature gene groups. For example, a set of 4–6 nuclearly encoded mitochondrial proteins constitute the overlap with MSigDB groups such as KEGG oxidative phosphorylation and HPO terms such as abnormal cerebrospinal fluid,
Слайд 6
Many molecular pathways map to the same phylogenetic clusters as
genes associated with specific human diseases.
Focusing on proteins coevolved with
the microphthalmia‐associated transcription factor (MITF), we identified the Notch pathway suppressor of hairless (RBP‐Jk/SuH) transcription factor, and showed that RBP‐Jk functions as an MITF cofactor.
Слайд 7Phylogenetic profiling identifies a new MITF‐associated factor
While phylogenetic profiling could
be used to seek the particular diseases with the strongest
phylogenetic profile overlap, we could also query for particular known components of diseases whether they have similar phylogenetic profiles to any other genes. The proteins with the same profile are much more likely to act in the same pathway. As an example, we used phylogenetic profiling to investigate the role of MITF, the master regulator
Слайд 8
By analyzing the conservation of human proteins across 87 species,
we sorted proteins into clusters of coevolution. Some clusters are
enriched for genes assigned to particular human diseases or molecular pathways; the other genes in the same cluster may function in related pathways and diseases.
Слайд 9Phylogenetic profile analysis of genes sets with similar disease phenotypes
Phylogenetic
profile analysis has previously been a powerful tool for the
study of human Bardet‐Biedl syndrome and mitochondrial diseases ). Just as phylogenetic profiling could detect significant overlap with about 20% of the molecular signatures gene groups, we sought to detect a similar fraction of the smaller set of genes annotated at present to be variant in human genetic diseases. Even though only a subset of human disease loci have been identified at this intermediate stage in human genetic analysis,
Слайд 10Phylogenetic profiling identifies ccdc105 as a meiosis‐specific chromatin localization gene
Proteins that constitute
components of specialized multiprotein complexes are also expected to have
similar phylogenetic profiles. As a test for the use of phylogenetic profiles to generate candidate components of such protein complexes, we analyzed proteins of the synaptonemal complex.
Слайд 11Many genes that were thought to map to different diseases
are actually coevolved together and mapped into the same phylogenetic
clusters.
Слайд 12Materials and methods
Species database generation
Protein‐coding sequences for human genes were
downloaded using BioMart version 0.7 from the Ensembl project (release
60). Ensembl includes both automatic annotation, in which transcripts are determined and annotated genome‐wide by automated bioinformatic methods, and manual curation.
Слайд 13Calculation of the list of most correlated genes
Pearson correlation coefficient
(R ) was calculated using the NPP matrix to generate a
correlation matrix. High correlation can be the result of coevolution or a by‐product of homology between gene sequences and in the later only corresponds to paralogous genes. To remove phylogenetic profile correlation scores that resulted from homology between the sequences of two human genes Gi to Gj, we assigned
Слайд 14Calculation of Co10 scores
To test whether sets of functional annotated
genes are significantly coevolved, we calculated a Coevolution (Co10) score.
We determined for each gene the 10 non‐homologous genes (the 10 nearest neighbors) that are most phylogenetically correlated with it (List10—see Materials and methods). We also tested 20, 50, and 100 nearest neighbors and this analysis yielded similar results (data not shown).
Слайд 15Generation of binary phylogenetic profile and NPP with different organism
sets
To test for the effect of different numbers of species
on the performance of phylogenetic profiling, we resampled our data using 75, 50, or 25% of our original species list. To keep similar phylogenetic representation of the organisms that were used, we chose organisms from the entire eukaryotic tree.
Слайд 16Generation of coevolved gene clusters
For each protein A, we ranked
the top 50 most correlated genes to it, using Pearson's
correlation coefficient (R ) on the NPP matrix. The most correlated protein to A received a rank score of 50 and the others the score of 49, 48, …, 1. The 50th protein got the rank score of one. The other genes got the rank score of zero. Since the rankings are asymmetric (i.e., Rank A to B is not necessary identical to the rank B to A), a ranking score between two genes (ranksocreAB) was calculated.
Слайд 17High load can lead to a small population size, which
in turn increases the accumulation of mutation load, culminating in
extinction via mutational meltdown.
Слайд 18MSigDB and HPO database
The Molecular Signature Database (MSigDB v3.0) contains
6800 gene sets collected from various sources such as online
pathway databases (KEGG, BIOcharta), Gene Ontology (GO groups), publications in PubMed and genes that share cis‐regulatory motifs or are coexpressed. We used the 6594 sets with fewer than 500 genes.
Слайд 19Plasmids
pcDNA3‐MITF and PGL4.11‐TRPM1 promoter luciferase were described in previous publications.
pSG5‐RBP‐Jk was kindly provided by Dr E Manet (INSERM U758,
Unité de Virologie humaine, Lyon, France).
Слайд 20Cell cultures, transfections, and luciferase reporter assays
Human WM3526, WM3682, and
WM3314 melanoma cells were cultured in Dulbecco's modified Eagle's medium
supplemented with 10% fetal calf serum. Cells were transfected with jetPEI™ for plasmids or Hiperfect (QIAGEN) for the siRNAs targeting MITF (40 nm) or RBP‐Jk (10 nm) according to the manufacturer's instructions.