You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: mcc/src/client/GeneticsPlot/GeneticsPlot.tsx
+28-4Lines changed: 28 additions & 4 deletions
Original file line number
Diff line number
Diff line change
@@ -108,10 +108,34 @@ export function GeneticsPlot() {
108
108
<>
109
109
<ErrorBoundary>
110
110
<divstyle={{paddingBottom: 20,maxWidth: 1000}}>
111
-
Population structure analysis using PCA is a helpful way to summarize the genetic relationships among animals in the MCC. The PCA results can be thought of as a simple type of genetic clustering - animals with more similar principal component loadings are more genetically similar. A more precise description of the relationship between two animals is provided by kinship coefficients – these are quantitative measures of relatedness that can be calculated by comparing two genomes, and interpreted using genealogical language, such as ‘parent-child’, ‘uncle-nephew’, ‘first cousins’, etc.
112
-
</div>
113
-
<divstyle={{paddingBottom: 20,maxWidth: 1000}}>
114
-
Whole genome sequencing was performed on each animal and genotypes were called with GATK haplotype caller. Principal components analysis was performed with GCTA (https://yanglab.westlake.edu.cn/software/gcta/#PCA) and kinship coefficients were calculated with KING (https://www.kingrelatedness.com/). Analyses were performed by Ric del Rosario (Broad Institute).
111
+
Over the past few years, the MCC team has been working on extracting, sequencing and analyzing DNA from
112
+
marmosets across the participating breeding centers. While we have deposited the raw sequence data for
113
+
578 marmosets on NCBI's Sequence Read Archive (SRA), we are excited to report that the MCC portal now
114
+
houses a call set with single nucleotide variants and short indels for over 800 individuals.
115
+
<p/>
116
+
The MCC genomic database is extensive, with each individual being genotype at millions of variants
117
+
across the genome. One way to summarize a large dataset can be done using Principal Component Analysis
118
+
(PCA). PCA is a technique used across disciplines (from astronomy to genomics) that reduces the
119
+
information in a multi-dimensional dataset to (fewer) principal components (PC) that retain overall
120
+
trends and patterns in the original data. Biologically, this could mean merging together two variants
121
+
that are always inherited together into just one PC, making the data easier to analyze while maintaining
122
+
its most important patterns. See the **Visualization with PCA** tab below.
123
+
<p/>
124
+
Although PCA is useful for broad-scale comparisons, it is not very useful when trying to distinguish
125
+
whether two individuals are siblings or first-cousins, for instance. For that, we have better statistics
126
+
that can describe the genetic relatedness between two individuals. We estimated genetic relatedness for
127
+
all pairs of individuals for which we have whole-genome data, and made these available under the
128
+
**Kinship** tab. There you will find the inferred relationships between pairs of individuals as well as
129
+
the calculated kinship coefficient, which is a quantitative measure of genetic relatedness (see
130
+
<ahref="https://en.wikipedia.org/wiki/Coefficient_of_relationship#Kinship_coefficient">here</a> for more details).
131
+
<p/>
132
+
It is possible to explore the full MCC database of variants with a graphical interface by accessing the
133
+
**Genome Browser** tab. There you can, for example, visualize all the variants present in your gene of
134
+
interest by typing it's name in the search bar.
135
+
<p/>
136
+
The genetic analyses described here were performed by Karina Ray (ONPRC), Murillo Rodrigues (ONPRC), and
137
+
Ric del Rosario (Broad Institute). Please contact us at <ahref="mailto:mcc@ohsu.edu">mcc@ohsu.edu</a> with any
0 commit comments