This blog records the progress of my thesis in XJTU
Ideas
Prof. Guo advises that osteoarthritis might be a good research.
This article conduct a meta-analysis of GWAS. I might not do that meta analysis but just GWAS. So the goal might be Learning Approach to Osteoarthritis Analysis
For theoretical basis, I search for definitions of meta-analysis and possible learning approach in GWAS
After discussion with Prof.Guo, there might be three main problems i might cast sight on:
- Given the GEO data, analysis possible gene loci.
- Given the Metabolism data, find possible pathway that can be regulated by drugs.
- Given the patient sample, predict the risk.
Hence I shall first look through the data and determine which should i pick.
Data collection
GWAS research
- Select 177517/826690 individuals for analysis
2 EA groups and 11 European
- Found 11897 SNV
p < 1.3 10e-8
- Phenotype conditional analysis identify 223 independent associations
-
87/96 loci replicated
-
Phenotype independently conditional analysis shows 100 associations
-
Lead SNV of each of associations mentioned in 5 are selected.
6 Coding SNV
59 transcription reside SNV
35 intergenic SNV
- Update risk SNV for different tissue
- 6 rare SNVs are detected (discovered in iceland)
- Risk SNVs also related to EA groups with evidence in several phenotype
-
Polygenic risk scores related to some of phenotypes
-
3 Female only SNVs discovered.
RS116112221 interestingly located
- Meta analysis shows another risk variant
-
60/100 SNVs related to phenotype
40 weight-bearing only
4 non-weight bearing only
42 both may contribute to pathology
- Some SNVs have more participation in joint replacement than osteoarthritis pathology; Especially with pains
-
Identify 637 genes with possible ability to become effector gene
-
Identify 77 genes with higher potential based on various criteria
4 supported by missense SNV
48 previously reported
30 newly discovered
- 77 genes mentioned above are distributed in 6 groups
- Skeletal development (63/77)
- Joint degeneration
- Neuronal function
- Muscle function
- Immune response
- Adipogenesis
- 205/637 genes are potential drug target
71/205 genes cooperate well with drugs licensced
- 20/77 genes can be candidates
7 newly discovered
Metabolomics research
32 research contains metabolomic data
Tissue | Biospecimen | Phenotypes | Amount |
---|---|---|---|
Joint | Synovial | OA, RA | 13 |
Serum | Serum | OA, RA | 19 |
Phenotypes | Biospecimen | Methodology | Sample size | References |
---|---|---|---|---|
Knee OA, RA, and postmortem controls | SF | UPLC Q-TOF MS | OA 5, RA 3, Controls 5. | Carlson A et al. [16] |
Early and later knee OA and controls (all postmortem) | SF | UPLC Q-TOF MS | Early OA 55; late OA 17; controls 7. | Carlson A et al. [18] |
Knee OA and cadaveric controls | SF | 1H NMR and GC-MS | Knee OA 55; controls 13. | Mickiewicz B et al. [13] |
Knee OA, RA, postmortem controls | SF | ESI-MS/MS | Early OA 17; late OA 13; RA 18; controls 9. | Kosinska M et al. [14] |
Knee OA vs. controls | SF | GC-TOF/MS | OA 49; controls 21. | Zheng K et al. [17] |
Knee OA, gout, calcium pyrophosphate disease (CPPD), spondylarthritis, septic arthritis, and RA | SF | 1H NMR | OA 15; gout 18; CPPD 11; septic arthritis 4; RA 4; reactive arthritis 3; Crohn's disease 2; ankylosing spondylitis 1; psoriasis arthritis 1. | Hügle T et al. [22] |
Reactive arthritis and undifferentiated spondyloarthropathy; RA, and OA | SF | 1H NMR | OA 21; RA 25; and reactive arthritis 30. | Muhammed H et al. [23] |
Knee OA severity | SF | GC/TOF MS | OA 15. | Kim S et al. [12] |
Knee and hip OA | SF | 1H NMR | Hip 12; knee 12. | Akhbari P et al. [19] |
Classification of OA | SF | hip and knee OA 80. | Zhang W. et al. | |
Knee OA vs. controls and other forms of arthritis | Serum | GC-TOF MS/UPLC-QTOF MS | OA 27; RA 27; AS 27; gout 33, and controls 60. | Jiang M et al. [25] |
OA, RA, and FM | Bloodspot | IRMS | OA 12; RA 15; FM14. | Hackshaw KV et al. [26] |
Knee OA vs. controls | Plasma | GC/Q-TOF-MS | OA 12; controls 29. | Huang Z et al. [27] |
OA vs. controls | Serum | LC/MS | Knee and hip OA 70; controls 82. | Tootsi K. et al. [28] |
OA vs. controls | Serum | 1H NMR | OA 1556; controls 2125. | Meessen, J. et al. [29] |
OA vs. controls | Serum | UPLC-TQ-MS | OA 32 and controls 35 in discovery cohort; OA 30 and controls 30 in replication cohort. | Chen R. et al. [30] |
Obesity and non-obesity knee OA vs. controls | Serum | LC/Q-TOF/MS/MS | Obesity knee OA 14; non-obesity knee OA 14, and controls 15. | Senol O et al. [31] |
Knee OA and risk for TKR | Plasma and serum | HPLC-MS/MS | Knee OA 64 and control 45 in the discovery cohort; knee OA 72 and controls 76 in the replication cohort; 158 subjects in the longitudinal study. | Zhang W. et al. [8] |
Knee cartilage volume loss over 2 years | Serum | HPLC-MS/MS | Knee OA 139. | Zhai G et al. [33] |
Drug response in knee OA | Serum | HPLC-MS/MS | Knee OA 158. | Zhai G et al. [34] |
Knee OA | Plasma | HPLC-MS/MS | Knee OA 64 and controls 45 in the discovery cohort; knee OA 72 and controls 76 in the replication cohort. | Zhang W. et al. [35] |
Knee OA | Serum | HPLC-MS/MS | Knee OA 123 and controls 299 in the discovery cohort; knee OA 76 and controls 100 in the replication cohort. | Zhai G. et al. [36] |
Knee OA progression in 5 years | Serum | HPLC-MS/MS | Knee OA progressor 234; nonprogressor 322. | Zhai G et al. [39] |
Joint tissues
- OA patient and control groups might own background bias due to the source of the sample
-
58/1233 metabolites varied in Carlson A et al.
Involved pathways:
- NO production
- Chondroitin sulfate degradation
-
Arg and Pro metabolism
Their research also contains RA in SF, however, no dissimilarities found
-
188/9903 metabolites varied in Carson A et al. in a larger group
Involved pathways:
- Extracellular matrix components metabolism
- AA, fatty acid and lipid metabolism
- Inflammation
- Energy metabolism
-
Vitamin metabolism
Cluster results:
-
Increased inflammation
- Oxidative stress
- Structural deterioration
-
Energy demand varies in Mickiewicz B et al.
-
Sphingomyelin(SM) and ceramide most abundant among samples in Kosinska M et al.
-
Three main molecules found different in a replication cohort validated research from Zheng K et al. They even differ between OA and RA
- Glutamine
- 1,5-anhydroglucitol
- Gluconic lactone
-
While another two research report consistency among OA and RA, but they own limitations on group size.
-
Samples from different joint in one patient would help eliminate possible differences between individuals in Xu Z et al.
Involved pathways:
- Phenylalanine metabolism,
- Taurine and hypotaurine metabolism
- Arg and Pro metabolism
-
68/469 metabolites found different in Yang G. et al.
-
28/114 metabolites differ between early and late radiographic OA in Kim S et al.
-
Knee OA and hip OA could own difference according to Akhbari P et al.
-
Metabolic syndrome might be used to cluster the patients by Guangju Z et al.
Serum
- Jiang M et al. introduce sexual control in their analysis. 6/30 metabolites are considered as difference, which has AUC of 0.91. They also found differences between OA and RA
- Huang Z et al. studied 12 knee OA patients and 20 healthy controls and identified three metabolites – succinic acid, xanthurenic acid, and tryptophan.
- Tootsi K et al. studied 70 knee and hip OA patients and 82 controls and found that glycine and arginine were independently associated with OA radiographic severity.
- Meessen, J et al. studied a total 227 metabolites assessed by NMR platform in a total 2125 controls and 1556 OA cases Optimal group size?
- Chen R et al. focus on the amino acid difference among population and found several related to OA
- Senol O et al. and Zhang W et al. cast sight on phenotypes such as obsity and diabase
Analysis
- Evolutionary learning
- Differential correlation network
- Meta Analysis?
Workplace
Using UKB to apply final multimodal model for risk evaluation
Similar research has been performed as GWAS for osteo
Motivation
Shift
However, after discussion with Prof. Guo. My intention of setting that three source network was denied, I hence need to
come up with a new strategy.
After screening the following articles and integrating Guo's advice. A new setup is to be made:
What is Few-Shot Learning? Methods & Applications in 2022
Low Data Drug Discovery with One-Shot Learning | ACS Central Science
A Comprehensive Survey on Graph Neural Networks | IEEE Journals & Magazine | IEEE Xplore
Stage 1 plan: Simple GNN Network
Stage 2 plan: Analyze SNP interaction generated by GNN network
Stage 3 plan: Fusion of phenotypes in to prediction
Stage 4 plan: Annotate the SNPs to genes, find possible pathways involved
Stage 5 plan: Introduce metabolites to GNN, increase ? accuracy**
GNN Development
Naive GNN
Start from TIID data i retreived last year, i start the development of a simple GNN based on the research
from Design Space for Graph Neural Networks. Main guidlines are
x_i' = \mathrm{Agg} \left( \left{ \mathrm{Act} \left( \mathrm{Dropout} \left( \mathrm{BN} \left( x_j W + b \right)
\right) \right), j \in \mathcal{N}(i) \right} \right)
To test the network, I apply 3 conGNN layers to deal with the graph and a pool layer to readout or just eliminate the
unrelated nodes. Detailed structures is
- GNN Layer with 128 channels 0.1 dropout
- GNN Layer with 64 channels 0.1 dropout
- GNN Layer with 32 channels 0.1 dropout
- Pooling layer by summing
- Dense Layer to make prediction
Current trained model showed well convergence of train loss, but limited by the data quality, accuracy is still
reletively low. Thus I might take the data from UKB's osteoarthritis genotype to make further progress.