More than 330 common variants loci have been identified associated with systemic lupus erythematosus (SLE), but their causal mechanisms are rarely well-explained. Many GWAS downstream analysis ideas were proposed, including fine-mapping, colocalization, Mendelian randomization causal inference, and multi-omics integrated analysis. But it is easy to get confused as there are many statisics hypotheses and caveat before using these tools. In this article, I will present my GWAS downstream analysis pipeline using a SLE signal as example.

Table 1. Number of cases and controls in 7 datasets.

SNP array WGS
Dataset HK GZ AH JN Thai HK JN
n_cases 1,604 1,014 877 512 835 1,532 195
n_ctrls 3,324 4,122 1,712 994 2,995 1,1357 369

The number of cases and controls of each dataset were shown in Table 1. Genome-wide association studies were frist conducted to each dataset separately, following a meta-analysis of 7 cohorts. As shown in Fig 1, a locus around GADPH-IFFO1-CHD4 in Chr12 was identified signficiantly associated with SLE. THe top SNP rs56848735 is located in the 5’UTR region of IFFO1 gene. We can see there are many variants have high LD with the top SNP rs56848735. Traditional fine-mapping algorithms like FINEMAP and SuSiE performs bad at this situation as there are many SNPs with high LD and low P values. Functionally-informed fine-mapping (PolyFun) was utilized to identify the candidate causal variants. The results from PolyFun shows there are two SNPs rs56848735 and rs1639123 contains highest posterior inclusion probability (PIP). THe rs1639123 is located in the intronic region that close to the 5’UTR of CHD4 gene.


LocusZoom.png
Fig 1. LocusZoom of identified GAPDH-IFFO1-CHD4 singal.

To further identify the genes that affected by this locus, we downloaded the whole-blood eQTL data from GTEx and eQTLGen databases. Table 2 shows the eQTL P-values of rs56848735 and rs1639123 for genes in this region. CHD4 gene is significant in both GTEx and eQTLGen databases for both candidate causal variants.


Table 2. eQTL P-values of rs56848735 and rs1639123 for genes around this locus.

CD27 TAPBPL VAMP1 ZNF384 GPR162 CHD4 LPAR5
rs56848735 GTEx - - - - - 1.4e-5 -
eQTLGen 5.3e-7 2.8e-8 3.2e-33 3.2e-10 5.6e-9 4.5e-24 5.6e-200
rs1639123 GTEx - - - - - 6.2e-7 -
eQTLGen 8.8e-7 6.7e-16 5.8e-12 9.4e-9 1.6e-5 8.8e-28 2.4e-211

We next performed colocalization study between eQTL and GWAS summary statistics in this region. CHD4 gene is colocalized with GWAS signal in both GTEx and eQTLGen databases. The locus compare plots show the GWAS and eQTL signals are consistent in the cluster of high LD SNPs. However, it looks have some LD inconsistence beteween GWAS and eQTL, because the GTEx and eQTLGen data are mainly based on European population, but our GWAS analysis is based on East Aisn population. The LD situation will be much better if we use a East Asian whole-blood eQTL database.


Colocalization.png
Fig 2. Colocalization and locus compare between GWAS and eQTL summary statistics.

The colocalization analysis shows this locus possibly regulate the expression of CHD4 gene. To further validate this hypothesis, we downloaded the a multiome dataset from 10X genomics. It is a scATAC-seq and scRNA-seq dataset that contains 3,000 cells from PBMC. The scATAC-seq data shows there are open chromatin peaks in both candidate causal variants. It means they are located in the regulatory region, and transcription factor could bind to these two peaks region and regulate gene expression. To identify the gene regulated by them, we tested the association between gene expression ~ peaks fragment by running hurdle negative binomial regression using Open4Gene. The result shows CHD4’s expression is significantly associated with the scATAC peak fragment counts of rs1639123. It suggests the rs1639123 is possibly the causal variant in this locus, and it regulates the expression of CHD4 gene.


Open4Gene.png
Fig 3. scATAC peaks in two candidate causal variants.

We are collecting and merging larger multiome dataset to further replicate our finding here. The manuscript is under preparating, and will be available soon.