Web supplementary information for "Mapping eQTL networks with mixed graphical Markov models" by I. Tur, A. Roverato and R. Castelo, Genetics, 198:1377-1383, 2014
In this web supplementary information we provide R scripts and data to help reproducing the results contained in our paper "Mapping eQTL networks with mixed graphical models", published by the journal Genetics. To run the code from the figures you would have first to run the code at the bottom with the whole analysis which can take a long time. You can spare that step by loading the following resulting objects:cross.Brem.RData | Brem and Kruglyak (2005) data as R/qtl cross object. |
param.RData | Parameter object derived from March 2014 annotations. |
eqtlnet.q0.fdr.nrr.sel.RData | Final resulting object from the analysis. |
RBPids.txt | RNA-binding protein ORF identifiers downloaded from Biomart. |
These are the scripts for the figures.
Figure 1: | Figure1.R |
Figure 2: | Figure2.R |
Figure 3: | Figure3.R |
Figure 4: | The PDF file qpgraphComparison.pdf contains a vignette describing all the steps to obtain the performance comparison between qpgraph and qtlhot (CMSTs) shown in in this figure. A tar ball with the code and all the intermediate resulting files can be found in this link. |
Figure 5: | Figure5.R |
Figure 6: | Figure6.R |
In the paper we use the currently under development version of qpgraph which will become release in the coming month of October, 2014. Using this version of the software and the data we estimated the eQTL network with the following function calls. These functions are still undocumented but they will be in the forthcoming release of Bioconductor:
> library(qtl) > library(qpgraph) > library(snow) > library(rlecuyer) > library(Rmpi) > library(BiocParallel) > load("cross.Brem.RData") ## load objects 'cross.Brem' and 'pMap' > param <- eQTLnetworkEstimationParam(cross.Brem, physicalMap=pMap, + geneAnnotation="TxDb.Scerevisiae.UCSC.sacCer3.sgdGene") > eqtlnet.q0 <- eQTLnetworkEstimate(param, ~ marker + gene, + verbose=TRUE, + BPPARAM=SnowParam(workers=10, type="MPI")) > eqtlnet.q0 eQTLnetwork object: Organism: Saccharomyces cerevisiae Genome: sacCer3 Gene annotation: UCSC Table sgdGene Input size: 1857 markers 6141 genes Model formula: ~marker + gene > eqtlnet.q0.fdr <- eQTLnetworkEstimate(param, + estimate=eqtlnet.q0, + p.value=0.01, method="fdr") > eqtlnet.q0.fdr eQTLnetwork object: Organism: Saccharomyces cerevisiae Genome: sacCer3 Gene annotation: UCSC Table sgdGene Input size: 1857 markers 6141 genes Model formula: ~marker + gene G^(0): 7998 vertices and 2295829 edges corresponding to 92710 eQTL and 2203119 gene-gene associations meeting a fdr-adjusted p-value < 0.01 and involving 6141 genes and 1857 eQTLs > eqtlnet.q0.fdr.nrr <- eQTLnetworkEstimate(param, + ~ marker + gene | gene(q=25,50,75,100), + estimate=eqtlnet.q0.fdr, verbose=TRUE, + BPPARAM=SnowParam(workers=10, type="MPI")) > eqtlnet.q0.fdr.nrr <- eQTLnetworkEstimate(param, + estimate=eqtlnet.q0.fdr.nrr, epsilon=0.1) > eqtlnet.q0.fdr.nrr eQTLnetwork object: Organism: Saccharomyces cerevisiae Genome: sacCer3 Gene annotation: UCSC Table sgdGene Input size: 1857 markers 6141 genes Model formula: ~marker | gene(q = 0, 25, 50, 75, 100) G^(0,25,50,75,100): 7998 vertices and 5297 edges corresponding to 3498 eQTL and 1799 gene-gene associations meeting a fdr-adjusted p-value < 0.01, a non-rejection rate epsilon < 0.10 and involving 6141 genes and 1857 eQTLs > ## number of genes involved in at least one eQTL > eqtls <- alleQTL(eqtlnet.q0.fdr.nrr) > length(unique(eqtls$gene)) [1] 450 > ## median number of eQTLs per gene > median(sapply(split(eqtls$QTL, eqtls$gene), length)) [1] 6 > ## fraction of genes with > 10 eQTLs on the same chromosome > eQTLxChr <- split(eqtls, eqtls$chrom) > eQTLxGenexChr <- lapply(eQTLxChr, function(x) split(x$QTL, x$gene)) > neQTLxGenexChr <- unlist(lapply(eQTLxGenexChr, sapply, length), + use.names=FALSE) > mean(neQTLxGenexChr > 10) [1] 0.2737307 > eqtlnet.q0.fdr.nrr.sel <- eQTLnetworkEstimate(param, + estimate=eqtlnet.q0.fdr.nrr, alpha=0.05) > eqtlnet.q0.fdr.nrr.sel eQTLnetwork object: Organism: Saccharomyces cerevisiae Genome: sacCer3 Gene annotation: UCSC Table sgdGene Input size: 1857 markers 6141 genes Model formula: ~marker | gene(q = 0, 25, 50, 75, 100) G^(0,25,50,75,100,*): 1781 vertices and 2300 edges corresponding to 500 eQTL and 1799 gene-gene associations meeting a fdr-adjusted p-value < 0.01, a non-rejection rate epsilon < 0.10, a forward eQTL selection significance level alpha < 0.05 and involving 1391 genes and 389 loci > ## number of genes involved in at least one eQTL > seleqtls <- alleQTL(eqtlnet.q0.fdr.nrr.sel) > length(unique(seleqtls$gene)) [1] 450 > ## distribution of the number of eQTLs per gene > table(sapply(split(seleqtls$QTL, seleqtls$gene), length)) 1 2 3 402 46 2 > ## number of genes without any eQTL >length(geneNames(eqtlnet.q0.fdr.nrr.sel))-length(unique(seleqtls$gene)) [1] 941
R session information
The code in the previous R scripts was run with the following configuration of R:
> sessionInfo()
R version 3.1.0 (2014-04-10)
Platform: x86_64-unknown-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_US.UTF8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF8 LC_COLLATE=en_US.UTF8
[5] LC_MONETARY=en_US.UTF8 LC_MESSAGES=en_US.UTF8
[7] LC_PAPER=en_US.UTF8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF8 LC_IDENTIFICATION=C
attached base packages:
[1] qpgraph_1.21.25 vimcom_0.9-93 setwidth_1.0-3
[4] colorout_1.0-2
other attached packages:
[1] qpgraph_1.99.2 vimcom_0.9-93 setwidth_1.0-3
[4] colorout_1.0-2
loaded via a namespace (and not attached):
[1] annotate_1.43.5 AnnotationDbi_1.27.10
[3] BatchJobs_1.3 BBmisc_1.7
[5] Biobase_2.25.0 BiocGenerics_0.11.5
[7] BiocParallel_0.99.19 biomaRt_2.21.1
[9] Biostrings_2.33.14 bitops_1.0-6
[11] brew_1.0-6 checkmate_1.4
[13] codetools_0.2-9 DBI_0.3.0
[15] digest_0.6.4 fail_1.2
[17] foreach_1.4.2 genefilter_1.47.6
[19] GenomeInfoDb_1.1.19 GenomicAlignments_1.1.29
[21] GenomicFeatures_1.17.14 GenomicRanges_1.17.40
[23] GGBase_3.27.3 graph_1.43.0
[25] grid_3.1.0 IRanges_1.99.28
[27] iterators_1.0.7 lattice_0.20-29
[29] Matrix_1.1-4 mvtnorm_1.0-0
[31] parallel_3.1.0 qtl_1.33-7
[33] RCurl_1.95-4.3 Rgraphviz_2.9.1
[35] Rsamtools_1.17.33 RSQLite_0.11.4
[37] rtracklayer_1.25.16 S4Vectors_0.2.4
[39] sendmailR_1.1-2 snpStats_1.15.4
[41] splines_3.1.0 stats4_3.1.0
[43] stringr_0.6.2 survival_2.37-7
[45] tools_3.1.0 XML_3.98-1.1
[47] xtable_1.7-4 XVector_0.5.8
[49] zlibbioc_1.11.1