This package provides a number of plot wrappers that create common
bioinformatic analyses visualizations and work with results from our
other packages, such as kimma
and SEARchways
.
As such, you will need to install and load several packages to access
example data.
This pipeline should be completed in R version 4.5.0 (2025-04-11) or newer. Please also download the following packages.
# CRAN packages
install.packages(c("tidyverse","patchwork"))
# Bioconductor packages
## These are dependencies for kimma that are beneficial to download separately
install.packages("BiocManager")
BiocManager::install(c("limma", "edgeR"))
# GitHub packages
install.packages("devtools")
devtools::install_github("BIGslu/kimma")
devtools::install_github("BIGslu/BIGpicture")
devtools::install_github("BIGslu/RNAetc")
devtools::install_github("BIGslu/SEARchways")
And load them into your current R session.
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ ggplot2 3.5.2 ✔ tibble 3.2.1
## ✔ lubridate 1.9.4 ✔ tidyr 1.3.1
## ✔ purrr 1.0.4
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(edgeR)
## Loading required package: limma
library(limma)
library(kimma)
library(BIGpicture)
library(SEARchways)
library(patchwork)
To ensure reproducibility with this document, set the random seed value.
set.seed(651)
Example data were obtained from media controls and human rhinovirus (HRV) infected human plasmacytoid dendritic cells. You can learn more about these data in
Dill-McFarland KA, Schwartz JT, Zhao H, Shao B, Fulkerson PC, Altman MC, Gill MA. 2022. Eosinophil-mediated suppression and Anti-IL-5 enhancement of plasmacytoid dendritic cell interferon responses in asthma. J Allergy Clin Immunol. Epub ahead of print. doi: 10.1016/j.jaci.2022.03.025. — GitHub
Specifically, this vignette uses bulk human RNA-seq processed from fastq to counts using SEAsnake and counts to voom using edgeR/limma. This results in voom-normalized, log2 counts per million (CPM) gene expression and associated sample and gene metadata. In total, 6 donors and 1000 random genes were selected for this vignette. Actual donor identifiers have been altered for privacy.
Unnormalized RNAseq counts and metadata.
example.dat <- kimma::example.dat
names(example.dat)
## [1] "counts" "samples" "genes"
Voom normalized RNAseq counts and metadata.
example.voom <- kimma::example.voom
names(example.voom)
## [1] "genes" "targets" "E" "weights" "design"
kmFit model results.
example.model <- BIGpicture::example.model
Create an additional kmFit model for plotting.
example.model2 <- kimma::kmFit(
dat = example.voom,
model = "~virus*asthma + median_cv_coverage + (1|ptID)",
run_lme = TRUE, use_weights = TRUE, metrics = TRUE)
## lme/lmerel model: expression~virus*asthma+median_cv_coverage+(1|ptID)
## Input: 12 libraries from 6 unique patients
## Model: 12 libraries
## Complete: 1000 genes
## Failed: 0 genes
Create example enrichment results.
enrich <- SEARchways::flexEnrich(
gene_list = list("HRV1"=names(SEARchways::example.gene.list$HRV1),
"HRV2"=names(SEARchways::example.gene.list$HRV2)),
ID = "ENSEMBL", species = "human",
collection = "H")
gsea <- SEARchways::BIGsea(
gene_list = SEARchways::example.gene.list,
ID = "ENSEMBL", species = "human",
collection = "C2", subcollection = "CP")
## Running HRV1
## Running HRV2
## Warning in preparePathwaysAndStats(pathways, stats, minSize, maxSize, gseaParam, : There are ties in the preranked stats (1% of the list).
## The order of those tied genes will be arbitrary, which may produce unexpected results.
Used in RNAseq data cleaning counts to voom to visualize the association of expression on the X with variance on the Y.
plot_mv(dat = example.dat, design = "~ virus")
plot_pca
: PCA all in oneCalculate, annotate, and plot a PCA representing gene expression. By
default, we assume unique library IDs are contained in the column
libID
. Change this with the libraryID
parameter.
When running on unnormalized data, transform the counts to log counts per million.
plot_pca(dat = example.dat,
vars = "virus",
scale = TRUE,
transform_logCPM = TRUE)
## Joining with `by = join_by(libID)`
## $virus
When running on normalized data, don’t transform the data (default).
plot_pca(dat = example.voom,
vars = "virus",
scale = TRUE)
## Joining with `by = join_by(libID)`
## $virus
Automatically identify outliers in PC space. See
outlier_group
parameter to define outliers within groups
independently.
plot_pca(dat = example.voom,
vars = "outlier", outlier_sd = 3,
scale = TRUE)
## Joining with `by = join_by(libID)`
## $outlier
Plot multiple variables and visualize together with
patchwork
.
plot_pca(dat = example.voom,
vars = c("virus","asthma","median_cv_coverage", "norm.factors"),
scale = TRUE) %>%
patchwork::wrap_plots(nrow=2)
## Joining with `by = join_by(libID)`
Other parameters
PCx
and PCy
outlier_group
libraryID
Coming soon: plot_pca2
plot_fit2
: Compare fit metricsThese functions use the output from kimma::kmFit
.
Compare model fits between 2 models. Note that it is important to
choose comparable models - this means the same number of samples and
changing only 1 model attribute at a time e.g. adding a covariate,
adding weights, adding a random effect, etc. Be careful if you have
missing data. For example, if you have ~ variable
vs
~ variable + covariate
but the covariate
has
some missing data, changes in fit could be due to the
covariate
itself and/or the smaller sample size since
samples missing any model variable are removed during
modeling.
Here, we compare a model with and without kinship correction. The kmFit result has two model results (lme and lmerel), so you need only 1 model result input. Note the results printed in the console. These summarize genes better fit by one model or another. An absolute change in AIC > 7 is significant, change from 2-7 is moderate, and < 2 is non-significant. These cutoffs are also shown as red lines in the plot.
names(example.model)
## [1] "lme" "lmerel" "lme.contrast" "lmerel.contrast"
## [5] "lme.fit" "lmerel.fit"
plot_fit2(model_result = example.model,
x="lme", y="lmerel", metrics = "AIC")
## Summary: AIC and BIC
## Best fit Metric Total genes Mean delta Stdev delta
## 1 Nonsignif (< 2) AIC 1000 0.2471154 0.3143377
If you’re models are in two separate objects:
plot_fit2(model_result = example.model, x="lme",
model_result_y = example.model2, y="lme",
metrics = "AIC")
## Summary: AIC and BIC
## Best fit
## 1 Moderate (2-7) lme_virus_asthma_median_cv_coverage_virus:asthma_(1 | ptID)
## 2 Nonsignif (< 2)
## 3 Signif (> 7) lme_virus_asthma_median_cv_coverage_virus:asthma_(1 | ptID)
## Metric Total genes Mean delta Stdev delta
## 1 AIC 727 4.439370 1.3281120
## 2 AIC 110 1.258108 0.5218337
## 3 AIC 163 9.782923 2.6771227
Note that the model name is not very readable. You can change the labels.
plot_fit2(model_result = example.model, x="lme",
x_label = "no covariate",
model_result_y = example.model2, y="lme",
y_label = "+medianCV",
metrics = "AIC")
## Summary: AIC and BIC
## Best fit Metric Total genes Mean delta Stdev delta
## 1 Moderate (2-7) +medianCV AIC 727 4.439370 1.3281120
## 2 Nonsignif (< 2) AIC 110 1.258108 0.5218337
## 3 Signif (> 7) +medianCV AIC 163 9.782923 2.6771227
Plot a different metric. Choose anything in
example.model$lme.fit
or similar. Note that the cutoffs
only apply to AIC and BIC
plot_fit2(model_result = example.model,
x="lme", y="lmerel",
metrics = c("BIC","Rsq"))
## Summary: AIC and BIC
## Best fit Metric Total genes Mean delta Stdev delta
## 1 Nonsignif (< 2) BIC 1000 0.2471154 0.3143377
## Summary: Other metrics
## Best fit Metric Total genes Mean delta Stdev delta
## 1 lme Rsq 273 0.01979372 0.02558724
## 2 lmerel Rsq 466 0.01729597 0.01973940
## 3 none Rsq 261 0.00000000 0.00000000
Add dots to see outlying genes.
plot_fit2(model_result = example.model, x="lme",
x_label = "no covariate",
model_result_y = example.model2, y="lme",
y_label = "+medianCV",
metrics = "AIC",
outliers = TRUE)
## Joining with `by = join_by(name)`
## Summary: AIC and BIC
## Best fit Metric Total genes Mean delta Stdev delta
## 1 Moderate (2-7) +medianCV AIC 727 4.439370 1.3281120
## 2 Nonsignif (< 2) AIC 110 1.258108 0.5218337
## 3 Signif (> 7) +medianCV AIC 163 9.782923 2.6771227
Label genes with the largest changes in fit.
plot_fit2(model_result = example.model, x="lme",
x_label = "no covariate",
model_result_y = example.model2, y="lme",
y_label = "+medianCV",
metrics = "AIC",
label = 3)
## Summary: AIC and BIC
## Best fit Metric Total genes Mean delta Stdev delta
## 1 Moderate (2-7) +medianCV AIC 727 4.439370 1.3281120
## 2 Nonsignif (< 2) AIC 110 1.258108 0.5218337
## 3 Signif (> 7) +medianCV AIC 163 9.782923 2.6771227
Change the labels to another gene identifier.
plot_fit2(model_result = example.model, x="lme",
x_label = "no covariate",
model_result_y = example.model2, y="lme",
y_label = "+medianCV",
metrics = "AIC",
label = 3,
genes = example.voom$genes, genes_label = "hgnc_symbol")
## Summary: AIC and BIC
## Best fit Metric Total genes Mean delta Stdev delta
## 1 Moderate (2-7) +medianCV AIC 727 4.439370 1.3281120
## 2 Nonsignif (< 2) AIC 110 1.258108 0.5218337
## 3 Signif (> 7) +medianCV AIC 163 9.782923 2.6771227
Only plot a subset of genes.
plot_fit2(model_result = example.model, x="lme",
x_label = "no covariate",
model_result_y = example.model2, y="lme",
y_label = "+medianCV",
metrics = "AIC",
subset_genes = example.voom$genes$geneName[1:100])
## Summary: AIC and BIC
## Best fit Metric Total genes Mean delta Stdev delta
## 1 Moderate (2-7) +medianCV AIC 71 4.398087 1.2628561
## 2 Nonsignif (< 2) AIC 12 1.380337 0.5232679
## 3 Signif (> 7) +medianCV AIC 17 9.942274 2.3146413
plot_volcano
: Volcano plotPlot all variables from one model.
plot_volcano(model_result = example.model, model = "lme")
Select variable(s) of interest.
plot_volcano(model_result = example.model, model = "lme",
variables = "virus")
Plot uncorrected P-values instead of FDR.
plot_volcano(model_result = example.model, model = "lme",
variables = "virus",
y="pval")
Color significant genes based on FDR and/or estimate.
plot_volcano(model_result = example.model, model = "lme",
variables = "virus",
x_cutoff = 1, y_cutoff = 0.05)
Label most significant genes.
plot_volcano(model_result = example.model, model = "lme",
variables = "virus",
x_cutoff = 1, y_cutoff = 0.05,
label = 3)
Change the gene labels to a different gene indentifier.
plot_volcano(model_result = example.model, model = "lme",
variables = "virus",
x_cutoff = 1, y_cutoff = 0.05,
label = 3,
genes = example.voom$genes,
genes_label = "hgnc_symbol")
plot_venn_genes
: Venn plot of DEGsFunctions the same as plot_upset_genes
.
Plot all variables at multiple FDR cutoffs.
plot_venn_genes(model_result = example.model,
models = "lme")$venn %>%
patchwork::wrap_plots(ncol=2)
Select specific FDR cutoff(s).
plot_venn_genes(model_result = example.model,
models = "lme",
fdr_cutoff = 0.05)
## $venn
## $venn$`0.05`
Select variable(s) of interest.
plot_venn_genes(model_result = example.model,
models = "lme",
fdr_cutoff = 0.05,
variables = c("asthma","virus"))
## $venn
## $venn$`0.05`
If you’re using a contrasts model, specify contrasts instead.
plot_venn_genes(model_result = example.model,
models = "lme.contrast",
fdr_cutoff = 0.05,
contrasts = c("HRV healthy - none healthy",
"HRV asthma - none asthma"))
## $venn
## $venn$`0.05`
Include the random effect and/or intercept if the model has one.
plot_venn_genes(model_result = example.model,
models = "lme",
fdr_cutoff = 0.05,
random = TRUE, intercept = TRUE)
## $venn
## $venn$`0.05`
Save the genes in each part of the Venn’s overlap for downstream analyses.
venn_result <- plot_venn_genes(model_result = example.model,
models = "lme",
fdr_cutoff = 0.05,
return_genes = TRUE)
venn_result$venn
## $`0.05`
venn_result$gene$`0.05` %>% head
## gene virus:asthma asthma virus
## 1 ENSG00000002587 <NA> <NA> Y
## 2 ENSG00000013810 <NA> Y Y
## 3 ENSG00000023287 <NA> <NA> Y
## 4 ENSG00000034510 <NA> <NA> Y
## 5 ENSG00000050130 <NA> <NA> Y
## 6 ENSG00000054118 <NA> <NA> Y
Compare DEGs in two different models.
plot_venn_genes(model_result = list("noCov"=example.model,
"medianCV"=example.model2),
models = "lme",
variables = "virus",
fdr_cutoff = 0.05)
## $venn
## $venn$`0.05`
Note that labels might get cutoff if they are too long, so you may
need to expand the plot limits. A good way to do this is to first find
the current limits by adding a theme with xy axes and then setting
lims
on the single Venn result. For example, here we expand
the y limits (even though the labels are not cutoff).
plot_venn_genes(model_result = list("noCov"=example.model,
"medianCV"=example.model2),
models = c("lme","lme"),
variables = "virus",
fdr_cutoff = 0.05)$venn$`0.05` +
theme_bw()
plot_venn_genes(model_result = list("noCov"=example.model,
"medianCV"=example.model2),
models = c("lme","lme"),
variables = "virus",
fdr_cutoff = 0.05)$venn$`0.05` +
lims(y=c(-1,2))
## Scale for y is already present.
## Adding another scale for y, which will replace the existing scale.
plot_upset_genes
: Upset plot of DEGsFunctions the same as plot_venn_genes
.
Plot all variables at multiple FDR cutoffs.
plot_upset_genes(model_result = example.model,
models = "lme")$upset %>%
patchwork::wrap_plots(ncol=2)
Select specific FDR cutoff(s).
plot_upset_genes(model_result = example.model,
models = "lme",
fdr_cutoff = 0.05)
## $upset
## $upset$`0.05`
Select variable(s) of interest.
plot_upset_genes(model_result = example.model,
models = "lme",
fdr_cutoff = 0.05,
variables = c("asthma","virus"))
## $upset
## $upset$`0.05`
If you’re using a contrasts model, specify contrasts instead.
plot_upset_genes(model_result = example.model,
models = "lme.contrast",
fdr_cutoff = 0.5,
contrasts = c("HRV healthy - none healthy",
"HRV asthma - none asthma"))
## $upset
## $upset$`0.5`
Include the random effect and/or intercept if the model has one.
plot_upset_genes(model_result = example.model,
models = "lme",
fdr_cutoff = 0.05,
random = TRUE, intercept = TRUE)
## $upset
## $upset$`0.05`
Save the genes in each part of the Venn’s overlap for downstream analyses.
upset_result <- plot_upset_genes(model_result = example.model,
models = "lme",
fdr_cutoff = 0.05,
return_genes = TRUE)
upset_result$upset
## $`0.05`
upset_result$gene$`0.05` %>% head
## # A tibble: 6 × 4
## gene virus asthma `virus:asthma`
## <chr> <chr> <chr> <chr>
## 1 ENSG00000002587 Y <NA> <NA>
## 2 ENSG00000005884 <NA> Y <NA>
## 3 ENSG00000006712 <NA> Y <NA>
## 4 ENSG00000013810 Y Y <NA>
## 5 ENSG00000018610 <NA> Y <NA>
## 6 ENSG00000023287 Y <NA> <NA>
Compare DEGs in two different models.
plot_upset_genes(model_result = list("noCov"=example.model,
"medianCV"=example.model2),
models = "lme",
variables = "virus",
fdr_cutoff = 0.05)
## $upset
## $upset$`0.05`
plot_genes
: Expression and model result summary for 1
geneA single wrapper function to get a table of model result and boxplots associated with model variables for one gene at a time.
plot_genes(dat = example.voom,
fdr = example.model$lme,
subset_genes = "ENSG00000173950",
variables = "virus")
## $ENSG00000173950
Plot multiple variables including interaction terms, which also add the main model terms associated with the interaction (e.g. if A:B, also plots A and B)
plot_genes(dat = example.voom,
fdr = example.model$lme,
subset_genes = "ENSG00000173950",
variables = "virus:asthma")
## $ENSG00000173950
Color boxplots by a variable of interest. This can be a variable already being plotted or a new variable.
plot_genes(dat = example.voom,
fdr = example.model$lme,
subset_genes = "ENSG00000173950",
variables = c("virus","asthma"),
colorID = "virus")
## $ENSG00000173950
plot_genes(dat = example.voom,
fdr = example.model$lme,
subset_genes = "ENSG00000173950",
variables = c("virus","asthma"),
colorID = "ptID")
## $ENSG00000173950
Plot many genes quickly using multiple processors. Not run in this document to save space.
deg <- example.model$lme %>% filter(FDR<0.001) %>% pull(gene)
length(deg)
plot.ls <- plot_genes(dat = example.voom,
fdr = example.model$lme,
subset_genes = deg,
variables = "virus",
processors = 6)
#Save each plot
#Change settings as needed
outdir <- "~/Desktop/"
fig.width <- 8
fig.height <- 5
for(gene in names(plot.ls)){
f <- paste0(outdir, gene, ".png")
ggsave(filename = f, plot = plot.ls[[gene]],
width = fig.width, height = fig.height)
}
Other parameters
edgeR
or
voom
object with counts
, meta
,
and optionally genes
libraryID
and geneID
to change themThese functions use the outputs from
SEARchways::BIGprofiler
and
SEARchways::flexEnrich
.
plot_enrich2
: Lollipop plotThe default plot for plot_enrich2
is what we call a
lollipop plot, and it only shows results with FDR < 0.2.
plot_enrich2(df = enrich)
plot_enrich2
: DotplotChange the chart_style
to a dotplot
plot_enrich2(df = enrich,
fdr_cutoff = 0.3,
chart_style = "dot")
Use an outline to denote results below an FDR cutoff. Note this only applies to dotplots.
plot_enrich2(df = enrich,
fdr_cutoff = 0.3,
chart_style = "dot",
dot_sig_cutoff = 0.2)
plot_enrich2
: BarplotChange the chart_style
to a barplot.
Note that barplots can only be created with a single group of enrichment results.
plot_enrich2(df = enrich %>%
dplyr::filter(group=="HRV2"),
fdr_cutoff = 0.3,
chart_style = "bar")
plot_enrich2
: Shared parametersYou can change the FDR cutoff and/or FDR colors groups.
plot_enrich2(df = enrich,
fdr_cutoff = 0.3,
fdr_binned_colors = c(0.2, 0.3, 1))
For set the FDR coloring to a log color gradient.
plot_enrich2(df = enrich,
fdr_cutoff = 0.3,
fdr_continuous = TRUE)
Order the gene sets / pathways. By default, they are ordered by fdr. You can also order by “hclust” of gene membership, “overlap_size”, “gs_size”, “ratio”, “fdr”, or “input”.
plot_enrich2(df = enrich,
fdr_cutoff = 0.3,
y_grouping_method = "gs_size")
plot_enrich2(df = enrich,
fdr_cutoff = 0.3,
y_grouping_method = "hclust")
#etc...
Add total gene set sizes.
plot_enrich2(df = enrich,
fdr_cutoff = 0.3,
include_gssize = TRUE)
Other parameters
SEARchways
results with _col
parameters
(pathway_col
, gene_col
,
ratio_col
, fdr_col
, gssize_col
,
dot_groupcol
)
plot_enrichgrid
: Add gene level info to
plot_enrich2
plot#Modify pathway names for a prettier plot
enrich <- enrich %>%
mutate(pathway = gsub("HALLMARK_","",pathway),
pathway = gsub("_","\n",pathway))
plot_enrichgrid(df = enrich,
df_group = "HRV2")
plot_enrichgrid(df = enrich,
df_group = "HRV2",
chart_style = "lollipop")
Order genes in the heatmap including “hclust”, “prevalence”, “input”, “alphabetical”, or “geneset”.
plot_enrichgrid(df = enrich,
df_group = "HRV2",
x_grouping_method = "prevalence")
Color genes by a binary prevalence cutoff instead of a heatmap.
plot_enrichgrid(df = enrich,
df_group = "HRV2",
prevalence_color = "cutoff",
prevalence_cutoff = 0.5)
Only plot specific genes of interest.
plot_enrichgrid(df = enrich,
df_group = "HRV2",
custom_genelist = c("ENSG00000023228", "ENSG00000158864"))
Quickly get the equivalent plot without the gene grid without having
to convert to plot_enrich2
.
plot_enrichgrid(df = enrich,
df_group = "HRV2",
include_grid = FALSE)
Other parameters
plot_enrich2
Shared
parameters for info on fdr_cutoff
,
fdr_binned_colors
, y_grouping_method
,
pathway_col
, gene_col
, ratio_col
,
fdr_col
, gssize_col
chart_style
only supports “bar” or “lollipop”
for this functionplot_gsea
: Lollipop plotPlot a lollipop plot of GSEA normalized enrichment scores (NES). Will
be superseded by plot_fit2
.
Here, we use a ridiculous FDR cutoff, because there are actually no significant results in the example data.
plot_gsea(gsea = gsea,
fdr_cutoff = 0.8, fdr_colors = c(0.5))
Get some genes of interest. These can be ENSEMBL or HGNC. Here, we pull some genes from gene sets in MSigDB, but you may wish to plot DEGs, GSEA leading edge, etc.
db <- msigdbr::msigdbr(species = "Homo sapiens", collection = "H")
genes <- db %>%
filter(gs_name %in% c("HALLMARK_OXIDATIVE_PHOSPHORYLATION",
"HALLMARK_HYPOXIA"))%>%
pull(ensembl_gene) %>% unique()
genes <- genes[c(1:20,250:270)]
#Add some results from enrichment so plot is more interesting
genes <- unique(c(genes, "ENSG00000079739", "ENSG00000105220", "ENSG00000114023",
"ENSG00000122863", "ENSG00000136521", "ENSG00000154174",
"ENSG00000158864", "ENSG00000023228"))
map_string
: Map genes to STRING databaseMap genes of interest to the STRING network database. You can update the STRING version as needed. By default, connections are retained with combined score > 400 (e.g. medium confidence).
map <- map_string(genes, version = 12.0)
names(map)
## [1] "map" "subgraph"
head(map$map)
## gene STRING_id
## 1 ENSG00000144476 9606.ENSP00000272928
## 2 ENSG00000148926 9606.ENSP00000436607
## 3 ENSG00000170425 9606.ENSP00000304501
## 4 ENSG00000162433 9606.ENSP00000378743
## 5 ENSG00000131016 9606.ENSP00000384537
## 6 ENSG00000149925 9606.ENSP00000496166
Change the combined scored threshold. For example, to “strong” confidence connections.
map <- map_string(genes, version = 12.0,
score_threshold = 700)
Other parameters
species = 'mouse'
plot_string
: Plot a STRING mapNow you can plot the STRING network map.
plot_string(map = map)
Control the layout if you don’t like the automatically chosen one.
See options in igraph::layout_with_
plot_string(map = map,
layout = "mds")
plot_string(map = map,
layout = "kk")
Remove orphans (nodes with no edge connections).
plot_string(map = map,
edge_min = 1)
Plot just orphans (nodes with no edge connections).
plot_string(map = map,
edge_max = 0)
Plot just the largest cluster. The large “lgl” layout is usually best for these.
plot_string(map = map, layout = "lgl",
main_cluster_only = TRUE)
Map enrichment or GSEA results from SEARchways
to your
network.
enrich.select <- enrich %>%
filter(group=="HRV2")
plot_string(map = map, layout = "lgl",
main_cluster_only = TRUE,
enrichment = enrich.select)
Change the cutoffs of what enrichment results are included.
plot_string(map = map, layout = "lgl",
main_cluster_only = TRUE,
enrichment = enrich.select,
overlap = 1, fdr_cutoff = 0.5)
Only included genes in selected (by cutoffs) enrichment pathways.
plot_string(map = map, layout = "lgl",
main_cluster_only = TRUE,
enrichment = enrich.select,
overlap = 1, fdr_cutoff = 0.5,
enriched_only = TRUE)
Change the aesthetics of the plot.
plot_string(map = map, layout = "lgl",
main_cluster_only = TRUE,
enrichment = enrich.select,
text_size = 1,
node_size = 2,
colors = c("#CC6677", "#44AA99", "grey"))
get_hm_clust
: Extract genes in
ComplexHeatmap
clustersCreate a heatmap to apply the function to.
library(ComplexHeatmap)
## Loading required package: grid
## ========================================
## ComplexHeatmap version 2.24.0
## Bioconductor page: http://bioconductor.org/packages/ComplexHeatmap/
## Github page: https://github.com/jokergoo/ComplexHeatmap
## Documentation: http://jokergoo.github.io/ComplexHeatmap-reference
##
## If you use it in published research, please cite either one:
## - Gu, Z. Complex Heatmap Visualization. iMeta 2022.
## - Gu, Z. Complex heatmaps reveal patterns and correlations in multidimensional
## genomic data. Bioinformatics 2016.
##
##
## The new InteractiveComplexHeatmap package can directly export static
## complex heatmaps into an interactive Shiny app with zero effort. Have a try!
##
## This message can be suppressed by:
## suppressPackageStartupMessages(library(ComplexHeatmap))
## ========================================
hm.dat <- example.model$lme %>%
select(gene, variable, pval) %>%
pivot_wider(names_from = variable, values_from = pval) %>%
column_to_rownames("gene") %>% as.matrix()
# The km determine cutoffs for clusters
example.hm <- Heatmap(hm.dat, row_km=3, column_km=2)
example.hm <- draw(example.hm)
Extract row (gene usually) clusters.
get_hm_clust(dat = hm.dat,
hm = example.hm, dimension = "row") %>%
head()
## row row_within_cluster cluster
## 581 ENSG00000115568 1 cluster1
## 582 ENSG00000111642 2 cluster1
## 583 ENSG00000061794 3 cluster1
## 584 ENSG00000120875 4 cluster1
## 585 ENSG00000167325 5 cluster1
## 586 ENSG00000164300 6 cluster1
Extract column (variable or sample usually) clusters.
get_hm_clust(dat = hm.dat,
hm = example.hm, dimension = "col")
## col col_within_cluster cluster
## 4 (1 | ptID) 1 cluster1
## 1 virus:asthma 1 cluster2
## 2 asthma 2 cluster2
## 3 virus 3 cluster2
plot_enrich
: Lollipop plotThere is also the older plot_enrich
that only supports
lollipop plots and will be deprecated once plot_enrich2
can
handle GSEA results.
plot_enrich(enrich = enrich)
plot_fit
: Compare fit metricsThere is also the original plot_fit
, which has a subset
of the plot_fit2
parameters. Each point is a gene colored
by the best fit model (delta AIC > or < 0). This plot is less
preferred as it is difficult to see small changes in AIC and many genes
overlap.
plot_fit(model_result = example.model, x="lme",
model_result_y = example.model, y="lmerel")
## Summary
## Best fit Metric Total genes Mean delta
## 1 lme_virus_asthma_virus:asthma_(1 | ptID) AIC 273 0.3958556
## 2 lmerel_virus_asthma_virus:asthma_(1 | ptID) AIC 466 0.2983838
## 3 none AIC 261 0.0000000
## Stdev delta
## 1 0.4044870
## 2 0.2582185
## 3 0.0000000
sessionInfo()
## R version 4.5.0 (2025-04-11)
## Platform: aarch64-apple-darwin20
## Running under: macOS Sequoia 15.4.1
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.12.1
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## time zone: America/Los_Angeles
## tzcode source: internal
##
## attached base packages:
## [1] grid stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] ComplexHeatmap_2.24.0 patchwork_1.3.0 SEARchways_1.5.0
## [4] BIGpicture_1.2.0 kimma_1.6.3 edgeR_4.6.1
## [7] limma_3.64.0 lubridate_1.9.4 forcats_1.0.0
## [10] stringr_1.5.1 dplyr_1.1.4 purrr_1.0.4
## [13] readr_2.1.5 tidyr_1.3.1 tibble_3.2.1
## [16] ggplot2_3.5.2 tidyverse_2.0.0
##
## loaded via a namespace (and not attached):
## [1] RColorBrewer_1.1-3 shape_1.4.6.1 rstudioapi_0.17.1
## [4] jsonlite_2.0.0 magrittr_2.0.3 farver_2.1.2
## [7] rmarkdown_2.29 GlobalOptions_0.1.2 fs_1.6.6
## [10] vctrs_0.6.5 memoise_2.0.1 rstatix_0.7.2
## [13] htmltools_0.5.8.1 plotrix_3.8-4 broom_1.0.8
## [16] Formula_1.2-5 sass_0.4.10 KernSmooth_2.23-26
## [19] bslib_0.9.0 gsubfn_0.7 plyr_1.8.9
## [22] cachem_1.1.0 igraph_2.1.4 lifecycle_1.0.4
## [25] ggnetwork_0.5.13 iterators_1.0.14 pkgconfig_2.0.3
## [28] Matrix_1.7-3 R6_2.6.1 fastmap_1.2.0
## [31] clue_0.3-66 digest_0.6.37 colorspace_2.1-1
## [34] S4Vectors_0.46.0 chron_2.3-62 RSQLite_2.3.9
## [37] ggpubr_0.6.0 labeling_0.4.3 timechange_0.3.0
## [40] httr_1.4.7 polyclip_1.10-7 abind_1.4-8
## [43] compiler_4.5.0 bit64_4.6.0-1 withr_3.0.2
## [46] doParallel_1.0.17 backports_1.5.0 BiocParallel_1.42.0
## [49] carData_3.0-5 viridis_0.6.5 DBI_1.2.3
## [52] ggupset_0.4.1 ggforce_0.4.2 gplots_3.2.0
## [55] ggsignif_0.6.4 MASS_7.3-65 rjson_0.2.23
## [58] gtools_3.9.5 caTools_1.18.3 tools_4.5.0
## [61] scatterpie_0.2.4 msigdbr_10.0.2 glue_1.8.0
## [64] cluster_2.1.8.1 ggvenn_0.1.10 fgsea_1.34.0
## [67] generics_0.1.3 gtable_0.3.6 textshape_1.7.5
## [70] tzdb_0.5.0 data.table_1.17.0 hms_1.1.3
## [73] tidygraph_1.3.1 car_3.1-3 utf8_1.2.4
## [76] BiocGenerics_0.54.0 ggrepel_0.9.6 foreach_1.5.2
## [79] pillar_1.10.2 yulab.utils_0.2.0 babelgene_22.9
## [82] circlize_0.4.16 tweenr_2.0.3 lattice_0.22-7
## [85] bit_4.6.0 tidyselect_1.2.1 locfit_1.5-9.12
## [88] knitr_1.50 gridExtra_2.3 IRanges_2.42.0
## [91] sqldf_0.4-11 stats4_4.5.0 xfun_0.52
## [94] graphlayouts_1.2.2 statmod_1.5.0 matrixStats_1.5.0
## [97] proto_1.0.0 stringi_1.8.7 ggfun_0.1.8
## [100] yaml_2.3.10 evaluate_1.0.3 codetools_0.2-20
## [103] ggraph_2.2.1 hash_2.2.6.3 cli_3.6.5
## [106] STRINGdb_2.20.0 jquerylib_0.1.4 Rcpp_1.0.14
## [109] msigdbdf_24.1.1 png_0.1-8 parallel_4.5.0
## [112] assertthat_0.2.1 blob_1.2.4 bitops_1.0-9
## [115] viridisLite_0.4.2 scales_1.4.0 crayon_1.5.3
## [118] GetoptLong_1.0.5 rlang_1.1.6 cowplot_1.1.3
## [121] fastmatch_1.1-6