ANOVERENCE¶
Please not that it is currently not recommended to run ANOVERENCE due to inconsitencies with the original implementation that we were not able to clarify with the original author
ANOVERENCE ([Kueffner2012]) employs the \(\eta^2\) metric, a nonlinear correlation coefficient derived from an analysis of variance (ANOVA) ([Cohen1973]). It is one of the few methods that make direct use of experiment metadata, like perturbations, knockouts and overexpressions.
Running ANOVERENCE¶
ANOVERENCE needs a minimum of three input files:
-i, --infile
: An expression matrix (genes are columns, samples are rows) without headers.-g, --genes
: A file containing gene names that correspond to columns in the expression matrix.-e, --features
: A file that contains experiment metadata.
Here is an example matrix containing expression data for five genes in ten samples:
0.4254475 0.0178292 0.9079888 0.4482474 0.1723238
0.4424002 0.0505248 0.8693676 0.4458513 0.1733112
1.0568470 0.2084539 0.4674478 0.5050774 0.2448833
1.1172264 0.0030010 0.3176543 0.3872039 0.2537921
0.9710677 0.0010565 0.3546514 0.4745322 0.2077183
1.1393856 0.1220468 0.4024654 0.3484362 0.1686139
1.0648694 0.1405077 0.4817628 0.4748571 0.1826433
0.8761173 0.0738140 1.0582917 0.7303661 0.0536562
1.2059661 0.1534070 0.7608608 0.6558457 0.1577311
1.0006755 0.0789863 0.8036309 0.8389751 0.0883061
In the genes files, we provide the column headers for the expression matrix in order:
G1
G2
G3
G4
G5
The metadata file contains eight columns plus one row for each sample. If a
column is not applicable, provide NA
as input. Note that this file has
headers:
Experiment Perturbations PerturbationLevels Treatment DeletedGenes OverexpressedGenes Time Repeat
1 NA NA NA NA NA NA 1
1 NA NA NA NA NA NA 2
2 NA NA NA NA NA NA 1
3 NA NA NA NA NA NA 1
3 NA NA NA NA NA NA 2
4 NA NA NA NA NA NA 1
4 NA NA NA NA NA NA 2
5 NA NA NA NA NA NA 1
5 NA NA NA NA NA NA 2
5 NA NA NA NA NA NA 3
Further we need to provide a -w, --weight
, typically an integral value between
10 and 1000 that controls how much more weight we give to perturbation experiments that involve the genes that are tested. Once we have those four parameters, we are
ready to run anoverence
:
anoverence -i exr_mat.tsv -g genes.txt -e meta.tsv -w 500
As output we receive a lower triangular matrix of interaction scores:
0.288087
0.388856 0.405731
0.459865 0.276648 0.336653
0.432748 0.374432 0.397973 0.403535
Running ANOVERENCE for a subset of genes¶
Often we have only a small number of genes of interest. We can instruct
ANOVERENCE
to only calculate interactions involving those genes by
providing a -t, --targets
file containing these gene names:
G3
G4
And running it with the -t, --targets
options:
anoverence -i expr_mat.tsv -g genes.txt -e meta.tsv -w 500 -t targets.txt
In this case we will receive an edge list as output:
G3 G1 0.388856
G4 G1 0.459865
G3 G2 0.405731
G4 G2 0.276648
G4 G3 0.336653
G3 G5 0.397973
G4 G5 0.403535