Getting network statistics¶
Centrality¶
How scores are used in seidr stats
¶
We typically use scores as measures of similarity in seidr
workflows. This means that higher is better. As an example in this network:
A B 1
A C 0.5
the edge A<->B is stronger than A<->C. In centrality metrics we often use weights as either similarity, or distance. When we e.g., calculate betweenness centrality, we want to know the shortest path from A to B, therefore weights are usually interpreted as distances here, and therefor lower is better.
By default seidr
assumes that weights are similarities and handles them as such. When sensible, it will use \(\frac{1}{w}\). If your data represents a distance, you must use the flag --weight-is-distance
, otherwise your outcome will be wrong. If you set this flag, seidr will use \(\frac{1}{w}\) for calculations where it assumes the weight indicates a similarity (i.e. the behaviour is inverse). See metrics below where the similarity [S] and distance [D] metrics are indicated.
Metrics¶
seidr
can calculate a limited number of network centrality statistics on SeidrFiles
.
On any SeidrFile
you can run:
seidr stats seidrfile.sf
to calculate the network centrality statistics. By default all metrics that can be calculated will be. Use the -m,--metrics
option to control this (see above as to the meaning of [S] and [D]):
For nodes
PR
- PageRank [S]CLO
- Closeness centrality [D]BTW
- Betweenness centrality [D]STR
- Strength (weighted degree) centrality [S]EV
- Eigenvector centrality [S]KTZ
- Katz centrality [S]LPC
- Laplacian centrality [S]
For edges
SEC
- Spanning edge centrality [D]EBC
- Edge betweenness centrality [D]
To select only few of these run e.g.:
stats stats -m BTW,CLO seidrfile.sf
Approximate vs exact¶
By default, seidr
uses approximations where it can to compute centrality statistics. It will sample -n,--nsamples
nodes to do so. If not specified, that number is 10% of nodes. If your network is small, you can turn on exact metrics with -e,--exact
.
Viewing stats¶
You can view node level statistics with:
seidr view --centrality seidrfile.sf
Edge level statistics are stored as edge attributes. You can add tags to see which attributes correspond to which stat:
seidr view -a seidrfile.sf