Skip to contents

Find recommended overlap threshold for EnrichMap

Usage

mem_find_overlap(
  mem,
  overlap_range = c(0.1, 0.99),
  max_cutoff = 0.4,
  adjust = -0.01,
  debug = FALSE,
  ...
)

Arguments

mem

list output from multiEnrichMap()

overlap_range

numeric range of Jaccard overlap values, default 0.1, 0.99 using step 0.01.

max_cutoff

numeric value between 0 and 1, to define the maximum fraction of nodes in the largest connected component, compared to the total number of non-singlet nodes.

adjust

numeric used to adjust the final overlap, default -0.01 will use the overlap one step before the max O score.

debug

logical indicating whether to return full debug data, which is used internally to determine the best overlap cutoff to use.

...

additional arguments are passed to mem2emap().

Value

numeric value with recommended Jaccard overlap coefficient.

Details

It implements a straightforward approach to determine a reasonable Jaccard overlap threshold for EnrichMap data, and is still very much open to improvement after more experience using it on varied datasets.

The method finds the overlap threshold at which the first connected component is no more than max_cutoff fraction of the whole network. This fraction is defined as the number of nodes in the largest connected component, divided by the total number of non-singlet nodes. When all nodes are connected, this fraction == 1.

We found empirically that a max_cutoff=0.4, the point at which the largest connected component contains no more than 40% of all nodes, seems to be a reasonably good place to start.