Choose interesting annotation colnames from a data.frame
Source:R/jam_choose_annotation_colnames.R
      choose_annotation_colnames.RdChoose interesting annotation colnames from a data.frame
Usage
choose_annotation_colnames(
  df,
  min_reps = 2,
  min_values = 2,
  max_values = Inf,
  keep_numeric = FALSE,
  simplify = TRUE,
  max_colnames = 20,
  ...
)Arguments
- df
 data.framewith annotations that could be interesting to display at the top or side of a heatmap.- min_reps
 numericminimum number of replicates required for a column to be considered interesting. For example,min_reps=3would require any value in a column to be repeated at least3times for that column to be interesting. This filter is intended to remove columns whose values are all unique, such as row identifiers.- min_values
 numericminimum number of unique values required for a column to be considered interesting.- max_values
 numericmaximum number of unique values required for a column to be considered interesting. Too many values and the interest is lost. Also, too many values, and the color key becomes unbearable with too many labels.- keep_numeric
 logicalindicating whether to keep columns withnumericvalues. Whenkeep_numeric == TRUEit will override the rules above.- simplify
 logicalindicating whether to filter out columns whose data already matches another column with 1:1 cardinality. This step requiresplatjam::cardinality()until that function is moved into thejambapackage.- max_colnames
 numericmaximum number of colnames to return. Note that columns are not sorted for priority, so they will be returned in the order they appear indfafter applying the relevant criteria.- ...
 additional arguments are ignored.
Value
character vector of colnames in df that meet the criteria.
If no colnames meet the criteria, this function returns NULL.
See also
Other jamses utilities: 
contrast2comp_dev(),
fold_to_log2fold(),
intercalate(),
list2im_opt(),
log2fold_to_fold(),
make_block_arrow_polygon(),
mark_stat_hits(),
matrix_normalize(),
point_handedness(),
point_slope_intercept(),
shortest_unique_abbreviation(),
shrinkDataFrame(),
shrink_df(),
shrink_matrix(),
sort_samples(),
strsplitOrdered(),
sub_split_vector(),
update_function_params(),
update_list_elements()
Examples
df <- data.frame(
   threereps=paste0("threereps_", letters[c(1,1,1,3,5,7,7)]),
   time=paste0("time_", letters[c(1:7)]),
   tworeps=paste0("tworeps_", letters[c(12,12,14,14,15,15,16)]),
   num=sample(1:7),
   class=paste0("class_", LETTERS[c(1,1,1,3,5,7,7)]),
   blah=rep("blah", 7),
   maxvalues=c("one", "two", "three", "four", "five", "six", "six"))
df
#>     threereps   time   tworeps num   class blah maxvalues
#> 1 threereps_a time_a tworeps_l   5 class_A blah       one
#> 2 threereps_a time_b tworeps_l   4 class_A blah       two
#> 3 threereps_a time_c tworeps_n   7 class_A blah     three
#> 4 threereps_c time_d tworeps_n   2 class_C blah      four
#> 5 threereps_e time_e tworeps_o   3 class_E blah      five
#> 6 threereps_g time_f tworeps_o   1 class_G blah       six
#> 7 threereps_g time_g tworeps_p   6 class_G blah       six
choose_annotation_colnames(df)
#> [1] "threereps" "tworeps"   "maxvalues"
df[,choose_annotation_colnames(df)]
#>     threereps   tworeps maxvalues
#> 1 threereps_a tworeps_l       one
#> 2 threereps_a tworeps_l       two
#> 3 threereps_a tworeps_n     three
#> 4 threereps_c tworeps_n      four
#> 5 threereps_e tworeps_o      five
#> 6 threereps_g tworeps_o       six
#> 7 threereps_g tworeps_p       six
choose_annotation_colnames(df, max_values=5)
#> [1] "threereps" "tworeps"  
df[,choose_annotation_colnames(df, max_values=5)]
#>     threereps   tworeps
#> 1 threereps_a tworeps_l
#> 2 threereps_a tworeps_l
#> 3 threereps_a tworeps_n
#> 4 threereps_c tworeps_n
#> 5 threereps_e tworeps_o
#> 6 threereps_g tworeps_o
#> 7 threereps_g tworeps_p
choose_annotation_colnames(df, simplify=FALSE)
#>   threereps     tworeps       class   maxvalues 
#> "threereps"   "tworeps"     "class" "maxvalues" 
df[,choose_annotation_colnames(df, simplify=FALSE)]
#>     threereps   tworeps   class maxvalues
#> 1 threereps_a tworeps_l class_A       one
#> 2 threereps_a tworeps_l class_A       two
#> 3 threereps_a tworeps_n class_A     three
#> 4 threereps_c tworeps_n class_C      four
#> 5 threereps_e tworeps_o class_E      five
#> 6 threereps_g tworeps_o class_G       six
#> 7 threereps_g tworeps_p class_G       six
choose_annotation_colnames(df, min_reps=3)
#> [1] "threereps"
choose_annotation_colnames(df, min_reps=1)
#> [1] "threereps" "time"      "tworeps"   "maxvalues"
choose_annotation_colnames(df, keep_numeric=TRUE)
#> [1] "threereps" "tworeps"   "num"       "maxvalues"
choose_annotation_colnames(df, min_reps=1)
#> [1] "threereps" "time"      "tworeps"   "maxvalues"
choose_annotation_colnames(df, min_reps=1, keep_numeric=TRUE)
#> [1] "threereps" "time"      "tworeps"   "num"       "maxvalues"