Skip to contents

Identify the Best Parameters For Your Dataset

Usage

find_best_params(
  x,
  genelist,
  bins_count_range = c(5, 10, 20, 40),
  gene_count_range = c(10, 20, 40, 80),
  bootstrap_iterations = 200,
  BPPARAM = BiocParallel::SerialParam(),
  ...
)

Arguments

x

The object to create `BlaseData“ from

genelist

The list of genes to use (ordered by descending goodness)

bins_count_range

The n_bins list to try out

gene_count_range

The n_genes list to try out

bootstrap_iterations

Iterations for bootstrapping when calculating confident mappings.

BPPARAM

The BiocParallel configuration. Defaults to SerialParam.

...

params to be passed to child functions, see as.BlaseData()

Value

A dataframe of the results.

  • bin_count: The bin count for this attempt

  • gene_count: The top n genes to use for this attempt

  • min_convexity: The worst convexity for these parameters

  • mean_convexity: The mean convexity for these parameters

  • confident_mapping_pct: The percent of bins which were confidently mapped to themselves for these parameters. If this value is low, then it is likely that in real use, few or no results will be confidently mapped.

See also

plot_find_best_params_results() for plotting the results of this function.

Examples

ncells <- 70
ngenes <- 100
counts_matrix <- matrix(
    c(seq_len(3500) / 10, seq_len(3500) / 5),
    ncol = ncells,
    nrow = ngenes
)
sce <- SingleCellExperiment::SingleCellExperiment(assays = list(
    normcounts = counts_matrix, logcounts = log(counts_matrix)
))
colnames(sce) <- paste0("cell", seq_len(ncells))
rownames(sce) <- paste0("gene", seq_len(ngenes))
sce$cell_type <- c(
    rep("celltype_1", ncells / 2),
    rep("celltype_2", ncells / 2)
)

sce$pseudotime <- seq_len(ncells) - 1
genelist <- rownames(sce)

# Finding the best params for the BlaseData
best_params <- find_best_params(
    sce, genelist,
    bins_count_range = c(2, 3),
    gene_count_range = c(20, 50),
    pseudotime_slot = "pseudotime",
    split_by = "pseudotime_range"
)
best_params
#>   column_label bin_count gene_count min_convexity mean_convexity
#> 1            1         2         20             0              0
#> 2            2         2         50             0              0
#> 3            1         3         20             0              0
#> 4            2         3         50             0              0
#>   confident_mapping_pct
#> 1                     0
#> 2                     0
#> 3                     0
#> 4                     0
plot_find_best_params_results(best_params)