This function sets the configuration for `archR`.

archRSetConfig(
  innerChunkSize = 500,
  kMin = 2,
  kMax = 20,
  modSelType = "stability",
  tol = 10^-3,
  bound = 10^-8,
  cvFolds = 5,
  parallelize = FALSE,
  nCoresUse = NA,
  nIterationsUse = 500,
  alphaBase = 0,
  alphaPow = 1,
  minSeqs = 25,
  checkpointing = TRUE,
  flags = list(debugFlag = FALSE, timeFlag = FALSE, verboseFlag = TRUE, plotVerboseFlag
    = FALSE)
)

Arguments

innerChunkSize

Numeric. Specify the size of the inner chunks of sequences.

kMin

Numeric. Specify the minimum of the range of values to be tested for number of NMF basis vectors.

kMax

Numeric. Specify the maximum of the range of values to be tested for number of NMF basis vectors.

modSelType

Character. Specify the model selection strategy to be used. Default is 'stability'. Another option is 'cv' short for cross-validation. Warning: The cross-validation approach can be time consuming and computationally expensive than the stability-based approach.

tol

Numeric. Specify the tolerance value as criterion for choosing the most appropriate number of NMF factors. Default is 1e-03.

bound

Numeric. Specify the lower bound value as criterion for choosing the most appropriate number of NMF factors. Default is 1e-08.

cvFolds

Numeric. Specify the number of cross-validation folds used for model selection. Only used when modSelType is set to 'cv'. Default value is 5.

parallelize

Logical. Specify whether to parallelize the procedure. Note that running archR serially can be time consuming. Consider parallelizing with at least 2 or 4 cores. If Slurm is available, archR's graphical user interface, accessed with run_archR_UI, enables providing all input data, setting archR configuration, and running archR directly by submiting/monitoring slurm jobs through the user interface.

nCoresUse

The number of cores to be used when `parallelize` is set to TRUE. If `parallelize` is FALSE, nCoresUse is ignored.

nIterationsUse

Numeric. Specify the number of bootstrapped iterations to be performed with NMF. Default value is 100. When using cross-validation more than 100 (upto 500) iterations may be needed.

alphaBase, alphaPow

Specify the base value and the power for computing 'alpha' in performing model selection for NMF. alpha = alphaBase^alphaPow. Alpha specifies the regularization for NMF. Default: 0 and 1 respectively. Warning: Currently, not used (for future).

minSeqs

Numeric. Specify the minimum number of sequences, such that any cluster/chunk of size less than or equal to it will not be further processed/clustered.

checkpointing

Logical. Specify whether to write intermediate checkpoints to disk as RDS files. Checkpoints and the final result are saved to disk provided the oDir argument is set in archR. When oDir argument is not provided or NULL, this is ignored. Default is TRUE.

flags

List with four Logical elements as detailed.

debugFlag

Whether debug information for the run is printed

verboseFlag

Whether verbose information for the run is printed

plotVerboseFlag

Whether verbose plotting is performed for the run

timeFlag

Whether timing information is printed for the run

Value

a list with all params for archR set

Examples

# Set archR configuration archRconfig <- archR::archRSetConfig( parallelize = TRUE, nCoresUse = 2, nIterationsUse = 100, kMin = 1, kMax = 20, modSelType = "stability", tol = 10^-4, bound = 10^-8, innerChunkSize = 100, flags = list(debugFlag = TRUE, timeFlag = TRUE, verboseFlag = TRUE, plotVerboseFlag = FALSE) )