This function, for each sample, will independently compute counts for each tile per cell and then infer gene activity scores.
addGeneScoreMatrix( input = NULL, genes = getGenes(input), geneModel = "exp(-abs(x)/5000) + exp(-1)", matrixName = "GeneScoreMatrix", extendUpstream = c(1000, 1e+05), extendDownstream = c(1000, 1e+05), geneUpstream = 5000, geneDownstream = 0, useGeneBoundaries = TRUE, useTSS = FALSE, extendTSS = FALSE, tileSize = 500, ceiling = 4, geneScaleFactor = 5, scaleTo = 10000, excludeChr = c("chrY", "chrM"), blacklist = getBlacklist(input), threads = getArchRThreads(), parallelParam = NULL, subThreading = TRUE, force = FALSE, logFile = createLogFile("addGeneScoreMatrix") )
A string giving a "gene model function" used for weighting peaks for gene score calculation. This string
should be a function of
The name to be used for storage of the gene activity score matrix in the provided
The minimum and maximum number of basepairs upstream of the transcription start site to consider for gene activity score calculation.
The minimum and maximum number of basepairs downstream of the transcription start site or transcription termination site (based on 'useTSS') to consider for gene activity score calculation.
An integer describing the number of bp upstream the gene to extend the gene body. This effectively makes the gene body larger as there are proximal peaks that should be weighted equally to the gene body. This parameter is used if 'useTSS=FALSE'.
An integer describing the number of bp downstream the gene to extend the gene body.This effectively makes the gene body larger as there are proximal peaks that should be weighted equally to the gene body. This parameter is used if 'useTSS=FALSE'.
A boolean value indicating whether gene boundaries should be employed during gene activity score calculation. Gene boundaries refers to the process of preventing tiles from contributing to the gene score of a given gene if there is a second gene's transcription start site between the tile and the gene of interest.
A boolean describing whether to build gene model based on gene TSS or the gene body.
A boolean describing whether to extend the gene TSS. By default useTSS uses the 1bp TSS while this parameter enables the extension of this region with 'geneUpstream' and 'geneDownstream' respectively.
The size of the tiles used for binning counts prior to gene activity score calculation.
The maximum counts per tile allowed. This is used to prevent large biases in tile counts.
A numeric scaling factor to weight genes based on the inverse of there length i.e. (Scale Factor)/(Gene Length). This is scaled from 1 to the scale factor. Small genes will be the scale factor while extremely large genes will be closer to 1. This scaling helps with the relative gene score value.
Each column in the calculated gene score matrix will be normalized to a column sum designated by
A character vector containing the
The number of threads to be used for parallel computing.
A list of parameters to be passed for biocparallel/batchtools parallel computing.
A boolean determining whether possible use threads within each multi-threaded subprocess if greater than the number of input samples.
A boolean value indicating whether to force the matrix indicated by
The path to a file to be used for logging ArchR output.