multivelo.recover_dynamics_chrom

multivelo.recover_dynamics_chrom(adata_rna, adata_atac=None, gene_list=None, max_iter=5, init_mode='invert', device='cpu', neural_net=False, adam=False, adam_lr=None, adam_beta1=None, adam_beta2=None, batch_size=None, model_to_run=None, plot=False, parallel=True, n_jobs=None, save_plot=False, plot_dir=None, rna_only=False, fit=True, fit_decoupling=True, extra_color_key=None, embedding='X_umap', n_anchors=500, k_dist=1, thresh_multiplier=1.0, weight_c=0.6, outlier=99.8, n_pcs=30, n_neighbors=30, fig_size=(8, 6), point_size=7, partial=None, direction=None, rescale_u=None, alpha=None, beta=None, gamma=None, t_sw=None)

Multi-omic dynamics recovery.

This function optimizes the joint chromatin and RNA model parameters in ODE solutions.

Parameters:
  • adata_rna (AnnData) – RNA anndata object. Required fields: Mu, Ms, and connectivities.

  • adata_atac (AnnData (default: None)) – ATAC anndata object. Required fields: Mc.

  • gene_list (str, list of str (default: highly variable genes)) – Genes to use for model fitting.

  • max_iter (int (default: 5)) – Iterations to run for parameter optimization.

  • init_mode (str (default: ‘invert’)) – Initialization method for switch times. ‘invert’: initial RNA switch time will be computed with scVelo time inversion method. ‘grid’: grid search the best set of switch times. ‘simple’: simply initialize switch times to be 5, 10, and 15.

  • device (str (default: ‘cpu’)) – The CUDA device that pytorch tensor calculations will be run on. Only to be used with Adam or Neural Network mode.

  • neural_net (bool (default: False)) – Whether to run time predictions with a neural network or not. Shortens runtime at the expense of accuracy. If False, uses the usual method of assigning each data point to an anchor time point as outlined in the Multivelo paper.

  • adam (bool (default: False)) – Whether MSE minimization is handled by the Adam algorithm or not. When set to the default of False, function uses Nelder-Mead instead.

  • adam_lr (float (default: None)) – The learning rate to use the Adam algorithm. If adam is False, this value is ignored.

  • adam_beta1 (float (default: None)) – The beta1 parameter for the Adam algorithm. If adam is False, this value is ignored.

  • adam_beta2 (float (default: None)) – The beta2 parameter for the Adam algorithm. If adam is False, this value is ignored.

  • batch_size (int (default: None)) – Speeds up performance using minibatch training. Specifies number of cells to use per run of MSE when running the Adam algorithm. Ignored if Adam is set to False.

  • model_to_run (int or list of int (default: None)) – User specified models for each genes. Possible values are 1 are 2. If None, the model for each gene will be inferred based on expression patterns. If more than one value is given, the best model will be decided based on loss of fit.

  • plot (bool or None (default: False)) – Whether to interactively plot the 3D gene portraits. Ignored if parallel is True.

  • parallel (bool (default: True)) – Whether to fit genes in a parallel fashion (recommended).

  • n_jobs (int (default: available threads)) – Number of parallel jobs.

  • save_plot (bool (default: False)) – Whether to save the fitted gene portrait figures as files. This will take some disk space.

  • plot_dir (str (default: plots for multiome and rna_plots for)

  • RNA-only) – Directory to save the plots.

  • rna_only (bool (default: False)) – Whether to only use RNA for fitting (RNA velocity).

  • fit (bool (default: True)) – Whether to fit the models. If False, only pre-determination and initialization will be run.

  • fit_decoupling (bool (default: True)) – Whether to fit decoupling phase (Model 1 vs Model 2 distinction).

  • n_anchors (int (default: 500)) – Number of anchor time-points to generate as a representation of the trajectory.

  • k_dist (int (default: 1)) – Number of anchors to use to determine a cell’s gene time. If more than 1, time will be averaged.

  • thresh_multiplier (float (default: 1.0)) – Multiplier for the heuristic threshold of partial versus complete trajectory pre-determination.

  • weight_c (float (default: 0.6)) – Weighting of scaled chromatin distances when performing 3D residual calculation.

  • outlier (float (default: 99.8)) – The percentile to mark as outlier that will be excluded when fitting the model.

  • n_pcs (int (default: 30)) – Number of principal components to compute distance smoothing neighbors. This can be different from the one used for expression smoothing.

  • n_neighbors (int (default: 30)) – Number of nearest neighbors for distance smoothing. This can be different from the one used for expression smoothing.

  • fig_size (tuple (default: (8,6))) – Size of each figure when saved.

  • point_size (float (default: 7)) – Marker point size for plotting.

  • extra_color_key (str (default: None)) – Extra color key used for plotting. Common choices are leiden, celltype, etc. The colors for each category must be present in one of anndatas, which can be pre-computed with scanpy.pl.scatter function.

  • embedding (str (default: X_umap)) – 2D coordinates of the low-dimensional embedding of cells.

  • partial (bool or list of bool (default: None)) – User specified trajectory completeness for each gene.

  • direction (str or list of str (default: None)) – User specified trajectory directionality for each gene.

  • rescale_u (float or list of float (default: None)) – Known scaling factors for unspliced. Can be computed from scVelo fit_scaling values as rescale_u = fit_scaling / std(u) * std(s).

  • alpha (float or list of float (default: None)) – Known trascription rates. Can be computed from scVelo fit_alpha values as alpha = fit_alpha * fit_alignment_scaling.

  • beta (float or list of float (default: None)) – Known splicing rates. Can be computed from scVelo fit_alpha values as beta = fit_beta * fit_alignment_scaling.

  • gamma (float or list of float (default: None)) – Known degradation rates. Can be computed from scVelo fit_gamma values as gamma = fit_gamma * fit_alignment_scaling.

  • t_sw (float or list of float (default: None)) – Known RNA switch time. Can be computed from scVelo fit_t_ values as t_sw = fit_t_ / fit_alignment_scaling.

Returns:

  • fit_alpha_c, fit_alpha, fit_beta, fit_gamma (.var) – inferred chromatin opening, transcription, splicing, and degradation (nuclear export) rates

  • fit_t_sw1, fit_t_sw2, fit_t_sw3 (.var) – inferred switching time points

  • fit_rescale_c, fit_rescale_u (.var) – inferred scaling factor for chromatin and unspliced counts

  • fit_scale_cc (.var) – inferred scaling value for chromatin closing rate compared to opening rate

  • fit_alignment_scaling (.var) – ratio used to realign observed time range to 0-20

  • fit_c0, fit_u0, fit_s0 (.var) – initial expression values at earliest observed time

  • fit_model (.var) – inferred gene model

  • fit_direction (.var) – inferred gene direction

  • fit_loss (.var) – loss of model fit

  • fit_likelihood (.var) – likelihood of model fit

  • fit_likelihood_c (.var) – likelihood of chromatin fit

  • fit_anchor_c, fit_anchor_u, fit_anchor_s (.varm) – anchor expressions

  • fit_anchor_c_sw, fit_anchor_u_sw, fit_anchor_s_sw (.varm) – switch time-point expressions

  • fit_anchor_c_velo, fit_anchor_u_velo, fit_anchor_s_velo (.varm) – velocities of anchors

  • fit_anchor_min_idx (.var) – first anchor mapped to observations

  • fit_anchor_max_idx (.var) – last anchor mapped to observations

  • fit_anchor_velo_min_idx (.var) – first velocity anchor mapped to observations

  • fit_anchor_velo_max_idx (.var) – last velocity anchor mapped to observations

  • fit_t (.layers) – inferred gene time

  • fit_state (.layers) – inferred state assignments

  • velo_s, velo_u, velo_chrom (.layers) – velocities in spliced, unspliced, and chromatin space

  • velo_s_genes, velo_u_genes, velo_chrom_genes (.var) – velocity genes

  • velo_s_params, velo_u_params, velo_chrom_params (.var) – fitting arguments used

  • ATAC (.layers) – KNN smoothed chromatin accessibilities copied from adata_atac