aind_ephys_utils.ops.reduce module¶

Dimensionality reduction operations.

This module will contain xarray-native reduction helpers (e.g., PCA) used by the .ephys.reduce accessor.

aind_ephys_utils.ops.reduce.reduce(da: DataArray, *, method: str, dim: str | None = 'unit', n_components: int | None = 5, stack: Tuple[str, ...] | None = ('trial', 'time'), unstack: bool = True, return_dataset: bool = True, window: Tuple[float, float] | Sequence[Tuple[float, float]] | None = None, window_apply: str = 'fit_only', orthogonalize: str = 'none', orthogonalize_across: str = 'none', trial_dim: str = 'trial', time_dim: str = 'time', trial_average: bool = True, labels: str | DataArray | Sequence[str | DataArray] | None = None, targets: DataArray | None = None, rank: int | None = None, regularization: float | None = None, cv: int | None = None, gpfa_options: Dict[str, object] | None = None) → DataArray | Dataset¶

Reduce data dimensionality in an xarray-friendly way.

Parameters:

da – Input DataArray.
method – Reduction method (e.g. “pca”, “gpfa”, “dpca”, “coding_direction”, “logistic”, “lda”, “rrr”).
dim – Dimension to reduce across for methods that operate on a single axis.
n_components – Number of components to keep (PCA, dPCA).
stack – Dims to stack before reduction (default: (trial, time)).
unstack – If True, unstack stacked dims in the output.
return_dataset – If True, return a Dataset with projections and weights (and explained variance for PCA). For dPCA and supervised methods, a Dataset is always returned.
window – Optional (tmin, tmax) window used for fitting supervised methods.
window_apply – “fit_only” (default) fits on the window but projects all samples; “fit_and_project” fits and projects only within the window.
orthogonalize – How to orthogonalize supervised components: “none”, “qr”, or “svd”.
orthogonalize_across – How to orthogonalize across multiple windows/labels: “none”, “windows”, “labels”, or “all”.
trial_dim – Trial dimension name.
time_dim – Time dimension name.
trial_average – If True (default), average across trials before dPCA marginalization. If trial_dim is absent, input is treated as already averaged and must include label dims directly (e.g., choice).
labels – Coordinate name(s) used for dPCA and supervised methods (coding direction, logistic, lda). Must exist in da.coords.
targets – Target matrix for reduced-rank regression.
rank – Rank for reduced-rank regression.
regularization – Regularization strength for supervised methods.
cv – Cross-validation folds for supervised methods.
gpfa_options – Optional dictionary of GPFA configuration overrides, e.g. {"max_iters": 200, "freq_ll": 5, "fast_mode": True, "gp_param_update_every": 5, "random_state": 0}.