aind_ephys_utils.ops package

Submodules

Module contents

Pure, testable operations used by .ephys accessors.

Functions in this package operate on xarray objects and should avoid side effects.

aind_ephys_utils.ops.align(data: DataArray | ndarray | Sequence[object] | Sequence[Sequence[object]], *, events: DataArray | Sequence[float] | ndarray, window: Tuple[float, float], to: str | None = None, dims: Sequence[str] | None = None, coords: Dict[str, object] | None = None, return_type: str = 'auto') DataArray | object

Align data to an event and extract a time window around it.

Parameters:
  • data – Input data. Supports xarray.DataArray (dense or ragged spikes), dense NumPy arrays, and ragged spike lists.

  • events – Event times used for alignment. Can be an xarray.DataArray with an event coordinate, or array-like event times.

  • window – Alignment window (tmin, tmax) relative to each event.

  • to – Event label to align to when events is xarray-backed.

  • dims – Optional dimension names used when data is a dense NumPy array. Required for dense NumPy input.

  • coords – Optional coordinate mapping used when converting dense NumPy input to xarray.DataArray.

  • return_type – Output type policy: "auto", "xarray", or "numpy". "auto" returns xarray for xarray input and NumPy/list output for NumPy or ragged-list input.

Returns:

Aligned output restricted to window. Continuous/binned input returns dense aligned data with shifted time. Ragged session spikes return trialized ragged spikes.

Return type:

xr.DataArray or object

Notes

to is optional for array-like/ndarray events inputs. For xarray events objects, to is required to select an event label.

aind_ephys_utils.ops.baseline(data: DataArray | ndarray | Sequence[object] | Sequence[Sequence[object]], *, window: Tuple[float, float], dim: str = 'time', mode: str = 'subtract', dims: Sequence[str] | None = None, coords: Dict[str, object] | None = None, return_type: str = 'auto') DataArray | object

Apply baseline correction over a window.

Parameters:
  • data – Input DataArray or NumPy-like data.

  • window – (tmin, tmax) baseline window.

  • dim – Dimension to baseline-correct.

  • mode – Baseline mode (“subtract”, “divide”, “zscore”).

  • dims – Optional dimension names used when data is a dense NumPy array.

  • coords – Optional coordinate mapping used when constructing a DataArray from dense NumPy input.

  • return_type – Output type policy: "auto", "xarray", or "numpy". "auto" mirrors the input style.

Returns:

Baseline-corrected data in the selected output representation.

Return type:

xr.DataArray or object

aind_ephys_utils.ops.bin(data: DataArray | ndarray | Sequence[object] | Sequence[Sequence[object]], dt: float, window: Tuple[float, float] | None = None, output: str = 'rate', time_unit: str = 's', dims: Sequence[str] | None = None, coords: Dict[str, object] | None = None, return_type: str = 'auto') DataArray | object

Bin ragged spikes into a dense representation.

Parameters:
  • data – Ragged spike DataArray/list or object NumPy array.

  • dt – Bin width in seconds.

  • window – Optional (tmin, tmax) window to bin.

  • output – Output type, typically “rate” or “count”.

  • time_unit – Unit for time values, recorded in attrs.

  • dims – Optional dimension names used when data is a dense NumPy array. This is ignored for ragged list inputs.

  • coords – Optional coordinate mapping used when constructing a DataArray from dense NumPy input.

  • return_type – Output type policy: "auto", "xarray", or "numpy". "auto" mirrors the input style (xarray in/xarray out, list/NumPy ragged in/list or NumPy out).

Returns:

Binned dense output. For return_type="numpy", returns a NumPy array with the same dense shape as the xarray result values.

Return type:

xr.DataArray or object

aind_ephys_utils.ops.normalize(data: DataArray | ndarray | Sequence[object] | Sequence[Sequence[object]], *, dim: str | Tuple[str, ...], method: str = 'zscore', dims: Sequence[str] | None = None, coords: Dict[str, object] | None = None, return_type: str = 'auto') DataArray | object

Normalize data across one or more dimensions.

Parameters:
  • data – Input DataArray or NumPy-like data.

  • dim – Dimension(s) to normalize across.

  • method – Normalization method (e.g. “zscore”, “minmax”, “robust”).

  • dims – Optional dimension names used when data is a dense NumPy array.

  • coords – Optional coordinate mapping used when constructing a DataArray from dense NumPy input.

  • return_type – Output type policy: "auto", "xarray", or "numpy". "auto" mirrors the input style.

Returns:

Normalized data in the selected output representation.

Return type:

xr.DataArray or object

aind_ephys_utils.ops.pseudopop(das: Sequence[DataArray], *, group_by: str | Sequence[str], session_ids: Sequence[object] | None = None, unit_dim: str = 'unit') DataArray

Build a pseudopopulation by PSTH-averaging and concatenating sessions.

Parameters:
  • das – Session DataArrays to combine.

  • group_by – Trial coord(s) used to compute per-condition PSTHs in each session.

  • session_ids – Optional per-session identifiers. If omitted, IDs are s0, s1, …

  • unit_dim – Unit dimension name used for concatenation.

aind_ephys_utils.ops.psth(data: DataArray | ndarray | Sequence[object] | Sequence[Sequence[object]], *, dim: str = 'trial', method: str = 'mean', group_by: str | Sequence[str] | None = None, keep_trials: bool = False, dims: Sequence[str] | None = None, coords: Dict[str, object] | None = None, return_type: str = 'auto') DataArray | object

Reduce across trials to compute a PSTH-style summary.

Parameters:
  • data – Input DataArray or NumPy-like data (binned or continuous).

  • dim – Dimension to reduce across.

  • method – Reduction method (e.g. “mean”, “median”).

  • group_by – Optional coord name(s) to group along dim before reducing.

  • keep_trials – If True, keep per-trial data along with the summary.

  • dims – Optional dimension names used when data is a dense NumPy array.

  • coords – Optional coordinate mapping used when constructing a DataArray from dense NumPy input.

  • return_type – Output type policy: "auto", "xarray", or "numpy". "auto" mirrors the input style.

Returns:

PSTH summary (or summary plus trials when keep_trials=True) in the selected output representation.

Return type:

xr.DataArray or object

aind_ephys_utils.ops.reduce(da: DataArray, *, method: str, dim: str | None = 'unit', n_components: int | None = 5, stack: Tuple[str, ...] | None = ('trial', 'time'), unstack: bool = True, return_dataset: bool = True, window: Tuple[float, float] | Sequence[Tuple[float, float]] | None = None, window_apply: str = 'fit_only', orthogonalize: str = 'none', orthogonalize_across: str = 'none', trial_dim: str = 'trial', time_dim: str = 'time', trial_average: bool = True, labels: str | DataArray | Sequence[str | DataArray] | None = None, targets: DataArray | None = None, rank: int | None = None, regularization: float | None = None, cv: int | None = None, gpfa_options: Dict[str, object] | None = None) DataArray | Dataset

Reduce data dimensionality in an xarray-friendly way.

Parameters:
  • da – Input DataArray.

  • method – Reduction method (e.g. “pca”, “gpfa”, “dpca”, “coding_direction”, “logistic”, “lda”, “rrr”).

  • dim – Dimension to reduce across for methods that operate on a single axis.

  • n_components – Number of components to keep (PCA, dPCA).

  • stack – Dims to stack before reduction (default: (trial, time)).

  • unstack – If True, unstack stacked dims in the output.

  • return_dataset – If True, return a Dataset with projections and weights (and explained variance for PCA). For dPCA and supervised methods, a Dataset is always returned.

  • window – Optional (tmin, tmax) window used for fitting supervised methods.

  • window_apply – “fit_only” (default) fits on the window but projects all samples; “fit_and_project” fits and projects only within the window.

  • orthogonalize – How to orthogonalize supervised components: “none”, “qr”, or “svd”.

  • orthogonalize_across – How to orthogonalize across multiple windows/labels: “none”, “windows”, “labels”, or “all”.

  • trial_dim – Trial dimension name.

  • time_dim – Time dimension name.

  • trial_average – If True (default), average across trials before dPCA marginalization. If trial_dim is absent, input is treated as already averaged and must include label dims directly (e.g., choice).

  • labels – Coordinate name(s) used for dPCA and supervised methods (coding direction, logistic, lda). Must exist in da.coords.

  • targets – Target matrix for reduced-rank regression.

  • rank – Rank for reduced-rank regression.

  • regularization – Regularization strength for supervised methods.

  • cv – Cross-validation folds for supervised methods.

  • gpfa_options – Optional dictionary of GPFA configuration overrides, e.g. {"max_iters": 200, "freq_ll": 5, "fast_mode": True, "gp_param_update_every": 5, "random_state": 0}.

aind_ephys_utils.ops.restrict(data: DataArray | ndarray | Sequence[object] | Sequence[Sequence[object]], *, window: Tuple[float, float], dim: str = 'time', dims: Sequence[str] | None = None, coords: Dict[str, object] | None = None, return_type: str = 'auto') DataArray | object

Restrict input data to a time window.

Parameters:
  • data – Input DataArray/NumPy array or ragged spikes.

  • window – (tmin, tmax) interval to keep.

  • dim – Time dimension name for dense data.

  • dims – Optional dimension names used when data is a dense NumPy array.

  • coords – Optional coordinate mapping used when constructing a DataArray from dense NumPy input.

  • return_type – Output type policy: "auto", "xarray", or "numpy". "auto" mirrors the input style.

Returns:

Cropped output with data restricted to window in the selected representation.

Return type:

xr.DataArray or object

aind_ephys_utils.ops.smooth(data: DataArray | ndarray | Sequence[object] | Sequence[Sequence[object]], *, dim: str = 'time', method: str = 'boxcar', sigma: float | None = None, window: float | None = None, boundary: str = 'reflect', dims: Sequence[str] | None = None, coords: Dict[str, object] | None = None, return_type: str = 'auto') DataArray | object

Smooth a signal along a dimension.

Parameters:
  • data – Input DataArray or NumPy-like data.

  • dim – Dimension to smooth (defaults to time).

  • method – Smoothing method (e.g. “boxcar”, “gaussian”).

  • sigma – Gaussian sigma in seconds (optional).

  • window – Window size in seconds (optional).

  • boundary – Boundary handling mode.

  • dims – Optional dimension names used when data is a dense NumPy array.

  • coords – Optional coordinate mapping used when constructing a DataArray from dense NumPy input.

  • return_type – Output type policy: "auto", "xarray", or "numpy". "auto" mirrors the input style.

Returns:

Smoothed data in the selected output representation.

Return type:

xr.DataArray or object