aind_ephys_utils.metrics.connectivity module¶
CCG-based connectivity detection: pre-filtering and surrogate testing.
A higher-level workflow built on the cross-correlogram primitives in
aind_ephys_utils.metrics.ccg:
build_shape_prefilter()— Bourgon+ 2010 independent (latency + FWHM) pre-filter that shrinks the multiple-testing burden before surrogates are run.run_surrogates()— derangement surrogate driver returning the per-pair surrogate peak distribution.run_two_stage_mc()— screen-then-refine Monte Carlo significance testing over the pre-filtered pair set.
The surrogate functions drive the numba CCG kernel via
ccg_trial_surrogates, so they require the numba optional extra at
run time (pip install aind-ephys-utils[numba]).
- aind_ephys_utils.metrics.connectivity.build_shape_prefilter(*, n_units: int, cross_pairs: ndarray, peak_lag: ndarray, peak_fwhm: ndarray, fwhm_range_s: tuple[float, float] | None, latency_range_s: tuple[float, float] | None) tuple[ndarray, ndarray, ndarray, ndarray, ndarray]¶
Build the latency + FWHM independent pre-filter for pairs.
Bourgon+ 2010: pre-filtering on covariates that are approximately null-independent of the test statistic (peak bin location for latency; noise-driven FWHM for FWHM) is statistically sound and shrinks the multiple-testing burden. Prominence is not part of this — it correlates with peak height under the null and would inflate FDR if used as a pre-filter; apply it post-hoc instead.
- Parameters:
n_units –
N— used to allocate the(N, N)mask arrays.cross_pairs – Iterable of
(i, j)pair tuples withi < j.peak_lag –
(N, N)peak-lag-in-seconds array.peak_fwhm –
(N, N)peak-FWHM-in-seconds array,inffor un-measured.fwhm_range_s –
(low, high)FWHM bounds in seconds, orNoneto disable.latency_range_s –
(low, high)|lag|bounds in seconds, orNoneto disable.
- Returns:
upper_mask, lat_ok, fwhm_ok, prefilter_mask ((N, N) bool arrays) –
upper_maskmarks the upper triangle of supplied pairs;lat_ok/fwhm_okare per-pair shape masks (all-Truewhen the corresponding range isNone);prefilter_maskis their conjunction.prefilter_pairs ((K, 2) int64 array) – The subset of
cross_pairsthat survived the pre-filter.
- aind_ephys_utils.metrics.connectivity.run_surrogates(trial_segs: TrialSegments, n_trials: int, n_units: int, n_surr: int, pairs: ndarray, rng: Generator, *, bin_size: float, max_lag: float, normalize: str, label: str = '') ndarray¶
Run derangement surrogates, return a
(n_surr, n_pairs)array.Each surrogate re-pairs trials with a derangement (no fixed points), recomputes the CCG for every pair in
pairs, and reduces it to the per-pair peak (max over lag bins). Output rows are inpairsorder and arefloat32(surrogate corrcoef peaks live in roughly[-1, 1], so float32 precision is ample and halves memory).- Parameters:
trial_segs – Trial-clipped spike segments (see
clip_spikes_to_trials).n_trials – Number of trials (the derangement length).
n_units – Number of units; kept for signature symmetry — the output shape is per-pair, so callers project to
(N, N)themselves.n_surr – Number of surrogate draws.
pairs –
(n_pairs, 2)unit-index pairs to evaluate.rng – NumPy random generator.
bin_size – CCG kernel parameters.
max_lag – CCG kernel parameters.
normalize – CCG kernel parameters.
label – Prefix for progress log messages.
- Returns:
(n_surr, n_pairs)float32 surrogate peak values.- Return type:
np.ndarray
- aind_ephys_utils.metrics.connectivity.run_two_stage_mc(*, trial_segs: TrialSegments, n_trials: int, n_units: int, raw_peak_val: ndarray, prefilter_pairs: ndarray, prefilter_mask: ndarray, n_screen: int, n_total: int, bin_size: float, max_lag: float, normalize: str, rng: Generator, fdr_alpha: float) tuple[ndarray, ndarray, int, int, ndarray]¶
Two-stage Monte Carlo significance testing on pre-filtered pairs.
Stage 1 runs
n_screenderangement surrogates over every pre-filtered pair, building coarse p-values. Pairs at the1/(n_screen + 1)floor become candidates for stage 2, which runsmax(n_total - n_screen, 0)additional surrogates so candidate p-values are computed againstn_totalexchangeable surrogates in total (stage-1 and stage-2 draws come from the same null and are concatenated; nothing is wasted). The achievable minimum p-value for a candidate is1/(n_total + 1).Pairs outside the pre-filter never have MC computed and get
p = 1, so a Benjamini-Hochberg pass downstream cannot declare them significant.- Parameters:
trial_segs – Trial-clipped spike segments.
n_trials – Trial and unit counts.
n_units – Trial and unit counts.
raw_peak_val –
(N, N)observed peak heights (the test statistic).prefilter_pairs – Output of
build_shape_prefilter()— the pair list to test.prefilter_mask –
(N, N)bool mask matchingprefilter_pairs(accepted for API symmetry; the function operates onprefilter_pairs).n_screen – Stage-1 surrogate count.
n_total – Target total surrogate count for refined candidates (stage 1 + stage 2). Stage 2 runs
max(n_total - n_screen, 0)extra surrogates; ifn_total <= n_screenthe refinement is a no-op.bin_size – CCG kernel parameters.
max_lag – CCG kernel parameters.
normalize – CCG kernel parameters.
rng – CCG kernel parameters.
fdr_alpha – Used for the surrogate threshold percentile returned for plotting.
- Returns:
p_values ((N, N) array) – Final p-values; held at 1.0 outside the pre-filter.
surr_threshold ((N, N) array) – Per-pair
(1 - fdr_alpha)percentile of surrogate peaks.n_candidates (int) – Number of pairs that hit the stage-1 floor (= stage-2 size).
n_surr_total (int) –
n_screenif there were no candidates / no extra surrogates, elsen_total.candidate_pairs ((n_candidates, 2) int64 array) – The pairs sent through stage 2 (empty when none).