API Reference¶
oobss ¶
oobss public API.
__all__
module-attribute
¶
__all__ = ['AuxIVA', 'ILRMA', 'OnlineAuxIVA', 'OnlineILRMA', 'OnlineISNMF', 'BatchRequest', 'StreamRequest', 'OnlineFrameRequest', 'SeparationOutput', 'SeparatorState', 'StreamingSeparatorState', 'load_yaml', 'save_yaml', 'JsonlLogger', 'log_steps_jsonl']
AuxIVA ¶
Bases: BaseIterativeSeparator
Base class for auxiliary-function-based independent vector analysis.
Procedure
input: {x_{f,t}}_{f=1..F, t=1..T}, iterations I
initialize W_f = I_M
for i = 1..I:
y_{f,t} <- W_f x_{f,t}
compute r_{k,t} and phi(r_{k,t}) (Gauss or Laplace)
V_{k,f} <- (1/T) * sum_t phi(r_{k,t}) x_{f,t} x_{f,t}^H
update w_{k,f} by IP1/IP2 for each k
return y_{f,t}
Update Equations
Indices are \(k=1,\dots,K\) (source), \(m=1,\dots,M\) (channel), \(f=1,\dots,F\) (frequency), and \(t=1,\dots,T\) (time frame). In the determined case, \(K=M\).
Notation:
The weighted covariance for source \(k\) is:
IP1 update for the \(k\)-th demixing row:
Common projection-back post-processing
AuxIVA/ILRMA and their online variants use the same projection-back rule. For a reference microphone \(m_{\mathrm{ref}}\):
The shared implementation lives in
:func:oobss.separators.utils.projection_back.
Attributes:
| Name | Type | Description |
|---|---|---|
observations |
ndarray of shape (n_frame, n_freq, n_src)
|
|
spatial |
SpatialUpdateStrategy
|
Demixing-matrix strategy (e.g., IP1/IP2). |
source |
SourceModelStrategy
|
Source-model strategy (e.g., Gauss/Laplace). |
covariance |
CovarianceUpdateStrategy
|
Weighted covariance strategy. |
estimated |
ndarray of shape (n_frame, n_freq, n_src)
|
|
source_model |
ndarray of shape (n_frame, n_freq, n_src)
|
|
demix_filter |
ndarray of shape (n_freq, n_src, n_src)
|
|
loss |
list[float]
|
|
Examples:
Basic TF-domain usage:
import numpy as np
from scipy.signal import ShortTimeFFT, get_window
from oobss import AuxIVA
fs = 16000
fft_size = 2048
hop_size = 512
win = get_window("hann", fft_size, fftbins=True)
stft = ShortTimeFFT(win=win, hop=hop_size, fs=fs)
# mixture_time: (n_samples, n_mic)
mixture_time = np.random.randn(fs * 2, 2)
# channel-first STFT: (n_mic, n_freq, n_frame)
X_cft = stft.stft(mixture_time.T)
# AuxIVA input must be frame-first: (n_frame, n_freq, n_mic)
X_tfm = X_cft.transpose(2, 1, 0)
model = AuxIVA(X_tfm)
out = model.fit_transform_tf(X_tfm, n_iter=30)
Y_tfm = out.estimate_tf
if Y_tfm is None:
raise ValueError("AuxIVA did not return TF estimates.")
# Reconstruct separated waveforms: (n_src, n_samples)
y_time = np.real(stft.istft(Y_tfm.transpose(2, 1, 0)))
Strategy plug-and-play (fix source model, swap spatial update):
from oobss.separators.strategies import (
BatchCovarianceStrategy,
GaussSourceStrategy,
IP2SpatialStrategy,
)
model = AuxIVA(
X_tfm,
source=GaussSourceStrategy(), # fixed source model
covariance=BatchCovarianceStrategy(), # fixed covariance update
spatial=IP2SpatialStrategy(), # swapped demixing update
)
model.run(20)
Y_tfm = model.get_estimate()
References
[1] N. Ono, "Stable and fast update rules for independent vector analysis based on auxiliary function technique," in Proc. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 189-192, Oct. 2011, doi: 10.1109/ASPAA.2011.6082320.
Source code in src/oobss/separators/auxiva.py
26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 | |
__call__ ¶
__call__(*args, **kwargs)
Alias for :meth:forward to provide a torch-like call style.
Source code in src/oobss/separators/core/base.py
31 32 33 | |
__init__ ¶
__init__(observations, *, spatial=None, source=None, covariance=None, reconstruction_strategy=None)
Initialize parameters in AuxIVA.
Source code in src/oobss/separators/auxiva.py
176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 | |
bind_mixture_tf ¶
bind_mixture_tf(mixture_tf)
Bind a TF-domain mixture and reset internal iterative state.
Source code in src/oobss/separators/auxiva.py
244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 | |
bind_mixture_time ¶
bind_mixture_time(mixture_time, sample_rate)
Bind time-domain input before iterative updates.
Source code in src/oobss/separators/core/base.py
89 90 91 92 93 94 95 96 | |
calc_loss ¶
calc_loss()
Calculate loss function value.
Source code in src/oobss/separators/auxiva.py
295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 | |
calc_source_model ¶
calc_source_model()
Calculate source model.
Returns:
| Type | Description |
|---|---|
ndarray of shape (n_frame, n_freq, n_src)
|
|
Source code in src/oobss/separators/auxiva.py
265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 | |
fit_transform_tf ¶
fit_transform_tf(mixture_tf, *, n_iter=0, request=None)
Bind TF input, run iterations, and return TF-domain estimate.
Source code in src/oobss/separators/core/base.py
53 54 55 56 57 58 59 60 61 62 63 64 65 | |
fit_transform_time ¶
fit_transform_time(mixture_time, *, n_iter=0, request=None)
Bind time-domain input if supported, then run iterations.
Source code in src/oobss/separators/core/base.py
67 68 69 70 71 72 73 74 75 76 77 78 79 | |
forward ¶
forward(mixture, *, n_iter=0, request=None, is_time_input=None)
Run batch separation from TF-domain or time-domain input.
Source code in src/oobss/separators/core/base.py
106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 | |
get_estimate ¶
get_estimate()
Return current TF-domain estimate.
Source code in src/oobss/separators/auxiva.py
322 323 324 | |
init_demix ¶
init_demix()
Initialize demixing matrix.
Source code in src/oobss/separators/auxiva.py
237 238 239 240 241 242 | |
reset ¶
reset()
Reset internal state (override in subclasses when needed).
Source code in src/oobss/separators/core/base.py
28 29 | |
run ¶
run(n_iter)
Execute n_iter update steps and return final estimate.
Source code in src/oobss/separators/core/base.py
98 99 100 101 102 103 104 | |
step ¶
step()
Update paramters one step.
Source code in src/oobss/separators/auxiva.py
203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 | |
BatchRequest
dataclass
¶
Execution options for batch separators.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
reference_mic
|
int
|
Reference microphone index for scale restoration/evaluation. |
0
|
sample_rate
|
int | None
|
Sampling rate in Hz when time-domain input is supplied. |
None
|
metadata
|
dict[str, Any]
|
Additional method-specific options. |
dict()
|
Source code in src/oobss/separators/core/io_models.py
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | |
ILRMA ¶
Bases: BaseIterativeSeparator
Base class for independent low-rank matrix analysis.
Procedure
input: {x_{f,t}}_{f=1..F, t=1..T}, NMF rank L, iterations I
initialize W_f = I_M, b_{k,f,\ell}, c_{k,\ell,t}
for i = 1..I:
y_{f,t} <- W_f x_{f,t}
for each source k:
r_{k,f,t} <- sum_\ell b_{k,f,\ell} c_{k,\ell,t}
update b_{k,f,\ell}, c_{k,\ell,t} by MU using |y_{k,f,t}|^2
V_{k,f} <- (1/T) * sum_t x_{f,t} x_{f,t}^H / r_{k,f,t}
update w_{k,f} by IP1/IP2
return y_{f,t}
Update Equations
Indices are \(k=1,\dots,K\) (source), \(m=1,\dots,M\) (channel), \(f=1,\dots,F\) (frequency), \(t=1,\dots,T\) (time frame), and \(\ell=1,\dots,L\) (NMF basis index). In the determined case, \(K=M\).
Notation:
Source variance model with NMF basis entries \(b_{k,f,\ell}\) and activation entries \(c_{k,\ell,t}\):
For each source \(k\), the multiplicative updates are:
Weighted covariance:
IP1 update for source \(k\):
Common projection-back post-processing
AuxIVA/ILRMA and their online variants use the same projection-back rule. For a reference microphone \(m_{\mathrm{ref}}\):
The shared implementation lives in
:func:oobss.separators.utils.projection_back.
Attributes:
| Name | Type | Description |
|---|---|---|
observations |
ndarray of shape (n_frame, n_freq, n_src)
|
|
spatial |
SpatialUpdateStrategy
|
Demixing-matrix strategy (e.g., IP1/IP2). |
source |
SourceModelStrategy
|
Source-model strategy (typically ILRMA NMF MU). |
covariance |
CovarianceUpdateStrategy
|
Weighted covariance strategy. |
estimated |
ndarray of shape (n_frame, n_freq, n_src)
|
|
source_model |
ndarray of shape (n_frame, n_freq, n_src)
|
|
demix_filter |
ndarray of shape (n_freq, n_src, n_src)
|
|
loss |
list[float]
|
|
Examples:
Basic TF-domain usage:
import numpy as np
from scipy.signal import ShortTimeFFT, get_window
from oobss import ILRMA
fs = 16000
fft_size = 2048
hop_size = 512
win = get_window("hann", fft_size, fftbins=True)
stft = ShortTimeFFT(win=win, hop=hop_size, fs=fs)
mixture_time = np.random.randn(fs * 2, 2) # (n_samples, n_mic)
X_tfm = stft.stft(mixture_time.T).transpose(2, 1, 0) # (T, F, M)
model = ILRMA(X_tfm, n_basis=8, random_state=0)
out = model.fit_transform_tf(X_tfm, n_iter=50)
Y_tfm = out.estimate_tf
if Y_tfm is None:
raise ValueError("ILRMA did not return TF estimates.")
y_time = np.real(stft.istft(Y_tfm.transpose(2, 1, 0)))
Warm-start NMF factors and demixing:
# Initial factors: basis0=(n_src, n_freq, n_basis),
# activ0=(n_src, n_frame, n_basis)
basis0 = np.abs(np.random.randn(2, X_tfm.shape[1], 8)) + 1e-6
activ0 = np.abs(np.random.randn(2, X_tfm.shape[0], 8)) + 1e-6
model = ILRMA(
X_tfm,
n_basis=8,
basis0=basis0,
activ0=activ0,
random_state=0,
)
model.run(30)
Y_tfm = model.get_estimate()
References
[1] D. Kitamura, N. Ono, H. Sawada, H. Kameoka, and H. Saruwatari, "Determined blind source separation unifying independent vector analysis and nonnegative matrix factorization," IEEE/ACM Trans. Audio, Speech, and Language Processing, vol. 24, no. 9, pp. 1622-1637, Sep. 2016, doi: 10.1109/TASLP.2016.2577880.
Source code in src/oobss/separators/ilrma.py
25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 | |
__call__ ¶
__call__(*args, **kwargs)
Alias for :meth:forward to provide a torch-like call style.
Source code in src/oobss/separators/core/base.py
31 32 33 | |
__init__ ¶
__init__(observations, *, n_basis=10, basis0=None, activ0=None, random_state=None, rng=None, spatial=None, source=None, covariance=None, reconstruction_strategy=None)
Initialize parameters in ILRMA.
Source code in src/oobss/separators/ilrma.py
195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 | |
bind_mixture_tf ¶
bind_mixture_tf(mixture_tf)
Bind a TF-domain mixture and reset internal iterative state.
Source code in src/oobss/separators/ilrma.py
302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 | |
bind_mixture_time ¶
bind_mixture_time(mixture_time, sample_rate)
Bind time-domain input before iterative updates.
Source code in src/oobss/separators/core/base.py
89 90 91 92 93 94 95 96 | |
calc_loss ¶
calc_loss(axis=None)
Calculate loss function value of ILRMA.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
axis
|
int or None
|
|
None
|
Raises:
| Type | Description |
|---|---|
ValueError:
|
If |
Source code in src/oobss/separators/ilrma.py
413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 | |
calc_source_model ¶
calc_source_model(B, A, y_power)
Calculate source model. By overriding this method, various source models (e.g., Student t, ILRMA-T, generalized Kullback---Leibler divergence, or IDLMA) can be applied.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
B
|
ndarray of shape (n_freq, n_basis)
|
Basis matrix |
required |
A
|
ndarray of shape (n_frame, n_basis)
|
Activation matrix |
required |
y_power
|
ndarray of shape (n_frame, n_freq)
|
Power spectrograms of estimated source |
required |
Returns:
| Type | Description |
|---|---|
tuple[ndarray, ndarray]
|
Updated |
Source code in src/oobss/separators/ilrma.py
381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 | |
fit_transform_tf ¶
fit_transform_tf(mixture_tf, *, n_iter=0, request=None)
Bind TF input, run iterations, and return TF-domain estimate.
Source code in src/oobss/separators/core/base.py
53 54 55 56 57 58 59 60 61 62 63 64 65 | |
fit_transform_time ¶
fit_transform_time(mixture_time, *, n_iter=0, request=None)
Bind time-domain input if supported, then run iterations.
Source code in src/oobss/separators/core/base.py
67 68 69 70 71 72 73 74 75 76 77 78 79 | |
forward ¶
forward(mixture, *, n_iter=0, request=None, is_time_input=None)
Run batch separation from TF-domain or time-domain input.
Source code in src/oobss/separators/core/base.py
106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 | |
get_estimate ¶
get_estimate()
Return current TF-domain estimate.
Source code in src/oobss/separators/ilrma.py
451 452 453 | |
init_activ ¶
init_activ()
Initialize activation matrix.
Source code in src/oobss/separators/ilrma.py
353 354 355 356 357 358 359 360 | |
init_basis ¶
init_basis()
Initialize basis matrix.
Source code in src/oobss/separators/ilrma.py
348 349 350 351 | |
init_demix ¶
init_demix()
Initialize demixing matrix.
Source code in src/oobss/separators/ilrma.py
295 296 297 298 299 300 | |
init_source_model ¶
init_source_model()
Initialize source variance model R with shape (T, F, N).
Source code in src/oobss/separators/ilrma.py
362 363 364 | |
reset ¶
reset()
Reset internal state (override in subclasses when needed).
Source code in src/oobss/separators/core/base.py
28 29 | |
run ¶
run(n_iter)
Execute n_iter update steps and return final estimate.
Source code in src/oobss/separators/core/base.py
98 99 100 101 102 103 104 | |
step ¶
step()
Update paramters one step.
Source code in src/oobss/separators/ilrma.py
232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 | |
JsonlLogger ¶
Append JSON-serializable records to a JSON Lines file.
Source code in src/oobss/logging_utils.py
10 11 12 13 14 15 16 17 18 19 | |
OnlineAuxIVA ¶
Bases: BaseStreamingSeparator
Online auxiliary-function-based independent vector analysis.
Procedure
input: frame sequence {x_{f,t}}_{f=1..F, t=1..T}
initialize W_{f,0} = I_M and V_{k,f,0}
for each frame t:
repeat inner_iter times:
y_{f,t} <- W_{f,t} x_{f,t}
compute r_{k,t} and phi(r_{k,t}) (Gauss)
V_{k,f,t} <- (1-alpha) * phi(r_{k,t}) x_{f,t} x_{f,t}^H
+ alpha * V_{k,f,t-1}
update w_{k,f,t} by IP1
projection-back by reference microphone
emit separated frame
Update Equations
Indices are \(k=1,\dots,K\) (source), \(m=1,\dots,M\) (channel), \(f=1,\dots,F\) (frequency), and \(t=1,\dots,T\) (time frame). In the determined case, \(K=M\).
Per-frame demixing:
The default source model is AuxIVA Gauss:
Online covariance recursion:
Demixing row update (IP1):
Common projection-back post-processing:
This is shared with batch AuxIVA/ILRMA via
:func:oobss.separators.utils.projection_back.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
n_mic
|
int
|
Number of microphones / separated sources. |
required |
n_freq
|
int
|
Number of frequency bins. |
required |
ref_mic
|
int
|
Reference microphone for projection-back reconstruction. |
0
|
forget
|
float
|
Forgetting factor in covariance smoothing. |
0.9
|
inner_iter
|
int
|
Number of per-frame inner updates. |
30
|
eps
|
float
|
Numerical stability constant. |
1e-12
|
cov_scale
|
float
|
Initial diagonal covariance scale. |
1e-06
|
spatial
|
SpatialUpdateStrategy | None
|
Strategy used to update demixing filters. |
None
|
source
|
SourceModelStrategy | None
|
Strategy used to compute source model per frame. |
None
|
covariance
|
CovarianceUpdateStrategy | None
|
Strategy used to update weighted covariance matrices. |
None
|
reconstruction_strategy
|
ReconstructionStrategy | None
|
Strategy used to reconstruct output spectra. |
None
|
Examples:
Process a full stream at once:
import numpy as np
from scipy.signal import ShortTimeFFT, get_window
from oobss import OnlineAuxIVA, StreamRequest
fs = 16000
fft_size = 2048
hop_size = 512
win = get_window("hann", fft_size, fftbins=True)
stft = ShortTimeFFT(win=win, hop=hop_size, fs=fs)
mixture_time = np.random.randn(fs * 2, 2) # (n_samples, n_mic)
# channel-first STFT: (n_mic, n_freq, n_frame)
X_cft = stft.stft(mixture_time.T)
# online input: (n_freq, n_mic, n_frame)
X_fmt = X_cft.transpose(1, 0, 2)
model = OnlineAuxIVA(
n_mic=2,
n_freq=X_fmt.shape[0],
ref_mic=0,
forget=0.99,
inner_iter=5,
)
out = model.process_stream_tf(
X_fmt,
request=StreamRequest(frame_axis=2, reference_mic=0),
)
Y_fmt = out.estimate_tf
if Y_fmt is None:
raise ValueError("OnlineAuxIVA did not return TF estimates.")
# inverse STFT expects channel-first axes
y_time = np.real(stft.istft(Y_fmt, f_axis=0, t_axis=2)).T
Frame-by-frame update with explicit state carry:
from oobss import StreamingSeparatorState
state: StreamingSeparatorState | None = None
outputs = []
for t in range(X_fmt.shape[2]):
frame = X_fmt[:, :, t] # (n_freq, n_mic)
y_frame, state = model.forward_streaming(frame, state=state)
outputs.append(y_frame)
Y_fmt = np.stack(outputs, axis=2)
References
[1] T. Taniguchi, N. Ono, A. Kawamura, and S. Sagayama, "An auxiliary-function approach to online independent vector analysis for real-time blind source separation," in Proc. Joint Workshop on Hands-free Speech Communication and Microphone Arrays (HSCMA), pp. 107-111, May 2014, doi: 10.1109/HSCMA.2014.6843261.
Source code in src/oobss/separators/online_auxiva.py
28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 | |
__call__ ¶
__call__(*args, **kwargs)
Alias for :meth:forward to provide a torch-like call style.
Source code in src/oobss/separators/core/base.py
31 32 33 | |
fit ¶
fit(spectrogram)
Process a full spectrogram with shape (n_freq, n_frames, n_mic).
Source code in src/oobss/separators/online_auxiva.py
365 366 367 368 | |
forward ¶
forward(stream_tf, *, request=None)
Torch-like forward alias for full streaming input.
Source code in src/oobss/separators/core/base.py
204 205 206 207 208 209 210 211 | |
forward_streaming ¶
forward_streaming(frame, *, state=None, request=None)
Process one frame and return (separated_frame, updated_state).
Source code in src/oobss/separators/core/base.py
152 153 154 155 156 157 158 159 160 161 162 163 | |
get_state ¶
get_state()
Return a typed snapshot of the current online state.
Source code in src/oobss/separators/online_auxiva.py
302 303 304 305 306 307 308 309 310 311 312 313 314 315 | |
partial_fit ¶
partial_fit(x, *, reference_mic=None)
Update model with one frame and return separated spectra.
Source code in src/oobss/separators/online_auxiva.py
245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 | |
process_frame ¶
process_frame(frame, request=None)
Process one TF frame and return separated frame.
Source code in src/oobss/separators/online_auxiva.py
375 376 377 378 379 380 381 382 | |
process_stream ¶
process_stream(stream, *, frame_axis=-1, request=None)
Process all frames in stream and stack outputs on the last axis.
Source code in src/oobss/separators/core/base.py
165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 | |
process_stream_tf ¶
process_stream_tf(stream_tf, *, request=None)
Process all frames in stream_tf using a typed stream request.
Source code in src/oobss/separators/core/base.py
182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 | |
reset ¶
reset()
Reset online state to its initial values.
Source code in src/oobss/separators/online_auxiva.py
235 236 237 238 239 240 241 242 243 | |
set_state ¶
set_state(state)
Restore online state from :class:StreamingSeparatorState.
Source code in src/oobss/separators/online_auxiva.py
317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 | |
OnlineFrameRequest
dataclass
¶
Per-frame runtime options for streaming separators.
Source code in src/oobss/separators/core/strategy_models.py
99 100 101 102 103 104 105 106 107 | |
OnlineILRMA ¶
Bases: BaseStreamingSeparator
Online independent low-rank matrix analysis.
Procedure
input: frame sequence {x_{f,t}}_{f=1..F, t=1..T}
initialize W_{f,0} = I_M, V_{k,f,0}
initialize source-wise NMF parameters b_{k,f,\ell}, c_{k,\ell,t}
for each frame t:
repeat inner_iter times:
y_{f,t} <- W_{f,t} x_{f,t}
for each source k:
update c_{k,\ell,t}, sufficient stats, and b_{k,f,\ell}
r_{k,f,t} <- sum_\ell b_{k,f,\ell} c_{k,\ell,t}
V_{k,f,t} <- (1-alpha) * x_{f,t} x_{f,t}^H / r_{k,f,t}
+ alpha * V_{k,f,t-1}
update w_{k,f,t} by IP1
projection-back by reference microphone
emit separated frame
Update Equations
Indices are \(k=1,\dots,K\) (source), \(m=1,\dots,M\) (channel), \(f=1,\dots,F\) (frequency), \(t=1,\dots,T\) (time frame), and \(\ell=1,\dots,L\) (NMF basis index). In the determined case, \(K=M\).
Demixing:
For each source \(k\), define:
The online MU update for \(C\) is:
Sufficient statistics and basis update:
Source variance and covariance update:
Demixing and common projection back:
This is shared with batch AuxIVA/ILRMA via
:func:oobss.separators.utils.projection_back.
Examples:
Process a stream with online ILRMA:
import numpy as np
from scipy.signal import ShortTimeFFT, get_window
from oobss import OnlineILRMA, StreamRequest
fs = 16000
fft_size = 2048
hop_size = 512
win = get_window("hann", fft_size, fftbins=True)
stft = ShortTimeFFT(win=win, hop=hop_size, fs=fs)
mixture_time = np.random.randn(fs * 2, 2) # (n_samples, n_mic)
X_fmt = stft.stft(mixture_time.T).transpose(1, 0, 2) # (F, M, T)
model = OnlineILRMA(
n_mic=2,
n_freq=X_fmt.shape[0],
n_bases=8,
ref_mic=0,
beta=1,
forget=0.99,
inner_iter=5,
random_state=0,
)
out = model.process_stream_tf(
X_fmt,
request=StreamRequest(frame_axis=2, reference_mic=0),
)
Y_fmt = out.estimate_tf
if Y_fmt is None:
raise ValueError("OnlineILRMA did not return TF estimates.")
y_time = np.real(stft.istft(Y_fmt, f_axis=0, t_axis=2)).T
Plug-and-play NMF updater while keeping spatial update fixed:
from oobss.separators.strategies import MultiplicativeNMFStrategy
model = OnlineILRMA(
n_mic=2,
n_freq=X_fmt.shape[0],
n_bases=8,
nmf=MultiplicativeNMFStrategy(),
)
References
[1] T. Nakashima and N. Ono, "Online independent low-rank matrix analysis as a lightweight and trainable model for real-time multichannel music source separation," in Proc. AAAI 2026 Workshop on Audio-Centric AI: Towards Real-World Multimodal Reasoning and Application Use Cases (Audio-AAAI), Jan. 2026.
Source code in src/oobss/separators/online_ilrma.py
28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 | |
__call__ ¶
__call__(*args, **kwargs)
Alias for :meth:forward to provide a torch-like call style.
Source code in src/oobss/separators/core/base.py
31 32 33 | |
fit ¶
fit(spectrogram)
Process a full spectrogram sequentially.
Source code in src/oobss/separators/online_ilrma.py
360 361 362 363 | |
forward ¶
forward(stream_tf, *, request=None)
Torch-like forward alias for full streaming input.
Source code in src/oobss/separators/core/base.py
204 205 206 207 208 209 210 211 | |
forward_streaming ¶
forward_streaming(frame, *, state=None, request=None)
Process one frame and return (separated_frame, updated_state).
Source code in src/oobss/separators/core/base.py
152 153 154 155 156 157 158 159 160 161 162 163 | |
get_state ¶
get_state()
Return current separator state snapshot.
Source code in src/oobss/separators/core/base.py
144 145 146 | |
partial_fit ¶
partial_fit(x, *, reference_mic=None)
Update model with one frame and return separated spectra.
Source code in src/oobss/separators/online_ilrma.py
286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 | |
process_frame ¶
process_frame(frame, request=None)
Process one TF frame and return separated frame.
Source code in src/oobss/separators/online_ilrma.py
376 377 378 379 380 381 382 383 | |
process_stream ¶
process_stream(stream, *, frame_axis=-1, request=None)
Process all frames in stream and stack outputs on the last axis.
Source code in src/oobss/separators/core/base.py
165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 | |
process_stream_tf ¶
process_stream_tf(stream_tf, *, request=None)
Process all frames in stream_tf using a typed stream request.
Source code in src/oobss/separators/core/base.py
182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 | |
reset ¶
reset()
Reset online state and NMF statistics to initial values.
Source code in src/oobss/separators/online_ilrma.py
268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 | |
set_state ¶
set_state(state)
Restore separator state from a snapshot.
Source code in src/oobss/separators/core/base.py
148 149 150 | |
OnlineISNMF ¶
Bases: BaseStreamingSeparator
Online Itakura-Saito NMF with source-wise ratio-mask reconstruction.
Procedure
input: STFT frame sequence {x_t}_{t=1..T}, x_t in C^F
initialize NMF basis W, statistics A/B
for each frame x_t:
v_t <- |x_t|^2
update activation h_t and basis statistics by online MU
if update timing reached: refresh W from A/B with forgetting
compute source power p_{s,f} by component-to-source assignment
compute ratio mask m_{s,f} = p_{s,f} / sum_s p_{s,f}
emit separated sources y_{s,f} = m_{s,f} x_f
Update Equations
With \(m = Wh + \varepsilon\), the online NMF updates are:
Basis refresh (every \(\beta\) frames):
Source-wise power aggregation and Wiener-style ratio mask:
Examples:
Separate one-channel STFT stream into two sources:
import numpy as np
from scipy.signal import ShortTimeFFT, get_window
from oobss import OnlineISNMF, StreamRequest
fs = 16000
fft_size = 1024
hop_size = 256
win = get_window("hann", fft_size, fftbins=True)
stft = ShortTimeFFT(win=win, hop=hop_size, fs=fs)
mixture_time = np.random.randn(fs * 2, 1) # mono mixture
X_ft = stft.stft(mixture_time[:, 0]) # (n_freq, n_frame)
model = OnlineISNMF(
n_components=16,
n_features=X_ft.shape[0],
n_sources=2,
beta=2,
forget=0.99,
inner_iter=10,
random_state=0,
)
out = model.process_stream_tf(
X_ft,
request=StreamRequest(frame_axis=1, n_sources=2),
)
Y_sft = out.estimate_tf # (n_sources, n_freq, n_frame)
if Y_sft is None:
raise ValueError("OnlineISNMF did not return TF estimates.")
Request masks instead of separated spectra:
out = model.process_stream_tf(
X_ft,
request=StreamRequest(frame_axis=1, n_sources=2, return_mask=True),
)
mask = out.estimate_tf # (n_sources, n_freq, n_frame)
Source code in src/oobss/separators/online_isnmf.py
26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 | |
__call__ ¶
__call__(*args, **kwargs)
Alias for :meth:forward to provide a torch-like call style.
Source code in src/oobss/separators/core/base.py
31 32 33 | |
forward ¶
forward(stream_tf, *, request=None)
Torch-like forward alias for full streaming input.
Source code in src/oobss/separators/core/base.py
204 205 206 207 208 209 210 211 | |
forward_streaming ¶
forward_streaming(frame, *, state=None, request=None)
Process one frame and return (separated_frame, updated_state).
Source code in src/oobss/separators/core/base.py
152 153 154 155 156 157 158 159 160 161 162 163 | |
get_state ¶
get_state()
Return current separator state snapshot.
Source code in src/oobss/separators/core/base.py
144 145 146 | |
partial_fit ¶
partial_fit(v)
Update online NMF model from one power spectrum frame.
Source code in src/oobss/separators/online_isnmf.py
177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 | |
process_frame ¶
process_frame(frame, request=None)
Process one frame using default or explicitly provided settings.
Source code in src/oobss/separators/online_isnmf.py
270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 | |
process_stream ¶
process_stream(stream, *, frame_axis=-1, request=None)
Process all frames in stream and stack outputs on the last axis.
Source code in src/oobss/separators/core/base.py
165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 | |
process_stream_tf ¶
process_stream_tf(stream_tf, *, request=None)
Process all frames in stream_tf using a typed stream request.
Source code in src/oobss/separators/core/base.py
182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 | |
reset ¶
reset()
Reset online NMF parameters and sufficient statistics.
Source code in src/oobss/separators/online_isnmf.py
166 167 168 169 170 171 172 173 174 175 | |
separate_frame ¶
separate_frame(x, *, n_sources, component_to_source=None)
Separate one complex STFT frame via source-wise ratio masking.
Source code in src/oobss/separators/online_isnmf.py
238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 | |
set_state ¶
set_state(state)
Restore separator state from a snapshot.
Source code in src/oobss/separators/core/base.py
148 149 150 | |
source_power ¶
source_power(h, *, n_sources, component_to_source=None)
Compute source-wise power model from component activation.
Source code in src/oobss/separators/online_isnmf.py
215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 | |
SeparationOutput
dataclass
¶
Unified separation result container.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
estimate_time
|
ndarray | None
|
Optional separated signals in time domain |
None
|
estimate_tf
|
ndarray | None
|
Optional separated signals in TF domain. |
None
|
mask
|
ndarray | None
|
Optional source mask values. |
None
|
demix_filter
|
ndarray | None
|
Optional demixing filter/matrix. |
None
|
permutation
|
ndarray | None
|
Optional source permutation indices. |
None
|
state
|
SeparatorState | StreamingSeparatorState | None
|
Optional internal runtime state snapshot. |
None
|
metadata
|
dict[str, Any]
|
Free-form method metadata for logging/benchmarking. |
dict()
|
Source code in src/oobss/separators/core/io_models.py
93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 | |
SeparatorState
dataclass
¶
Mutable runtime state used across iterative or online updates.
Source code in src/oobss/separators/core/io_models.py
58 59 60 61 62 63 64 65 | |
StreamRequest
dataclass
¶
Execution options for streaming separators.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
frame_axis
|
int
|
Axis in input tensor that corresponds to frame/time index. |
-1
|
reference_mic
|
int | None
|
Optional reference microphone index. |
None
|
n_sources
|
int | None
|
Optional source count override for methods that require it. |
None
|
component_to_source
|
ndarray | None
|
Optional NMF component-to-source assignment. |
None
|
return_mask
|
bool
|
If |
False
|
metadata
|
dict[str, Any]
|
Additional method-specific options. |
dict()
|
Source code in src/oobss/separators/core/io_models.py
30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 | |
StreamingSeparatorState
dataclass
¶
Typed runtime state for streaming BSS-style separators.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source_model
|
ndarray | None
|
Optional source model state snapshot at the current frame. |
None
|
demix_filter
|
ndarray | None
|
Optional demixing filter snapshot. |
None
|
mix_filter
|
ndarray | None
|
Optional mixing filter snapshot (typically inverse of |
None
|
frame_index
|
int
|
Number of processed frames associated with this state. |
0
|
metadata
|
dict[str, Any]
|
Additional algorithm-specific state values. |
dict()
|
Source code in src/oobss/separators/core/io_models.py
68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 | |
load_yaml ¶
load_yaml(path, *, overrides=None)
Load a YAML file into a dictionary, with optional dotlist overrides.
Source code in src/oobss/configs.py
25 26 27 28 29 30 31 32 33 34 35 36 | |
log_steps_jsonl ¶
log_steps_jsonl(path, steps)
Write many step dictionaries to JSONL.
Source code in src/oobss/logging_utils.py
22 23 24 25 26 | |
save_yaml ¶
save_yaml(path, data)
Write dictionary data to a YAML file.
Source code in src/oobss/configs.py
54 55 56 57 58 59 | |
Benchmark¶
oobss.benchmark ¶
Benchmark orchestration APIs.
__all__
module-attribute
¶
__all__ = ['ExperimentEngine', 'MethodRunnerRegistry', 'default_method_runner_registry', 'validate_builtin_method_params']
ExperimentEngine
dataclass
¶
Orchestrate planning and execution of benchmark experiments.
Source code in src/oobss/benchmark/engine.py
230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 | |
build_tasks ¶
build_tasks(recipe, methods, *, method_grid=None)
Build tasks from recipe and method definitions.
Source code in src/oobss/benchmark/engine.py
242 243 244 245 246 247 248 249 250 251 | |
run ¶
run(*, recipe, methods, output_root, workers, overwrite, save_framewise, summary_precision, save_audio, method_grid=None, runner_registry=None)
Execute planned experiments and persist outputs.
Source code in src/oobss/benchmark/engine.py
253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 | |
MethodRunnerRegistry
dataclass
¶
Registry that resolves method IDs to runner instances.
Source code in src/oobss/benchmark/methods.py
907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 | |
default_method_runner_registry ¶
default_method_runner_registry()
Create the built-in method runner registry.
Source code in src/oobss/benchmark/methods.py
942 943 944 945 946 947 948 949 950 951 952 | |
validate_builtin_method_params ¶
validate_builtin_method_params(method_type, params)
Validate params for built-in method types.
Unknown method types are ignored so external plugin types remain supported.
Source code in src/oobss/benchmark/methods.py
221 222 223 224 225 226 227 228 229 230 231 | |
Dataloaders¶
oobss.dataloaders ¶
Dataset adapter and loader APIs.
__all__
module-attribute
¶
__all__ = ['AdapterFactory', 'BaseDatasetAdapter', 'DatasetLoader', 'TorchrirDynamicDatasetAdapter', 'TrackAudio', 'TrackHandle', 'build_torchrir_dynamic_adapter', 'create_loader', 'loader_registry', 'TorchrirDynamicDataset', 'build_torchrir_dynamic_dataloader']
BaseDatasetAdapter ¶
Bases: ABC
Abstract dataset adapter used by experiment pipelines.
Source code in src/oobss/dataloaders/base.py
76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 | |
discover
abstractmethod
¶
discover(*, include=None, sample_limit=None)
Discover available track handles.
Source code in src/oobss/dataloaders/base.py
79 80 81 82 83 84 85 86 | |
load
abstractmethod
¶
load(handle, *, duration_sec=None)
Load one track represented by handle.
Source code in src/oobss/dataloaders/base.py
88 89 90 91 92 93 94 95 | |
stem_names
abstractmethod
¶
stem_names()
Return ordered stem labels used in reporting outputs.
Source code in src/oobss/dataloaders/base.py
97 98 99 | |
TorchrirDynamicDataset ¶
Dataset with __len__ / __getitem__ for dynamic torchrir scenes.
Source code in src/oobss/dataloaders/torchrir_dynamic.py
128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 | |
TorchrirDynamicDatasetAdapter
dataclass
¶
Bases: BaseDatasetAdapter
Adapter for dynamic torchrir scene directories.
Source code in src/oobss/dataloaders/base.py
106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 | |
discover ¶
discover(*, include=None, sample_limit=None)
Return discovered dynamic-scene handles sorted by scene ID.
Source code in src/oobss/dataloaders/base.py
113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 | |
load ¶
load(handle, *, duration_sec=None)
Load one dynamic scene and return canonical track audio.
Source code in src/oobss/dataloaders/base.py
133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 | |
stem_names ¶
stem_names()
Return ordered source names from the first discovered scene.
Source code in src/oobss/dataloaders/base.py
158 159 160 161 162 163 164 165 166 167 168 169 170 | |
TrackAudio
dataclass
¶
Container holding stems, mixture, and metadata for one track.
Attributes:
| Name | Type | Description |
|---|---|---|
track_id |
str
|
Unique identifier of the track in the dataset. |
path |
Path
|
Backing track path, if available. |
stems |
ndarray
|
Time-domain stem signals with shape |
mix |
ndarray
|
Time-domain mixture with shape |
sample_rate |
int
|
Sampling rate in Hz. |
Source code in src/oobss/dataloaders/base.py
18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 | |
TrackHandle
dataclass
¶
Opaque reference to a track discovered by a dataset loader.
The payload is loader-specific metadata required to retrieve the actual audio for the track. It must remain pickle-safe because tasks can run in subprocess workers.
Source code in src/oobss/dataloaders/base.py
63 64 65 66 67 68 69 70 71 72 73 | |
build_torchrir_dynamic_adapter ¶
build_torchrir_dynamic_adapter(dataset_cfg)
Create :class:TorchrirDynamicDatasetAdapter from dataset configuration.
Source code in src/oobss/dataloaders/base.py
173 174 175 176 177 178 179 180 181 182 183 184 | |
build_torchrir_dynamic_dataloader ¶
build_torchrir_dynamic_dataloader(*, root, return_type='torch', include=None, sample_limit=None, duration_sec=None, dtype=None, include_metadata=False, batch_size=1, shuffle=False, num_workers=0, pin_memory=False, drop_last=False, collate_fn=None)
Build a torch.utils.data.DataLoader for dynamic torchrir scenes.
Source code in src/oobss/dataloaders/torchrir_dynamic.py
191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 | |
create_loader ¶
create_loader(dataset_cfg, *, registry_overrides=None)
Instantiate a dataset loader from dataset_cfg.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dataset_cfg
|
dict[str, Any]
|
Dataset configuration. Must include |
required |
registry_overrides
|
Mapping[str, AdapterFactory] | None
|
Optional injected adapter factories for tests or external integrations. |
None
|
Source code in src/oobss/dataloaders/base.py
199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 | |
loader_registry ¶
loader_registry(overrides=None)
Return registry mapping loader type names to factories.
Source code in src/oobss/dataloaders/base.py
187 188 189 190 191 192 193 194 195 196 | |
Evaluation¶
oobss.evaluation ¶
Evaluation utilities for separation outputs.
__all__
module-attribute
¶
__all__ = ['Framing', 'align_lengths', 'calc_si_sdr_framewise', 'framewise_si_sdr_summary', 'summarize_framewise_si_sdr', 'si_bss_eval', 'MetricsBundle', 'compute_metrics', 'normalize_framewise_metrics']
Framing ¶
Iterator for overlapping window slices along the last axis.
Source code in src/oobss/evaluation/framewise.py
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 | |
MetricsBundle
dataclass
¶
Source code in src/oobss/evaluation/metrics.py
103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 | |
align_lengths ¶
align_lengths(arrays)
Trim all arrays to the shortest shared length along the last axis.
Source code in src/oobss/evaluation/framewise.py
45 46 47 48 49 50 | |
calc_si_sdr_framewise ¶
calc_si_sdr_framewise(ref, est, window, hop, *, scaling=True, compute_permutation=True)
Compute frame-wise SI-SDR with sliding windows.
Source code in src/oobss/evaluation/framewise.py
53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 | |
compute_metrics ¶
compute_metrics(reference, estimate, mixture, sample_rate, *, filter_length, frame_cfg, compute_permutation=True, permutation_strategy=None)
Compute batch and optional frame-wise separation metrics.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
reference
|
ndarray
|
Reference signals with shape |
required |
estimate
|
ndarray
|
Estimated signals with shape |
required |
mixture
|
ndarray
|
Mixture baseline with shape |
required |
compute_permutation
|
bool
|
Whether to allow permutation solving in |
True
|
Source code in src/oobss/evaluation/metrics.py
142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 | |
framewise_si_sdr_summary ¶
framewise_si_sdr_summary(ref, est, *, mixture=None, window, hop, scaling=True, compute_permutation=True)
Compute frame-wise SI-SDR and optional mixture baseline.
Source code in src/oobss/evaluation/framewise.py
111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 | |
normalize_framewise_metrics ¶
normalize_framewise_metrics(framewise)
Normalize frame-wise metric keys while keeping backward compatibility.
The framewise SI-SDR utilities return canonical keys such as
mean_si_sdr / mean_si_sdr_imp. Some downstream code expects alias
keys with *_channels suffix. This function guarantees both key styles
are available and fills missing aggregate keys from raw frame matrices.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
framewise
|
dict[str, ndarray] | None
|
Framewise metric dictionary, typically produced by
:func: |
required |
Returns:
| Type | Description |
|---|---|
dict[str, ndarray] | None
|
A normalized dictionary where numeric values are converted to numpy
arrays and the following aliases are guaranteed when source keys exist:
- |
Source code in src/oobss/evaluation/metrics.py
27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 | |
si_bss_eval ¶
si_bss_eval(reference_signals, estimated_signals, scaling=True)
Compute SI-SDR family metrics and permutation.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
reference_signals
|
ndarray
|
Reference matrix shaped |
required |
estimated_signals
|
ndarray
|
Estimated matrix shaped |
required |
scaling
|
bool
|
If |
True
|
Source code in src/oobss/evaluation/si_sdr.py
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 | |
summarize_framewise_si_sdr ¶
summarize_framewise_si_sdr(ref, est, fs, *, window_sec=5.0, hop_sec=None, mixture=None, scaling=True, compute_permutation=True)
Return frame-wise SI-SDR and aggregate statistics.
Source code in src/oobss/evaluation/framewise.py
153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 | |
Postprocess¶
oobss.postprocess ¶
Post-processing utilities.
__all__
module-attribute
¶
__all__ = ['ParameterEstimationResult', 'PerReferenceSeparationResult', 'gaussian_source_model_weight', 'mixing_matrix_from_demixing_for_reference', 'separate_with_reference']
ParameterEstimationResult
dataclass
¶
Estimated parameters and raw demixed spectra for one method.
Source code in src/oobss/postprocess/separation.py
20 21 22 23 24 25 26 27 28 | |
PerReferenceSeparationResult
dataclass
¶
Per-reference separation outputs.
Source code in src/oobss/postprocess/separation.py
31 32 33 34 35 36 37 38 | |
gaussian_source_model_weight ¶
gaussian_source_model_weight(demixed_tfm)
Create Gaussian source-model weights from demixed spectra.
This utility computes frame/source power and broadcasts it across the frequency axis so the output can be used as a simple Gaussian source model.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
demixed_tfm
|
ndarray
|
Demixed STFT with shape |
required |
Returns:
| Type | Description |
|---|---|
ndarray
|
Broadcast source model weights with shape |
Source code in src/oobss/postprocess/separation.py
41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 | |
mixing_matrix_from_demixing_for_reference ¶
mixing_matrix_from_demixing_for_reference(demixing_matrix, *, ref_mic)
Return reference-normalized mixing matrices from demixing matrices.
The function first inverts each demixing matrix, then applies projection-back scaling such that the selected reference microphone row becomes source-consistent for each source.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
demixing_matrix
|
ndarray
|
Demixing matrix array. Supported shapes are:
- |
required |
ref_mic
|
int
|
Reference microphone index used for projection-back normalization. |
required |
Returns:
| Type | Description |
|---|---|
ndarray
|
Reference-normalized mixing matrix with the same leading dimensionality
as |
Raises:
| Type | Description |
|---|---|
ValueError
|
If |
Source code in src/oobss/postprocess/separation.py
83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 | |
separate_with_reference ¶
separate_with_reference(params, *, ref_mic, n_samples)
Apply reference-wise projection-back and reconstruct time signals.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
params
|
ParameterEstimationResult
|
Parameter estimation bundle that includes demixed STFT, demixing matrix,
and an STFT object implementing |
required |
ref_mic
|
int
|
Reference microphone index for projection-back normalization. |
required |
n_samples
|
int
|
Number of time-domain samples to keep after iSTFT. The reconstructed output is cropped to this length. |
required |
Returns:
| Type | Description |
|---|---|
PerReferenceSeparationResult
|
Result containing:
- |
Raises:
| Type | Description |
|---|---|
ValueError
|
If demixing and demixed STFT shapes are unsupported or inconsistent. |
Source code in src/oobss/postprocess/separation.py
175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 | |
Signal¶
oobss.signal ¶
Signal processing utilities.
STFTPlan
dataclass
¶
STFT configuration shared across benchmark runners.
Source code in src/oobss/signal/stft.py
10 11 12 13 14 15 16 | |
build_stft ¶
build_stft(plan, sample_rate)
Build a :class:scipy.signal.ShortTimeFFT instance from plan.
Source code in src/oobss/signal/stft.py
19 20 21 22 | |
Visualization¶
oobss.visualization ¶
Visualization utilities for inspection and reporting.
plot_nmf_factors ¶
plot_nmf_factors(x, basis, activations, *, vmin=None, vmax=None)
Plot NMF factors and reconstructed spectrogram in a compact grid.
Source code in src/oobss/visualization/spectrogram.py
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 | |
save_channel_spectrograms ¶
save_channel_spectrograms(spec, name, outdir, *, vmin=-40.0, vmax=20.0)
Save one spectrogram image per channel from a frame-first spectrogram.
Source code in src/oobss/visualization/spectrogram.py
87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 | |