Delay

This section covers several types of delay including basic delay, feedback delay, slapback delay, ping-pong delay and multitaps delay.

BasicDelay

class diffFx_pytorch.processors.delay.BasicDelay(sample_rate=44100, param_range=None)[source]

Bases: ProcessorsBase

Differentiable implementation of a single-tap delay line.

This processor implements a basic digital delay line using frequency-domain processing for precise, artifact-free time delays. It creates a single echo of the input signal with controllable delay time and mix level.

Implementation is based on:

The delay is implemented in the frequency domain using the time-shift property:

\[Y(\omega) = X(\omega)e^{-j\omega\tau}\]

where:

X(ω) is the input spectrum
Y(ω) is the delayed spectrum
τ is the delay time in seconds
Phase is unwrapped to ensure continuous delay response

Processing Chain:

Zero-pad input for delay buffer
Convert to frequency domain
Calculate phase shift (z^-N term)
Apply phase shift to spectrum
Convert back to time domain
Mix processed signal with original

Parameters:

sample_rate (int) – Audio sample rate in Hz
param_range (Dict[str, EffectParam], optional) – Parameter ranges.

Parameters Details:

delay_ms: Echo delay time

Range: 0.1 to 1000.0 milliseconds
Controls time offset between original and delayed signal
Minimum value ensures stable processing
Maximum value set for practical buffer sizes

mix: Wet/dry mix ratio

Range: 0.0 to 1.0
0.0: Only original signal
1.0: Only delayed signal
Linear crossfade between original and delayed signals

Note

Uses FFT-based delay for precise time shifting
Phase unwrapping prevents discontinuities in delay
Automatic padding handles all delay times
Particularly effective for:
- Creating simple echoes
- Adding space to dry signals
- Basic time-based effects

Examples

Basic DSP Usage:

>>> # Create a basic delay
>>> delay = BasicDelay(sample_rate=44100)
>>> # Process audio
>>> output = delay(input_audio, dsp_params={
...     'delay_ms': 500.0,  # Half-second delay
...     'mix': 0.5          # Equal mix of dry and wet
... })

Neural Network Control:

>>> # 1. Simple parameter prediction
>>> class DelayController(nn.Module):
...     def __init__(self, input_size):
...         super().__init__()
...         self.net = nn.Sequential(
...             nn.Linear(input_size, 32),
...             nn.ReLU(),
...             nn.Linear(32, 2),  # 2 parameters: delay and mix
...             nn.Sigmoid()  # Ensures output is in [0,1] range
...         )
...
...     def forward(self, x):
...         return self.net(x)
>>>
>>> # Process with features
>>> controller = DelayController(input_size=16)
>>> features = torch.randn(batch_size, 16)
>>> norm_params = controller(features)
>>> output = delay(input_audio, norm_params=norm_params)

A simple delay line that creates a single echo of the input signal after a specified time delay. It provides control over the delay time and mix ratio between the dry (original) and wet (delayed) signals, offering basic time-based effects without feedback.

_register_default_parameters()[source]

Register delay time and mix parameters.

Sets up two parameters:

delay_ms: Delay time in milliseconds (0.1 to 1000.0)
mix: Wet/dry mix ratio (0.0 to 1.0)

process(x, norm_params=None, dsp_params=None)[source]

Process input signal through the delay line.

Parameters:

x (torch.Tensor) – Input audio tensor. Shape: (batch, channels, samples)
norm_params (Dict[str, torch.Tensor]) – Normalized parameters (0 to 1) Must contain the following keys: - ‘delay_ms’: Delay time in milliseconds (0 to 1) - ‘mix’: Wet/dry balance (0 to 1) Each value should be a tensor of shape (batch_size,)
dsp_params (Dict[str, Union[float, torch.Tensor]], optional) – Direct DSP parameters. Can specify delay parameters as: - float/int: Single value applied to entire batch - 0D tensor: Single value applied to entire batch - 1D tensor: Batch of values matching input batch size Parameters will be automatically expanded to match batch size and moved to input device if necessary. If provided, norm_params must be None.

Returns:

Processed audio tensor of same shape as input

Return type:

torch.Tensor

BasicFeedbackDelay

class diffFx_pytorch.processors.delay.BasicFeedbackDelay(sample_rate=44100, param_range=None)[source]

Bases: ProcessorsBase

Differentiable implementation of a feedback delay line.

This processor implements a delay line with feedback and feedforward paths, creating multiple decaying echoes. The implementation uses frequency-domain processing and a feedback-feedforward structure for flexible echo patterns.

Implementation is based on:

The transfer function of the system is from [1]:

\[H(z) = \frac{z^{-N} + g_{ff} - g_{fb}}{z^{-N} - g_{fb}}\]

where:

z^(-N) represents the delay of N samples
g_ff is the feedforward gain
g_fb is the feedback gain
System stability is ensured by limiting |g_fb| < 1

Processing Chain:

Zero-pad input for delay buffer
Convert to frequency domain
Calculate phase shift (z^-N term)
Apply transfer function H(z)
Convert back to time domain
Mix processed signal with original

Parameters:

sample_rate (int) – Audio sample rate in Hz
param_range (Dict[str, EffectParam], optional) – Parameter ranges.

Parameters Details:

delay_ms: Echo delay time

Range: 0.1 to 1000.0 milliseconds
Controls time between successive echoes
Determines rhythmic pattern of echoes

mix: Wet/dry mix ratio

Range: 0.0 to 1.0
0.0: Only original signal
1.0: Only processed signal

fb_gain: Feedback gain

Range: 0.0 to 0.99
Controls decay rate of echoes
Higher values create longer decay times
Clamped to ±0.99 for stability

ff_gain: Feedforward gain

Range: 0.0 to 0.99
Controls level of direct delayed signal
Shapes initial echo response
Independent of feedback path

Note

Uses FFT-based delay for precise time shifting
Phase unwrapping prevents discontinuities
Automatic padding handles all delay times
Particularly effective for:
- Creating rhythmic echo patterns
- Adding depth and space
- Building complex delay textures
System stability is maintained by gain limits

Examples

Basic DSP Usage:

>>> # Create a feedback delay
>>> delay = BasicFeedbackDelay(sample_rate=44100)
>>> # Process with rhythmic echoes
>>> output = delay(input_audio, dsp_params={
...     'delay_ms': 250.0,  # Quarter note at 120 BPM
...     'mix': 0.5,         # Equal mix
...     'fb_gain': 0.7,     # Moderate feedback
...     'ff_gain': 0.8      # Strong initial echo
... })

Neural Network Control:

>>> # 1. Simple parameter prediction
>>> class FeedbackDelayController(nn.Module):
...     def __init__(self, input_size):
...         super().__init__()
...         self.net = nn.Sequential(
...             nn.Linear(input_size, 32),
...             nn.ReLU(),
...             nn.Linear(32, 4),  # 4 parameters
...             nn.Sigmoid()  # Ensures output is in [0,1] range
...         )
...
...     def forward(self, x):
...         return self.net(x)
>>>
>>> # Process with features
>>> controller = FeedbackDelayController(input_size=16)
>>> features = torch.randn(batch_size, 16)
>>> norm_params = controller(features)
>>> output = delay(input_audio, norm_params=norm_params)

A delay effect that includes a feedback path, allowing the delayed signal to be fed back into the input. This creates multiple, gradually decaying echoes. Features controls for delay time, feedback amount, and wet/dry mix, enabling creation of everything from subtle space to rhythmic echo patterns.

_register_default_parameters()[source]

Register delay, mix, and gain parameters.

Sets up four parameters:

delay_ms: Delay time in milliseconds (0.1 to 1000.0)
mix: Wet/dry mix ratio (0.0 to 1.0)
fb_gain: Feedback gain (0.0 to 0.99)
ff_gain: Feedforward gain (0.0 to 0.99)

process(x, norm_params=None, dsp_params=None)[source]

Process input signal through the feedback delay line.

Parameters:

x (torch.Tensor) – Input audio tensor. Shape: (batch, channels, samples)
norm_params (Dict[str, torch.Tensor]) – Normalized parameters (0 to 1) Must contain the following keys: - ‘delay_ms’: Base delay time in milliseconds (0 to 1) - ‘fb_gain’: Amount of signal fed back through delay line (0 to 1) - ‘ff_gain’: Feedforward gain (0 to 1) - ‘mix’: Wet/dry balance (0 to 1) Each value should be a tensor of shape (batch_size,)
dsp_params (Dict[str, Union[float, torch.Tensor]], optional) – Direct DSP parameters. Can specify feedback delay parameters as: - float/int: Single value applied to entire batch - 0D tensor: Single value applied to entire batch - 1D tensor: Batch of values matching input batch size Parameters will be automatically expanded to match batch size and moved to input device if necessary. If provided, norm_params must be None.

Returns:

Processed audio tensor of same shape as input

Return type:

torch.Tensor

SlapbackDelay

class diffFx_pytorch.processors.delay.SlapbackDelay(sample_rate=44100, param_range=None)[source]

Bases: BasicDelay

Differentiable implementation of a slapback delay effect.

The implementation is based on:

This processor extends BasicDelay to create a specialized short delay effect that emulates the distinctive “doubling” sound popularized in 1950s recordings. The delay time range is specifically restricted to create the characteristic slapback effect.

The processor uses the same frequency-domain implementation as BasicDelay:

\[Y(\omega) = X(\omega)e^{-j\omega\tau}\]

where τ is restricted to 40-120ms for the slapback effect.

Delay Time Ranges:

40-80ms: Tight doubling effect
80-120ms: Subtle ambience

These ranges are chosen based on psychoacoustic research and historical usage in classic recordings.

Parameters:

sample_rate (int) – Audio sample rate in Hz
param_range (Dict[str, EffectParam], optional) – Parameter ranges.

Parameters Details:

delay_ms: Slapback delay time

Range: 40.0 to 120.0 milliseconds
Shorter range than BasicDelay for specific effect
40-80ms: Creates tight doubling
80-120ms: Adds natural space

mix: Wet/dry mix ratio

Range: 0.0 to 1.0
0.0: Only original signal
1.0: Only delayed signal
Typical settings: 0.3-0.5 for classic sound

Note

Inherits all processing methods from BasicDelay
Only modifies parameter ranges for specialized use
Particularly effective on:
- Vocals (creates natural doubling)
- Electric guitar (adds depth)
- Snare drums (enhances attack)
No feedback to maintain clarity of effect

Examples

Basic DSP Usage:

>>> # Create a slapback delay
>>> delay = SlapbackDelay(sample_rate=44100)
>>> # Process with classic settings
>>> output = delay(input_audio, dsp_params={
...     'delay_ms': 60.0,  # Tight doubling effect
...     'mix': 0.4         # Subtle enhancement
... })

Neural Network Control:

>>> # 1. Simple parameter prediction
>>> class SlapbackController(nn.Module):
...     def __init__(self, input_size):
...         super().__init__()
...         self.net = nn.Sequential(
...             nn.Linear(input_size, 32),
...             nn.ReLU(),
...             nn.Linear(32, 2),  # 2 parameters: delay and mix
...             nn.Sigmoid()  # Ensures output is in [0,1] range
...         )
...
...     def forward(self, x):
...         return self.net(x)
>>>
>>> # Process with features
>>> controller = SlapbackController(input_size=16)
>>> features = torch.randn(batch_size, 16)
>>> norm_params = controller(features)
>>> output = delay(input_audio, norm_params=norm_params)

A specialized short delay effect that emulates the distinctive “doubling” sound of vintage tape delays. Uses very short delay times (typically 60-120ms) with minimal to no feedback, creating a tight, distinctive echo that was popular in early rock and roll recordings.

_register_default_parameters()[source]

Register parameters with slapback-specific ranges.

Modifies the delay time range from BasicDelay to:

delay_ms: 40.0 to 120.0 ms (slapback range)
mix: 0.0 to 1.0 (unchanged from BasicDelay)

Note

These ranges are specifically chosen for the characteristic slapback doubling effect.

PingPongDelay

class diffFx_pytorch.processors.delay.PingPongDelay(sample_rate=44100, param_range=None)[source]

Bases: ProcessorsBase

Differentiable implementation of a stereo ping-pong delay effect.

This processor implements a stereo delay effect where echoes alternate between left and right channels, creating a “ping-pong” spatial pattern. The implementation uses a cross-coupled feedback structure in the frequency domain for precise timing and smooth transitions.

Implementation is based on:

The system is described by coupled transfer functions:

\[ \begin{align}\begin{aligned}H_{11}(z) = \frac{1}{1 - b_1b_2z^{-2N}}\\H_{12}(z) = \frac{b_1z^{-N}}{1 - b_1b_2z^{-2N}}\\H_{21}(z) = \frac{b_2z^{-N}}{1 - b_1b_2z^{-2N}}\\H_{22}(z) = \frac{b_1b_2z^{-2N}}{1 - b_1b_2z^{-2N}}\end{aligned}\end{align} \]

where:

z^(-N) represents the base delay
b1, b2 are feedback gains for each channel
System stability ensured by |b1*b2| < 1

Processing Chain:

Zero-pad stereo input for delay buffer
Convert to frequency domain
Calculate cross-coupled transfer functions
Apply transfers to each channel
Convert back to time domain
Mix processed signal with original

Parameters:

sample_rate (int) – Audio sample rate in Hz
param_range (Dict[str, EffectParam], optional) – Parameter ranges.

Parameters Details:

delay_ms: Base delay time

Range: 0.1 to 3000.0 milliseconds
Controls time between alternating echoes
Each bounce takes this amount of time

feedback_ch1: Left channel feedback gain

Range: 0.0 to 0.99
Controls decay of left-to-right echoes
Higher values create longer decay times

feedback_ch2: Right channel feedback gain

Range: 0.0 to 0.99
Controls decay of right-to-left echoes
Can differ from ch1 for asymmetric patterns

mix: Wet/dry mix ratio

Range: 0.0 to 1.0
0.0: Only original signal
1.0: Only processed signal

Note

Uses FFT-based delay for precise time shifting
Phase unwrapping prevents discontinuities
Automatic padding handles all delay times
Particularly effective for:
- Creating rhythmic spatial patterns
- Adding stereo width and movement
- Building complex stereo textures
System stability is maintained by gain limits

Examples

Basic DSP Usage:

>>> # Create a ping-pong delay
>>> delay = PingPongDelay(sample_rate=44100)
>>> # Process with rhythmic spatial echoes
>>> output = delay(input_audio, dsp_params={
...     'delay_ms': 250.0,     # Quarter note at 120 BPM
...     'feedback_ch1': 0.7,   # Left to right decay
...     'feedback_ch2': 0.7,   # Right to left decay
...     'mix': 0.5            # Equal mix of dry and wet
... })

Neural Network Control:

>>> # 1. Simple parameter prediction
>>> class PingPongController(nn.Module):
...     def __init__(self, input_size):
...         super().__init__()
...         self.net = nn.Sequential(
...             nn.Linear(input_size, 32),
...             nn.ReLU(),
...             nn.Linear(32, 4),  # 4 parameters
...             nn.Sigmoid()  # Ensures output is in [0,1] range
...         )
...
...     def forward(self, x):
...         return self.net(x)
>>>
>>> # Process with features
>>> controller = PingPongController(input_size=16)
>>> features = torch.randn(batch_size, 16)
>>> norm_params = controller(features)
>>> output = delay(input_audio, norm_params=norm_params)

A stereo delay effect where the echoes alternate between left and right channels, creating a “ping-pong” effect across the stereo field. Each echo bounces from one channel to the other while decreasing in amplitude, producing a wide spatial effect with rhythmic possibilities.

_register_default_parameters()[source]

Register delay time, feedback, and mix parameters.

Sets up four parameters:

delay_ms: Base delay time (0.1 to 3000.0 ms)
feedback_ch1: Left channel feedback (0.0 to 0.99)
feedback_ch2: Right channel feedback (0.0 to 0.99)
mix: Wet/dry mix ratio (0.0 to 1.0)

process(x, norm_params=None, dsp_params=None)[source]

Process input signal through the ping-pong delay.

Parameters:

x (torch.Tensor) – Input audio tensor. Shape: (batch, 2, samples)
norm_params (Dict[str, torch.Tensor]) – Normalized parameters (0 to 1) Must contain the following keys: - ‘delay_ms’: Base delay time in milliseconds (0 to 1) - ‘feedback_ch1’: Left channel feedback (0 to 1) - ‘feedback_ch2’: Right channel feedback (0 to 1) - ‘mix’: Wet/dry balance (0 to 1) Each value should be a tensor of shape (batch_size,)
dsp_params (Dict[str, Union[float, torch.Tensor]], optional) – Direct DSP parameters. Can specify ping-pong parameters as: - float/int: Single value applied to entire batch - 0D tensor: Single value applied to entire batch - 1D tensor: Batch of values matching input batch size Parameters will be automatically expanded to match batch size and moved to input device if necessary. If provided, norm_params must be None.

Returns:

Processed stereo audio tensor of same shape as input. Shape: (batch, 2, samples)

Return type:

torch.Tensor

Raises:

AssertionError – If input is not stereo (2 channels)

MultiTapsDelay

class diffFx_pytorch.processors.delay.MultiTapsDelay(*args: Any, **kwargs: Any)[source]

Bases: ProcessorsBase

Differentiable implementation of a multi-tap delay effect.

This processor implements a parallel delay structure with multiple taps, where each tap represents an independent echo with its own delay time and gain. The implementation uses frequency-domain processing for precise timing control and efficient computation.

Implementation is based on:

The transfer function is a sum of delayed signals:

\[H(\omega) = \sum_{i=0}^{N-1} g_i e^{-j\omega\tau_i}\]

where:

N is the number of taps
g_i is the gain of tap i
τ_i is the delay time of tap i
Phase is unwrapped for each tap

Processing Chain:

Zero-pad input for maximum delay buffer
Convert to frequency domain
Calculate phase shifts for each tap
Apply gains and sum delayed signals
Convert back to time domain
Mix processed signal with original

Parameters:

sample_rate (int) – Audio sample rate in Hz
num_taps (int) – Number of independent delay taps. Defaults to 4.
param_range (Dict[str, EffectParam], optional) – Parameter ranges.

Parameters Details:

For each tap i (where i ranges from 0 to num_taps-1):

i_tap_delays_ms: Delay time for tap i

Range: 50.0 to 500.0 milliseconds
Controls timing of each echo
Independent control per tap
Can create complex rhythmic patterns

i_tap_gains: Gain for tap i

Range: 0.0 to 1.0
Controls amplitude of each echo
Allows creation of complex patterns
Can be used for amplitude envelopes

mix: Overall wet/dry mix ratio

Range: 0.0 to 1.0
0.0: Only original signal
1.0: Only processed signal
Controls overall effect intensity

Note

Uses FFT-based delay for precise time shifting
Phase unwrapping prevents discontinuities
Automatic padding handles all delay times
Particularly effective for:
- Creating complex rhythmic patterns
- Building custom echo sequences
- Designing unique delay textures
Each tap can be independently controlled
System is stable for all parameter values

Examples

Basic DSP Usage:

>>> # Create a 4-tap delay
>>> delay = MultiTapsDelay(sample_rate=44100, num_taps=4)
>>> # Process with rhythmic pattern
>>> params = {
...     '0_tap_delays_ms': 125.0,  # Eighth note at 120 BPM
...     '0_tap_gains': 0.8,
...     '1_tap_delays_ms': 250.0,  # Quarter note
...     '1_tap_gains': 0.6,
...     '2_tap_delays_ms': 375.0,  # Dotted quarter
...     '2_tap_gains': 0.4,
...     '3_tap_delays_ms': 500.0,  # Half note
...     '3_tap_gains': 0.2,
...     'mix': 0.5
... }
>>> output = delay(input_audio, dsp_params=params)

Neural Network Control:

>>> # 1. Simple parameter prediction
>>> class MultiTapController(nn.Module):
...     def __init__(self, input_size, num_taps):
...         super().__init__()
...         num_params = 2 * num_taps + 1  # delays, gains, and mix
...         self.net = nn.Sequential(
...             nn.Linear(input_size, 32),
...             nn.ReLU(),
...             nn.Linear(32, num_params),
...             nn.Sigmoid()  # Ensures output is in [0,1] range
...         )
...
...     def forward(self, x):
...         return self.net(x)
>>>
>>> # Process with features
>>> controller = MultiTapController(input_size=16, num_taps=4)
>>> features = torch.randn(batch_size, 16)
>>> norm_params = controller(features)
>>> output = delay(input_audio, norm_params=norm_params)

A complex delay effect that creates multiple delayed copies (taps) of the input signal, each with independent timing, level, and panning controls. This allows for creation of complex rhythmic patterns and spatial effects by precisely controlling the timing and placement of each echo.

__init__(sample_rate, param_range=None, num_taps=4)[source]

Initialize the processor base.

Parameters:

sample_rate – Audio sample rate in Hz
param_range – Optional parameter definitions to override defaults

_register_default_parameters()[source]

Register parameters for all taps and mix.

Creates parameters for each tap:

i_tap_delays_ms: Delay time (50.0 to 500.0 ms)
i_tap_gains: Tap gain (0.0 to 1.0)

Plus overall mix parameter.

Total parameters = 2 * num_taps + 1

process(x, norm_params=None, dsp_params=None)[source]

Process input signal through the multi-tap delay.

Parameters:

x (torch.Tensor) – Input audio tensor. Shape: (batch, channels, samples)
norm_params (Dict[str, torch.Tensor]) – Normalized parameters (0 to 1) Must contain the following keys: - ‘{i}_tap_delays_ms’: Base delay time for each tap (0 to 1) - ‘{i}_tap_gains’: Tap gain for each tap (0 to 1) - ‘mix’: Wet/dry balance (0 to 1) Each value should be a tensor of shape (batch_size,)
dsp_params (Dict[str, Union[float, torch.Tensor]], optional) – Direct DSP parameters. Can specify multi-tap parameters as: - float/int: Single value applied to entire batch - 0D tensor: Single value applied to entire batch - 1D tensor: Batch of values matching input batch size Parameters will be automatically expanded to match batch size and moved to input device if necessary. If provided, norm_params must be None.

Returns:

Processed audio tensor of same shape as input

Return type:

torch.Tensor