Delay

This section covers several types of delay including basic delay, feedback delay, slapback delay, ping-pong delay and multitaps delay.

BasicDelay

class diffFx_pytorch.processors.delay.BasicDelay(sample_rate=44100, param_range=None)[source]

Bases: ProcessorsBase

Differentiable implementation of a single-tap delay line.

This processor implements a basic digital delay line using frequency-domain processing for precise, artifact-free time delays. It creates a single echo of the input signal with controllable delay time and mix level.

Implementation is based on:

The delay is implemented in the frequency domain using the time-shift property:

\[Y(\omega) = X(\omega)e^{-j\omega\tau}\]
where:
  • X(ω) is the input spectrum

  • Y(ω) is the delayed spectrum

  • τ is the delay time in seconds

  • Phase is unwrapped to ensure continuous delay response

Processing Chain:
  1. Zero-pad input for delay buffer

  2. Convert to frequency domain

  3. Calculate phase shift (z^-N term)

  4. Apply phase shift to spectrum

  5. Convert back to time domain

  6. Mix processed signal with original

Parameters:
  • sample_rate (int) – Audio sample rate in Hz

  • param_range (Dict[str, EffectParam], optional) – Parameter ranges.

Parameters Details:
delay_ms: Echo delay time
  • Range: 0.1 to 1000.0 milliseconds

  • Controls time offset between original and delayed signal

  • Minimum value ensures stable processing

  • Maximum value set for practical buffer sizes

mix: Wet/dry mix ratio
  • Range: 0.0 to 1.0

  • 0.0: Only original signal

  • 1.0: Only delayed signal

  • Linear crossfade between original and delayed signals

Note

  • Uses FFT-based delay for precise time shifting

  • Phase unwrapping prevents discontinuities in delay

  • Automatic padding handles all delay times

  • Particularly effective for:
    • Creating simple echoes

    • Adding space to dry signals

    • Basic time-based effects

Examples

Basic DSP Usage:
>>> # Create a basic delay
>>> delay = BasicDelay(sample_rate=44100)
>>> # Process audio
>>> output = delay(input_audio, dsp_params={
...     'delay_ms': 500.0,  # Half-second delay
...     'mix': 0.5          # Equal mix of dry and wet
... })
Neural Network Control:
>>> # 1. Simple parameter prediction
>>> class DelayController(nn.Module):
...     def __init__(self, input_size):
...         super().__init__()
...         self.net = nn.Sequential(
...             nn.Linear(input_size, 32),
...             nn.ReLU(),
...             nn.Linear(32, 2),  # 2 parameters: delay and mix
...             nn.Sigmoid()  # Ensures output is in [0,1] range
...         )
...
...     def forward(self, x):
...         return self.net(x)
>>>
>>> # Process with features
>>> controller = DelayController(input_size=16)
>>> features = torch.randn(batch_size, 16)
>>> norm_params = controller(features)
>>> output = delay(input_audio, norm_params=norm_params)

A simple delay line that creates a single echo of the input signal after a specified time delay. It provides control over the delay time and mix ratio between the dry (original) and wet (delayed) signals, offering basic time-based effects without feedback.

_register_default_parameters()[source]

Register delay time and mix parameters.

Sets up two parameters:
  • delay_ms: Delay time in milliseconds (0.1 to 1000.0)

  • mix: Wet/dry mix ratio (0.0 to 1.0)

process(x, norm_params=None, dsp_params=None)[source]

Process input signal through the delay line.

Parameters:
  • x (torch.Tensor) – Input audio tensor. Shape: (batch, channels, samples)

  • norm_params (Dict[str, torch.Tensor]) – Normalized parameters (0 to 1) Must contain the following keys: - ‘delay_ms’: Delay time in milliseconds (0 to 1) - ‘mix’: Wet/dry balance (0 to 1) Each value should be a tensor of shape (batch_size,)

  • dsp_params (Dict[str, Union[float, torch.Tensor]], optional) – Direct DSP parameters. Can specify delay parameters as: - float/int: Single value applied to entire batch - 0D tensor: Single value applied to entire batch - 1D tensor: Batch of values matching input batch size Parameters will be automatically expanded to match batch size and moved to input device if necessary. If provided, norm_params must be None.

Returns:

Processed audio tensor of same shape as input

Return type:

torch.Tensor

BasicFeedbackDelay

class diffFx_pytorch.processors.delay.BasicFeedbackDelay(sample_rate=44100, param_range=None)[source]

Bases: ProcessorsBase

Differentiable implementation of a feedback delay line.

This processor implements a delay line with feedback and feedforward paths, creating multiple decaying echoes. The implementation uses frequency-domain processing and a feedback-feedforward structure for flexible echo patterns.

Implementation is based on:

The transfer function of the system is from [1]:

\[H(z) = \frac{z^{-N} + g_{ff} - g_{fb}}{z^{-N} - g_{fb}}\]
where:
  • z^(-N) represents the delay of N samples

  • g_ff is the feedforward gain

  • g_fb is the feedback gain

  • System stability is ensured by limiting |g_fb| < 1

Processing Chain:
  1. Zero-pad input for delay buffer

  2. Convert to frequency domain

  3. Calculate phase shift (z^-N term)

  4. Apply transfer function H(z)

  5. Convert back to time domain

  6. Mix processed signal with original

Parameters:
  • sample_rate (int) – Audio sample rate in Hz

  • param_range (Dict[str, EffectParam], optional) – Parameter ranges.

Parameters Details:
delay_ms: Echo delay time
  • Range: 0.1 to 1000.0 milliseconds

  • Controls time between successive echoes

  • Determines rhythmic pattern of echoes

mix: Wet/dry mix ratio
  • Range: 0.0 to 1.0

  • 0.0: Only original signal

  • 1.0: Only processed signal

fb_gain: Feedback gain
  • Range: 0.0 to 0.99

  • Controls decay rate of echoes

  • Higher values create longer decay times

  • Clamped to ±0.99 for stability

ff_gain: Feedforward gain
  • Range: 0.0 to 0.99

  • Controls level of direct delayed signal

  • Shapes initial echo response

  • Independent of feedback path

Note

  • Uses FFT-based delay for precise time shifting

  • Phase unwrapping prevents discontinuities

  • Automatic padding handles all delay times

  • Particularly effective for:
    • Creating rhythmic echo patterns

    • Adding depth and space

    • Building complex delay textures

  • System stability is maintained by gain limits

Examples

Basic DSP Usage:
>>> # Create a feedback delay
>>> delay = BasicFeedbackDelay(sample_rate=44100)
>>> # Process with rhythmic echoes
>>> output = delay(input_audio, dsp_params={
...     'delay_ms': 250.0,  # Quarter note at 120 BPM
...     'mix': 0.5,         # Equal mix
...     'fb_gain': 0.7,     # Moderate feedback
...     'ff_gain': 0.8      # Strong initial echo
... })
Neural Network Control:
>>> # 1. Simple parameter prediction
>>> class FeedbackDelayController(nn.Module):
...     def __init__(self, input_size):
...         super().__init__()
...         self.net = nn.Sequential(
...             nn.Linear(input_size, 32),
...             nn.ReLU(),
...             nn.Linear(32, 4),  # 4 parameters
...             nn.Sigmoid()  # Ensures output is in [0,1] range
...         )
...
...     def forward(self, x):
...         return self.net(x)
>>>
>>> # Process with features
>>> controller = FeedbackDelayController(input_size=16)
>>> features = torch.randn(batch_size, 16)
>>> norm_params = controller(features)
>>> output = delay(input_audio, norm_params=norm_params)

A delay effect that includes a feedback path, allowing the delayed signal to be fed back into the input. This creates multiple, gradually decaying echoes. Features controls for delay time, feedback amount, and wet/dry mix, enabling creation of everything from subtle space to rhythmic echo patterns.

_register_default_parameters()[source]

Register delay, mix, and gain parameters.

Sets up four parameters:
  • delay_ms: Delay time in milliseconds (0.1 to 1000.0)

  • mix: Wet/dry mix ratio (0.0 to 1.0)

  • fb_gain: Feedback gain (0.0 to 0.99)

  • ff_gain: Feedforward gain (0.0 to 0.99)

process(x, norm_params=None, dsp_params=None)[source]

Process input signal through the feedback delay line.

Parameters:
  • x (torch.Tensor) – Input audio tensor. Shape: (batch, channels, samples)

  • norm_params (Dict[str, torch.Tensor]) – Normalized parameters (0 to 1) Must contain the following keys: - ‘delay_ms’: Base delay time in milliseconds (0 to 1) - ‘fb_gain’: Amount of signal fed back through delay line (0 to 1) - ‘ff_gain’: Feedforward gain (0 to 1) - ‘mix’: Wet/dry balance (0 to 1) Each value should be a tensor of shape (batch_size,)

  • dsp_params (Dict[str, Union[float, torch.Tensor]], optional) – Direct DSP parameters. Can specify feedback delay parameters as: - float/int: Single value applied to entire batch - 0D tensor: Single value applied to entire batch - 1D tensor: Batch of values matching input batch size Parameters will be automatically expanded to match batch size and moved to input device if necessary. If provided, norm_params must be None.

Returns:

Processed audio tensor of same shape as input

Return type:

torch.Tensor

SlapbackDelay

class diffFx_pytorch.processors.delay.SlapbackDelay(sample_rate=44100, param_range=None)[source]

Bases: BasicDelay

Differentiable implementation of a slapback delay effect.

The implementation is based on:

This processor extends BasicDelay to create a specialized short delay effect that emulates the distinctive “doubling” sound popularized in 1950s recordings. The delay time range is specifically restricted to create the characteristic slapback effect.

The processor uses the same frequency-domain implementation as BasicDelay:

\[Y(\omega) = X(\omega)e^{-j\omega\tau}\]

where τ is restricted to 40-120ms for the slapback effect.

Delay Time Ranges:
  • 40-80ms: Tight doubling effect

  • 80-120ms: Subtle ambience

These ranges are chosen based on psychoacoustic research and historical usage in classic recordings.

Parameters:
  • sample_rate (int) – Audio sample rate in Hz

  • param_range (Dict[str, EffectParam], optional) – Parameter ranges.

Parameters Details:
delay_ms: Slapback delay time
  • Range: 40.0 to 120.0 milliseconds

  • Shorter range than BasicDelay for specific effect

  • 40-80ms: Creates tight doubling

  • 80-120ms: Adds natural space

mix: Wet/dry mix ratio
  • Range: 0.0 to 1.0

  • 0.0: Only original signal

  • 1.0: Only delayed signal

  • Typical settings: 0.3-0.5 for classic sound

Note

  • Inherits all processing methods from BasicDelay

  • Only modifies parameter ranges for specialized use

  • Particularly effective on:
    • Vocals (creates natural doubling)

    • Electric guitar (adds depth)

    • Snare drums (enhances attack)

  • No feedback to maintain clarity of effect

Examples

Basic DSP Usage:
>>> # Create a slapback delay
>>> delay = SlapbackDelay(sample_rate=44100)
>>> # Process with classic settings
>>> output = delay(input_audio, dsp_params={
...     'delay_ms': 60.0,  # Tight doubling effect
...     'mix': 0.4         # Subtle enhancement
... })
Neural Network Control:
>>> # 1. Simple parameter prediction
>>> class SlapbackController(nn.Module):
...     def __init__(self, input_size):
...         super().__init__()
...         self.net = nn.Sequential(
...             nn.Linear(input_size, 32),
...             nn.ReLU(),
...             nn.Linear(32, 2),  # 2 parameters: delay and mix
...             nn.Sigmoid()  # Ensures output is in [0,1] range
...         )
...
...     def forward(self, x):
...         return self.net(x)
>>>
>>> # Process with features
>>> controller = SlapbackController(input_size=16)
>>> features = torch.randn(batch_size, 16)
>>> norm_params = controller(features)
>>> output = delay(input_audio, norm_params=norm_params)

A specialized short delay effect that emulates the distinctive “doubling” sound of vintage tape delays. Uses very short delay times (typically 60-120ms) with minimal to no feedback, creating a tight, distinctive echo that was popular in early rock and roll recordings.

_register_default_parameters()[source]

Register parameters with slapback-specific ranges.

Modifies the delay time range from BasicDelay to:
  • delay_ms: 40.0 to 120.0 ms (slapback range)

  • mix: 0.0 to 1.0 (unchanged from BasicDelay)

Note

These ranges are specifically chosen for the characteristic slapback doubling effect.

PingPongDelay

class diffFx_pytorch.processors.delay.PingPongDelay(sample_rate=44100, param_range=None)[source]

Bases: ProcessorsBase

Differentiable implementation of a stereo ping-pong delay effect.

This processor implements a stereo delay effect where echoes alternate between left and right channels, creating a “ping-pong” spatial pattern. The implementation uses a cross-coupled feedback structure in the frequency domain for precise timing and smooth transitions.

Implementation is based on:

The system is described by coupled transfer functions:

\[ \begin{align}\begin{aligned}H_{11}(z) = \frac{1}{1 - b_1b_2z^{-2N}}\\H_{12}(z) = \frac{b_1z^{-N}}{1 - b_1b_2z^{-2N}}\\H_{21}(z) = \frac{b_2z^{-N}}{1 - b_1b_2z^{-2N}}\\H_{22}(z) = \frac{b_1b_2z^{-2N}}{1 - b_1b_2z^{-2N}}\end{aligned}\end{align} \]
where:
  • z^(-N) represents the base delay

  • b1, b2 are feedback gains for each channel

  • System stability ensured by |b1*b2| < 1

Processing Chain:
  1. Zero-pad stereo input for delay buffer

  2. Convert to frequency domain

  3. Calculate cross-coupled transfer functions

  4. Apply transfers to each channel

  5. Convert back to time domain

  6. Mix processed signal with original

Parameters:
  • sample_rate (int) – Audio sample rate in Hz

  • param_range (Dict[str, EffectParam], optional) – Parameter ranges.

Parameters Details:
delay_ms: Base delay time
  • Range: 0.1 to 3000.0 milliseconds

  • Controls time between alternating echoes

  • Each bounce takes this amount of time

feedback_ch1: Left channel feedback gain
  • Range: 0.0 to 0.99

  • Controls decay of left-to-right echoes

  • Higher values create longer decay times

feedback_ch2: Right channel feedback gain
  • Range: 0.0 to 0.99

  • Controls decay of right-to-left echoes

  • Can differ from ch1 for asymmetric patterns

mix: Wet/dry mix ratio
  • Range: 0.0 to 1.0

  • 0.0: Only original signal

  • 1.0: Only processed signal

Note

  • Uses FFT-based delay for precise time shifting

  • Phase unwrapping prevents discontinuities

  • Automatic padding handles all delay times

  • Particularly effective for:
    • Creating rhythmic spatial patterns

    • Adding stereo width and movement

    • Building complex stereo textures

  • System stability is maintained by gain limits

Examples

Basic DSP Usage:
>>> # Create a ping-pong delay
>>> delay = PingPongDelay(sample_rate=44100)
>>> # Process with rhythmic spatial echoes
>>> output = delay(input_audio, dsp_params={
...     'delay_ms': 250.0,     # Quarter note at 120 BPM
...     'feedback_ch1': 0.7,   # Left to right decay
...     'feedback_ch2': 0.7,   # Right to left decay
...     'mix': 0.5            # Equal mix of dry and wet
... })
Neural Network Control:
>>> # 1. Simple parameter prediction
>>> class PingPongController(nn.Module):
...     def __init__(self, input_size):
...         super().__init__()
...         self.net = nn.Sequential(
...             nn.Linear(input_size, 32),
...             nn.ReLU(),
...             nn.Linear(32, 4),  # 4 parameters
...             nn.Sigmoid()  # Ensures output is in [0,1] range
...         )
...
...     def forward(self, x):
...         return self.net(x)
>>>
>>> # Process with features
>>> controller = PingPongController(input_size=16)
>>> features = torch.randn(batch_size, 16)
>>> norm_params = controller(features)
>>> output = delay(input_audio, norm_params=norm_params)

A stereo delay effect where the echoes alternate between left and right channels, creating a “ping-pong” effect across the stereo field. Each echo bounces from one channel to the other while decreasing in amplitude, producing a wide spatial effect with rhythmic possibilities.

_register_default_parameters()[source]

Register delay time, feedback, and mix parameters.

Sets up four parameters:
  • delay_ms: Base delay time (0.1 to 3000.0 ms)

  • feedback_ch1: Left channel feedback (0.0 to 0.99)

  • feedback_ch2: Right channel feedback (0.0 to 0.99)

  • mix: Wet/dry mix ratio (0.0 to 1.0)

process(x, norm_params=None, dsp_params=None)[source]

Process input signal through the ping-pong delay.

Parameters:
  • x (torch.Tensor) – Input audio tensor. Shape: (batch, 2, samples)

  • norm_params (Dict[str, torch.Tensor]) – Normalized parameters (0 to 1) Must contain the following keys: - ‘delay_ms’: Base delay time in milliseconds (0 to 1) - ‘feedback_ch1’: Left channel feedback (0 to 1) - ‘feedback_ch2’: Right channel feedback (0 to 1) - ‘mix’: Wet/dry balance (0 to 1) Each value should be a tensor of shape (batch_size,)

  • dsp_params (Dict[str, Union[float, torch.Tensor]], optional) – Direct DSP parameters. Can specify ping-pong parameters as: - float/int: Single value applied to entire batch - 0D tensor: Single value applied to entire batch - 1D tensor: Batch of values matching input batch size Parameters will be automatically expanded to match batch size and moved to input device if necessary. If provided, norm_params must be None.

Returns:

Processed stereo audio tensor of same shape as input. Shape: (batch, 2, samples)

Return type:

torch.Tensor

Raises:

AssertionError – If input is not stereo (2 channels)

MultiTapsDelay

class diffFx_pytorch.processors.delay.MultiTapsDelay(*args: Any, **kwargs: Any)[source]

Bases: ProcessorsBase

Differentiable implementation of a multi-tap delay effect.

This processor implements a parallel delay structure with multiple taps, where each tap represents an independent echo with its own delay time and gain. The implementation uses frequency-domain processing for precise timing control and efficient computation.

Implementation is based on:

The transfer function is a sum of delayed signals:

\[H(\omega) = \sum_{i=0}^{N-1} g_i e^{-j\omega\tau_i}\]
where:
  • N is the number of taps

  • g_i is the gain of tap i

  • τ_i is the delay time of tap i

  • Phase is unwrapped for each tap

Processing Chain:
  1. Zero-pad input for maximum delay buffer

  2. Convert to frequency domain

  3. Calculate phase shifts for each tap

  4. Apply gains and sum delayed signals

  5. Convert back to time domain

  6. Mix processed signal with original

Parameters:
  • sample_rate (int) – Audio sample rate in Hz

  • num_taps (int) – Number of independent delay taps. Defaults to 4.

  • param_range (Dict[str, EffectParam], optional) – Parameter ranges.

Parameters Details:
For each tap i (where i ranges from 0 to num_taps-1):
i_tap_delays_ms: Delay time for tap i
  • Range: 50.0 to 500.0 milliseconds

  • Controls timing of each echo

  • Independent control per tap

  • Can create complex rhythmic patterns

i_tap_gains: Gain for tap i
  • Range: 0.0 to 1.0

  • Controls amplitude of each echo

  • Allows creation of complex patterns

  • Can be used for amplitude envelopes

mix: Overall wet/dry mix ratio
  • Range: 0.0 to 1.0

  • 0.0: Only original signal

  • 1.0: Only processed signal

  • Controls overall effect intensity

Note

  • Uses FFT-based delay for precise time shifting

  • Phase unwrapping prevents discontinuities

  • Automatic padding handles all delay times

  • Particularly effective for:
    • Creating complex rhythmic patterns

    • Building custom echo sequences

    • Designing unique delay textures

  • Each tap can be independently controlled

  • System is stable for all parameter values

Examples

Basic DSP Usage:
>>> # Create a 4-tap delay
>>> delay = MultiTapsDelay(sample_rate=44100, num_taps=4)
>>> # Process with rhythmic pattern
>>> params = {
...     '0_tap_delays_ms': 125.0,  # Eighth note at 120 BPM
...     '0_tap_gains': 0.8,
...     '1_tap_delays_ms': 250.0,  # Quarter note
...     '1_tap_gains': 0.6,
...     '2_tap_delays_ms': 375.0,  # Dotted quarter
...     '2_tap_gains': 0.4,
...     '3_tap_delays_ms': 500.0,  # Half note
...     '3_tap_gains': 0.2,
...     'mix': 0.5
... }
>>> output = delay(input_audio, dsp_params=params)
Neural Network Control:
>>> # 1. Simple parameter prediction
>>> class MultiTapController(nn.Module):
...     def __init__(self, input_size, num_taps):
...         super().__init__()
...         num_params = 2 * num_taps + 1  # delays, gains, and mix
...         self.net = nn.Sequential(
...             nn.Linear(input_size, 32),
...             nn.ReLU(),
...             nn.Linear(32, num_params),
...             nn.Sigmoid()  # Ensures output is in [0,1] range
...         )
...
...     def forward(self, x):
...         return self.net(x)
>>>
>>> # Process with features
>>> controller = MultiTapController(input_size=16, num_taps=4)
>>> features = torch.randn(batch_size, 16)
>>> norm_params = controller(features)
>>> output = delay(input_audio, norm_params=norm_params)

A complex delay effect that creates multiple delayed copies (taps) of the input signal, each with independent timing, level, and panning controls. This allows for creation of complex rhythmic patterns and spatial effects by precisely controlling the timing and placement of each echo.

__init__(sample_rate, param_range=None, num_taps=4)[source]

Initialize the processor base.

Parameters:
  • sample_rate – Audio sample rate in Hz

  • param_range – Optional parameter definitions to override defaults

_register_default_parameters()[source]

Register parameters for all taps and mix.

Creates parameters for each tap:
  • i_tap_delays_ms: Delay time (50.0 to 500.0 ms)

  • i_tap_gains: Tap gain (0.0 to 1.0)

Plus overall mix parameter.

Total parameters = 2 * num_taps + 1

process(x, norm_params=None, dsp_params=None)[source]

Process input signal through the multi-tap delay.

Parameters:
  • x (torch.Tensor) – Input audio tensor. Shape: (batch, channels, samples)

  • norm_params (Dict[str, torch.Tensor]) – Normalized parameters (0 to 1) Must contain the following keys: - ‘{i}_tap_delays_ms’: Base delay time for each tap (0 to 1) - ‘{i}_tap_gains’: Tap gain for each tap (0 to 1) - ‘mix’: Wet/dry balance (0 to 1) Each value should be a tensor of shape (batch_size,)

  • dsp_params (Dict[str, Union[float, torch.Tensor]], optional) – Direct DSP parameters. Can specify multi-tap parameters as: - float/int: Single value applied to entire batch - 0D tensor: Single value applied to entire batch - 1D tensor: Batch of values matching input batch size Parameters will be automatically expanded to match batch size and moved to input device if necessary. If provided, norm_params must be None.

Returns:

Processed audio tensor of same shape as input

Return type:

torch.Tensor