Differentiable implementation of a single-tap delay line.
This processor implements a basic digital delay line using frequency-domain processing
for precise, artifact-free time delays. It creates a single echo of the input signal
with controllable delay time and mix level.
Implementation is based on:
The delay is implemented in the frequency domain using the time-shift property:
\[Y(\omega) = X(\omega)e^{-j\omega\tau}\]
where:
X(ω) is the input spectrum
Y(ω) is the delayed spectrum
τ is the delay time in seconds
Phase is unwrapped to ensure continuous delay response
Controls time offset between original and delayed signal
Minimum value ensures stable processing
Maximum value set for practical buffer sizes
mix: Wet/dry mix ratio
Range: 0.0 to 1.0
0.0: Only original signal
1.0: Only delayed signal
Linear crossfade between original and delayed signals
Note
Uses FFT-based delay for precise time shifting
Phase unwrapping prevents discontinuities in delay
Automatic padding handles all delay times
Particularly effective for:
Creating simple echoes
Adding space to dry signals
Basic time-based effects
Examples
Basic DSP Usage:
>>> # Create a basic delay>>> delay=BasicDelay(sample_rate=44100)>>> # Process audio>>> output=delay(input_audio,dsp_params={... 'delay_ms':500.0,# Half-second delay... 'mix':0.5# Equal mix of dry and wet... })
Neural Network Control:
>>> # 1. Simple parameter prediction>>> classDelayController(nn.Module):... def__init__(self,input_size):... super().__init__()... self.net=nn.Sequential(... nn.Linear(input_size,32),... nn.ReLU(),... nn.Linear(32,2),# 2 parameters: delay and mix... nn.Sigmoid()# Ensures output is in [0,1] range... )...... defforward(self,x):... returnself.net(x)>>>>>> # Process with features>>> controller=DelayController(input_size=16)>>> features=torch.randn(batch_size,16)>>> norm_params=controller(features)>>> output=delay(input_audio,norm_params=norm_params)
A simple delay line that creates a single echo of the input signal after a specified time delay.
It provides control over the delay time and mix ratio between the dry (original) and wet (delayed)
signals, offering basic time-based effects without feedback.
x (torch.Tensor) – Input audio tensor. Shape: (batch, channels, samples)
norm_params (Dict[str, torch.Tensor]) – Normalized parameters (0 to 1)
Must contain the following keys:
- ‘delay_ms’: Delay time in milliseconds (0 to 1)
- ‘mix’: Wet/dry balance (0 to 1)
Each value should be a tensor of shape (batch_size,)
dsp_params (Dict[str, Union[float, torch.Tensor]], optional) – Direct DSP parameters.
Can specify delay parameters as:
- float/int: Single value applied to entire batch
- 0D tensor: Single value applied to entire batch
- 1D tensor: Batch of values matching input batch size
Parameters will be automatically expanded to match batch size
and moved to input device if necessary.
If provided, norm_params must be None.
Differentiable implementation of a feedback delay line.
This processor implements a delay line with feedback and feedforward paths, creating
multiple decaying echoes. The implementation uses frequency-domain processing and
a feedback-feedforward structure for flexible echo patterns.
>>> # Create a feedback delay>>> delay=BasicFeedbackDelay(sample_rate=44100)>>> # Process with rhythmic echoes>>> output=delay(input_audio,dsp_params={... 'delay_ms':250.0,# Quarter note at 120 BPM... 'mix':0.5,# Equal mix... 'fb_gain':0.7,# Moderate feedback... 'ff_gain':0.8# Strong initial echo... })
Neural Network Control:
>>> # 1. Simple parameter prediction>>> classFeedbackDelayController(nn.Module):... def__init__(self,input_size):... super().__init__()... self.net=nn.Sequential(... nn.Linear(input_size,32),... nn.ReLU(),... nn.Linear(32,4),# 4 parameters... nn.Sigmoid()# Ensures output is in [0,1] range... )...... defforward(self,x):... returnself.net(x)>>>>>> # Process with features>>> controller=FeedbackDelayController(input_size=16)>>> features=torch.randn(batch_size,16)>>> norm_params=controller(features)>>> output=delay(input_audio,norm_params=norm_params)
A delay effect that includes a feedback path, allowing the delayed signal to be fed back into
the input. This creates multiple, gradually decaying echoes. Features controls for delay time,
feedback amount, and wet/dry mix, enabling creation of everything from subtle space to rhythmic
echo patterns.
Process input signal through the feedback delay line.
Parameters:
x (torch.Tensor) – Input audio tensor. Shape: (batch, channels, samples)
norm_params (Dict[str, torch.Tensor]) – Normalized parameters (0 to 1)
Must contain the following keys:
- ‘delay_ms’: Base delay time in milliseconds (0 to 1)
- ‘fb_gain’: Amount of signal fed back through delay line (0 to 1)
- ‘ff_gain’: Feedforward gain (0 to 1)
- ‘mix’: Wet/dry balance (0 to 1)
Each value should be a tensor of shape (batch_size,)
dsp_params (Dict[str, Union[float, torch.Tensor]], optional) – Direct DSP parameters.
Can specify feedback delay parameters as:
- float/int: Single value applied to entire batch
- 0D tensor: Single value applied to entire batch
- 1D tensor: Batch of values matching input batch size
Parameters will be automatically expanded to match batch size
and moved to input device if necessary.
If provided, norm_params must be None.
Differentiable implementation of a slapback delay effect.
The implementation is based on:
This processor extends BasicDelay to create a specialized short delay effect
that emulates the distinctive “doubling” sound popularized in 1950s recordings.
The delay time range is specifically restricted to create the characteristic
slapback effect.
The processor uses the same frequency-domain implementation as BasicDelay:
\[Y(\omega) = X(\omega)e^{-j\omega\tau}\]
where τ is restricted to 40-120ms for the slapback effect.
Delay Time Ranges:
40-80ms: Tight doubling effect
80-120ms: Subtle ambience
These ranges are chosen based on psychoacoustic research
and historical usage in classic recordings.
Only modifies parameter ranges for specialized use
Particularly effective on:
Vocals (creates natural doubling)
Electric guitar (adds depth)
Snare drums (enhances attack)
No feedback to maintain clarity of effect
Examples
Basic DSP Usage:
>>> # Create a slapback delay>>> delay=SlapbackDelay(sample_rate=44100)>>> # Process with classic settings>>> output=delay(input_audio,dsp_params={... 'delay_ms':60.0,# Tight doubling effect... 'mix':0.4# Subtle enhancement... })
Neural Network Control:
>>> # 1. Simple parameter prediction>>> classSlapbackController(nn.Module):... def__init__(self,input_size):... super().__init__()... self.net=nn.Sequential(... nn.Linear(input_size,32),... nn.ReLU(),... nn.Linear(32,2),# 2 parameters: delay and mix... nn.Sigmoid()# Ensures output is in [0,1] range... )...... defforward(self,x):... returnself.net(x)>>>>>> # Process with features>>> controller=SlapbackController(input_size=16)>>> features=torch.randn(batch_size,16)>>> norm_params=controller(features)>>> output=delay(input_audio,norm_params=norm_params)
A specialized short delay effect that emulates the distinctive “doubling” sound of vintage tape
delays. Uses very short delay times (typically 60-120ms) with minimal to no feedback, creating
a tight, distinctive echo that was popular in early rock and roll recordings.
Differentiable implementation of a stereo ping-pong delay effect.
This processor implements a stereo delay effect where echoes alternate between
left and right channels, creating a “ping-pong” spatial pattern. The implementation
uses a cross-coupled feedback structure in the frequency domain for precise timing
and smooth transitions.
Implementation is based on:
The system is described by coupled transfer functions:
>>> # Create a ping-pong delay>>> delay=PingPongDelay(sample_rate=44100)>>> # Process with rhythmic spatial echoes>>> output=delay(input_audio,dsp_params={... 'delay_ms':250.0,# Quarter note at 120 BPM... 'feedback_ch1':0.7,# Left to right decay... 'feedback_ch2':0.7,# Right to left decay... 'mix':0.5# Equal mix of dry and wet... })
Neural Network Control:
>>> # 1. Simple parameter prediction>>> classPingPongController(nn.Module):... def__init__(self,input_size):... super().__init__()... self.net=nn.Sequential(... nn.Linear(input_size,32),... nn.ReLU(),... nn.Linear(32,4),# 4 parameters... nn.Sigmoid()# Ensures output is in [0,1] range... )...... defforward(self,x):... returnself.net(x)>>>>>> # Process with features>>> controller=PingPongController(input_size=16)>>> features=torch.randn(batch_size,16)>>> norm_params=controller(features)>>> output=delay(input_audio,norm_params=norm_params)
A stereo delay effect where the echoes alternate between left and right channels, creating
a “ping-pong” effect across the stereo field. Each echo bounces from one channel to the other
while decreasing in amplitude, producing a wide spatial effect with rhythmic possibilities.
x (torch.Tensor) – Input audio tensor. Shape: (batch, 2, samples)
norm_params (Dict[str, torch.Tensor]) – Normalized parameters (0 to 1)
Must contain the following keys:
- ‘delay_ms’: Base delay time in milliseconds (0 to 1)
- ‘feedback_ch1’: Left channel feedback (0 to 1)
- ‘feedback_ch2’: Right channel feedback (0 to 1)
- ‘mix’: Wet/dry balance (0 to 1)
Each value should be a tensor of shape (batch_size,)
dsp_params (Dict[str, Union[float, torch.Tensor]], optional) – Direct DSP parameters.
Can specify ping-pong parameters as:
- float/int: Single value applied to entire batch
- 0D tensor: Single value applied to entire batch
- 1D tensor: Batch of values matching input batch size
Parameters will be automatically expanded to match batch size
and moved to input device if necessary.
If provided, norm_params must be None.
Returns:
Processed stereo audio tensor of same shape as input. Shape: (batch, 2, samples)
Differentiable implementation of a multi-tap delay effect.
This processor implements a parallel delay structure with multiple taps, where each tap
represents an independent echo with its own delay time and gain. The implementation uses
frequency-domain processing for precise timing control and efficient computation.
Implementation is based on:
The transfer function is a sum of delayed signals:
For each tap i (where i ranges from 0 to num_taps-1):
i_tap_delays_ms: Delay time for tap i
Range: 50.0 to 500.0 milliseconds
Controls timing of each echo
Independent control per tap
Can create complex rhythmic patterns
i_tap_gains: Gain for tap i
Range: 0.0 to 1.0
Controls amplitude of each echo
Allows creation of complex patterns
Can be used for amplitude envelopes
mix: Overall wet/dry mix ratio
Range: 0.0 to 1.0
0.0: Only original signal
1.0: Only processed signal
Controls overall effect intensity
Note
Uses FFT-based delay for precise time shifting
Phase unwrapping prevents discontinuities
Automatic padding handles all delay times
Particularly effective for:
Creating complex rhythmic patterns
Building custom echo sequences
Designing unique delay textures
Each tap can be independently controlled
System is stable for all parameter values
Examples
Basic DSP Usage:
>>> # Create a 4-tap delay>>> delay=MultiTapsDelay(sample_rate=44100,num_taps=4)>>> # Process with rhythmic pattern>>> params={... '0_tap_delays_ms':125.0,# Eighth note at 120 BPM... '0_tap_gains':0.8,... '1_tap_delays_ms':250.0,# Quarter note... '1_tap_gains':0.6,... '2_tap_delays_ms':375.0,# Dotted quarter... '2_tap_gains':0.4,... '3_tap_delays_ms':500.0,# Half note... '3_tap_gains':0.2,... 'mix':0.5... }>>> output=delay(input_audio,dsp_params=params)
Neural Network Control:
>>> # 1. Simple parameter prediction>>> classMultiTapController(nn.Module):... def__init__(self,input_size,num_taps):... super().__init__()... num_params=2*num_taps+1# delays, gains, and mix... self.net=nn.Sequential(... nn.Linear(input_size,32),... nn.ReLU(),... nn.Linear(32,num_params),... nn.Sigmoid()# Ensures output is in [0,1] range... )...... defforward(self,x):... returnself.net(x)>>>>>> # Process with features>>> controller=MultiTapController(input_size=16,num_taps=4)>>> features=torch.randn(batch_size,16)>>> norm_params=controller(features)>>> output=delay(input_audio,norm_params=norm_params)
A complex delay effect that creates multiple delayed copies (taps) of the input signal, each
with independent timing, level, and panning controls. This allows for creation of complex
rhythmic patterns and spatial effects by precisely controlling the timing and placement of
each echo.
x (torch.Tensor) – Input audio tensor. Shape: (batch, channels, samples)
norm_params (Dict[str, torch.Tensor]) – Normalized parameters (0 to 1)
Must contain the following keys:
- ‘{i}_tap_delays_ms’: Base delay time for each tap (0 to 1)
- ‘{i}_tap_gains’: Tap gain for each tap (0 to 1)
- ‘mix’: Wet/dry balance (0 to 1)
Each value should be a tensor of shape (batch_size,)
dsp_params (Dict[str, Union[float, torch.Tensor]], optional) – Direct DSP parameters.
Can specify multi-tap parameters as:
- float/int: Single value applied to entire batch
- 0D tensor: Single value applied to entire batch
- 1D tensor: Batch of values matching input batch size
Parameters will be automatically expanded to match batch size
and moved to input device if necessary.
If provided, norm_params must be None.