Table of Contents

Class ScaleTempoAudioEffect

Namespace
VisioForge.Core.Types.X.AudioEffects
Assembly
VisioForge.Core.dll

Scale tempo audio effect changes the playback speed (tempo) of audio without significantly affecting pitch. Uses the GStreamer 'scaletempo' element with WSOLA (Waveform Similarity Overlap-Add) algorithm. Ideal for time-stretching audio, creating slow-motion or fast-forward effects while preserving pitch.

public class ScaleTempoAudioEffect : BaseAudioEffect, ISharedAudioEffectX, IVideoEditXAudioEffect

Inheritance

Implements

Inherited Members

Remarks

GStreamer element: scaletempo Properties:

  • rate: Playback speed multiplier (0.5 = half speed, 2.0 = double speed)
  • stride: Length of audio segments processed (in milliseconds)
  • overlap: Percentage of segment overlap for smooth transitions
  • search: Time window for finding best overlap position (in milliseconds)

WSOLA Algorithm:

  1. Divides audio into overlapping segments (stride length)
  2. Searches for best position to overlap segments (search window)
  3. Cross-fades overlapping portions (overlap percentage)
  4. Reconstructs audio at new tempo

Quality vs Efficiency trade-offs:

  • Larger stride = faster processing, but more artifacts
  • Larger search = better quality, but slower processing
  • More overlap = smoother, but more CPU intensive

Best for:

  • Speech (podcasts, audiobooks): stride=30-50ms, search=14ms
  • Music: stride=15-30ms, search=20-30ms for better quality
  • Real-time: smaller stride and search for lower latency

Constructors

ScaleTempoAudioEffect(double)

Initializes a new instance of the VisioForge.Core.Types.X.AudioEffects.ScaleTempoAudioEffect class.

public ScaleTempoAudioEffect(double rate = 1)

Parameters

rate double

The tempo scale rate. 1.0 = original speed, 0.5 = half speed, 2.0 = double speed. Typical range: 0.5 to 2.0. Values outside this range may have reduced quality.

Properties

Overlap

Gets or sets the percentage of stride length to overlap between adjacent segments. Higher overlap creates smoother transitions but requires more processing. Range: 0.0 (no overlap) to 0.9 (90% overlap). Default: 0.2 (20%).

Typical values:

  • 0.1-0.15: Minimal overlap, efficient but may have artifacts
  • 0.2-0.3: Balanced quality and performance (default: 0.2)
  • 0.4-0.6: High overlap, smoother but more CPU intensive

Increase for:

  • More complex material (music with harmonics)
  • Higher quality requirements

Decrease for:

  • Real-time processing
  • Simple material (speech, single instruments)
public double Overlap { get; set; }

Property Value

double

Rate

Gets or sets the playback rate (tempo multiplier). Controls the speed of audio playback while attempting to preserve pitch.

  • Values < 1.0: Slower playback (0.5 = half speed)
  • Value = 1.0: Normal speed (no change)
  • Values > 1.0: Faster playback (2.0 = double speed)

Typical ranges:

  • 0.5-0.8: Slowed down (better comprehension, detail analysis)
  • 1.2-1.5: Sped up (faster listening, time-saving)
  • 1.5-2.0: Very fast (may reduce comprehension)

Quality decreases at extreme values (<0.4 or >3.0). For music, smaller changes (0.8-1.25) work best.

public double Rate { get; set; }

Property Value

double

Gets or sets the length of the search window for finding the best overlap position. Larger search windows find better matches but require more processing time. Default: 14 milliseconds.

Typical values:

  • 5-10 ms: Fast, lower quality (real-time applications)
  • 14-20 ms: Balanced (default: 14ms for speech)
  • 20-40 ms: High quality (music, offline processing)

Increase for:

  • Music or harmonic content
  • When quality is priority over speed

Decrease for:

  • Real-time or low-latency requirements
  • Simple material like speech
public TimeSpan Search { get; set; }

Property Value

TimeSpan

Stride

Gets or sets the length of each audio segment (stride) to be processed. Determines the granularity of the time-stretching operation. Default: 30 milliseconds.

Typical values:

  • 15-25 ms: Fine-grained, better for music
  • 30-40 ms: Balanced (default: 30ms for speech)
  • 50-100 ms: Coarse, efficient but may have noticeable artifacts

Shorter strides:

  • Better quality for complex material
  • More CPU intensive
  • Better for music

Longer strides:

  • More efficient processing
  • Adequate for speech
  • May produce audible artifacts in music

Note: Stride length affects perceived quality more than search or overlap.

public TimeSpan Stride { get; set; }

Property Value

TimeSpan

Methods

GenerateDescription()

Generates the description.

public string GenerateDescription()

Returns

string

System.String.