Electronic Studio Resources II: Assignment 3 — Spectral Processing

- Get familiar with a few common spectral processing techniques, using the SoundHack and SPEAR programs. We may also work with the ProVerb convolution reverb effect in DP.
- Produce some sounds that you could imagine using in the short piece you will compose for Assignment 4.

- At least two sound files from SoundHack (one phase vocoder, one
convolution), and one from SPEAR. You are welcome also to include a file
processed with ProVerb in DP. Put these files in a folder that has your
name and the assignment number as part of its name.
We expect that you will generate many more than three sound files in the process of finding three that seem good enough to submit. The success of these sound processing methods depends to a great extent on trial and error, as they are quite sensitive to the characteristics of the source material.

- Follow the assignment submission instructions to submit your assignment.

The French mathematician, Jean Baptiste Fourier (a contemporary of Beethoven),
discovered that any periodic waveform can be represented as the sum of one or
more harmonically related sine waves, each with a fixed frequency, phase, and
amplitude. He worked out the mathematics, later called the *Fourier
Transform*, for performing this decomposition — as well as its
inverse, for reconstructing a time-domain waveform from the analysis data.
Fourier’s discovery led in the twentieth century to sophisticated methods
of processing sound, using an extension of the Fourier Transform to cover
digital, rather than analog, systems. The DFT (Discrete Fourier Transform)
works with the discrete time intervals used in digital sampling systems. The
FFT (Fast Fourier Transform) is a computationally efficient version of the DFT.

The FFT analyzes a time-domain waveform into a large number of *frequency
bands*. The bands are equally spread across the frequency spectrum, from
0 Hz to the Nyquist frequency. The number of bands is typically 512,
1024, or 2048. (The FFT algorithm requires that the number be a power of two.)
Each band represents a sine wave of a certain frequency, and the amplitudes of
these *frequency components* constitute the most important part of the
analysis. The FFT assumes that the sound you analyze is constant, not
changing. But that’s not the kind of sound we’re most interested
in, so FFT-based audio tools take many “snapshots” of the sound at
equal time intervals. Each snapshot is an FFT analysis at one point in time.
The results of the FFT analysis can be displayed in real time, giving you a
graphical view of the timbral evolution of a sound, snapshot by snapshot —
rather like a
flip-book animation. FFT-based processing tools manipulate
the FFT analysis data for each snapshot before *resynthesizing* a modified
complex waveform. This resynthesis is the fun part.

**SoundHack** is a program that lets you perform various sound processing
techniques that might not be available (at least with the same degree of
flexibility) in commercial programs like Digital Performer and Pro Tools.
SoundHack also lets you convert from one sound file format to another.

Download SoundHack for your own Mac here. (Sorry, no Windows version.)

A *phase vocoder* starts by taking a series of windowed FFT analysis
frames (the snapshots mentioned above). Recall that this divides the spectrum
into equal-sized frequency *bins* and shows the time-varying amplitudes
and phases of these bins. (A *spectrogram* is a common visualization of
this process.) The phase vocoder then processes the phase data to determine the
extent to which the actual frequency falling within a bin deviates from the
bin’s fixed frequency. This time-varying frequency deviation lets the
phase vocoder handle frequency more precisely than a simple FFT. Following
manipulation of the phase vocoder data (the amplitude and frequency deviation
for each bin), an inverse process resynthesizes a time-domain audio signal. The
phase vocoder is typically used to perform pitch-shifting without affecting
duration, or time-scaling without affecting pitch, but there are many other
processes, such as morphing and spectral retuning, that benefit from phase
vocoder analysis.

- Open a sound file in
**SoundHack**. Choose the**Hack > Phase Vocoder**menu command. - Set
**Bands**to the number of FFT bands you want to use. A large number of bands yields better frequency resolution, while a small number of bands yields better time resolution. - The
**Window**popup menu allows you to choose different FFT window envelopes for different filtering characteristics. Stick with**Hamming**,**von Hann**, and**Kaiser**window types. - The
**Overlap**popup menu sets the amount of overlap between successive FFT frames. 1x means the frames are contiguous; 2x means one frame starts in the middle of the previous frame. The less overlap, the more fluttery the output will sound. - Click the
**Time Scale**button for time scaling, or the**Pitch Scale**button for pitch scaling. Enter the scale factor next to the**Scaling**popup menu (above the Pitch Scale button). This popup menu lets you enter desired length (for time scaling) or semitone transposition (for pitch scaling).If you want the time or pitch scaling factor to change during processing, click the

**Scaling Function**check box, and then**Edit Function**. This brings up the**Function Window**editor, in which you can draw a time-varying scaling function.

*Convolution* is a process that multiplies the spectra of two sound files.
When this works well, you can achieve a blend between the two sounds that is
very different from simply mixing them together. For example, you might be
able to impose whispering onto the sound of the seashore. For this to work, at
least one of the sounds must have fairly broad-band energy (energy across the
frequency spectrum). If one sound has lots of low-frequency energy, and the
other has lots of high-frequency energy, then multiplying their spectra will
give you basically nothing. (Multiplying anything by zeros gives you zero.)
Convolution is typically implemented by taking FFTs of the two sounds,
multiplying the amplitudes of corresponding bins, and then taking the inverse
FFT to return to a time-domain waveform.

Another use of convolution is to apply a spatial characteristic to a sound.
You can think of the *impulse response* of a space, such as a concert
hall, as the sound captured when you play a very brief broad-band sound in that
space, which then bounces off the walls and other surfaces, creating echoes and
reverberation. Convolution lets you impose this ambience onto another sound,
making it seem as if the sound is heard in that space. (This is what the DP
ProVerb plug-in does.) For more about this use of convolution, see
this article.

- Open a sound file in
**SoundHack**. Choose**Hack > Convolution**. Click the**Pick Impulse**to select a different impulse sound file.If you’re trying to impose reverb, try using the Voxengo impulse response files (IU network or VPN only).

Or check out this extensive list of impulse response files.

- Click
**Process**to write the convolution result to a file. Remember that the success of the output sound depends on the spectral characteristics of the source and impulse sound files. Trial and error is an essential part of working with convolution.

For more help with SoundHack, see the online manual (IU network or VPN only).

SPEAR (Sinusoidal Partial Editing Analysis and Resynthesis) is an excellent graphical implementation of the McAulay-Quatieri (MQ) sound analysis algorithm, which decomposes a sound into many sinusoidal partials. You edit these partials using a variety of tools (time scaling, frequency or pitch shifting, amplitude scaling, or simply drawing new partials) and then resynthesize them into a new sound file.

MQ analysis is an example of a *tracking phase vocoder*. Instead of having
a fixed number of partials spanning the entire input file, tracking phase
vocoders construct a variable number of partial threads, whose lengths vary
from just a few frames to the entire file duration.

Download SPEAR for your own computer here.

- Open a sound file in
**SPEAR**, using its**File > Open**. (Just press the**Analyze**button in the window that appears.) You can play and stop the sound by toggling the space bar. - Experiment with the
**Controls**window to change pitch, speed, etc. - Use the
**Lasso Selection**tool to select part of the spectrum. You can play only this part by holding down the shift key while pressing the space bar. - Then use the tools in the palette, or the commands in the
**Transform**menu to manipulate the selected part of the spectrum. - Save the results to a new sound file. No, you don’t do this by
using the
**File > Save**command. That saves a spectral analysis file in the SDIF format. Instead, use**Sound > Synthesize to File**.