Electronic Studio Resources II: Assignment 3 — Spectral Processing

- Get familiar with a few common spectral processing techniques, using the SoundHack and SPEAR programs. We may also work with the ProVerb convolution reverb effect in DP.
- Produce some sounds that you could imagine using in the short piece you will compose for Assignment 4.

- At least two sound files from SoundHack (one phase vocoder, one convolution), and one from SPEAR. We may also look at ProVerb in DP, and you are welcome to include a file processed with that. Put these files in a folder that has your name and the assignment number as part of its name.
- Follow the assignment submission instructions to submit your assignment.

The French mathematician, Jean Baptiste Fourier (a contemporary of Beethoven),
discovered that any periodic waveform can be represented as the sum of one or
more harmonically related sine waves, each with a fixed frequency, phase, and
amplitude. He worked out the mathematics, later called the *Fourier
Transform*, for performing this decomposition — as well as its
inverse, for reconstructing a time-domain waveform from the analysis data.
Fourier’s discovery led in the twentieth century to sophisticated methods
of processing sound, using an extension of the Fourier Transform to cover
digital, rather than analog, systems. The DFT (Discrete Fourier Transform)
works with the discrete time intervals used in digital sampling systems. The
FFT (Fast Fourier Transform) is a computationally efficient version of the DFT.

The FFT analyzes a time-domain waveform into a large number of *frequency
bands*. The bands are equally spread across the frequency spectrum, from
0 Hz to the Nyquist frequency. The number of bands is typically 512,
1024, or 2048. (The FFT algorithm requires that the number be a power of two.)
Each band represents a sine wave of a certain frequency, and the amplitudes of
these *frequency components* constitute the most important part of the
analysis. The FFT assumes that the sound you analyze is constant, not
changing. But that’s not the kind of sound we’re most interested
in, so FFT-based audio tools take many “snapshots” of the sound at
equal time intervals. Each snapshot is an FFT analysis at one point in time.
The results of the FFT analysis can be displayed in real time, giving you a
graphical view of the timbral evolution of a sound, snapshot by snapshot —
rather like a
flip-book animation. FFT-based processing tools manipulate
the FFT analysis data for each snapshot before *resynthesizing* a modified
complex waveform. This resynthesis is the fun part.

**SoundHack** is a program that lets you perform various sound processing
techniques that might not be available (at least with the same degree of
flexibility) in commercial programs like Digital Performer and Pro Tools.
SoundHack also lets you convert from one sound file format to another.

Download SoundHack for your own Mac here. (Sorry, no Windows version.)

A *phase vocoder* starts by taking a series of windowed FFT analysis
frames (the snapshots mentioned above). Recall that this divides the spectrum
into equal-sized frequency *bins* and shows the time-varying amplitude of
these bins. (A *spectrogram* is a common visualization of this process.)
The phase vocoder then processes the analysis data to determine the extent to
which the actual frequency falling within a bin deviates from the bin’s
fixed frequency. This time-varying frequency deviation lets the phase vocoder
handle frequency more precisely than a simple FFT. Following manipulation of
the phase vocoder data (the amplitude and frequency deviation for each bin), an
inverse process resynthesizes a time-domain audio signal. The phase vocoder is
typically used to perform pitch-shifting without affecting duration, or
time-scaling without affecting pitch.

- Open a sound file in
**SoundHack**. Choose the**Hack > Phase Vocoder**menu command. - Set
**Bands**to the number of FFT bands you want to use. A large number of bands yields better frequency resolution, while a small number of bands yields better time resolution. - The
**Window**popup menu allows you to choose different FFT window envelopes for different filtering characteristics. Stick with**Hamming**,**von Hann**, and**Kaiser**window types. - The
**Overlap**popup menu sets the amount of overlap between successive FFT frames. 1x means the frames are contiguous; 2x means one frame starts in the middle of the previous frame. The less overlap, the more fluttery the output will sound. - Click the
**Time Scale**button for time scaling, or the**Pitch Scale**button for pitch scaling. Enter the scale factor next to the**Scaling**popup menu (above the Pitch Scale button). This popup menu lets you enter desired length (for time scaling) or semitone transposition (for pitch scaling).If you want the time or pitch scaling factor to change during processing, click the

**Scaling Function**check box, and then**Edit Function**. This brings up the**Function Window**editor, in which you can draw a time-varying scaling function.

*Convolution* is a process that multiplies the spectra of two sound
files. When this works well, you can achieve a kind of blend between the
two sounds that is very different from simply mixing them together. For
example, you might be able to impose whispering onto the sound of the
seashore. For this to work, at least one of the sounds must have fairly
broad-band energy (energy across the frequency spectrum). If one sound
has lots of low-frequency energy, and the other has lots of high-frequency
energy, then multiplying their spectra will give you basically nothing.
(Multiplying anything by zeros gives you zero.) Convolution is typically
implemented by taking FFTs of the two sounds, multiplying the amplitudes
of corresponding bins, and then taking the inverse FFT to return to a
time-domain waveform.

Another use of convolution is to apply a spatial characteristic to a sound.
The *impulse response* of a space, such as a concert hall, is the sound
made by making a very brief broad-band sound in that space, which then bounces
off the walls and other surfaces, creating echoes and reverberation.
Convolution lets you impose this ambience onto another sound, making it seem as
if the sound is heard in that space. (This is what the DP ProVerb plug-in
does.) For more about this use of convolution, see
this article.

- Open a sound file in
**SoundHack**. Choose**Hack > Convolution**. Click the**Pick Impulse**to select a different impulse sound file.If you’re trying to impose reverb, try using the Voxengo impulse response files (IU network or VPN only).

Or check out this extensive list of impulse response files.

- Click
**Process**to write the convolution result to a file. Remember that the success of the output sound depends on the spectral characteristics of the source and impulse sound files. Trial and error is an essential part of working with convolution.

For more help with SoundHack, see the online manual (IU network or VPN only).

SPEAR (Sinusoidal Partial Editing Analysis and Resynthesis) is an excellent graphical implementation of the McAulay-Quatieri (MQ) sound analysis algorithm, which decomposes a sound into many sinusoidal partials. You edit these partials using a variety of tools (time scaling, frequency or pitch shifting, amplitude scaling, or simply drawing new partials) and then resynthesize them into a new sound file.

Download SPEAR for your own computer here.

- Open a sound file in
**SPEAR**, using its**File > Open**. (Just press the**Analyze**button in the window that appears.) You can play and stop the sound by toggling the space bar. - Experiment with the
**Controls**window to change pitch, speed, etc. - Use the
**Lasso Selection**tool to select part of the spectrum. You can play only this part by holding down the shift key while pressing the space bar. - Then use the tools in the palette, or the commands in the
**Transform**menu to manipulate the selected part of the spectrum. - Save the results to a new sound file. No, you don’t do this by
using the
**File > Save**command. That saves a spectral analysis file in the SDIF format. Instead, use**Sound > Synthesize to File**.