next up previous
Next: Sound Synthesis Up: Method Previous: Method

Recording Analysis

Recordings of the chosen notes are made and edited to include only the primary pipe speech, excluding the initial attack and final release of the note where other effects occur (Figure 1). The power spectral density (PSD) of each recording is estimated (e.g., using Welch's method [2], Figure 2). A PSD function determines the power distribution of a signal in the frequency domain. This allows determination of the power of the signal at discrete frequencies. The $N$ (an arbitrary number chosen by the user) largest peaks in each distribution are found by searching for three consecutive points where the middle point has greater value than both surrounding points (Figure 2). The percentage power contribution of each peak is found by dividing the power of the peak by the total power of the distribution. Finally, the expected base frequency is recorded as the closest frequency to it that is also one of the $N$ peaks. The exact base frequency cannot be used because the PSD function produces a list of discrete frequencies. Furthermore, the number of peaks is chosen so that there is a close match (within 2Hz, generally). This is generally not an issue because the base frequency is almost always one of the first two peaks.

Figure 1: Excerpt of a recorded wave.
Image recording-time

Figure 2: Excerpt of the PSD of a recorded wave. The first five peaks are marked with red circles.
Image recording-peaks

At this point processing on individual recordings is complete, and they can now be processed as a set. The first peak power contribution of each recording is taken, along with that recording's base frequency. A least-squares polynomial fit algorithm is applied where the base frequency ($F$) is the independent variable and the peak power contribution corresponding to that base frequency ($P$) is the dependent variable. Third-degree polynomials tend to produce good results. This procedure is repeated for all of the other $N$ peaks so that the polynomial fit algorithm is applied to all of the second, third, etc. peak power contributions up to $N$. These algorithms generate coefficients $a$, $b$, and $c$ for the simple polynomial

P = aF^2 + bF + c,\end{displaymath} (1)

where $F$ is the desired frequency and $P$ is the resultant power contribution for each of the first $N$ power contributions. This produces continuous functions from which to generate individual (and likely unique) harmonic structures at any frequency.

next up previous
Next: Sound Synthesis Up: Method Previous: Method
mjibson 2009-01-06