Creating a pitch pipe: some musical mathematics

I’ve just finished writing a new pitch pipe web application, to replace my old version which has always been a bit buggy in ways I never quite understood. By comparing the screenshots below, you can see that the new pitch pipe is not only better-looking, but has more features than the previous version. It is meant to be used on a phone in portrait mode: you might get some weird layout stuff happening otherwise.

One thing in particular I wanted to experiment with was a way of being able to play some intervals in just intonation rather than only having equal temperament available. Just intervals are always relative to a tonal centre (one cannot “re-use” almost any of these once the tonal centre shifts), and so I ended up with a design where the tonal centre is on the outermost ring of the pitch pipe and can be moved around. I also ended up adding a frequency spectrum because, why not?

In this post I’ll go over some of the mathematics of pitch which makes this work, and a little about how it’s programmed - it uses only standard Web APIs, in particular the Web Audio API, and excluding icon images weighs in at about 20 kB of code. It’s also a progressive web app, and so will work offline after it’s accessed for the first time. You can find the source code for the new version in this repository. I’ll be leaving the old version up too, if only because I think it’s a cute example of what is doable in about 30 lines of Javascript.

The relationship between frequency and pitch

The pitch pipe creates sound by using the Web Audio API, which in simplified terms means that we can tell the browser a particular frequency (like 440 Hz) and a wave shape (sine, square, triangle, or sawtooth), and the browser will generate the noise for us. However, as musicians we understand pitch in terms of intervals (octaves, perfect fifths, semitones, etc) and specific notes (A, C sharp, etc) rather than frequencies, so we need to do some maths to move between the frequencies and the more interpretable pitches.

In order to understand the frequency-pitch relationship, our starting point will be the octave, the most fundamental interval in most musical traditions. I would describe an octave jump as going to a higher note that sounds “the same as, but higher than”, the original note. Here is what an octave sounds like, courtesy of Wikipedia:

An octave, played on a piano

If the first note played has frequency $f$ , then the second note an octave higher has frequency $2f$ , and this trend continues: another octave up will give us the frequency $4f$ , and another octave again gives us $8f$ . So an octave jump always corresponds to multiplying a frequency by $2$ . However, a musician would say that the interval between $f$ and $8f$ is three times the interval between $f$ and $2f$ (three octaves compared to one octave), not eight times that interval. This is because we percieve pitch as the logarithm of frequency, and so to convert octaves into a frequency multiplier, we raise the number of octaves to the power of 2, to get the correct answer $2^3 = 8$ .

This pitch-frequency relationship is fundamental to understanding tuning systems, so let’s set up some names and define the relationships. We need to pick a reference point, and a standard choice is to set $f_{\mathtt{A4}} = 440~\mathrm{Hz}$ for the note A4 (the A above middle C). Based on our discussion on octaves before, we have these two formulas for converting from a frequency $f$ to what I’ll call a relative pitch $\omega$ :

f = f_{\texttt{A4}} \cdot 2^\omega, \quad \omega = \log_2 \left( \frac{f}{f_{\mathtt{A4}}} \right).

\omega = 0

corresponds to A4,

\omega = 1

corresponds to A5 one octave higher, and

\omega = 2

corresponds to A6 two octaves higher. The idea is that distances in

\omega

-space corresponds to how we percieve intervals: a distance of 1 is always an octave, no matter whether it is between

\omega = 0

and

\omega = 1

, or between

\omega = -2.43

and

\omega = -1.43

In the standard equal temperament tuning system (aka the one you very probably know and love), every semitone is exactly $1/12$ th of an octave, and so a semitone interval corresponds to a movement of $\pm 1/12$ in $\omega$ -space. We can now do useful calculations, like what is the frequency of C#5, the note a major 3rd above A4? A major 3rd is five semitones, so using our equations above we set $\omega = 5/12$ and the corresponding frequency will be

f_{\mathtt{C\#5}} = f_{\mathtt{A4}} \cdot 2^{5/12}.

And voilà, we have all the knowledge we need to calculate frequencies for all the standard notes one would find on a piano. These are the notes on the outer ring of the pitch pipe — go ahead and experiment with them! How are they different to the notes on the inside?

As an aside, the relative pitch $\omega$ is extremely convenient in the actual pitch pipe program: here are a few examples. The note $\omega$ should be placed at an angle of $\omega \times 360^\circ$ on the circle. To snap $\omega$ to the nearest semitone, do $\operatorname{round}(12 \omega) / 12$ .

The harmonic series and just intonation

There are some intervals which occur naturally in music and sound, which will never occur in an equal-tempered system, and sometimes we want to use these intervals or chords when we perform instead of using their equal-tempered counterparts. For reasons which will become clear, this usually only applies to smaller groups of performers in fairly close harmony: say with all parts playing within a range of two or three octaves. In addition, it will usually only be done by performers using variable-tuning instruments, for example voices, un-fretted string instruments, many wind instruments, but no grand pianos.

One source of these naturally occurring intervals is in the harmonic series. When a string is plucked or bowed, or a resonant column of air is set up in a flute, there is often a fundamental frequency $f$ at which the string or air column oscillates. This is usually the loudest oscillation we hear, and we usually attribute the percieved pitch of the note to this fundamental. For some physics and maths reasons, the other modes of oscillation in the string or column tend to be integer multiples of this fundamental frequency: $2f$ , $3f$ , $4f$ , $5f$ , and so on. To get an idea of what this sounds like, see this visualisation (with your sound on).

These extra frequencies are called overtones of the fundamental. A very rough cartoon of what the frequency spectrum of a simple instrument (a string or flute) would look like is this:

A typical frequency spectrum for an oscillator with fundamental frequency $f$ .

The fundamental frequency is the strongest, then there is an overtone series dying off. The particular shape of the overtone series, and which overtones are present, will contribute to the timbre of the note (whether it sounds like a guitar string or a flute, for example). The pitch pipe shows the frequency spectrum for the waves being played — keep in mind that these waves are extremely simple, and life is full of much more rich and complex sounds (and hence frequency spectra)!

Let’s focus a little on what intervals are present in the overtone series. We clearly have octaves between $f$ , $2f$ , $4f$ , and so on (and also octaves between $3f$ and $6f$ for instance). The first new interval is between $2f$ and $3f$ , a ratio of $3/2$ , which is absurdly close to (but not quite) an equal tempered fifth of 7 semitones:

3/2 = 1.5, \quad \quad 2^{7/12} \approx 1.498.

The difference between these two notes is practically inaudible if one is played after the other, but they can be distinguished if they are held as part of a chord. In the pitch pipe, try distinguishing the fifth with the note labelled

3/2

, and then try again when holding each as part of a chord with the tonic note. A similar story holds for the equal-tempered fourth and the ratio

4/3

The next two intervals which are often used in just intonation settings are the major third and harmonic seventh, which correspond to frequency ratios of $5/4$ and $7/4$ respectively. By looking at their locations on the pitch pipe, you can see that the just intonation major third is a bit flat compared to the equal-tempered third, in fact the difference is exactly

\omega = \log_2(5/4) - 4/12 \approx -14 \text{ cents},

where a cent is a step of

1/1200

\omega

-space (cents are defined so that 100 of them make up a semitone). The harmonic 7th is about

31

cents flat of a minor seventh, an easily heard difference.

Why and when to use just intonation

It makes sense to want to use intervals present in the harmonic series because they are, at least to some degree, already naturally occurring in many sounds. When performing in small groups with variable-tuning instruments, it is quite easy to “lock on” to these intervals, because the first couple of overtones being produced by each instrument will line up and constructively interfere with each other. This can produce chords that sound subjectively “better”, or more “restful”, or sometimes even more “exciting”.

One style of music where this is used prominently in is in barbershop singing, where chords are said to ring when these overtones line up just right. Have a listen to this recording of the Lover come back tag by the Ringmasters quartet and notice how the musical style is encouraging ringing chords: notes are sustained without any vibrato so that every note can lock, the harmony is close (the four parts all sit within two octaves), and the voicing on the final chord is (from the bass part going upwards) 1-5-1-3, an exact progression from the harmonic series $(2f, 3f, 4f, 5f)$ .

Lover come back, performed by the Ringmasters quartet.

Another phenomenon which happens more readily when using just intonation is phantom overtones when singing in a group. The basic idea is that if the harmonics line up just right, then either through constructive interference, amplification happening at one harmonic because of room acoustics, or even just a trick of the brain, a listener can percieve an extra voice singing a note that none of the performers are singing. Listen to the two phrases from where the video below starts (about 1:04 to 1:30), of three singers performing the Halo theme, and listen for the phantom soprano joining in at the end of each phrase. A key ingredient to achieving this kind of overtone is for all the performers to agree on a particular vowel formant, assumedly so that the overtone series are the most compatible between performers.

Theme from Halo, with some phantom overtones.

The key thing to remember about just intonation is that it is always relative to a tonal centre: you can’t take a justly tuned major third above C, then go away and use it in some other key like A, because the intervals are not equal width around the octave anymore. Try it on the pitch pipe: the just third held with C will sound beautiful, but held together with an A will just sound like a very flat fifth. The key weakness of trying to fix a tuning system to a particular set of just intervals is that modulation, one of the joys of music, is all but impossible. When playing with fixed-tuning instruments, across a very large pitch range, or with many different performers and instruments, an equally tempered tuning system will be the most reliable at achieving harmony and flexibility of key.

Programming the actual app

Ever since I got a taste of declarative layouts for building GUIs I’ve never gone back: this app in particular is programmed using Svelte to do the heavy lifting of synchronising the application state to the layout, and Typescript because I prefer my Javascript with more types. I use Vite as a bundler since it’s great for running a hot-reloading development server and generating builds (it’s also insanely quick), and I’m experimenting with using the PWA Vite Plugin to auto-generate me a service worker to make the app available offline, and installable to the home screen. Once the Svelte is compiled there are no run-time dependencies, and the whole app is about 20 kB large uncompressed.

The user interface is fairly vanilla HTML and CSS, with some magic incantations to make it work correctly on a phone (I wanted a mode where it could not scroll, as if I were programming a phone app and just wanted a flat immovable canvas to paint on). The frequency spectrum display is a <canvas> element, and the circular pitch pipe itself is a dynamic SVG — dynamic in the sense that it is a template updated by Svelte as the application state changes.

The audio is generated using the Web Audio API — I think it’s pretty random (and fantastic!) that this is built-in to modern browsers. The four waves available, sine, sawtooth, triangle, and square, are the four built-in waves to the OscillatorNode in the web audio API. These go through an AnalyserNode, then through a GainNode to turn the volume down, then to the speaker. The AnalyserNode is set to perform a discrete Fourier transform using 8192 bins, which with a sample rate of 48 kHz corresponds to a resolution of about $5~\mathrm{Hz}$ .

Next steps

There are still one or two bugs to iron out, to do with the app restoring after phone lock/unlock when installed as a home screen app, but overall I’m happy with how this experiment has gone. It’s much more usable-at-a-glance than my original program, and the user interface for accessing the just intervals turned out nicely.