Fourier analysis

So there was this frenchman called Joseph Fourier who was mostly looking into this hot new field of physics called “thermodynamics” (pun intended), and kind of on the side proved that any periodic function can be expressed as a sum of trigonometric functions (meaning, any function that repeats is just a bunch of sine waves added together) (as a sidenote, Gauss used the idea first because of course he did, but didn’t do anything useful with it).

This means that, for any signal, we can now extract frequency components - if you have an audio recording of a piano piece, you can combine that with a signal of 220Hz to figure out when C1 is played. This is handy if you want to extract a single frequency from a signal (it’s how AM/FM radio works, though it is usually implemented in hardware), but it is pretty useless if you want to know all the frequencies.

For a while, the standard method to achieve this was to just calculate them all individually, which was quite tedious (in technical terms, it required O(n²) computations). In a ‘65 us-am meeting on figuring out how to spy on soviet nuclear tests, a couple folks (John Tukey and Richard Garwin) realised that an earlier O(n log n) algorithm for the Hadamard transform could be reformulated to also work for Fourier transforms, and soon published their results (well, Garwin sent Tukey to IBM, where he collaborated with James Cooley). As the main guy involved in the algorithm didn’t work at IBM, they couldn’t copyright it, meaning everyone was free to use it, and considering that this happened in '65 and computers were just taking off, this algorithm became very relevant in digital signal processing (eg JPEG uses it to compress chunks).

Math

To briefly cover the basics: if you define the square root of one to equal i, you can use something called complex numbers of the form a+bi for a and b being regular “real” numbers. If you think of real numbers as a line, the complex ones exist on a plane. Most math we have for reals generalises for complex numbers with some interesting side effects, for example, raising stuff to a complex exponent rotates it on the plane. The famous equation e^(i*pi)+1=0 illustrates this in a roundabout way - raising e to the complex constant i rotates it around the plane, and raising it exactly pi times rotates it to -1. Meaning, you can now do trigonometry with exponents.

To elaborate on the above brief explanation (‘combine a piano recording with a 220Hz signal’), if we multiply any signal by a sinusoidal and a cosinusoidal wave of a certain frequency and integrate those (“sum up all the values”), we get the amplitude (“power”) of that frequency in that signal. By using the above trick for doing rotations via complex numbers, we can represent every sin-cos pair as a complex numbers, which turns out to be a lifesaver. If our signals are discrete (meaning, a signal is a list of values, and not eg a function), this means we need as many trigonometric samples as our sample is long. A list of n samples can only have up to n constituting frequencies (XXX elaborate), meaning we need a total of n² samples - leading to the O(n²) complexity above.

XXX quick intro to matrix multiplication maybe?

The idea of “take a list of n samples, multiply each with n values, then sum up the result” sounds a lot like matrix multiplication. In usual mathematical fashion, that’s because it is - the above can be represented as the following equation:

/ s1 \   / 1 1       1         .. 1         \   / f1 \
| s2 |   | 1 w       w^2 w^3   .. w^n-1     |   | f2 |
| s3 | x | 1 w^2     w^4 w^6   .. w^2*(n-1) | = | f3 |
| .. |   |   ..                             |   | .. |
\ sn /   \ 1 w^(n-1) w^2*(n-1) .. w^(n-1)^2 /   \ fn /

where s* is the input signal, f* is its fourier equivalent, and
w=e^(-2i*pi/n)

w is equal to “rotating” the sample in the complex plane by one nth of a circle - meaning, the second row is one full rotation, corresponding to a wave with a period spanning the entire input, the third row is two rotations, corresponding to a wave with a period of half the input, etc. The first row simply sums up all the values - it gives us the “average” value, from which all others are deviations. It might also help to think of 1 as w^0 and w as w^1

This is known as a Vandermonde matrix. The exact specifics of this are irrelevant, except for the fact that such a matrix has certain symmetries that can be exploited to speed up computation significantly.

XXX - folding sine waves into themselves - corresponding mathematical restructuring https://en.wikipedia.org/wiki/Cooley%E2%80%93Tukey_FFT_algorithm#The_radix-2_DIT_case

Algorithm

Code is available on my github, below is a somewhat abbreviated version:

#include<math.h> // trigs
void _fft(float buf[],float out[],float exps[],int n,int step){
    if(step>=n)return;float tr,ti;
    _fft(out,buf,exps,n,step*2);
    _fft(out+step*2,buf+step*2,exps,n,step*2);
    for(int i=0;i<n;i+=2*step){
        tr=exps[2*i]*out[2*(i+step)]-exps[2*i+1]*out[2*(i+step)+1];
        ti=exps[2*i]*out[2*(i+step)+1]+exps[2*i+1]*out[2*(i+step)];
        buf[i]=out[2*i]+tr,buf[i+1]=out[2*i+1]+ti;
        buf[i+n]=out[2*i]-tr,buf[i+n+1]=out[2*i+1]-ti;
    }
}
void fft(float buf[],int n){int s=n;
    if(n&(n-1)){int pow=0,nt=n;
        while(nt)nt>>=1,pow++;s=1<<pow;
    }float in[2*s],out[2*s],exps[2*s],a;
    for(int i=0;i<s;i++){
        out[2*i]=in[2*i]=buf[i%n],out[2*i+1]=in[2*i+1]=0;
        //sincosf(M_PI*i*2/s *(inv?-1:1),exps+2*i+1,exps+2*i);
        a=M_PI*i/s*2,exps[2*i]=cosf(a),exps[2*i+1]=sinf(a);
    }_fft(in,out,exps,s,1);
    for(int i=0;i<n;i++)buf[i]=sqrt(in[2*i]*in[2*i]+in[2*i+1]*in[2*i+1]);
}

The fft function takes an array of floats and its size. It figures out the next-largest power of two if needed (since the Cooley-Tukey algorithm requires the length of its data to be a power of two), then it creates two arrays for the algorithm to use, plus a third one for its complex exponents. The arrays all have size 2s since C-T uses complex numbers - every even-indexed item is real, and every odd one is imaginary. Exponents are likewise calculated with trigonometry (a call to sincosf would have been more efficient, unfortunately it seems to be broken in my release of gcc). The recursive algorithm is then called, and its output is cast back into the input array (storing the amplitudes of the numbers).

The recursive algorithm works somewhat like a mergesort (I should probably make a page about sorting algorithms). XXX

Shortcomings XXX

useful if you wanna analyse stuff like tides - look at differential analysers for more on that
pretty shit at figuring out the timing of signals within a recording - if a signal is only present for part of it, this will not be detected/represented. the standard solution is [xxx] (aka “cut the recording into small pieces and do FFT on those”), which kind of sucks since it just shifts the problem from one part to another. wavelet transforms do this a lot better, but i havent found a comparatively fast implementation of those so far.

(written 20241225 161635)