From: lou@caber.valid.com (Louis K. Scheffer) Subject: Re: 48k to 44.1k sample rate conversion Date: 18 May 91 08:06:35 GMT joe@media-lab.media.mit.edu.MEDIA.MIT.EDU (Joe Chung) writes: >In the digital audio world, a lot of sound is recorded at a 48k sample >rate. A CD, however, is at 44.1k. As we all know, the classic >technique of sample rate conversion would have us upsample to the >lowest common multiple, low-pass filter, and downsample to the desired >rate. The only problem is that the LCM of 44.1k and 48k is 21.168 Meg! >A 4 minute stereo soundfile at 48k takes up about 50 megabytes, which >means that it's 21.168M upsampled version would be over 22 gigabytes! >Obviously, sample rate converters don't store the entire intermediate, >upsampled version, but I was wondering if anyone has a good idea of >what sorts of schemes they do use. Do they do some sort of overlap- >save method akin to block convolution? Does anyone have any code >to do this? The most straightforward way is as follows: You interpolate 146 zeros between every sample to get a data rate of 7.056 MHz. You then design a digital filter to low-pass this to 20 KHz. This can be an FIR filter with perfect phase and arbitrarily good frequency response, subject only to the limits of your computing power. One DSP chip will easily get you +-0.05 db. You then discard 159 of every 160 samples to get 44.1 KHz. Since there is no content above 22.05 KHz, this loses nothing. Assuming you have done all high-precision math up to this point, then the last step, rounding back to 16 bits, is the only one that introduces any error. This error will most likely appear as white noise spread over the whole spectrum (just like the noise introduced on the first A/D conversion). The level should be very low (-96 db, more or less). Computationally, you never store the interpolated 0s, and you never compute the samples that will be discarded. Say the FIR filter with a 7 MHz rate and a 20KHz cutoff has 20000 coefficients (just a guess). Since 146/147 of the data are 0, you must do 20000/147 = 136 multiplies to get each output point. Hence you keep the last 136 samples at 48 KHz, and multiply by 136 coefficients (spaced 147 apart) from the FIR filter. Since all possible phases between the two signals are present (otherwise you did not have the least common multiple) you must keep all 20000 coefficients. Each output point uses a different 136 element subset up to the 147th output point, which uses the same coefficients as the first output point, and the cycle repeats. So the only question is computing the coefficients to use in the FIR filter. This is conceptually straightforward (See, for example, the fortran program in Rabiner and Gold, "Theory and Application of Digital Signal Processing") but such a long filter may stress the limits of computer arithmetic. Since you only need to compute it once, however, it should be simple to adapt the algorithm to higher precision arithmetic. I have the coefficients lying around somewhere for 48.0 -> 44.0 KHz conversion (an easier problem) but not for 48.0 -> 44.1. If someone computes them, or has them, maybe they could post a message saying where they can be found. Hope this helps, Lou Scheffer From: parks@bennett.berkeley.edu (Tom Parks) Subject: Re: 48k to 44.1k sample rate conversion Date: 26 May 91 22:15:01 GMT In article <498@valid.valid.com>, lou@caber.valid.com (Louis K. Scheffer) writes: |> True, but you only design it once (and it's done already, I've got the |> coefficients for those who want them). It takes about 10 CPU hours on a fast |> machine to design the filter (using Remez exchange, anyway). It takes 14000 |> coefficients to get +- 0.05 db passband, 104 db rejection stopband. Ok, I went ahead and did a design for converting from 48 kHz to 44.1 kHz. I cascaded 3 FIR filters to do this. Their sample rate conversion ratios were 3:2, 7:5, and 7:16 and they required 147, 75, and 53 taps respectively. Each filter had a pass band ripple of less than 0.05 dB and stop band rejection of over 100 dB. It took me less than an hour of my time to design these filters, and it took less than 1 minute of CPU time on a SPARCstation. |> Agreed, but you only design it once, and the program source is published in |> "Theory and Application of Digital Signal Processing" by Rabiner and Gold. |> (Actually, you need to tune it up a little for such large filters, but that's |> only a few days work.) You only have to design this set of filters once, too, and I didn't have to spend any time tuning up any software. |> This appraoch will work, but it's not clear the computational requirement is |> any less. Direct convolution by the 14000 tap filter takes about 95 multiply |> adds per output point (since 146/147 of the input samples are 0). In your |> example, the very first filter does: |> |> increase by 3 |> decrease by 2 at 144 KHz (but 2/3 inputs are 0) |> throw away 1 of 2, rate now 72 KHz. |> |> This filter looks like a 3x oversampling filter, and will require |> about 130 coefficients, for a 130/3 = 43 multiplies per output sample (and |> output is at a 72 KHz rate, so this implies 43 * 72/44.1 = 70 multiplies per |> (44.1 KHz) output sample. The first filter requires 49 macs per 72 kHz sample, or 3528000 macs/second. The second filter requires 10.7 macs per 100.8 kHz sample, or 1080000 macs/sec. The third filter requires 7.6 macs per 44.1 kHz sample, or 333900 macs/second. The total is 4941900 macs to produce 44100 samples, or 112 macs/sample. So this is comparable to the 14000 tap filter implementation as far as macs/second are concerned. |> Thus if you implement the filters by multiply/adds, this takes more operations, |> plus the book-keeping is *much* more complex. The direct approach is a one |> or two instruction loop on most DSP chips, for 100 or 200 cycles per output |> point. I doubt you can do the bookkeeping alone in less than 100 cycles |> per output point for the method above. |> Since the three filters are cascaded, there isn't really any additional overhead. Each filter is a one or two instruction loop, and there's one of these loops for each filter. So my implementation requires 112 multiply-accumulates per output sample (compared to 95), storage for 275 filter taps (compared to 14000), and takes less than a minute of CPU time to design (compared to 10 hours). Tom Parks From: lou@caber.valid.com (Louis K. Scheffer) Subject: Programs for 44.1<->48 conversion available Date: 15 Oct 91 04:38:50 GMT I've got some filters and programs to convert 44.1 KHz sampled data to 48 KHz sampled data and vice versa. This includes the filter programs themselves, programs to generate test data, programs to test the output via FFTs, programs to test the FFT program, and so on. These are demo programs only; they read a file of numbers (in ascii) and produces another file of numbers. One caveat: the theory is correct, and the numbers look fine, but I've never listened to the results. (I don't have the necessary I/O, nor a 48 KHz DAT player.) Here is the README file - if you want the stuff send me email. It's about 62 Kbytes. -Lou Scheffer- **** The README file from the 'resample' directory. ********** Here is some software to convert between 44.1 and 48 KHz rates. Theory of operation: 44.1 and 48 are in the ratio 147/160. To convert from 44.1 to 48, for example, we (conceptually): 1) interpolate 159 zeros between every input sample. This raises that data rate to 7.056 MHz. Since it is equivalent to reconstructing with delta functions, it also creates images of frequency f at 44.1-f, 44.1+f, 88.2-f, 88.2+f, ... 2) We remove these with an FIR digital filter, leaving a signal containing only 0-20 KHz information, but still sampled at a rate of 7.056 MHz. 3) We discard 146 of every 147 output samples. It does not hurt to do so since we have no content above 24 KHz. In practice, of course, we never compute the values of the samples we will throw out. So we need to design an FIR filter that is flat to 20 KHz, and down at least X db at 24 KHz. How big does X need to be? You might think about 100 db, since the max signal size is roughly +-32767, and the input quantization +- 1/2, so we know the input had a signal to broadband noise ratio of 98 db at most. However, the noise in the stopband (20KHz-3.5MHz) is all folded into the passband by the decimation in step 3, so we need another 22 db (that's 160 in db) to account for the noise folding. Thus 120 db rejection yields a broadband noise equal to the original quantizing noise. If you are a fanatic, you can shoot for 130 db to make the original quantizing errors dominate, and a 22.05 KHz cutoff to eliminate even ultrasonic aliasing. You will pay for your fanaticism with a penance of more taps, however. I've designed 3 filters - one minimal (but better, I suspect, than many commercial products for this purpose), one a good compromise, and one for fanatics. Most of this code is in double precision for testing purposes. If you convert to single precision or integers, watch your step! resample.c - source of conversion program. gencos.c - generates cosine test cases. full.c - generates a 32767 height sin wave, rounded to integers, for testing the FFT program. We pick this because the correct answer is known analytically. fft.f - reads data, does an FFT on it, and computes signal to noise metrics. Note - this program does NOT do any windowing on the input data, so the results are only correct if the signal to be measured repeats exactly during one fft cycle (256 samples). The test cases are picked that way. resp.f - computes responses of digital filters. Slow but very accurate. makefile - compiles programs and makes test cases. Compiles the Fortan programs fft and resp with the xlf command, which is for the IBM machine. Substitute your machine's fortran compile command. "make clean" removes files that can be easily recreated from the source files. coeff.9000 - A minimal set of coefficients. Flat +- 0.012 db to 20KHz, down 119 db by 24KHz, 96.5 db signal to noise ratio. (For comparison, input quantization to ideal 16 bits is 98 db). coeff.12500 A better set of coefficents. Flat +- 0.009 db to 20KHz, down 123 db by 23KHz, 100 db signal to noise ratio. coeff.21500 A fanatic set of coefficients. Flat to better than +- 0.001 db from 0 to 20 KHz. Down 130 db at 22.05 KHz, wideband S/N >108 db. Compiled programs: 44.1to48 - conversion program for 44.1 to 48 conversion. 48to44.1 - conversion program for other direction. gencos - cosine test data (double precision) full - full size test data (integer) fft - program for looking at results Test cases generated by makefile: 11025at48 11025 Hz signal sampled at 48 KHz. Convert this to 44.1, then test using fft. 12000at48 12000 Hz signal sampled at 48 KHz. Check the noise floor of the FFT with this. 23000at48 23000 Hz signal sampled at 48 KHz. Check the filter rejection. 689at48 689.0625 Hz signal sampled at 48 KHz. Convert this to 44.1, then test using fft. 750at44.1 11025 Hz signal sampled at 48 KHz. Convert this to 48, then test using fft. 750at48 750 Hz signal sampled at 48 KHz. Another test of the FFT. dc dc signal Sanity check full_height 32767 height sin wave, quantized to integers Should (in theory) give 98 db signal to noise ratio from the fft. You use resp tmp fft