Posts with «spectrum analyzer» label

Hackaday Prize Entry: Open Source FFT Spectrum Analyzer

Every machine has its own way of communicating with its operator. Some send status emails, some illuminate, but most of them vibrate and make noise. If it hums happily, that’s usually a good sign, but if it complains loudly, maintenance is overdue. [Ariel Quezada] wants to make sense of machine vibrations and draw conclusions about their overall mechanical condition from them. With his project, a 3-axis Open Source FFT Spectrum Analyzer he is not only entering the Hackaday Prize 2016 but also the highly contested field of acoustic defect recognition.

For the hardware side of the spectrum analyzer, [Ariel] equipped an Arduino Nano with an ADXL335 accelerometer, which is able to pick up vibrations within a frequency range of 0 to 1600 Hz on the X and Y axis. A film container, equipped with a strong magnet for easy installation, serves as an enclosure for the sensor. The firmware [Ariel] wrote is an efficient piece of code that samples the analog signals from the accelerometer in a free running loop at about 5000 Hz. It streams the digitized waveforms to a host computer over the serial port, where they are captured and stored by a Python script for further processing.

From there, another Python script filters the captured waveform, applies a window function, calculates the Fourier transform and plots the spectrum into a graph. With the analyzer up and running, [Ariel] went on testing the device on a large bearing of an arbitrary rotating machine he had access to. A series of tests that involved adding eccentric weights to the rotating shaft shows that the analyzer already makes it possible to discriminate between different grades of imbalance.

The HackadayPrize2016 is Sponsored by:

Filed under: The Hackaday Prize

RADIX-4 FFT (integer math).

Updates on 30 Sept. 2014:

Everything below is correct, and may worth to read. But new  code based on Split Radix Real is published. Faster, lower memory demands, both version for UNO and DUE available as libraries.  New algorithm makes it possible to run FFT_SIZE =  512 on UNO board in less than 9.6 milliseconds.

/**************************************************************************************************************************************

Tweaking the FFT code, that I’ve published earlier in my series of blogs, I hit a “stone wall”. There are nothing could be improved in the “musical note recognition” version of the code, in order to make it faster. At least, nothing w/o completely switching to assembler language, what I’m trying to avoid for now.  I’m sure, it’s the fastest C algorithm. Looking around it didn’t take long to find out that there is other option: change RADIX-2 algorithm for RADIX with higher order, 4, 8, or split-radix approach. Putting split-radix aside, (would it be my next adventure?), RADIX-4 looks promising, with theoretically 1/4 reduction in number of multiplications (what I believe is an “Achilles heel”).

Googling for awhile, I couldn’t find fixed point version in plain C or C++ language. There is TI’s “Autoscaling Radix-4 FFT for MS320C6000TM” application report, which I find useful , but the problem is it’s “bind” with TI microprocessors hardware multiplier, and any attempt to re-write code would, probably, make it’s performance even worse than RADIX-2. Having “tweaking” experience with fix_fft source code from:  http://www.jjj.de/             I decide to follow same path, as I did before, adapting fix_fft for arduino: take their floating point source, disassemble it to the pieces, and than combine all parts back as fixed point or integer math components.    And you know what ? Thanks God, I successed!!!

I decided not all parts to re-assemble back again, this is why fft_size has to be power of 4 ( 16, 64, 256, 1024 etc.). Next, the software is “adjustable” for different level of the optimization. Trade is always the same, accuracy against speed. I’d highlight 3 level at this point:

1. No optimization, all math operation 15-bits.   The slowest version. Not tested at all.

2. Compromise version.  Switches: 12-bits Sine table, regular multiplication (long) right shifted >>12, Half-Scaling in the sum_dif_I (RSL) >>1. Recorded measurements result:  24 milliseconds with N = 256 fft_size.

3. Maximum optimization. Switches: 8-bits Sine table, macro assembler multiplication short cut, no scaling in the core. Timing 10.1 millisecond!!!

Fastest. Best of the Best Ever written FFT code for 8-bit microprocessor.   Enjoy the meal:   https://docs.google.com/open?id=0Bw4tXXvyWtFVMldRT3NFMGNTZVN0Y0d4eVRsenVZdw

Here is slightly modified copy, where I moved sine table from RAM to FLASH memory using progmem utility. For someone, who was curious to find the answer: how much progmem slower compare to access data in the RAM, there is an answer. 10.16 milliseconds become 10.28, or 120 usec slower. Divide by 84 x 6 = 504 number of readings, each progmem costs 0.24 useconds. Its about 4 cycles CPU.

https://docs.google.com/open?id=0Bw4tXXvyWtFVQjZpZkw1c3VUZXlmaF9sOEJwMmpEUQ

Screenshot from the running application, signal generator running on the computer, feeding audio wave to OPA and than analog input 0. Look for hardware setup configuration on the “color organ” blog-post.

BTW, there is one more important thing, I missed to emphasize in my short introductory paragraph, code offers FLEXIBILITY over SNR ratio. Basic FFT algorithm has an intrinsic “build-in” GAIN: G(in) = FFT_SIZE / 2 . (in) stands for intrinsic. That is perfect value for fft_size = 64 ( Gain = 64 / 2 = 32) and arduino (Atmel AtMega328)  10-bit ADC ( max value = 1023 ). FFT output would be 32 x 1023 = 32736, exactly 15 bit + sign. In other words, scaling in the algorithm core doesn’t required at all! That alone improve speed and lower rounding noise error significantly. The same time G(in)  grows too high with FFT_SIZE = 256, when G = 256 / 2 = 128 and output of the FFT would overflow size of 16-bit integer math. But again, scaling don’t have to be 100%, as long as there is a way to keep it in balance with ADC data. In this particular case, with 10-bit ADC, we can keep gain just below 32, it’s not necessary to make it exactly “1″.  For 12-bit ADC upper G limit would be 8, still not “1″. To manipulate the gain, division by 2 (>> 1) in the “sum_dif_I” could be set, to prevent overflow with fft_size > 64. Right shift “gain limiter” creates a square root adjustment, according to new formula: G(rsl) = SQRT (FFT_SIZE) / 4 . (rsl) stands for right-shift-limiter.

  1.  G = 1 for fft_size = 16,
  2.  G = 2 for fft_size = 64,
  3.  G = 4 for fft_size = 256,
  4.  G = 8 for fft_size = 1024.

Summing up, for using RADIX-4 with arduino ADC and FFT_SIZE <= 64, keep division by 2 (>> 1) in the “sum_dif_I” commented out. In any other circumstances, >10 bits external ADC, >64 fft_size, uncomment it.

To be continue…..


RADIX-4 FFT (integer math).

Tweaking the FFT code, that I’ve published earlier in my series of blogs, I hit a “stone wall”. There are nothing could be improved in the “musical note recognition” version of the code, in order to make it faster. At least, nothing w/o completely switching to assembler language, what I’m trying to avoid for now.  I’m sure, it’s the fastest C algorithm. Looking around it didn’t take long to find out that there is other option: change RADIX-2 algorithm for RADIX with higher order, 4, 8, or split-radix approach. Putting split-radix aside, (would it be my next adventure?), RADIX-4 looks promising, with theoretically 1/4 reduction in number of multiplications (what I believe is an “Achilles heel”).

Googling for awhile, I couldn’t find fixed point version in plain C or C++ language. There is TI’s “Autoscaling Radix-4 FFT for MS320C6000TM” application report, which I find useful , but the problem is it’s ”bind” with TI microprocessors hardware multiplier, and any attempt to re-write code would, probably, make it’s performance even worse than RADIX-2. Having “tweaking” experience with fix_fft source code from:  http://www.jjj.de/             I decide to follow same path, as I did before, adapting fix_fft for arduino: take their floating point source, disassemble it to the pieces, and than combine all parts back as fixed point or integer math components.    And you know what ? Thanks God, I successed!!!

I decided not all parts to re-assemble back again, this is why fft_size has to be power of 4 ( 16, 64, 256, 1024 etc.). Next, the software is “adjustable” for different level of the optimization. Trade is always the same, accuracy against speed. I’d highlight 3 level at this point:

1. No optimization, all math operation 15-bits.   The slowest version. Not tested at all.

2. Compromise version.  Switches: 12-bits Sine table, regular multiplication (long) right shifted >>12, Half-Scaling in the sum_dif_I (RSL) >>1. Recorded measurements result:  24 milliseconds with N = 256 fft_size.

3. Maximum optimization. Switches: 8-bits Sine table, macro assembler multiplication short cut, no scaling in the core. Timing 10.1 millisecond!!!

Fastest. Best of the Best Ever written FFT code for 8-bit microprocessor.   Enjoy the meal:   https://docs.google.com/open?id=0Bw4tXXvyWtFVMldRT3NFMGNTZVN0Y0d4eVRsenVZdw

Here is slightly modified copy, where I moved sine table from RAM to FLASH memory using progmem utility. For someone, who was curious to find the answer: how much progmem slower compare to access data in the RAM, there is an answer. 10.16 milliseconds become 10.28, or 120 usec slower. Divide by 84 x 6 = 504 number of readings, each progmem costs 0.24 useconds. Its about 4 cycles CPU.

https://docs.google.com/open?id=0Bw4tXXvyWtFVQjZpZkw1c3VUZXlmaF9sOEJwMmpEUQ

Screenshot from the running application, signal generator running on the computer, feeding audio wave to OPA and than analog input 0. Look for hardware setup configuration on the “color organ” blog-post.

LInk to first version based on RADIX-2 FFT:     LINK

BTW, there is one more important thing, I missed to emphasize in my short introductory paragraph, code offers FLEXIBILITY over SNR ratio. Basic FFT algorithm has an intrinsic “build-in” GAIN: G(in) = FFT_SIZE / 2 . (in) stands for intrinsic. That is perfect value for fft_size = 64 ( Gain = 64 / 2 = 32) and arduino (Atmel AtMega328)  10-bit ADC ( max value = 1023 ). FFT output would be 32 x 1023 = 32736, exactly 15 bit + sign. In other words, scaling in the algorithm core doesn’t required at all! That alone improve speed and lower rounding noise error significantly. The same time G(in)  grows too high with FFT_SIZE = 256, when G = 256 / 2 = 128 and output of the FFT would overflow size of 16-bit integer math. But again, scaling don’t have to be 100%, as long as there is a way to keep it in balance with ADC data. In this particular case, with 10-bit ADC, we can keep gain just below 32, it’s not necessary to make it exactly “1″.  For 12-bit ADC upper G limit would be 8, still not “1″. To manipulate the gain, division by 2 (>> 1) in the “sum_dif_I” could be set, to prevent overflow with fft_size > 64. Right shift “gain limiter” creates a square root adjustment, according to new formula: G(rsl) = SQRT (FFT_SIZE) / 4 . (rsl) stands for right-shift-limiter.

  1.  G = 1 for fft_size = 16,
  2.  G = 2 for fft_size = 64,
  3.  G = 4 for fft_size = 256,
  4.  G = 8 for fft_size = 1024.

Summing up, for using RADIX-4 with arduino ADC and FFT_SIZE <= 64, keep division by 2 (>> 1) in the “sum_dif_I” commented out. In any other circumstances, >10 bits external ADC, >64 fft_size, uncomment it.

To be continue…..