Posts with «download» label

Speech / Voice Recognition – remix.

It’s about right time to release one more “remix” for one of my blog, published almost a year ago. I haven’t got much comments, not many as I was expecting, on a topic. There are some reasons, that would explain this phenomenon, but I would better to start outlining what I did in new release, and some of you who tried old version would be impressed by progress I’ve made!

Basic structure was left almost intact. Essential parts of the project: Filtering 2D and Cross-Correlation are the same, so please read my old post, if you come across this one w/o seen it first. What differs is “preprocessing”, before we get to the filtering stage. In first, I ported a code on Leonardo board.  The easiness of connecting electret mic to Leonardo, just didn’t give me a choice!  I played already with Leonardo ADC – electret mic’s in my previous post, and would assure you, that this guys were designed to work as a team. Uno followers have to solder a pre-amplifier, not big deal, the same time not really interesting.  Code would run on Uno, except Timer and ADC settings, which you could always “copy/paste” from the old version. Feel free, be my guest.

Analog front-end is absolutely the same as I used in Sound Localization project. You would need only one mic here, just reduce the number of electrical components down.

Sampling subroutine based on Timer 4, and arduino Leonardo internal PGA set to gain x40. There are a comments in the code, so you could adjust gain up or down depends on the sensitivity of mic you have.

Windowing LUT is slightly modified Hamming/Hann cosine function,  I’d say my table is an “intermediate” version of both mention inventors.

FFT is my best achievements of this year, RADIX-4.  Compare to old code, about 3x faster. This is why I was able to increase FFT_SIZE up to 128,  still having plenty of time.  Magnitude calculation is based on approximation, very fast, because no square root extraction required. Accuracy, in the worst case scenario ~95%, which is more than enough in this project.

I changed Non-Linear compression algorithm, as now there are more Bins – 64 to pack in 16 Bands. Math is simple, and hope doesn’t need an extensive comments. Packing is necessary due memory limits, 1 sec password in current configuration (16 bands with sampling rate 4 kHz) occupied 1 kByte, full size of EEPROM on Leonardo or Uno boards.

Command Line Interface is preserved. Here is the instructions, how to set everything up and running.

And here how spectrogram looks like in LibreOffice, test phrase “Front right”, (OS Linux Ubuntu, 12.04):

                   

Instructions.

 Get your microphone wired / connected? All checked at least a couple times with multimeter, voltages looks o’k? Good, now you are ready to start! ( Don’t forget to upload a sketch to your arduino / Leonardo -);

1. X. First of all, open serial monitor window, and check boadrate.
Should be 115200. Next, type “x” and Enter. Get some response? Does
it look like a table? Excellent, data you are looking at is “raw” 
sampling data. Probably, just noise, acoustical or electrical. 

2. F. Second test, type “f” - Enter. Again, arduino would print out 
a table. This time data represent “processed” by FFT analog signal. 
Each bin corresponds to frequency range 32 Hz. If you have a signal
generator , it's right time to do a detailed check up microphone, 
wiring, and software. If don't have, you can use your computer's 
sound card and some program, there are plenty of them available 
on-line free of charge. Connect generator to PC USB speackers, and
run a single tone, anything in audible range 16 – 2000 Hz. Check 
again using “f” command, if arduino registers a signal, and it's in
right bin. Due “windowing” function, even pure single tone would 
show up at least in 2-3 neighboring bins. Amplitude depends on 
sensitivity of the mic, and volume of the sound. Sending “x” you can 
confirm, that there is no “clipping” too many “– 511” or “+511”.

3. S. Next, if all goes well till this step, you probably already
notice, that whenever a mic picking-up a sound – yellow on-board LED
lights up. Now, it's better shutdown your TV, iPad, radio. 
Make your environment as quite as possible. Try to say something 
with your own voice, and see if led lighting up when you start 
talking, and than after ~1 sec it goes off. Repeat a few times, 
adjusting a volume and / or distance to the mic, so led goes “on/off”
reliable. Send “s” command when led is off. Don't worry, it should 
be no report on screen, till you say a word. Procedure is simple, 
send “s” - say something. After you talk, and get a printing, look 
carefully. Your objective, is get a “spectrogram” which consist of 
a few spots / blobs of digits, randomly distributed all over the 
“surface”. Repeat a few times with one word, than try another one 
and so on. With short words, very likely reporting data would be 
concentrated at the top ( you may need to scroll up to see the 
beginning), if this is a case, better to choose longer words or talk
in slower tempo. Just remember, there are three dimension, time, 
frequency and volume. You can use a signal generator here again, 
spectrogram should looks like a vertical line, may be 2 - 3 parallel
lines on low frequencies test tones. 
 Changing a frequency, you can even get some curves. What is 
important, the must be no negative numbers. If you see them – 
"overloading" happens, Decrease a volume. Dynamic range is limited 
by 127, more close you can get to this value w/o negatives, 
the better. And last dimension is a frequency. For man, it would 
be a little bit hard to “detach” a spectrogram from the left border.
It's true for everyone, who is not opera singer, me no exceptions.... 

4. R and G. After you practice enough, and received Manny nice 
looking spectrograms, it's time to check with arduino, if it's 
agree with you / thinks similar. Send “r” - recording, and say 
a word you've get your best spectrum with. Wait a few seconds, 
writing to EEPROM takes time. Now send a “g” and say the same word.
In the same manner, tonality and volume. See on the outputs, what 
is the cross-correlation factor you received. More than 50% - very 
good for beginning. Less – try again a few more times sending “g” and
repeating same word, maybe slightly varying pronunciation. Try to 
reach best recognition, “crack” your own password code! 
( note: Negatives number on “G” reports-form must be present.)
If no luck try another password code. Now you know the drill: 
S – repeat, repeat, repeat....., R, G – repeat, repeat, repeat..... 
(joke). Can't get good match? Try your computer's test sounds, 
beeps, horns, clicks, barks – whatever your OS has. It doesn't have
to be shorter than 1 sec, but only 1 second in the beginning 
would be stored / compared. My computer is able to repeat the same 
sound track (speakers test - "front right") with enormously high 
cross-factor 99 % ! I'm not a computer, my best short 86 % so far...

5. P. This command is simply reading the content of the EEPROM, 
so you always can verify, what you stored last time. 
Editing / formatting this data you can store a table in the arduino 
FLASH memory using PROGMEM. About 10 – 15 commands. Of course, 
storing data in external SD card or EEPROM, could greatly increase
the “vocabulary”, the same time design of fast cross-correlation 
algorithm with multiple “pattern”s would be another brain teasing
puzzle -);.

Have fun!

Link to arduino sketch, VOR (VOice Recognition).

 


Sound Localization.

Well, it’s elementary simple in theory, how to do sound localization based on phase difference of signals, that received by two spatially distant microphones. The devil, as always, in details. I’ve not seen any such project created for arduino, and get curious if it’s possible at all. Long story short, here I’d like to present my project, which answer this question  - YES!

Moreover, quantity  of electronics components not much differs from what I’ve used in my previous blog.  Compare two drawings, you will notice only 4 resistors and 4 electret microphones were added! All circuitry is just a few capacitors, 9 resistors, one IC and mics.  Frankly speaking, writing a remix of oscilloscope, I was testing  an arduino analog inputs, keeping in mind to use it in junction with electret microphones in other projects, like sound pressure measurements (dBA),  voice recognition or something funny in “color music” series. As they call it – “a pilot” project?.  There are some issue (simplest ever) oscilloscope has when doing fast rate sampling on 4 channels (settings 7, 8 and 9 Time/Div ) I already described, so I slightly reduce sampling down to 40 kHz here.



Note: *Hardware would be different for arduino boards based on different chips, and must include pre-amplifiers with AtMega328 uCPU. 

One more important things to mention in this short introductory, as I used FFT algorithm for phase calculation ( I like FFT very much, you probably, already notice it ),

Arduino is capable not only track a MOSQUITO flying in your room, it could tell if it’s MALE of FEMALE !!!!!

                  SOFTWARE.

  As I say above, I choose 40 kHz for sampling rate, which is a good compromise between accuracy of the readings  and maximum audio frequency, that Localizator could hear. Getting signals from two mic’s simultaneously, upper limits for audio data is 10 kHz. No real-time, “conveyor belt” include 4 major separate stages:

  • sampling X dimension;
  • FFT
  • phase calculation
  • delay time extracting
  • sampling Y dimension;
  • FFT
  • phase calculation
  • delay time extracting

4 mic’s split in 2 groups for X and Y coordinate consequently. Picking up 4 mic’s simultaneously is possible, but would reduce audio range down to 5 kHz, so I decided to process two dimension (horizontal and vertical planes)  separately, in series. Removing vertical tracking from the code, if it’s not necessary, would increase speed and accuracy in leftover plane, of course. I’d refer you for description of the first and second stages to other blogs, FFT was brought w/o any modification at all. Essential and most important part of this project, stages 3 and 4.

Phase Calculation (3).

 Mathematical tutorial on a topic, I’m not any good as a teacher, so you better read somewhere else, to brush up a basic concept. Core of the process is arctangent function. This link says a number of cycles. In two words – too slow.  LUT ( Look Up Tables ) is the best solution for no-float uCPU to do complex math extremely fast, and reasonably (?) precise. Drawback of LUT is limited size, so it could be saved in FLASH memory, which in next tern  is also limited. This is what I did on “resource management” side: 1 kWords ( 16-bit integers, 2 kBytes) , 32 x 32 ( 5 x 5 bites) LUT, scaled up to 512 to get better “integer” resolution. There are a few values in top-right corner, that melted together as their differences are less than “1″ (not shown on the picture on right side). The “worst” resolution is in top-left corner, where “granularity” is reaching 256, or unacceptable 50% of the dynamic range. To stay as far away from this corner, I put a “Rainbow Noise Canceler” – single line with ” IF ” statement, which “disqualifies” any BIN with magnitude, calculated at the FFT stage, lower than 256.

IF(((sina * sina) + (cosina * cosina)) < 256) phase = -1;

 I called it “Rainbow” because of it’s shape, “red line” is an arc, going from 16 on top line to 16 on left side. Also, “Gain Reset” – 6 bit ( depends on the FFT size, has to be 6 bits for 128) reduced to 5 bits, in order to get better sensitivity. This two parameters / settings, 5-bit and 3.5 bit magnitude limit, create a “threshold” for weak spectral peaks. Basically, depends on application, both values can be adjusted in  different proportions.

 There are two category of tracking technics, with mic’s installed on moving platform, and stationary mic’s. First one is a little bit easier  to understand and build, requires Relative direction to sound source. This what I’ve done. Stationary mic’s approach, when motors are moving laser pointer (or filming camera) alone, would require Absolute direction to sound source, and must include stage #5 – angle calculation via known delay time. Math is pretty simple, acrsine function, and at this point only one calculation per several frames would be necessary, so floating point math wouldn’t be an issue at all. No LUT, scaling, rounding/truncation. Elementary school geometry knowledge – thats all you need.

Delay Time Extraction (4).

 Subtraction phase value of one “qualified” mic’s data pull from another, produce phase difference. To turn phase difference in delay time, division by BIN number is performed. Lets call this operation “Denominator” process.  The denomination is necessary, because all data after this step going to be combine and process together, doesn’t matter of wave length, which is different for every bin. Frequency and wavelength related to each other via simple formula:  Wavelength = Velocity / Frequency, where velocity is a speed of sound wave in the air ( 340 m/sec at room temperature). As distance between two mic’s is a constant,  sound with different wavelength ( frequency ) produce different phase offset, and denomination make them proportional. (WikiPedia, I’m sure, would explain this much better, mind you, I’m a Magician, not mathematician).

First picter on right side shows  ”Nuisance 3: Incorrect arctan” correction. You will find two lines with “IF” statements in the code relaited to stage #3.

Second one,  gives you idea why other correction at stage #4 is necessary As you can see, subtraction one arctan from another generates a rectangular “pulse” ( Diff. n. corr., violet line) whenever one function changes sign but other (delayed version) not yet. Light blue line (DIFF(B)) doesn’t have such abnormality. Math is simple, just two lines with “IF’s” in the same manner, only “double size” constants this time. 2048 on my scale corresponds to 2 x PI, 1024 – PI, and 512 – PI / 2.

Arduino has only 1 ADC, so there is always constant delay time equals to one sampling period ( T = 1/40 kHz = 25 usec), which also should be subtracted ( or added, depends how you associate input 1 and 2 – left / right side mic.)

Filtering.

 To fight reverberation and noise, I choose a Low Pass Filter, which I’d call here as a “Rolling Filter”. My research with regular LPF, shows that this class of filters is completely NOT appropriate for such type of data, due their high susceptibility to “spikes”, or sudden jump in magnitude level. For example, when system getting steady reading from 2-3 test frequencies with low values, let say -10, simple averaging ( should be -10 ) results will be corrupted with one accidental spike (magnitude +2000) during next 60 – 100 consecutive frames !!!  The Median Filter doing well eliminating sudden spikes, the same time is very hungry to CPU cycles, as it’s using “sort” algorithm each time new sample was arrived to the data pull. Having 64 frequencies, and setting filter kernel to 5 – 8 samples, arduino would be buried doing sorting at almost 40 ksps.  Even processing each frequencies data not individually,  and sorting only one 64 elements array still very time consuming job.  After thinking a while, I came up to conclusion, that “Rolling Filter” has almost the same efficiency as Median, but instead of “sorting” requires only 1 additive operation! On long run, the output value will “roll” and “stick” to the middle of the pull. ( Try to model it in LibreOffice. )  Adjusting “step” of the “Rolling Filter”, you can easy manipulate responsiveness,  which is almost impossible with Median Filters. (Things TO DO: Adaptive Filtering, real time adjustment depends on input data “quality”).

 To be continue…. Video will follows !

( Predicting your question, how “good” is localization?  Its about same, as Laser TRF (tracking range finder) has, look at other blogs for now to get impression.  In other words: ASTONISHINGLY GOOD,   a few (1 – 5) angular degree in closed environment.

Link to Arduino (Leonardo) sketch:  Localizator-beta-9.


Audio VU meter (AC microVoltmeter) with Extra wide Dynamic Range 69 dB.

O’K, after having some fun with stereo version of the VU meter I described in my previous blog-post, now it’s time to do a serious stuff. Studio grade VU meter !!! 24 steps, equally spaced every 3 dB, covering Extra wide Dynamic Range from -63  up to  +6 dB.  Single (mono) channel this time, no messing around, absolute precision at the stake. Plus, it keeps absolutely Top-Flat linear frequency response from 40 Hz up to 20 kHz(*).

 

 

I’m not going into details of RGB LEDs Display, which has no modification since “Tears of Rainbow” project, only plates installed in one line, form a single GIGANTIC bar-graph. There are some minor changes in mixing colors data tables, but they intuitively understandable.  The most important feature in this project is autoscaling. As you, probably know, Arduino has 10 bits ADC. Only it can’t process negative half-wave, and for this reason it has only 9 bits available for AC measurements.  According to DSP theory, maximum dynamic range is:

DR = 1.77 + 6.02 x B = 1.77 + 6.02 x 9 = 55.95 dB.

 As input audio waveform represents anything but perfect peak-to-peak 5V sine-wave, real dynamic range would be lower. How much? In first, there is a hardware limits.  OPA (NE5532), which is:

  • very low noise !!!
  •  high output-drive capability;
  •  high unity-gain and maximum-output-swing bandwidths;
  •  low distortion;
  •  high slew rate;
  •  input-protection diodes, and output short-circuit protection

 but, unfortunately,  isn’t rail-to-rail type. Test results show, that compression  become noticeable (~1 dB) when not scaled magnitude approaches level about 50 dB. That is in good agreement with observed on oscilloscope not distorted deviation peak-to-peak 2.5 V. Or only half of full range of 5V. And as theory says, half is one bit less, and real DR = 1.77 + 6.02 x 8 = 49.93  (~50 dB). In second, audio data is processed on “block” structure basis. It means, having average of the block 50 dB, doesn’t mean that there was no spikes in the sampling pull, that obviously would be clipped and introduce error in the measurements results.  This phenomenon is defined as Crest Factor. Different sources estimate crest factor of musical content between 10 – 20 dB.  So, taking direct approach, Arduino with OPA mentioned above as front-end could accurately cover only:              50 – 20 = 30 dB.  To get wider dynamic range, I have to scale input amplifier gain, and this is exactly what I did, building amplifier in two stages and selecting one cascade (by-passing second one) or two cascades using internal ADC multiplexer. As there is no switching IC in analog signal path involved, gain is defined with high stability, could be one time precisely measured – calibrated via coefficient stored in EEPROM (nice feature to add).

On the right side there are electrical drawings of “slightly” modified kit,  where stereo amplifier was converted into 2 stage mono version. First stage, with gain about  G1 = 1 + 10 k / 1 k = 11  is necessary to “bump-up” line-level signal, to create DC bias required for correct operation of the ADC, and also served as buffer to lower signal source impedance, as it seen by ADC input.  I set a gain of the second stage amplifier at 40 dB:  20 x Log_10 ( G2 ),     where    G2 = 1 + 100 k / 1 k = 101.

IMHO, setting gain limit for only 30 db per stage as it follows from paragraph above, is overkill, and would be justified for “real-time” radio broadcasting or audio processing for storage media, when high fidelity of audio program must be preserved. For visual display “clipping” of bursts in signal is not noticeable at all due high refresh rate of display, 78 Hz. Human just can’t see, if LED lights-up with such speed.  For steady AC amplitude measurements (micro Voltmeter mode) this is not a problem at all, and headroom as small as 3 dB would be sufficient, leaving wide 47 dB per stage.

 Software

  There are two thresholds are defined in program, where switching between one or two stage amplification is happening:

      if ( magn_new <=  44 ) sensitv = 1;

      if ( magn_new >= 47 ) sensitv = 0;

  44 and 47, with hysteresis 3 dB. First line defines switching to high sensitive mode (overall gain 1100), and second line, does exactly opposite. Look at the chart, hope it would save me a million words -);

 Couple words on using this device as precise AC micro-voltmeter. Having 1100 overall amplification as add-up to already quite sensitive Arduino ADC, driving overall sensitivity to enormously  5 / ( 1024 x 1100 ) = 4.439 uV Special care should be taken on grounding, shielding of amplifier PCB, probably, EMI suppressor ferrite chokes wouldn’t be an excess in power line and signal path.   In my project, w/o any modification to original kit’s board (except couple jumper wires to cascade two stage amplifier) of course, I was not expecting to get to such high sensitivity level. Moreover, in project arduino is driving LED display, “ADC noise reduction mode” is off, plus ADC is working on double speed – preselector set to 250 kHz!!!  And this is why constant 14 was subtracted in software from magn_new, just before it goes for BarGraph “mapping” procedure:

      magn_new  -= 14;

Basically 14 is a noise flour of my analog front-end.  Approximately 51 micro volts AC is turning on first LED bar. Look at the table, which reflect my current hardware set-up.

* Other things to keep in mind, there is a “gap” 78 Hz wide in frequency range at 10 kHz,  It introduces a small error, about  78 / 20.000 = 0.39% in white noise measurements result. For musical content, which has really low power density level at 10 kHz, magnitude of error would be much lower, probably, less than 0.05 %.

 Running FFT in code creates great opportunity to reject any interference in the audio band. For example, if there is a noticeable hum from electrical grid lines in the content, issue easily could be fixed NOT including bin[1] in final sum of magnitude calculation. Though to make it works more efficient, some adjustment in sampling period would be necessary, setting bin[1] frequency precisely at 50/60 Hz.

 One more advantage of having FFT based  filtering     (primary mission is HPF, look in stereo VU meter, how long kernel of the FIR filter has to be otherwise), is great opportunity to create “weighting” A, B, C or D curve for audio noise measurements. (:TO DO).

 Link to Download Arduino sketch:  Audio_VU_Meter_Mono_69dB


**********Stereo Audio VU meter on Arduino**********

This blog is a sequel of “Tears of Rainbow”.  Using the same hardware set-up of Gigantic RGB LED display, I decided to re-work software a little bit, in order to display the true RMS amplitude of musical content. Video clip on youtube:                       VU_Meter   640×480                                      VU_Meter_HD

Objective:

  • Stereo input, process both channel;
  • Full audio band, 40 Hz – 20 kHz;
  • Fast update rate of visual output.
  • Precision Full-Wave  measurements. 

To process stereo input, this time arduino is switching ADC multiplexer every time when it finish sampling input data array (size=128). Two channels “interleaved” with frame rate 78 Hz, so during each frame only one channel sampled / processed, and update rate per channel is equals to 78 / 2 = 39 Hz, which is more than enough for most audio applications.

 I’m using FFT Radix-4  to extract RMS magnitude of audio waveform, and this is why:

1.  Sampling rate in this application is 10 kHz. How I achieved  20 kHz stated in objective section, doing sampling only 10 ksps?  >>>Aliasing!<<<   What is considered to be nightmare when we need spectral information from FFT output, aliasing in this project is really helpful, reflecting all spectral components around  axis – 10 kHz back “to the field”. As all bins going to be sum-up there is no issue, only benefits. Due aliasing, I’m able to use low sampling rate, and reduce CPU workload down to 52%.

2.  In order to get accurate magnitude calculation of RMS,  which is defined as square root of the sum of squares divided by number of samples per specified period of time:    V(rms) = √ ( ∑ Vi ^2 ) / N) DC offset  must be subtracted from the input raw data of each sample    Vi = Vac + Vdc   (if you remember, AtMega328 ADC needs DC offset to read AC negative half-wave).  The problem here, DC offset value is never known with high accuracy due bunch of reason, like voltage stability of PSU,  thermal effects, resistors tolerance (+/- 1 or 5 %), ADC internal non-linearity etc. Cure for this, which works quite well for monitoring electrical grid power, high pass filter (HPF). Only instead of single 50/60 Hz frequency of power line,  I have a wide frequency range, starting from 20 Hz and ending at 20 kHz. When I feed specification of the HPF:

  • Sample Rate (Hz) ? [0 to 20000]                     ? 10000
  • Desired stop-band attenuation (dB) [10 to 200] ? 40
  • Stop-band edge frequency Fa [0 to 5000]         ? 0
  • Pass-band edge frequency Fp [0 to 5000]        ? 40

to  Parks-McClellan FIR filter design algorithm (one of the most popular, and probably, the best) it provides the result:

  • …filter length: 551 …beta: 3.395321

551 coefficient to be multiplied and sum up (MAC-ed) every 100 usec! No way. I’m not sure, if it could be done on 32-bits 100 MHz platform with build-in MAC hardware, but there is no way for 8-bit 16 MHz Arduino.

IIR filter wouldn’t make much difference here. It has lower quantity of multiplications, but more sensitive for truncation and rounding error, so I’d have to use longer (32-bits?) variables, which is not desirable on 8-bit microprocessor at all.

And here comes FFT Radix-4, which easily fulfill this extra-tough requirements in the most efficient and elegant way. All I have to do, is just NOT include bin[0] in final sum, and all DONE!. TOP-FLAT  linear frequency response  40 Hz – 20 kHz  ( -3 dB ), with complete suppression of DC, and low frequency rumble below 20 Hz attenuation.  Linearity is better than +-1 dB between 80 – 9960 Hz.

Last things, audio front-end. As VU meter was designed in stereo version, I’ve build another “line-in”  pre-amplifier based on this kit: Super Ear Amplifier Kit

Link to Download a sketch:  Stereo_VU_Meter.

 

Modified Stereo VU meter, Logarithmic scale, 8 bars per channel, spacing 6 dB.

Dynamic range: 8 x 6 = 48 dB.  Stereo_VU_Meter(Log10).
 Next blog:   Extending dynamic range to 72 dB! 

Tears of Rainbow.

Video clips on youtube, arduino is running simple demo application.

Tears of Rainbow                             BarGraph HD movie                                    " href="//www.youtube.com/watch?v=30ELYwyy4JQ&feature=youtu.be]" target="_blank">BarGraph movie.

 

It’s time to release new updates for my first (ever) project with Arduino, “Color Light Music”.  From artistic perspective, VU BarGraph style (IMHO) is the best one for spectral dynamic representation, and not much could be improved on this side. But this time, it cross my mind an another idea “Tears of Rainbow”. This blog about how successively (or awfully) the idea was brought to life. And of course, VU visual effects still there, updated with nice peak indicators, color adjustment flexibility (this time triple color LEDs), and PWM-ed brightness settings luxury.  So, this is design requirements, I was following:

  •   make it as big as possible, GIGANTIC size !;
  •   Lego style, or many blocks / modules, which could be re-arranged in different pattern;
  •   extend-able,  easy to add up more blocks later on;
  •   low price on hardware, no special display driver IC.
To simplify assembly work, I decided to buy RGB Led Strip. I had known, from my first project, that design would be composed with straight lines, and the longer lines means the more LED’s ( and consequently, soldering work). For comparison, one line on this display consist of 6 RGB leds, or 24 soldering connections. Using RGB strip, I reduce a workload 24 to 4, or 6 times. I envy to people,  who have a patience to build  8x8x8 RGB led cube (or even 10^3 !).   Addressable RGB strip would make life even easier,   but  I couldn’t find local re-seller,  and was not going to wait shipment / customs. It’s summer time!
 In order to easy reconfigure a style, for example, from  3 BarGraphs, needed in Color Music exposition, to just  1 GIGANTIC VU meter (*),  RGB led strip is chopped-up and attached to 3 rectangular shape plates. I find out, that for some reason strip isn’t “sticky” enough, and to keep its perfectly align on a plate, I used a tire-ups at both ends. Luckily, it was quite easy to punch a holes in the plates for tire-ups just using kitchen knife.
 It wouldn’t be so, if I use a glass as a back-plate
(I had such idea initially). Something to think, if you plan to work with a strip in your design. The same also true for wiring (32 wires per plate). Tin “cookie” plates just was made to be part of this project!  And I even did not mention the heat dissipation,  1/3 of 5 meters strip consume around 12 W of power,  it’s almost like my soldering iron!
 One more things before I forget, I installed 1 cm paper pads to insulate contacts from the metal plate in the middle and on one side. Heat shrink tube takes care of the other end.
 LED’s use 12V as power source, and as I need a lot of  PWM channels to control their brightness , here comes 74HC595 buffered by ULN2803 at the outputs. Nothing special, 9 shift registers daisy chained to produce 72 PWM outputs. Two IC in a pair installed in reverse on a prototype board, to minimize a number of interconnections. As you can see from the  picture, there is only 1! yellow jumper brought from pin 15 of the shift register to pin 8 of the Darlington array. Why they don’t make a shift register in DIP-16 package? There wouldn’t be any jumpers at all!  Other alternative is using TPIC6B595.
        * For clarity, schematic diagram shows only two pairs of chip, and half length of the strip lines.

Now software part.

There are on-line libraries available, to drive 74HC595 by arduino. Only some of them not using hardware “build-in”  SPI interface , and really slow in communication with peripheral IC’s (don’t forget, that LED display only second part of the project, the first one, FFT, is very time consuming). The others libraries, nicely written and perfectly optimized for speed, have too much functionality, that I don’t need in my project, plus they are memory demanding. On the other hand, I need low resolution animation function – sliding down colorful tears, that I have to create on my own.  Now I ‘d like to represent a code, very fast SPI subroutines, completely written in C !  Function shift out 9 bytes ( for 9 shift registers in this project ) approximately in less than 36 usec, or 0.5 usec per one PWM channel. One bit-set in the unrolled loop is about 4.5 cycles.

static uint8_t brightn;

brightn++;
 if(( brightn % QUANTUMS ) == 0)
{
bitClear(PORTB,LATCH_PB);
SPDR = 0;

uint8_t * srP = &brightns[PIN_NBRS];
uint8_t cmp_level = brightn;

for (int8_t iSR = 0, curBt = 0; iSR < IC_COUNT ; iSR++, curBt = 0){

if ((* –srP) > cmp_level)
curBt |= 0b00000001;
if ((* –srP) > cmp_level)
curBt |= 0b00000010;
if ((* –srP) > cmp_level)
curBt |= 0b00000100;
if ((* –srP) > cmp_level)
curBt |= 0b00001000;
if ((* –srP) > cmp_level)
curBt |= 0b00010000;
if ((* –srP) > cmp_level)
curBt |= 0b00100000;
if ((* –srP) > cmp_level)
curBt |= 0b01000000;
if ((* –srP) > cmp_level)
curBt |= 0b10000000;

loop_until_bit_is_set(SPSR, SPIF);
SPDR = curBt; // Start the transmission
}
loop_until_bit_is_set(SPSR, SPIF);
bitSet(PORTB,LATCH_PB);
}

FFT part of the code completely “copy / pasted” form my “Radix-4″ blog. Here an advise, if you wish to explore the code, look there for “pure” form of function. What is new in this publication, is magnitude calculation subroutine, without slow SQRT.

Bar Graphs “set position” sub-function, or mapping height of lighted area of a plate to integral sum of the bins, brought into this project with mild modification from first project.

Continue moving from LED’s display to audio input, I should say couple words on a sampling. There are two functions in the project, that have to be triggered periodically with a timer, “display refresh” posted above and “take ADC sample”. It looks logically, instead of having two timers and have a lot of troubles with collision / racing between them, to scale both function to the same time frame, and execute them at once. “Display refresh” rate equals to minimum rate just to avoid flickering (60/70 Hz) multiplied by the numbers of brightness level. For example, setting brightness step number to 256 ( which provides excellent 256 x 256 x 256 = 16 M colors ) would require periodicity 60 x 256 = 15360 Hz.  See, where I’m driving at? Exactly, 15 kHz is nice frequency to sample audio input!. Well, it’s not 44.1 kHz as default settings Hi-Fi audio standard would recommend, but I ‘m not using all sound data in this project, as I only interested in lower 2 kHz part of the spectrum. And BTW, it’s almost 4x times higher than bear minimum prescribed by sampling theorem (Whittaker–Shannon–Kotelnikov).  I’ve made my choice at 15.625 kHz, to simplify math of binary compare to 64. ( 1/64 usec = 15.6 kHz) If there is no big difference, why not pick up “lucky” binary number, and help a Timer to do his job?

Initially, I thought that I would just re-use sampling sub-function  from “Pitch Shifting” project, slightly adjusting it from 8.0  to 15.6 kHz. I was surprised to discover, that TIMER2 and SPI don’t want to work together! Have I missed something in a data sheet? Could be, sometimes it’s so hard to comprehend, that I’d be still experimenting with “Blinking Led”, if not help from this (must to have) masterpiece:

AVR Microcontroller and Embedded Systems: Using Assembly and C (Pearson Custom Electronics Technology)

O’K, there is TIMER1 available. As project already have been heavily “over-loaded” on software side, I decided to take TimerOne library and not bother myself this time with a bunch of registers, interruptions, masks, etc, leaving this out of the scope, as not related to subject.

FFT size =128 provides extra fine resolution for Color Music performance. BTW, may be it not obvious, but bigger size of FFT has LOWER CPU workload per sample. And last, after everything was melted in one  (BIG) sketch nothing happened. No, my arduino can’t catch up at 15.6 kHz. Shifting 9 bytes via SPI, as I mention earlier, takes 36 microseconds. It’s leaving 64 – 36 = 28 microseconds per sample for everything else, or 28 x 128 = 3 584 per frame.  Radix-4 (size = 128) takes 4.2 milliseconds, as I posted here.  Alright, hell with it, who need 16 M colors  on 8 x 3 led display, by the way?  So, I bring quantity of brightness steps down to:             256 / 4 = 64, which is more than enough -> 262 144 color combinations!  QUANTUMS definition sets coefficient 4 in SPI sub-function.  The same time frame rate equals to  122 Hz, which is 2x times higher, than 60 Hz I started my calculation with.

Default color map, or bin’s assignment to a specific plate, is shown above. This time I implemented a command in CLI to make adjustment in this map on the fly, according to music style, equalizing all three bars more or less proportionally. Automatic Gain Control loop implemented in first project, doesn’t work so great with bigger display size ( first project uses 4  lines per color ). Plus, AGC bringing noise in the visual performance in  pauses between  two songs and in quite fragments of the music.  Starting bin position for each RGB plate could not be changed using CLI  ( you still can do this modifying the scetch), but quantity of bins accumulated per plate could be adjusted simply sending “dr” for red, “dg” – for green, and “db” - for blue, where  d is a digit 0..9.  Bands could over-lap, which is not desirable, in this case red is limited too 0..3, green 0..6, and blue 0..9.

More on the audio input hardware and sampling software subroutine, I post in separate blog, as this part follows w/o much modification thorough a few previous blogs, and doesn’t need to be re-stated here.

*Note: in software G.VU is not implemented yet.

Link to download a sketch:  Tears_of_Rainbow.


Voice Pitch Shifting – Scrambler.

After I’ve made astonishing breakthrough in speed of the FFT algorithm based on RADIX-4 version, I decided to create a new project, which would take a full advantage of Radix-4 “rocket science” performance. Reviewing my “Voice Recognition” blog it looks logically to create just exactly opposite application – Voice Scrambler. To make it happened, I need FFT subroutine to be completed twice, in forward and reverse ( iFFT ) direction. Than simply manipulating individual frequencies (bins) position in the array, I could scramble a voice in well known old fashioned manner, inverting the spectrum of the human voice, making it sounds completely non intelligible (alien’s voice).  For Pitch Shifting, bins have to be “progressively” spaced between each other, driving timbre up on the scale. It is possible to lower a Pitch as well, shrinking and over-lap bins, but I made only up shifting part for now.

Making preliminary calculation, I get:   10.1 milliseconds x 2 / 256 samples = 78.9 microseconds / sample.  Or turning up side down,  12.67 kHz sampling frequency. It’s rough estimation, but regular “public phone quality” – 8 kHz sampling rate looks quite achievable.  Next, as always, if CPU is o’k, let check into memory management. Unfortunately, arduino Uno (2 kB) wouldn’t allow to use fft-256, because:  input buffer (256) + output (256) +  fft processing ( real + imaginary) (512), plus multiply by 2 (integer size, 1024 x 2 = 2048) would occupy all available 2 kB. So, I have to decrease fft size on one level down, to fft-128, or even two level down – make fft-64, from original fft-256 (presumably good quality, comparable with MP3 codec).  After completing a couple sketchy tests, fft-64 shows pure quality of speech and was rejected. Can’t say I did thoroughful  research on this matter, may be fft-64 could still be good in other circumstances or with musical material content instead of speech.

O’k, fft-128 – compromise version was selected by God. But my code, published in “RADIX-4″ blog hasn’t included RADIX-8 section, which is required when size of the array isn’t a power of 4. Nothing to do, the only way to bring  Scrambler project to life, is  to write a missing section of the RADIX-8 code… So I did. Timing measurements show, that Radix-4 with new patch is running even faster, than extrapolated from fft-256 down to fft-128 speed of RADIX-4 without it, measurements result: 4.2 milliseconds (compare to extrapolated 4.6 milliseconds). By the way, as I mention in other post about Split-Radix, if it would be my next adventure. It was, I have re-written from “Matters Computational” http://www.jjj.de/  Split-Radix C/C++ code in my tools-box now, which is to my surprise shows practically no difference in speed with Radix-4 algorithm, same  10.1 (fft-256) milliseconds execution time, and loosing ! competition giving 4.6 milliseconds (fft-128).

 
Other things to explain, as Pitch Shifting considered to be most complicated task even for monstrous DSP processors, I’ve skipped two important procedures: windowing on input samples and add-overlap on the output flow, just because no processor’s time left. Luckily, due speech’s spectrum concentration around most noticeable “middle” area, cutting off “windowing” is only slightly increases noise level. On the other hand, missing add-overlap procedure has greater negative impact on the voice quality,  ”robotize” it via parasitic amplitude modulation. When frequency bins re-mapped in the pull, running inverse iFFT doesn’t produce continuity  ”match” with previous block as it should, because input has no a continuity disruption between samples blocks (128 samples, ~16 milliseconds of voice frame).  Well, if I do everything right, there wouldn’t be any fun with arduino, would it be?  Btw, there are two outputs buffer in the software, especially to track this issue. I was thinking to implement some kind of distortion estimator based on this information. Meanwhile, one of them could be removed to save some memory for other part of the program.

Hardware.

Arduino has 10 bits ADC, this is why I decided to build a DAC for audio output based on PWM TIMER0 feature, 5 to 5 bits split equally between pin 5 and 6. One PWM pin could only play 7 bits sound (don’t forget sign bit), which is too low. Weighted 32R-R ladder potentially allows to increase PWM frequency up to 250 kHz or so, and simplify requirements for output filter design. Nevertheless, I left “default” 31 kHz settings for TIMER0, just in case I need more research with 16 bits output later on, All it would takes to switch on “full” 16 bits, just add one 256 k resistor and divide integer in  high and low byte.

Please, be advised, that included scrambler function was not fully tested, as I don’t have second arduino to do a decoding in real time-);  (Probably, I can record a scrambled sound and decode it in the second passage trough the system – things TO DO.)  The same time Pitch Shifting was successfully tested, as you can see in posted video clip. YouTube Video

Summary:

  • Sampling rate 8 kHz. (easily to adjust by TIMER2 variable.)
  • Output rate 8 kHz. (same as sampling, two processes are synchronous.)
  • Delay 16 milliseconds. (1 block, or two blocks 32 milliseconds.)
  • Input/Output resolution: 10 bits.

There are build-in command line interface:

  • if (incomingByte == ‘m’) { // FREE MEMORY BYTES
  • if (incomingByte == ‘x’) { // PRINT OUT INCOMING SAMPLING BUFFER
  • if (incomingByte == ‘s’) { // SWITCHES – SCRAMBLING ON / OFF
  • if (incomingByte == ‘y’) { // PRINT OUT OUTGOING  BUFFER
  • if (incomingByte == ‘f’) { // DATA AFTER FFT, FREQUENCIES BINS
  • if (incomingByte == ‘p’) { // DATA AFTER PITCH SHIFTING – SCRAMBLING
  • if ((incomingByte >= ’0′) && (incomingByte <= ’9′)) { // DIGITS  ”1″ to “9″ – REGULATE MAGNITUDE OF SHIFTING, “0″ – SPECTRUM INVERSION.

Link to download an Arduino sketch: Pitch Shifting – Scrambler 


RADIX-4 FFT (integer math).

Updates on 30 Sept. 2014:

Everything below is correct, and may worth to read. But new  code based on Split Radix Real is published. Faster, lower memory demands, both version for UNO and DUE available as libraries.  New algorithm makes it possible to run FFT_SIZE =  512 on UNO board in less than 9.6 milliseconds.

/**************************************************************************************************************************************

Tweaking the FFT code, that I’ve published earlier in my series of blogs, I hit a “stone wall”. There are nothing could be improved in the “musical note recognition” version of the code, in order to make it faster. At least, nothing w/o completely switching to assembler language, what I’m trying to avoid for now.  I’m sure, it’s the fastest C algorithm. Looking around it didn’t take long to find out that there is other option: change RADIX-2 algorithm for RADIX with higher order, 4, 8, or split-radix approach. Putting split-radix aside, (would it be my next adventure?), RADIX-4 looks promising, with theoretically 1/4 reduction in number of multiplications (what I believe is an “Achilles heel”).

Googling for awhile, I couldn’t find fixed point version in plain C or C++ language. There is TI’s “Autoscaling Radix-4 FFT for MS320C6000TM” application report, which I find useful , but the problem is it’s “bind” with TI microprocessors hardware multiplier, and any attempt to re-write code would, probably, make it’s performance even worse than RADIX-2. Having “tweaking” experience with fix_fft source code from:  http://www.jjj.de/             I decide to follow same path, as I did before, adapting fix_fft for arduino: take their floating point source, disassemble it to the pieces, and than combine all parts back as fixed point or integer math components.    And you know what ? Thanks God, I successed!!!

I decided not all parts to re-assemble back again, this is why fft_size has to be power of 4 ( 16, 64, 256, 1024 etc.). Next, the software is “adjustable” for different level of the optimization. Trade is always the same, accuracy against speed. I’d highlight 3 level at this point:

1. No optimization, all math operation 15-bits.   The slowest version. Not tested at all.

2. Compromise version.  Switches: 12-bits Sine table, regular multiplication (long) right shifted >>12, Half-Scaling in the sum_dif_I (RSL) >>1. Recorded measurements result:  24 milliseconds with N = 256 fft_size.

3. Maximum optimization. Switches: 8-bits Sine table, macro assembler multiplication short cut, no scaling in the core. Timing 10.1 millisecond!!!

Fastest. Best of the Best Ever written FFT code for 8-bit microprocessor.   Enjoy the meal:   https://docs.google.com/open?id=0Bw4tXXvyWtFVMldRT3NFMGNTZVN0Y0d4eVRsenVZdw

Here is slightly modified copy, where I moved sine table from RAM to FLASH memory using progmem utility. For someone, who was curious to find the answer: how much progmem slower compare to access data in the RAM, there is an answer. 10.16 milliseconds become 10.28, or 120 usec slower. Divide by 84 x 6 = 504 number of readings, each progmem costs 0.24 useconds. Its about 4 cycles CPU.

https://docs.google.com/open?id=0Bw4tXXvyWtFVQjZpZkw1c3VUZXlmaF9sOEJwMmpEUQ

Screenshot from the running application, signal generator running on the computer, feeding audio wave to OPA and than analog input 0. Look for hardware setup configuration on the “color organ” blog-post.

BTW, there is one more important thing, I missed to emphasize in my short introductory paragraph, code offers FLEXIBILITY over SNR ratio. Basic FFT algorithm has an intrinsic “build-in” GAIN: G(in) = FFT_SIZE / 2 . (in) stands for intrinsic. That is perfect value for fft_size = 64 ( Gain = 64 / 2 = 32) and arduino (Atmel AtMega328)  10-bit ADC ( max value = 1023 ). FFT output would be 32 x 1023 = 32736, exactly 15 bit + sign. In other words, scaling in the algorithm core doesn’t required at all! That alone improve speed and lower rounding noise error significantly. The same time G(in)  grows too high with FFT_SIZE = 256, when G = 256 / 2 = 128 and output of the FFT would overflow size of 16-bit integer math. But again, scaling don’t have to be 100%, as long as there is a way to keep it in balance with ADC data. In this particular case, with 10-bit ADC, we can keep gain just below 32, it’s not necessary to make it exactly “1″.  For 12-bit ADC upper G limit would be 8, still not “1″. To manipulate the gain, division by 2 (>> 1) in the “sum_dif_I” could be set, to prevent overflow with fft_size > 64. Right shift “gain limiter” creates a square root adjustment, according to new formula: G(rsl) = SQRT (FFT_SIZE) / 4 . (rsl) stands for right-shift-limiter.

  1.  G = 1 for fft_size = 16,
  2.  G = 2 for fft_size = 64,
  3.  G = 4 for fft_size = 256,
  4.  G = 8 for fft_size = 1024.

Summing up, for using RADIX-4 with arduino ADC and FFT_SIZE <= 64, keep division by 2 (>> 1) in the “sum_dif_I” commented out. In any other circumstances, >10 bits external ADC, >64 fft_size, uncomment it.

To be continue…..


RADIX-4 FFT (integer math).

Tweaking the FFT code, that I’ve published earlier in my series of blogs, I hit a “stone wall”. There are nothing could be improved in the “musical note recognition” version of the code, in order to make it faster. At least, nothing w/o completely switching to assembler language, what I’m trying to avoid for now.  I’m sure, it’s the fastest C algorithm. Looking around it didn’t take long to find out that there is other option: change RADIX-2 algorithm for RADIX with higher order, 4, 8, or split-radix approach. Putting split-radix aside, (would it be my next adventure?), RADIX-4 looks promising, with theoretically 1/4 reduction in number of multiplications (what I believe is an “Achilles heel”).

Googling for awhile, I couldn’t find fixed point version in plain C or C++ language. There is TI’s “Autoscaling Radix-4 FFT for MS320C6000TM” application report, which I find useful , but the problem is it’s ”bind” with TI microprocessors hardware multiplier, and any attempt to re-write code would, probably, make it’s performance even worse than RADIX-2. Having “tweaking” experience with fix_fft source code from:  http://www.jjj.de/             I decide to follow same path, as I did before, adapting fix_fft for arduino: take their floating point source, disassemble it to the pieces, and than combine all parts back as fixed point or integer math components.    And you know what ? Thanks God, I successed!!!

I decided not all parts to re-assemble back again, this is why fft_size has to be power of 4 ( 16, 64, 256, 1024 etc.). Next, the software is “adjustable” for different level of the optimization. Trade is always the same, accuracy against speed. I’d highlight 3 level at this point:

1. No optimization, all math operation 15-bits.   The slowest version. Not tested at all.

2. Compromise version.  Switches: 12-bits Sine table, regular multiplication (long) right shifted >>12, Half-Scaling in the sum_dif_I (RSL) >>1. Recorded measurements result:  24 milliseconds with N = 256 fft_size.

3. Maximum optimization. Switches: 8-bits Sine table, macro assembler multiplication short cut, no scaling in the core. Timing 10.1 millisecond!!!

Fastest. Best of the Best Ever written FFT code for 8-bit microprocessor.   Enjoy the meal:   https://docs.google.com/open?id=0Bw4tXXvyWtFVMldRT3NFMGNTZVN0Y0d4eVRsenVZdw

Here is slightly modified copy, where I moved sine table from RAM to FLASH memory using progmem utility. For someone, who was curious to find the answer: how much progmem slower compare to access data in the RAM, there is an answer. 10.16 milliseconds become 10.28, or 120 usec slower. Divide by 84 x 6 = 504 number of readings, each progmem costs 0.24 useconds. Its about 4 cycles CPU.

https://docs.google.com/open?id=0Bw4tXXvyWtFVQjZpZkw1c3VUZXlmaF9sOEJwMmpEUQ

Screenshot from the running application, signal generator running on the computer, feeding audio wave to OPA and than analog input 0. Look for hardware setup configuration on the “color organ” blog-post.

LInk to first version based on RADIX-2 FFT:     LINK

BTW, there is one more important thing, I missed to emphasize in my short introductory paragraph, code offers FLEXIBILITY over SNR ratio. Basic FFT algorithm has an intrinsic “build-in” GAIN: G(in) = FFT_SIZE / 2 . (in) stands for intrinsic. That is perfect value for fft_size = 64 ( Gain = 64 / 2 = 32) and arduino (Atmel AtMega328)  10-bit ADC ( max value = 1023 ). FFT output would be 32 x 1023 = 32736, exactly 15 bit + sign. In other words, scaling in the algorithm core doesn’t required at all! That alone improve speed and lower rounding noise error significantly. The same time G(in)  grows too high with FFT_SIZE = 256, when G = 256 / 2 = 128 and output of the FFT would overflow size of 16-bit integer math. But again, scaling don’t have to be 100%, as long as there is a way to keep it in balance with ADC data. In this particular case, with 10-bit ADC, we can keep gain just below 32, it’s not necessary to make it exactly “1″.  For 12-bit ADC upper G limit would be 8, still not “1″. To manipulate the gain, division by 2 (>> 1) in the “sum_dif_I” could be set, to prevent overflow with fft_size > 64. Right shift “gain limiter” creates a square root adjustment, according to new formula: G(rsl) = SQRT (FFT_SIZE) / 4 . (rsl) stands for right-shift-limiter.

  1.  G = 1 for fft_size = 16,
  2.  G = 2 for fft_size = 64,
  3.  G = 4 for fft_size = 256,
  4.  G = 8 for fft_size = 1024.

Summing up, for using RADIX-4 with arduino ADC and FFT_SIZE <= 64, keep division by 2 (>> 1) in the “sum_dif_I” commented out. In any other circumstances, >10 bits external ADC, >64 fft_size, uncomment it.

To be continue…..


Speech / Voice Recognition. Arduino project, next in a series FFT and Arduino.

 Finally, I’d like to present  the most sophisticated project I’ve done so far, build around the idea turning Arduino board into a DSP.  The results are really impressive for small microprocessor, with low memory size and low MIPS. IMHO, arduino provides better results, than Windows Vista VR system, with 1 GB / 2.2 GHz  hardware, for short one-two words commands, of course.
No HMM, neural networks, or other very popular and “scientifically sounding” theories, were considered to be implemented in the algorithm. Google brings up  millions links on a topic, just ask, but only few of them are designed on really scientific concept, rather than dumb data base “sharpening”. I’m not saying they are completely wrong, and I’m not an expert in the field, but they are not smart ether. My decision is simple 2D cross-correlation. Basically, the heart of the recognition algorithm is similar to an image matching program, which works the same way for voice/sound.  To create a Spectrogram image, arduino is continuously monitoring sound level via microphone, and start capturing data when VOX threshold is exceeded. After input array “X” filled up, data transfered on next level to calculate FFT. The same “conveyor belt” works between FFT and Filtering, flags raised when data is ready, and flags lowered when process finished. The only difference is a speed, conveyor belt is running faster passing data ADC-FFT, and slower at Filter-Correlation stage, as it requires 64 regular cycles to complete spectrogram image in one SuperCycle.  The most time consuming part is Edge Enhancement / HPF Filtering of the spectrogram. I’m still looking around to improve performance of this stage, as it holds all process back from to be fully “Real Time”.
 Specification:
-  4 kHz sampling rate:  2 kHz voice freq. range;
-  64 FFT subroutine,    62.5 Hz spectral resolution;
-  16 x 64 Spectrogram Image, around 1 second max voice password;
-  duration of the Cross-Correlation < 5 milliseconds;
-  duration of the FFT+SQRT+Compression < 4 milliseconds;
-  duration of the Edge Enhancement ~ 35 milliseconds;Main cycle time frame is 16 milliseconds, it’s defined by sampling rate x FFT size, 0.25 x 64 = 64 millisecond. Super-cycle 1.024 is needed only because EE prevents all processes to be completed in less than 16 milliseconds. There is a resources left, to increase sampling up to 8 or even 12 kHz, I just had no time to conduct experiments if it is beneficial.

There is a Command Line Interface, built-in the software, which control “record” and debug “print” functions, 7 commands for now:
if (incomingByte == ‘x’) {           // INPUT ADC DATA
if (incomingByte == ‘f’) {           // FFT OUTPUT
if (incomingByte == ‘s’) {           // SPECROGRAMM PRE  FILTERED
if (incomingByte == ‘g’) {           // SPECROGRAMM POST FILTERED
if (incomingByte == ‘r’) {           // RECORD SPECROGRAMM TO EEPROM
if (incomingByte == ‘p’) {           // PLAY SPECROGRAMM FROM EEPROM
if (incomingByte == ‘m’) {           // FREE MEMORY BYTES

Software is written for AtMega328p microprocessor, Arduino Uno board or similar. For others, all referenced registers has to be replaced with appropriate names for microprocessor.Compiles on 022 IDE, there are some conflicts with 1.0 IDE, that I was not feel myself right to troubleshoot yet. For better understanding some math background, have a look at my previous posts.

Link to download a sketch:   Voice_Recognition_24_01

Analog front-end is the same, as I used in my first project: Color Ogran
There is not much could be improved on this part, and I again used both inputs – from microphone to do tests with my own voice, and also from “line” input, for single tone test generated by computer during debugging. Next picture shows “s” command print-out in the serial monitor window, after I pronounce a word : “Spectrogram” . Due limited size of the window, data printed with 90 degree rotation, left-right is frequencies bands direction, and up-down is time. Lower freq. on left side (60 Hz) and higher (2 kHz) on the right.  The same time 3D images generated in right view angle.

This is how spectrogram looks like after “g” command entered in serial monitor and word sounds just right after that:

Next couple images created with single tone frequency  (320 Hz), just to show more clear “internal properties” of the filtering, again “s” and “g” commands were entered:

Well, as tone sounds continuously, it shows filtering in one direction only, and not the best tutorial on edge-enhancement theory. (“Home brew” lab limits). The same time last picture shows, that each “peek” on the original spectrogram, become surrounded by negative smaller peeks, resulting in “0″ overall sum  on 3×3 foot-print, and consequently on the whole map. In electronics it goes under HPF name, and essence of process is to remove DC component, plus attenuate  Low Frequencies.
Excelent on-line book

Short manual:
to be completed later


Optical Magnet, Arduino project next in a series Laser Tracking 3D

This blog considered to be next stage in the series published earlier, concentrated around the idea tracking object in the space. There are an enormous quantity of similar projects could be developed on this platform. I’ll name a few:

- star / rocket / vehicle tracking;
- follow me / robot / hands / cap etc;
- navigation to charging station / landing pad / around area;
- contact-less measurements rotational speed / angle to surface / shifting.

All of this on a few dollars micro-controller and cheap CMOS camera! Real-time, up to 60 Hz update rate!
Please, check on the first and second version, as I’d skip explanation basic design concept here.

  Most important features in this series of projects:

* 1. LOCALIZATION XY.
* 2. RANGE FINDER Z.
* 3. TRACKING 3-D.
* 4. TRACE TOOL.
* 5. TRACKING 6-D.
* 6. OPTICAL MAGNET.

Feature considered to be independent, so you can star  from  project 1:
http://fftarduino.blogspot.com/2011/12/arduino-laser-3d-tracking-range-finder.html
than move on next stage, and so on depends on a budget, parts availability or your interest!

This version of hardware/software system design capable to track object in

6 – D ++ space:

*   -  Linear motion along X, Y, Z coordinates (3D);
*   -  Rotation around fixed axis (6D);

I put two ++ plus signs, in order to underline capability of the hardware design to track Rotation of the object based not only on distance measurements, but also Reflectivity. As Power Control Loop strictly hold lasers radiation under control, simple calculation in periodicity of the emitted power would provide information about angular speed round / cylindrical object. Phase difference in 4 signals gives rotational center for Z. It’s also apply for linear motion, tracking of the object could be based on reflectivity or distance or BOTH simultaneously ,  which opens enormously great amount of possibilities.

Optical Magnet:
*  – Attract closest surface;
*  – Repel;
*  – Attract or repel surface with specific reflectivity (BLACK, WHITE, OR COLOR)!!!
*    * work in progress, Reflectivity math are not implemented yet       *
*    * algorithm to track rotation is not included, this is version for demonstration purposes  mainly.*

Link to download Arduino Uno sketch:  Optical_Magnet_6D

8 January, 2012.
Release notes of the version 3.2 software:

-  Video is De-interlaced, full image size 512 (active 492) lines;
-  digitalWrite, analogWrite functions of the arduino IDE were replaced by direct port manipulation, in order  to improve time performance of critical section of the code – interrupt subroutine and to avoid blocking interruption call (functions have locking mechanism in theirs body);
- minor changes in Power Control Loop algorithm, to prevent oscillation;

Link to download Arduino Uno sketch:  Optical_Magnet_6D_V3

 finished…