## Stamp II math note #5

#### Statistics and digital data filtering

(c) 1998 EME Systems, Berkeley CA U.S.A.
Tracy Allen
<stamp index> <home>

#### Contents (updated 1/8/2002)

Smoothing

top

The following routine implements a digital first order low pass filter. It filters an input data stream, so that noise or fluctuations are not so pronounced in the output data stream. There is a time constant. If a step change in the input value occurs, the time constant is the number of iterations it takes for the response to slew 2/3 of the way from the initial to the final value. A larger value of the time constant means more filtering, but slower response to changes.

Suppose the reading to be filtered comes, as it often does, from an analog to digital converter. Suppose further that the readings are in the range of 0 to 4095. Here is a filter with a time constant of 16. It implements the following numerical filter:

`Y0 = X0*16                initial accumulator =16* initial datum   Yn = Yn-1 * 15/16 + Xn     iteration adds new value to accumulator   Zn = Yn/16                filter output    ' Filter with time constant=16 iterationsX var word    ' current value, filter inputY var word    ' smoothed value, accumulatorZ var word    ' filter outputinititalize:   gosub getX  ' get current value, not shown   Y=X*16     'initialize, note multipication *16loop:   gosub getX  ' get current value, not shown   Y = Y ** 61441 + X           ' smoothing function  Z = Y/16  debug dec X, tab,dec Z, CR ' show valuesgoto loop`

Notes and explanation:

• The input variable is X. The values can range from 0 to 4095. Not greater than 4095, because then it could overflow one word when multiplied times 16. (see below for other implementations)
• The time constant is in units of iterations. To convert this to absolute time, you have to multiply by the actual iteration time . If the loop happens to take 1 millisecond, then the real time constant will be 16 milliseconds. But if the loop happens to take 10 seconds, then the real time constant will be 160 seconds. The time to settle to within 98% of the final value will be 5 time constants.
• Note that the smoothed value is intitialized to Y=X*16. And the output smoothed value, Z, is Y divided by 16. This is necessary because of the integer math of the BASIC Stamp. The algorithm needs to account internally for "fractional" changes.
• Here are formulas to get faster response, at the expense of less filtering. All of these are initialized to X*16, and the output of the filter is divided by 16 to compare with the current value. All of these are limited to values less than 4096.
`' init Y to X*16             '<---note initialization to CV*16Y = (Y ** 61441) + X        '<---- time constant=16 iterations' Y = (Y ** 57345) + (X*2)  '<---- time constant=8 iterations' Y = (Y ** 49153) + (X*4)  '<---- time constant=4 iterations' Y = (Y ** 32769) + (X*8)  '<---- time constant=2 iterationsdebug  "smoothed value=",Y/16 ' <--- note division by 16`
• Here is an explanation of how to derive the formula, when the time constant =8. The first steps in the derivation are not good BASIC Stamp math, because they would lead to integer overflow. To make it into good BASIC Stamp math we have to convert the formulae to use the ** operator. For a time constant of 8, the new value of SV is made up of 7 parts old value of SV and one part current value (*16).
`Y = (Y*7 + (X*16)) / 8    ' <-- not good BS math                             ' due to integer overflow                             ' if Y>9363, then Y*7>65535Y = (Y * 7/8) + (X*2)     ' <-- algebraically the same                             ' but still integer overflow.`

One way to accomplish the multiplication by 7/8 on the BASIC stamp is by the ** or */ commands. for the 7/8, use **57344. Or */192. (See essay on ** and */.) Empirically, **57345 works a little better than **57344.

`the **61441 operator performs multiplication by 15/16 = 61440/65536the **57345 operator performs multiplication by 7/8 = 57344/65536the **49153 operator performs multiplication by 3/4 = 49152/65536the **32769 operator performs multiplication by 1/2 = 32768/65536`

Another way to accomplish the multiplication by 7/8 is, Y - (Y/8).

• Multiplying by 16 prior to filtering allows the filter to track "fractional" changes in the average value. Without this scaling, there are bad numerical effects (hysteresis) due to integer rounding. When the time constant is N, the current value must be multiplied by N or greater at the input to the filter in order to achieve satisfactory results. The problem amounts to the error in computing 15/16 in integer math, where 15/16=0. Due to this, Y will "stick" one unit below X, and never come up completely to exactly equal to X. It is easier for the algorithm to go down than it is for it to go up, due to the nature of whole number rounding.

Here is another method:, which actually has better numerical convergence properties at the expense of little more calculation. This algorithm operates on the difference between the new input value ( times 16) and the output value. At each iteration, it acts to remove 1/4 of the difference. However, at the end when the values are close together it converges linearly at 1 unit per iteration and ends up at exactly zero difference. Say the input jumps from 1000 to 1002. Then the filter output stays at 1000 for 16 iterations, then at 1001 for 16 iterations while the filter "fractional" value progresses on step at a time from 16000 to 16032. So if the input value bobbles around from 1000 to 1001, the filter output will hover close to 1001. If the input makes a much larger jump, the convergence to the new value is initially much faster, making 1/4 of the difference in each step. (This is similar to the low pass filter)

`' Filter with final time constant=16 iterations (linear)' X var word    ' current value, filter inputY var word    ' smoothed value, accumulatorZ var word    ' deviation of current value from the smoothed one              ' also the filter outputS var bit     ' sign of the difference   inititalize:   gosub getX     ' get current value, not shown   Y=X*16         'initialize, note multipication *8loop:   gosub getX     ' get current value, not shown  debug dec X,tab  ' show it  Z = X*16-Y     ' difference from smoothed value  sign= Z.bit15  ' sign of the difference  Z=abs Z        ' magnitude of the difference  Z = (Z+3)/4      ' 1/4 of the difference, or 1, or 0        Y= Y + (-S^Z+S)  ' update smoothed value                   ' term in () is conditional subtractiondebug dec Y, tab,dec Y*/160, CR ' show valuesgoto loop`

The */160 in the debug shows the decimal value in tenths, so you can visualize the fractional part tracking slowly behind the input value, filtering out the rapid changes, and catching up if the input value stays steady at one value.

Here is a filter that works for larger numbers, such as barometer data that has to be filtered for altimetry. This filter uses a double precision accumulator, it too converges well without numerical problems. This can be used for small numbers too.

`example with time constant kR=8:   Y0 = X0*65536                   initial accumulator =216* initial datum    Yn = Yn-1 * 7/8 + Xn*65536/8    iteration adds new value to accumulator    Zn = Yn/65536                  filter output    ' Double precision filter for input numbers up to 65535kR con 3       ' the filter time constant is 23=8, done a shift right kRkL con 16-kR   ' for the computation shift leftsX  var word    ' current value, filter inputY0 var word    ' smoothed value, accumulator low wordY1 var word    ' smoothed value, accumulator high wordZ  var word    ' scratch variable for calculations               ' also final filter outputinititalize:   gosub getX  ' get current value, not shown   Y1=X        'initialize, note effective multipication *65536loop:   gosub getX  ' get current value, not shown  Z=(Y0>>kR)|(Y1<<kL)   ' form low word of accumulator, divided by (2^kR).  Y0=Y0-Z               ' subtract it from low word of accumulator                        ' next do the same for high word, minus borrow from low word  Y1=Y1-(Y1>>kR)+(Y0 max -Z +Z max 1 -1)  Z=X<<kL               ' form input data times divided by (2^kR), part to low word  Y0= Y0 + Z            ' add it to low word of the accumulator  Y1= Y1+(X>>kR)+(Y0 max Z -Z max 1)  ' do the same for the high word, plus carry  Z = Y1 + (X0 max 1)   ' filter output, high word of accumulator with adjustment.  debug dec X, tab, dec Z, CR ' show the raw value and the filtered valuegoto loop`

Statistics: mean, min, max, and S.D.

top

This is an example of how to deal with the large numbers that arise from maintaining sums of values that can quickly exceed the 16 bit limit of the Stamp word. It potentially takes only 16 samples from a 12-bit A/D converter to reach that limit. But there is no problem if we can sum the values to a double precision total. On the Stamp, this takes routines for addition, and also division in order to divide the sum of the values by the number of samples to get the mean.

The sum of the squares of the data values is even more demanding, since it can exceed 16 bits in one sample. For example 1000*1000=1,000,000, which is well above the 65535 limit of 16 bit math. But it is well within the 4,294,901,760 limit of 32 bit double precision math. The sum of the squares is essential for the calculation of the standard deviation and the variance. Once having the sum, it is necessary to subtract from that the square of the sum and take its square root, so that means double precision routines for subtraction, multiplication and the square root.

In this example I arranged to use a sample size of 64, so that all the di visions can be done as shifts. (Note, I implemented this routine before I understood how to do the general double precision math on the BASIC Stamp. I would do it a little differently next time around.) I cheated a little on the standard deviation, and used a division by 64 where it should really be a division by 63. The samples in this example are the result of a 12-bit A/D conversion. Each sample is less than 2^12. The sum of the 64 samples will be less than 2^18. The sum of the squares will be less than 2^30. A special trick has to be to calculate the square of the sum, which can be as large as 2^36.

The original application for this calculation was for an optical backscatter instrument, measuring the turbidity (cloudiness) of water. Currents and turbulance in the water kicked up silt off the bottom, and the fluctuations in turbidity on the scale of seconds were an important component of the research.

In this note, additions are done with a 15-bit representation, that is, numbers are adjusted so that the most significant bit of the word variable represents the carry out of an addition or subtraction. Multiplications are done in a 16-bit representation, so that the * and ** operators will work as intended. The representation of the numbers will have to change from 15 to 16 bit and vice versa as called for. And divisions are done using shifts operations.

`sum0    var     word          ' sum of the samples, least sig. wordsum1    var     word          ' most significant wordsumc    var     sum0.bit15    ' internal carry for sumssq0    var     word          ' sum of squares, least sig. wordssq1    var     word          ' most significant word.ssqc    var     ssq0.bit15    ' internal carry for squaresomax    var     word          ' maximum sample valueomin    var     word          ' minimum sample valueADres   var     word          ' A/D converter result, 12 bitwordj   var     word          ' temporary variableNN      var     byte          ' loop counter and remporary variable   ' 64 samples are taken and summary stats accumulatedomin=\$ffff         ' initialize accumulationsomax=0sum0=0sum1=0ssq0=0ssq1=0   serout 16,b96,[CR," sampling.. "]for NN=1 to 64     ' 64 samplespause 75           ' for timed sampling.                   ' each time around the loop eats up 25 milliseconds                   ' pause 75 for 0.1 second,  pause 975 for 1 second etc.gosub ADread       ' get sampleomin = ADres max omin   ' minimumomax = ADres min omax   ' maximumsum0=sum0+ADres         ' sum of the valuessum1=sum1+sumc          ' carry to high wordsumc=0                  ' zero the carry bit (15 bit representation)wordj=ADres*ADres       ' calculate sample^2 (low 16 bits)ssq0=wordj & \$7fff + ssq0                         ' sum of square, low word (15 bit representation)wordj=ADres**ADres << 1 + wordj.bit15                        ' calculate sample^2 (high 16 bits of r15 representation)                        ' note that the lsb comes from the * calculationssq1=wordj+ssq1+ssqc    ' accumulate high word, include carry from low sumssqc=0                  ' zero the carry bit (15 bit representation)next   ' show minimum and maximumserout 16,b96,["min=",dec omin,"  max=",dec omax]' calculate mean; r15->word (4096 max)ADres=sum0>>6+(sum1<<9)   ' divide by 64, by shift and OR                          ' the result is in 16-bit representationserout 16,b96,["  avg=",dec ADres]' calculate Standard Deviation' the notation r15 means 15-bit representation' the notation r16 means 16-bit representation' there will be a lot of switching back and forth' compute square of the sums' first divide by 8 and save remainder' this is a trick--the square of the sums can be as large as 2^36NN=sum0//8                      ' save remainder sum0=sum0>>3+(sum1<<12)         ' divide by 8; r15->r16 wordwordj=sum0                      ' save a copysum1=sum0**sum0                 ' square of sum/8sum0=sum0*sum0sum1=sum1<<1+sumc               ' convert r16->r15sumc=0wordj=wordj/4*NN                ' compute remainder times sumsum0=sum0+wordj                 ' add remaindersum1=sum1+sumcsumc=0'' subract 64*mean^2 from sum of squares r15->r15ssq0=ssq0-sum0ssq1=ssq1-sum1-ssqcssqc=0' divide by 64 (should really divide by 63 here, oh well)ssq1=ssq0>>6|(ssq1<<9)          ' r15--> r16 word                  ' take square root                   ' & interpolate; r16 word->r16 wordssq0=sqr ssq1     ' first pass, result is bytessq1=ssq1-(ssq0*ssq0)*100+ssq0/(2*ssq0+1)+(ssq0*100)/10                  ' see separate note on interpolation of sqr   serout 16,b96,["  SD=",dec ssq1]`

median filter, also maximum, minimum and mode

top

The "median" (the middle value in a list of numbers when they are sorted in ascending order) is a robust statistic. It is often less sensitive to outliers than is the average value of the numbers. It is often used as the very first stage of digital signal processing. A small fifo buffer holds raw data samples. At each step, the output to subsequent stages of the filter is the median value of the numbers currently in the buffer. At each step, the oldest value in the filter is replaced with the newest value.

Another statistic is the "mode", the most common value in a list of numbers. This is an iffy statistic in a small sample. It can happen that more than one number appears the same number of times, or that no number appears more than once.

The most familiar statistic is the average , the sum of the numbers divided by the length of the buffer. The average tends to be strongly affected by outlying values that might come from noise or instrument errors, so it is not a good choice for the initial stages of a digital filter. However, the output of a median filter based on 5 to 15 samples maybe fed to a second level of filtering that does in fact compute an average over a longer fixed time interval, or a moving (windowed) average.

I will build up to the median filter by developing a couple of subroutines.

Here's a subroutine that counts occurances of a particular value. For example, there might be 3 instances of the particular value y=29 in a list of numbers. This routine "marks" the values in the list as they are counted, and in the later routines, this will allow us to avoid counting the same list element more than once. The variables in this subroutine are used in all the following routines in this section.

`  m   con   7          ' number of samples in the list  m1  con   m-1        ' one less than that (for index)  x   var   word(m)    ' an m-byte list of samples  y   var   word       ' value to match for count  i   var   nib        ' index  n   var   nib        ' count of occurances (cumulative)  fx  var   byte       ' 8 bits for marking  f   var   fx.bit0    ' for bit addressing of marks  cnty:            ' enter with y=value to count    for i=0 to m1   ' scan the list      if f(i)=0 then cnty1  ' already counted      if x(i)-y max 1 then cnty1  ' not =y      n=n+1    ' count it      f(i)=0   ' mark it as counted  cnty1:    next  return           ' exit with n=# of occurances`

The above example works on a list of 7 word-size entries. It should be obvious what changes need to be made for different sized lists or for different sized variables. The byte variable, fx has to be at least as large as the number of elements in the list. It is the bits of that byte that are used to "mark" the list elements as "counted".

If you use external memory, or even the scratchpad memory in a BS2sx or BS2e, you could modify the routine for a larger list. fx has to be expanded to match, and the variables i and n (here nibs) too have to be sized to count the elements. As a digital filter, the list is usually going to be 5 to 15 samples only.

Note that the function (x(i)-y max 1) in the above is equal to zero only if x(i)=y. This is a "trick" function that allows the algorithm to avoid using an IF-THEN statment.

Here are routines that locate the maximum or minimum value in the list of numbers. This only searches the elements in the list that have their flag bit, f(i) =1 (uncounted). This allows the median routine to first find the highest value, then the second highest, and so on.

`  ' finds the maximum value in a list of m samples  ' skips values that have been counted as marked.  ' same variables as cnty above  findmax:    y=0    for i=0 to m1            y=x(i)*f(i) min y ' maximum among the uncounted                        ' or use y=x(i)&-f(i) min y    next  return    ' maximum in list is y  ' finds the minimum value in a list of m samples   ' same variables as cnty above  findmin:    y=-1        ' maximum value, \$f, \$ff or \$ffff    for i=0 to m1      y=x(i)|~-f(i) max y    next  return    ' minimum in list is y`

In the above, it may seem strange to see the min operator used to compute the maximum, and the max operator to compute the minimum, but that is the way it works in the BASIC Stamp. The logic expressions assure that only the flagged list elements contribute. This could be done with IF-THEN, but again it takes longer and uses more code on the BS2.

Here at last is a median routine. It works by finding the maximum value in the list, counting how many instances there are of that value, then the second largest value, how many of those, and so on. It continues until the total number counted exceeds 1/2 of the values in the list. That value is the median. We assume that there are an odd number of samples in the list, so that there really is an unambiguous middle element.

`  ' finds the median value in a list of m samples  ' same variables & constants as cnty above, plus..    m2     con m/2+1  ' middle sample in the list                      ' e.g. m=7 in list, m2=4th in the middle    n2     var nib    ' to hold the cumulative count    findmedian:      fx=-1           ' mark entire array as "uncounted"      n=0             ' none yet counted    findmedian1:      gosub findmax   ' maximum value in list, as yet uncounted      gosub cnty      ' number of instances of that value      if n<m2 then findmedian1   ' not done yet    return            ' median=y`

Now suppose that the elements in the list come from an analog to digital converter or from some other means of reading data. The list will be filled dynamically, with each new entry replacing the oldest in the list, and the median value being output to the rest of the process. The median filter is an effective first stage for digital signal processing. This is a simple example of a filtering routine.

`  ' median filter  j     var nib       ' pointer to the oldest data in the table                      ' (this is different from variable "i"!)                      ' do not use j for anything else in the program            initialize:    for i=0 to m1     ' fill the table with readings      gosub getdata   ' not shown, returns a reading, y      x(i)=y          ' fill the table with readings    next  loop:    gosub findmedian  ' now find the median    debug ?y          ' show the median                      ' might do some additional calculations here                      ' or feed the data to some other process    gosub getdata     ' not shown, returns a reading, y    x(j)=y            ' replace the oldest reading with y    j=j+1//m          ' point to the new oldest reading, circular buffer  pause 1000          ' for demo loop  goto loop    `

Here is a mode routine. It counts the number of occurances of each value in the list, and returns the value and the frequency of the most common one. The mode is not a very strong statistic, unless there are a relatively large number of samples distributed over a relatively small number of catagories. There may be other uses for this routine, other than statistical.

`  k   var   nib    ' frequency of most common value  z   var  word    j   var   nib    ' index for mode scan    mode:    fx=-1   ' mark all as uncounted, \$ffff    k=0     ' frequency  mode1:    j=ncd fx   ' find an uncounted value    if j=0 then mode2   ' done    y=x(j-1)   ' value to count    n=0        ' start at zero    gosub cnty    if n<=k then mode1  ' try again if count is low    k=n        ' this count is higher    z=y        ' this value is most frequent (so far)    goto mode1  ' do until no values left to count  mode2:  return   ' mode=z   frequency=k`

If there is more than one value that qualifies, the mode routine returns the first one it encounters. The variable k comes back with the count of the most common value. If k is small, that means that the "peak" is not very strong. If k=1, then no value appeared more than once. You have to decide what to do about those situations.

And finally, here is a routine to plot a frequency distribution, using ascii graphics. This of course is more interesting if there is a strong node, that is, if there are a relatively large number of samples distributed over a relatively small number of catagories, so that you can see a definite peak with tails. Samples can be combined into fewer catagories, so that the frequency in each catagory becomes significant.

`  ' plots a frequency distribution of values in the list  miny   var  word  maxy   var  word  distrib:    fx=-1  ' mark all as uncounted    gosub findmin    miny=y    gosub findmax    maxy=y    for y=miny to maxy      n=0    ' zero count to start      gosub cnty      debug dec5 y," ",rep "*"\n  ' number of stars=# of instances    next  return  ' example  '00779 **  '00780 ***    <-- median and mode        '00781 *      <-- average  '00782  '00783  '00784  '00785  '00786  '00787  '00788  '00789 *      <-- outlier`

<top> <index> <home> < mailto:info@emesystems.com>