The following routine implements a digital first order low pass filter. It filters an input data stream, so that noise or fluctuations are not so pronounced in the output data stream. There is a time constant. If a step change in the input value occurs, the time constant is the number of iterations it takes for the response to slew 2/3 of the way from the initial to the final value. A larger value of the time constant means more filtering, but slower response to changes.

Suppose the reading to be filtered comes, as it often does, from an analog to digital converter. Suppose further that the readings are in the range of 0 to 4095. Here is a filter with a time constant of 16. It implements the following numerical filter:

Y_{0}= X_{0}*16initial accumulator =16* initial datumY

_{n}= Y_{n-1}* 15/16 + X_{n}iteration adds new value to accumulatorZ

_{n}= Y_{n}/16filter output' Filter with time constant=16 iterations

X var word ' current value, filter input

Y var word ' smoothed value, accumulator

Z var word ' filter output

inititalize:

gosub getX ' get current value, not shown

Y=X*16 'initialize, note multipication *16

loop:

gosub getX ' get current value, not shown

Y = Y ** 61441 + X ' smoothing function

Z = Y/16

debug dec X, tab,dec Z, CR ' show values

goto loop

Notes and explanation:

- The input variable is X. The values can range from 0 to 4095. Not greater than 4095, because then it could overflow one word when multiplied times 16. (see below for other implementations)
- The time constant is in units of iterations. To convert this to absolute time, you have to multiply by the actual iteration time . If the loop happens to take 1 millisecond, then the real time constant will be 16 milliseconds. But if the loop happens to take 10 seconds, then the real time constant will be 160 seconds. The time to settle to within 98% of the final value will be 5 time constants.
- Note that the smoothed value is intitialized to Y=X*16. And the output smoothed value, Z, is Y divided by 16. This is necessary because of the integer math of the BASIC Stamp. The algorithm needs to account internally for "fractional" changes.
- Here are formulas to get faster response, at the expense of
less filtering. All of these are initialized to X*16, and the
output of the filter is divided by 16 to compare with the current
value. All of these are limited to values less than 4096.
' init Y to X*16 '<---note initialization to CV*16

Y = (Y ** 61441) + X '<---- time constant=16 iterations

' Y = (Y ** 57345) + (X*2) '<---- time constant=8 iterations

' Y = (Y ** 49153) + (X*4) '<---- time constant=4 iterations

' Y = (Y ** 32769) + (X*8) '<---- time constant=2 iterations

debug "smoothed value=",Y/16 ' <--- note division by 16 - Here is an explanation of how to derive the formula, when the
time constant =8. The first steps in the derivation are not good
BASIC Stamp math, because they would lead to integer overflow. To
make it into good BASIC Stamp math we have to convert the formulae
to use the ** operator. For a time constant of 8, the new value of
SV is made up of 7 parts old value of SV and one part current
value (*16).
Y = (Y*7 + (X*16)) / 8 ' <-- not good BS math

' due to integer overflow

' if Y>9363, then Y*7>65535

Y = (Y * 7/8) + (X*2) ' <-- algebraically the same

' but still integer overflow.One way to accomplish the multiplication by 7/8 on the BASIC stamp is by the ** or */ commands. for the 7/8, use **57344. Or */192. (See essay on ** and */.) Empirically, **57345 works a little better than **57344.

the **61441 operator performs multiplication by 15/16 = 61440/65536

the **57345 operator performs multiplication by 7/8 = 57344/65536

the **49153 operator performs multiplication by 3/4 = 49152/65536

the **32769 operator performs multiplication by 1/2 = 32768/65536Another way to accomplish the multiplication by 7/8 is, Y - (Y/8).

- Multiplying by 16 prior to filtering allows the filter to track "fractional" changes in the average value. Without this scaling, there are bad numerical effects (hysteresis) due to integer rounding. When the time constant is N, the current value must be multiplied by N or greater at the input to the filter in order to achieve satisfactory results. The problem amounts to the error in computing 15/16 in integer math, where 15/16=0. Due to this, Y will "stick" one unit below X, and never come up completely to exactly equal to X. It is easier for the algorithm to go down than it is for it to go up, due to the nature of whole number rounding.

Here is another method:, which actually has better numerical convergence properties at the expense of little more calculation. This algorithm operates on the difference between the new input value ( times 16) and the output value. At each iteration, it acts to remove 1/4 of the difference. However, at the end when the values are close together it converges linearly at 1 unit per iteration and ends up at exactly zero difference. Say the input jumps from 1000 to 1002. Then the filter output stays at 1000 for 16 iterations, then at 1001 for 16 iterations while the filter "fractional" value progresses on step at a time from 16000 to 16032. So if the input value bobbles around from 1000 to 1001, the filter output will hover close to 1001. If the input makes a much larger jump, the convergence to the new value is initially much faster, making 1/4 of the difference in each step. (This is similar to the low pass filter)

' Filter with final time constant=16 iterations (linear)

'

X var word ' current value, filter input

Y var word ' smoothed value, accumulator

Z var word ' deviation of current value from the smoothed one

' also the filter output

S var bit ' sign of the difference

inititalize:

gosub getX ' get current value, not shown

Y=X*16 'initialize, note multipication *8

loop:

gosub getX ' get current value, not shown

debug dec X,tab ' show it

Z = X*16-Y ' difference from smoothed value

sign= Z.bit15 ' sign of the difference

Z=abs Z ' magnitude of the difference

Z = (Z+3)/4 ' 1/4 of the difference, or 1, or 0

Y= Y + (-S^Z+S) ' update smoothed value

' term in () is conditional subtraction

debug dec Y, tab,dec Y*/160, CR ' show values

goto loop

The */160 in the debug shows the decimal value in tenths, so you can visualize the fractional part tracking slowly behind the input value, filtering out the rapid changes, and catching up if the input value stays steady at one value.

Here is a filter that works for larger numbers, such as barometer data that has to be filtered for altimetry. This filter uses a double precision accumulator, it too converges well without numerical problems. This can be used for small numbers too.

example with time constant kR=8:Y

_{0}= X_{0}*65536initial accumulator =2Y^{16}* initial datum

_{n}= Y_{n-1}* 7/8 + X_{n}*65536/8iteration adds new value to accumulatorZ

_{n}= Y_{n}/65536filter output' Double precision filter for input numbers up to 65535

kR con 3 ' the filter time constant is 2^{3}=8, done a shift right kR

kL con 16-kR ' for the computation shift lefts

X var word ' current value, filter input

Y0 var word ' smoothed value, accumulator low word

Y1 var word ' smoothed value, accumulator high word

Z var word ' scratch variable for calculations

' also final filter output

inititalize:

gosub getX ' get current value, not shown

Y1=X 'initialize, note effective multipication *65536

loop:

gosub getX ' get current value, not shown

Z=(Y0>>kR)|(Y1<<kL) ' form low word of accumulator, divided by (2^kR).

Y0=Y0-Z ' subtract it from low word of accumulator

' next do the same for high word, minus borrow from low word

Y1=Y1-(Y1>>kR)+(Y0 max -Z +Z max 1 -1)

Z=X<<kL ' form input data times divided by (2^kR), part to low word

Y0= Y0 + Z ' add it to low word of the accumulator

Y1= Y1+(X>>kR)+(Y0 max Z -Z max 1) ' do the same for the high word, plus carry

Z = Y1 + (X0 max 1) ' filter output, high word of accumulator with adjustment.

debug dec X, tab, dec Z, CR ' show the raw value and the filtered value

goto loop

This is an example of how to deal with the large numbers that arise from maintaining sums of values that can quickly exceed the 16 bit limit of the Stamp word. It potentially takes only 16 samples from a 12-bit A/D converter to reach that limit. But there is no problem if we can sum the values to a double precision total. On the Stamp, this takes routines for addition, and also division in order to divide the sum of the values by the number of samples to get the mean.

The sum of the squares of the data values is even more demanding, since it can exceed 16 bits in one sample. For example 1000*1000=1,000,000, which is well above the 65535 limit of 16 bit math. But it is well within the 4,294,901,760 limit of 32 bit double precision math. The sum of the squares is essential for the calculation of the standard deviation and the variance. Once having the sum, it is necessary to subtract from that the square of the sum and take its square root, so that means double precision routines for subtraction, multiplication and the square root.

In this example I arranged to use a sample size of 64, so that all the di visions can be done as shifts. (Note, I implemented this routine before I understood how to do the general double precision math on the BASIC Stamp. I would do it a little differently next time around.) I cheated a little on the standard deviation, and used a division by 64 where it should really be a division by 63. The samples in this example are the result of a 12-bit A/D conversion. Each sample is less than 2^12. The sum of the 64 samples will be less than 2^18. The sum of the squares will be less than 2^30. A special trick has to be to calculate the square of the sum, which can be as large as 2^36.

The original application for this calculation was for an optical backscatter instrument, measuring the turbidity (cloudiness) of water. Currents and turbulance in the water kicked up silt off the bottom, and the fluctuations in turbidity on the scale of seconds were an important component of the research.

In this note, additions are done with a 15-bit representation, that is, numbers are adjusted so that the most significant bit of the word variable represents the carry out of an addition or subtraction. Multiplications are done in a 16-bit representation, so that the * and ** operators will work as intended. The representation of the numbers will have to change from 15 to 16 bit and vice versa as called for. And divisions are done using shifts operations.

sum0 var word ' sum of the samples, least sig. word

sum1 var word ' most significant word

sumc var sum0.bit15 ' internal carry for sum

ssq0 var word ' sum of squares, least sig. word

ssq1 var word ' most significant word.

ssqc var ssq0.bit15 ' internal carry for squares

omax var word ' maximum sample value

omin var word ' minimum sample value

ADres var word ' A/D converter result, 12 bit

wordj var word ' temporary variable

NN var byte ' loop counter and remporary variable

' 64 samples are taken and summary stats accumulated

omin=$ffff ' initialize accumulations

omax=0

sum0=0

sum1=0

ssq0=0

ssq1=0

serout 16,b96,[CR," sampling.. "]

for NN=1 to 64 ' 64 samples

pause 75 ' for timed sampling.

' each time around the loop eats up 25 milliseconds

' pause 75 for 0.1 second, pause 975 for 1 second etc.

gosub ADread ' get sample

omin = ADres max omin ' minimum

omax = ADres min omax ' maximum

sum0=sum0+ADres ' sum of the values

sum1=sum1+sumc ' carry to high word

sumc=0 ' zero the carry bit (15 bit representation)

wordj=ADres*ADres ' calculate sample^2 (low 16 bits)

ssq0=wordj & $7fff + ssq0

' sum of square, low word (15 bit representation)

wordj=ADres**ADres << 1 + wordj.bit15

' calculate sample^2 (high 16 bits of r15 representation)

' note that the lsb comes from the * calculation

ssq1=wordj+ssq1+ssqc ' accumulate high word, include carry from low sum

ssqc=0 ' zero the carry bit (15 bit representation)

next

' show minimum and maximum

serout 16,b96,["min=",dec omin," max=",dec omax]

' calculate mean; r15->word (4096 max)

ADres=sum0>>6+(sum1<<9) ' divide by 64, by shift and OR

' the result is in 16-bit representation

serout 16,b96,[" avg=",dec ADres]

' calculate Standard Deviation

' the notation r15 means 15-bit representation

' the notation r16 means 16-bit representation

' there will be a lot of switching back and forth

' compute square of the sums

' first divide by 8 and save remainder

' this is a trick--the square of the sums can be as large as 2^36

NN=sum0//8 ' save remainder

sum0=sum0>>3+(sum1<<12) ' divide by 8; r15->r16 word

wordj=sum0 ' save a copy

sum1=sum0**sum0 ' square of sum/8

sum0=sum0*sum0

sum1=sum1<<1+sumc ' convert r16->r15

sumc=0

wordj=wordj/4*NN ' compute remainder times sum

sum0=sum0+wordj ' add remainder

sum1=sum1+sumc

sumc=0

'

' subract 64*mean^2 from sum of squares r15->r15

ssq0=ssq0-sum0

ssq1=ssq1-sum1-ssqc

ssqc=0

' divide by 64 (should really divide by 63 here, oh well)

ssq1=ssq0>>6|(ssq1<<9) ' r15--> r16 word

' take square root

' & interpolate; r16 word->r16 word

ssq0=sqr ssq1 ' first pass, result is byte

ssq1=ssq1-(ssq0*ssq0)*100+ssq0/(2*ssq0+1)+(ssq0*100)/10

' see separate note on interpolation of sqr

serout 16,b96,[" SD=",dec ssq1]

Another statistic is the "mode", the most common value in a list of numbers. This is an iffy statistic in a small sample. It can happen that more than one number appears the same number of times, or that no number appears more than once.

The most familiar statistic is the average , the sum of the numbers divided by the length of the buffer. The average tends to be strongly affected by outlying values that might come from noise or instrument errors, so it is not a good choice for the initial stages of a digital filter. However, the output of a median filter based on 5 to 15 samples maybe fed to a second level of filtering that does in fact compute an average over a longer fixed time interval, or a moving (windowed) average.

I will build up to the median filter by developing a couple of subroutines.

Here's a subroutine that counts occurances of a particular value. For example, there might be 3 instances of the particular value y=29 in a list of numbers. This routine "marks" the values in the list as they are counted, and in the later routines, this will allow us to avoid counting the same list element more than once. The variables in this subroutine are used in all the following routines in this section.

m con 7 ' number of samples in the list

m1 con m-1 ' one less than that (for index)

x var word(m) ' an m-byte list of samples

y var word ' value to match for count

i var nib ' index

n var nib ' count of occurances (cumulative)

fx var byte ' 8 bits for marking

f var fx.bit0 ' for bit addressing of marks

cnty: ' enter with y=value to count

for i=0 to m1 ' scan the list

if f(i)=0 then cnty1 ' already counted

if x(i)-y max 1 then cnty1 ' not =y

n=n+1 ' count it

f(i)=0 ' mark it as counted

cnty1:

next

return ' exit with n=# of occurances

The above example works on a list of 7 word-size entries. It should be obvious what changes need to be made for different sized lists or for different sized variables. The byte variable, fx has to be at least as large as the number of elements in the list. It is the bits of that byte that are used to "mark" the list elements as "counted".

If you use external memory, or even the scratchpad memory in a BS2sx or BS2e, you could modify the routine for a larger list. fx has to be expanded to match, and the variables i and n (here nibs) too have to be sized to count the elements. As a digital filter, the list is usually going to be 5 to 15 samples only.

Note that the function (x(i)-y max 1) in the above is equal to zero only if x(i)=y. This is a "trick" function that allows the algorithm to avoid using an IF-THEN statment.

Here are routines that locate the maximum or minimum value in the list of numbers. This only searches the elements in the list that have their flag bit, f(i) =1 (uncounted). This allows the median routine to first find the highest value, then the second highest, and so on.

' finds the maximum value in a list of m samples

' skips values that have been counted as marked.

' same variables as cnty above

findmax:

y=0

for i=0 to m1

y=x(i)*f(i) min y ' maximum among the uncounted

' or use y=x(i)&-f(i) min y

next

return ' maximum in list is y

' finds the minimum value in a list of m samples

' same variables as cnty above

findmin:

y=-1 ' maximum value, $f, $ff or $ffff

for i=0 to m1

y=x(i)|~-f(i) max y

next

return ' minimum in list is y

In the above, it may seem strange to see the min operator used to compute the maximum, and the max operator to compute the minimum, but that is the way it works in the BASIC Stamp. The logic expressions assure that only the flagged list elements contribute. This could be done with IF-THEN, but again it takes longer and uses more code on the BS2.

Here at last is a median routine. It works by finding the maximum value in the list, counting how many instances there are of that value, then the second largest value, how many of those, and so on. It continues until the total number counted exceeds 1/2 of the values in the list. That value is the median. We assume that there are an odd number of samples in the list, so that there really is an unambiguous middle element.

' finds the median value in a list of m samples

' same variables & constants as cnty above, plus..

m2 con m/2+1 ' middle sample in the list

' e.g. m=7 in list, m2=4th in the middle

n2 var nib ' to hold the cumulative count

findmedian:

fx=-1 ' mark entire array as "uncounted"

n=0 ' none yet counted

findmedian1:

gosub findmax ' maximum value in list, as yet uncounted

gosub cnty ' number of instances of that value

if n<m2 then findmedian1 ' not done yet

return ' median=y

Now suppose that the elements in the list come from an analog to digital converter or from some other means of reading data. The list will be filled dynamically, with each new entry replacing the oldest in the list, and the median value being output to the rest of the process. The median filter is an effective first stage for digital signal processing. This is a simple example of a filtering routine.

' median filter

j var nib ' pointer to the oldest data in the table

' (this is different from variable "i"!)

' do not use j for anything else in the program

initialize:

for i=0 to m1 ' fill the table with readings

gosub getdata ' not shown, returns a reading, y

x(i)=y ' fill the table with readings

next

loop:

gosub findmedian ' now find the median

debug ?y ' show the median

' might do some additional calculations here

' or feed the data to some other process

gosub getdata ' not shown, returns a reading, y

x(j)=y ' replace the oldest reading with y

j=j+1//m ' point to the new oldest reading, circular buffer

pause 1000 ' for demo loop

goto loop

Here is a mode routine. It counts the number of occurances of each value in the list, and returns the value and the frequency of the most common one. The mode is not a very strong statistic, unless there are a relatively large number of samples distributed over a relatively small number of catagories. There may be other uses for this routine, other than statistical.

k var nib ' frequency of most common value

z var word

j var nib ' index for mode scan

mode:

fx=-1 ' mark all as uncounted, $ffff

k=0 ' frequency

mode1:

j=ncd fx ' find an uncounted value

if j=0 then mode2 ' done

y=x(j-1) ' value to count

n=0 ' start at zero

gosub cnty

if n<=k then mode1 ' try again if count is low

k=n ' this count is higher

z=y ' this value is most frequent (so far)

goto mode1 ' do until no values left to count

mode2:

return ' mode=z frequency=k

If there is more than one value that qualifies, the mode routine returns the first one it encounters. The variable k comes back with the count of the most common value. If k is small, that means that the "peak" is not very strong. If k=1, then no value appeared more than once. You have to decide what to do about those situations.

And finally, here is a routine to plot a frequency distribution, using ascii graphics. This of course is more interesting if there is a strong node, that is, if there are a relatively large number of samples distributed over a relatively small number of catagories, so that you can see a definite peak with tails. Samples can be combined into fewer catagories, so that the frequency in each catagory becomes significant.

' plots a frequency distribution of values in the list

miny var word

maxy var word

distrib:

fx=-1 ' mark all as uncounted

gosub findmin

miny=y

gosub findmax

maxy=y

for y=miny to maxy

n=0 ' zero count to start

gosub cnty

debug dec5 y," ",rep "*"\n ' number of stars=# of instances

next

return

' example

'00779 **

'00780 *** <-- median and mode

'00781 * <-- average

'00782

'00783

'00784

'00785

'00786

'00787

'00788

'00789 * <-- outlier