18.2: Analysis Techniques for Laboratory

Last updated
Save as PDF

Page ID: 50312

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

Data Analysis (for introductory students)

To continue on the previous section when we get data from experiments we wish to analysis it to be able to make reasonable conclusions. There are multiple methods of analysis for data, but for this introduction class we will look at just two: Fitting and Fast Fourier Transform (FFT).

In a typical experiment we would have one independent variable and one dependent variable (in the cases were there are a lot of independent variables producing different effects we would need to go to more sophisticated analysis techniques that are not the focus of this course - however from a computer point of view the functions to do these sophisticated analyzes are available in programs like Octave). Here we will show how one might use the fitting method for laboratory results and the FFT method to analyze data acquired in noisy circumstances (i.e. reality) and possibly pull out some useful data.

Fitting

During an experiment we will vary some variable to get results that, of course, we would like to draw some conclusions about. One of the most common techniques is to see if there is a relationship (such as a linear or quadratic) between the independent variable and dependent variable. To do that one must fit a model to the data. The most common fitting technique (or regression)¹ is the least square method. In Octave (and MATLAB/Scilab, Python, IDL, etc.) there are routines available to do this type of fit for polynomials and other functions. Here we will detail the Octave method of doing one of these fits with the additional component of including error. Another fitting technique, the spline, will be demonstrated in the next section on FFTs.

Here we display the results of an experiment along with there model fit establishing a relationship (the data and the script are below)


This is a plot of the data from an experiment with errors and the model from the least square fit on the data with errors (different for each model point). To get more detail open the image in another window (or run the program below in Octave). The relationship between the resonance frequency and the volume in the wine class is \(-(8.1152 \times 10^{-3} \pm 2.6242 \times 10^{-4}) x^2 + (0.277000 \pm 0.050156) x + (954.88 \pm 2.1531)\)	This is a similar plot as on the left but with resonance frequency versus height. The equation is \(-(106.62 \pm 2.1685) x + (802.96 \pm 5.5370)\)	Close up of error bars for the image to the left (height) to emphasis the different errors on both the data and the model from the least square fit. Another possible way to display this would be by a confidence interval, however for this class we would like to stick with errorbars. Confidence intervals should be investigated in your junior or senior year.

The code for the data and the fit with some notes in the comments

%
% Example of fitting from old experiment
%
% Empty wine glass
%
% Experiment 1 uses the volume of water which, because of 
% the shape of the wine glass, was
% not linearly proportional to the height of the water from the stem
%
% load the wpolyfit routines which will be used to
% get errorbars (standard deviation) on the fit to the data
% NB: Each point in the fit model has a different error - as it should be
% NB2: If you run wpolyfit without getting the output, you will get a 
% graph of the data with the fit and a confidence interval...we don't 
% want to do that here, but you might feel the need to do it elsewhere
%
% The optim package has
%
pkg load optim
%
% Experimental values (with a number of trials)
%
Empty = [954, 956, 955, 955, 956];
w40mL = [952, 952.5, 951, 952.4, 951];
w80mL = [925.5, 926, 926, 927.5, 926];
w120mL = [871.5, 872.5, 871.5, 872.5, 872.5];
w160mL = [790, 789, 789, 789, 788];
w200mL = [690, 688, 687, 689.5, 689.75];
%
% Experiment 2 uses height of water from the stem (zero centimeters means 
% just wetting the wine glass on bottom)
%
w0cm = [798.2, 800, 800.5, 798.5, 798.8];
w1cm = [691, 691.8, 692.4, 692.2,  691.6];
w2cm = [592.4, 592.8, 592.2, 592.6, 592.8];
w3cm = [482.4, 482, 482.2, 482.4, 482];
%
% Experiment 3 used Corn Syrup...
% No need to show results
% Conclusion: never use corn syrup in an experiment like this
%
% Mean of experimental value and standard deviation which will be used for error
%
% Experiment 1
%
mempty = mean(Empty);
smempty = std(Empty);
m40 = mean(w40mL);
sm40 = std(w40mL);
m80 = mean(w80mL);
sm80 = std(w80mL);
m120 = mean(w120mL);
sm120 = std(w120mL); 
m160 = mean(w160mL);
sm160 = std(w160mL); 
m200 = mean(w200mL);
sm200 = std(w200mL);
%
% Experiment 2
%
mw0cm = mean(w0cm);
sw0cm = std(w0cm);
mw1cm = mean(w1cm);
sw1cm = std(w1cm);
mw2cm = mean(w2cm);
sw2cm = std(w2cm);
mw3cm = mean(w3cm);
sw3cm = std(w3cm);
wvol = [mempty,m40,m80,m120,m160,m200];
wvolstd = [smempty,sm40,sm80,sm120,sm160,sm200];
volume = [0,40,80,120,160,200];
%
% Experiment 1 graph
% Figure 1 (does not need to be expressly activated)
%
h1 = errorbar(volume,wvol,wvolstd);
set(h1,"marker",".");
set(h1,"linestyle","none");
%
% It is obvious that this will need to be an order 2 fit
% 
[p,s] = wpolyfit(volume,wvol,wvolstd,2);
[wvoln,wvolstdn]=polyconf(p,volume,s,'ci');
hold on
plot(volume,wvol,'bo;Average Frequency;');
plot(volume,wvoln,'r.-;Curve Fit;');
xlabel('Total water volume (mL) in the wine glass ');
ylabel('Resonance Frequency (Hz)');
axis([-10,210])
h2 = errorbar(volume,wvoln,wvolstdn);
set(h2,"color","red");
set(h2,"marker",".");
% axis([650,1000,-50,250])
title("Wine Glass - Volume","fontsize",12);
hold off
% 
% The equation:
%
dp = sqrt(sumsq(inv(s.R'))'/s.df)*s.normr
equation = p
%
% So we have -(8.1152e-3 +- 2.6242e-4)*x^2 + (0.277000 +- 0.050156)*x + (954.88 +- 2.1531)
%
% Experiment 2 graph
% Figure 2 does need to be expressly activated
%
figure(2);
wcm = [mw0cm,mw1cm,mw2cm,mw3cm];
wcmstd = [sw0cm,sw1cm,sw2cm,sw3cm];
height = [0,1,2,3];
%
h1 = errorbar(height,wcm,wcmstd);
set(h1,"marker",".");
set(h1,"linestyle","none");
%
% Issue; is this linear or a polynomial?
% We will need to do a chi^2 test to see which is better...
% F distribution is used to test hypothesis (see help of wpolyfit)
%
% Running this
% wpolyfit(height,wcm,wcmstd,1)  
% will give you a graph and the polynomial;
% but the method below gives you control...
%
[poly1,s1] = wpolyfit(height,wcm,wcmstd,1);
[poly2,s2] = wpolyfit(height,wcm,wcmstd,2);
F = (s1.normr^2 - s2.normr^2)/(s1.df-s2.df)/(s2.normr^2/s2.df);
prob = 1-fcdf(F,s1.df-s2.df,s2.df)
%
% prob =  0.42651  This is definitely not less than 0.01 so we cannot reject 
% lower order polynomial...
% so have some confidence in saying this linear
%
% Degree 3 is automatically rejected since there are only 4 points
%
% This data will not get a useful chi square because it is too little data
%
[p,s] = wpolyfit(height,wcm,wcmstd,1);
%
[wcmn,wcmstdn]=polyconf(p,height,s,'ci');
hold on
plot(height,wcm,'bo;Average Frequency;');
plot(height,wcmn,'r.-;Curve Fit;');
xlabel('Height of added water in the wine glass (cm) ');
ylabel('Resonance Frequency (Hz)');
axis([-0.1,3.1]);
h2 = errorbar(height,wcmn,wcmstdn);
set(h2,"color","red");
set(h2,"marker",".");
% axis([-1,4,400,900]);
title("Wine Glass - Height","fontsize",12);
hold off
% 
% The equation:
%
dp = sqrt(sumsq(inv(s.R'))'/s.df)*s.normr
equation = p
%
% So we have -(106.62+-2.1685)*x + (802.96+-5.5370) 
%

Fitting as shown in this part is commonly done in laboratory along with other techniques.

FFT

The Fast Fourier Transform (FFT) is the method of taking a Fourier transform in computers. The theory of these can wait until later courses, but we can still use the function as it is readily available in Octave (and MATLAB/Scilab, Python, IDL, etc.). As stated previously in other sections the Fourier transform transforms a time-space signal into a frequency-space signal and this transformation results in a different view of the data. For the purposes of this section, the FFT's main purpose for engineers is to extract hidden signals/data from time dependent signals and utilize that information for analysis or even cleaning up noise.

Example (FFT and some fitting as well)

For this example we create a program that generates some "perfect" data and then adds random noise to it plus a 60 Hz noise (which itself has random noise). With this very noise data we use the FFT and a spline fit to try and clean the data to see if we can find the original data. Note that while we know we added a 60 Hz noise we want to pretend that we do not have prior knowledge of what noise has corrupted our data. Instead we will see if there is any noise on our data with a specific frequency characteristic. BTW, why 60 Hz? This question is for you to answer.

Here we display the results where (A) is the data we get from the instrument, what do we see in (A)?

(B) is our data when we apply an FFT (actually a power density spectrum)
- Here we can see that there is a 60 Hz signal which we did not want so we define that as noise - pretty obvious isn't it?
- This "noise" was certainly not obvious in (A)
(C) is our data with the 60 Hz removed just by zeroing out that component (we could do better but it obscures the topic)
- Note we do this because the 60 Hz is noise, if it was not noise then the FFT analysis (B) above would be enough
- Just zeroing out the component has as an undesirable side effect of zeroing out any actual signal we are interested in
(D) is our spline fit that in this case acts as a filter
With (D) we can actually see some data which we could not see in (A)
(E) and (F) are the actually data that we generated (which in a real world situation we would not know)
- (E) is the data with random noise added but not the 60 Hz noise
- (C) and (E) compare favorably though (C) is still pretty noisy
- (F) is the original pristine signal created with zero noise (which is NOT how reality works)
- (D) and (F) compare favorably
We have used some of our previously described techniques to take a very noisy signal (A) that seemingly is useless and through analysis (B) and a cleaning process (C and D) produce a reasonable signal (D)
- Caveat: Any cleaning procedure will lose some of your signal so the engineer or scientist has to be very gentle on the data
- This is a little bit like art restoration where you have to look at each piece carefully and slowly clean it without destroying the art
Example program is below...

# For those who saw the old page: This works now...yes!
# This is an example script to produce a signal with two types of noise on it: Random and specific 60 Hz noise
# This is not a function because it is a one time use example
#
# We choose a sample frequence of 8000 samples/second because this program was originally used for sound and that is the frequency 
# that octave uses for its record (for recording sound)
# 
# So here note the sampling of the data and the time (5 secs) of the data
#
fsample = 8000;
time = 0:1/fsample:5-1/fsample;
#
# Here we produce a brute force data signal with noise
#
 x1 = 0.01*ones(1,1000);
 x2 = 0.017*ones(1,1000);
 x3 = 0.011*ones(1,1000);
 x4 = 0.02*ones(1,1000);
 x5 = 0.025*ones(1,1000);
 x6 = 0.012*ones(1,1000);
 x7 = 0.008*ones(1,1000);
 x8 = 0.024*ones(1,1000);
 x9 = 0.032*ones(1,1000);
 x10 = 0.04*ones(1,1000);
 x11 = 0.033*ones(1,1000);
 x12 = 0.025*ones(1,1000);
 x13 = 0.017*ones(1,1000);
 x14 = 0.024*ones(1,1000);
 x14 = 0.012*ones(1,1000);
 x15 = 0.022*ones(1,1000);
 x16 = 0.01*ones(1,1000);
 x17 = 0.017*ones(1,1000);
 x18 = 0.014*ones(1,1000);
 x19 = 0.026*ones(1,1000);
 x20 = 0.02*ones(1,1000);
 x21 = 0.014*ones(1,1000);
 x22 = 0.007*ones(1,1000);
 x23 = 0.021*ones(1,1000);
 x24 = 0.029*ones(1,1000);
 x25 = 0.017*ones(1,1000);
 x26 = 0.026*ones(1,1000);
 x27 = 0.016*ones(1,1000);
 x28 = 0.006*ones(1,1000);
 x29 = 0.018*ones(1,1000);
 x30 = 0.011*ones(1,1000);
 x31 = 0.005*ones(1,1000);
 x32 = 0.021*ones(1,1000);
 x33 = 0.014*ones(1,1000);
 x34 = 0.02*ones(1,1000);
 x35 = 0.012*ones(1,1000);
 x36 = 0.005*ones(1,1000);
 x37 = 0.0015*ones(1,1000);
 x38 = 0.024*ones(1,1000);
 x39 = 0.032*ones(1,1000);
 x40 = 0.022*ones(1,1000);
 xcomp =[x1,x2,x3,x4,x5,x6,x7,x8,x9,x10,x11,x12,x13,x14,x15,x16,x17,x18,x19,x20,x21,x22,x23,x24,x25,x26,x27,x28,x29,x30,x31,x32,x33,x34,x35,x36,x37,x38,x39,x40];
 #
 # We will add noise with a random function, note it is only plus not minus so we have 
 # actually changed the signal itself as well. Since this is a demo we will just assume 
 # the we have added a component to the signal so the "true" signal is xcomp +0.025
 x = xcomp + 0.05*rand(1,40000);
 truexcomp = xcomp + 0.025*ones(1,40000);
#
# Now we produce the 60 Hz noise interferance
# 
elec_noise = cos(2*pi*60*time) + 0.05*rand(1,40000);
sig = x + elec_noise;
# 
subplot(321)
plot(time,sig)
title("(A) Our signal with ALL the noise components")
xlabel("Time (seconds)")
nlength = length(sig);
fftsig = fft(sig);
#
# Rids us of the two-sided 
# portion of the fft
#
fftsig2 = fftsig(1:nlength/2+1);
#
# The first part is a normalization (though not quite)
#
psdsig = (1/(fsample*nlength)).*fftsig2.*conj(fftsig2);
#
# Multipling by two to ensure conservation of total power; plus pull out DC
# component
#
psdsig(2:end-1)=2*psdsig(2:end-1);
freq = 0:fsample/nlength:fsample/2;
#
# Test plot
#
subplot(322)
plot(freq,psdsig,"color","magenta")
title("(B) FFT (in power form) of noisy (left) signal")
xlabel("Frequency")
axis([0,100])
#
# From the power density function we can see there is 60 Hz noise (which we knew, 
# but in real analysis we would not know)
#
# Let us see if we can clear this noice out to get our signal back again by 
# zeroing out the signal out large signal in the fft (not the psd...so we will have to 
# do both sides). 
# This is a brute force method for demostration purposes.
#
pos1 =  (60/fsample)*nlength+1;
pos2 = (nlength - (60/fsample)*nlength) + 1;
fftsigsub = fftsig;
fftsigsub(pos1:pos1) = 0.0;
fftsigsub(pos2:pos2) = 0.0;
#
# Inverse the FFT and correct for power
#
timesig = ifft(fftsigsub)/sqrt(2);
#
subplot(323)
plot(time,timesig)
title("(C) 60 Hz removed (brute force)")
xlabel('Time (seconds)')
#
# Ok this recovers the original signal but it is really noisy; lets filter it
# Or in others words fit it with a spline
#
# For a spline we need to decide a good knot number; this is quite difficult
#
# knots = linspace(0,5,400)
# knots = linspace(0,5,800)
# The previous knots removed more noise, but also removed some signal; this is 
# a delicate balance
#
knots = linspace(0,5,1600);
thespline = splinefit(time,timesig,knots);
timesigfilt = ppval(thespline,time);
subplot(324)
plot(time,timesigfilt)
title("(D) The signal filtered with spline fit")
xlabel('Time (seconds)')
#
subplot(325)
plot(time,x,"color","red")
title("(E) The signal before the 60Hz noise addition")
xlabel('Time (seconds)')
#
# This is the made-up signal with no noise whatsoever (you will never find
# this is real life)
#
subplot(326)
plot(time,truexcomp,"color","red")
title("(F) The original signal before noise was added")
xlabel('Time (seconds)')

There are many different applications of FFTs and there are other useful transformations as well (like the Radon Transform) so this is just a brief introduction of all the possible analysis techniques and data recovery techniques. Now we will look at an FFT-based idea that is primarily for analysis.

Spectrogram (FFT-based function)

For analysis a number of fields a spectrogram is a useful tool. Spectrograms can be used with any signal (wave) such as sound or electromagnetic waves. It is an image of what someone is hearing or seeing and is a convenient presentation that allows a engineer or scientist to see signals with a different view point.

This analysis method is used in many different fields such as speech analysis, bird identification², seismology, vibration analysis for materials, audio engineering, astronomy, etc. For a classic demonstration of the spectrogram we will analysis a bird call made by a professor.

Example

Because record sound requires different machine configuration this Octave program will have to be run, modified, and experimented with on your personal computer.


This scrip demonstrates how to make a spectrogram of sound for use in identifying birds (or other things). Because the record function is depending on machine configuration this Octave program will have to be run, modified, and experimented with on your own computer which shouldn't be difficult since Octave is free.	Spectrogram of the sound coming from the rare professor bird. A human can hear best between 1000 Hz and 5000 Hz³ so you would anticipate that a whistle would be in that range (which it is). Also notice the quick chirping followed by longer chirping.

This is an example of a public domain image from the National Park Service of a spectrogram on their Soundscapes of Mount Rainier web site.

Spectrogram taken at Mount Rainier by the National Park Service. This spectrogram has been marked up with known sounds to help the general public understand some uses of spectrograms. There are more spectrograms at the web site linked above.

Final thoughts

This concludes are discussion of various techniques that engineers and scientists use computers for in their efforts to aid the world through knowledge and useful products. This is a small smattering of all the techniques used by engineers and scientists. A student should take a numerical methods course and signals and systems course (or a combination of them in the same course) to learn more and especially other important methods not discussed herein.

¹Regression and curve fitting can, to some, mean the same thing and to others mean a different thing. Fitting as described here is not drawing a line from data point to data point, but more a regression to produce the best model (curve) of the data. If we wish to delve into the debates of the two subjects, please feel free to use you favorite web search to wade into the thorny issues of definitions.

²In ornithology the spectrogram is also called the sonogram (which can be confused with the medical sonogram which is not the same idea). Ornithology has been using this technique for decades and there are many examples available including apps that describe the sound of a bird usually accompanied with a spectrogram.

³The full range of human hearing is from about 20 Hz to 20000 Hz though this varies depending on genetics and age.