Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
150 views59 pages

Error Notes

Error notes (Physical chemistry)

Uploaded by

jagabandhu_patra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
150 views59 pages

Error Notes

Error notes (Physical chemistry)

Uploaded by

jagabandhu_patra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 59
Chapter Three DATA HANDLING AND SPREADSHEETS IN ANALYTICAL CHEMISTRY "Facts are subborn, but steises are much more pliable.” —Matk Twain "43.89% ofall saistcs are. wordless.” Anonymous Although data handling normally follows the collection of data in an analysis, i 4s treated early in the text because a knowledge of statistical analysis willbe te- ‘quired as you perform experiments inthe laboratory Alo, statistics are necessary {o undersiand the significance of the da tat are collected and therefore to set limitations on each step of the analysis. The design of experiments (including size of sample required, accuracy of measurements required, and numberof analyses eeded) is determined from a proper understanding of what the data will represent ‘The availablity of spreashéets to process data has made statistical and other calculaions very efficent. You will st be presented with the details of var- ious calculations throughout the text, which are necessary for fall understanding ofthe principles. But spreadsheet calculations wil also be iniroduced throughout toillusrate how to take advantage of this software for routine caleulations. We wil Introduce te principles of the use of spreadsheets in this chapter 31 Accuracy and Precision: There Is a Difference Accuracy isthe degree of agreement between the measured value and the true ‘value. An absolute true valu is seldom known. A more realistic definition of 2c- curacy then, would assume itt be the agreement between a measured value and the accepted trie value. ‘We can, by good analytical technique, such es making comparisons against ‘known standard sample of similar composition, arive ata reasonable assump- tion about the accuracy of a method, within the limitations of the knowledge of Acenroe is how close you gett the bullseye Pretson i ow close the repeiive shots are one anther It |e nearly imposible to have aces racy without good pression. Fig. 31. Accuracy veews precision. (Good precision doesnot gurance accuracy “To be sae of iting the target, sont fit, and cll whatever you bit the trget"—Ashieih Bent Determine or systematic errs ae nonrandom snd coor wen some thing is wong wit he measurement DATA HANDLING AND SPREADSHEETS IN ANALYTICAL CHEMISTRY ©€ the “known” sample (and ofthe measurements), The accuracy to which we know the value of the standard sample is ulimately dependent on some measurement that will have a given Kimitof eertainy in i ‘Precision is defined as the desree of agreement between replicate measure- ments ofthe same quantity. That i, tis the repeatability of a result. The preci- sion may be expressed as the standard deviation, the coefficient of variation, the range of the data, or as a confidence interval e.g. 95%) about the mean value. Good precision does not assure good accuracy. This would be the case, for ex- sample, if there were a systematic eror inthe analysis, A weight used to measure cach ofthe samples may be in error. This error does not affect the precision, but itdoes affect the accuracy. On the other hand, the precision can be relatively poor and the accuracy, more or less by chance, may be good. Since all real analyses are unknown, the higher the degree of precision, the greater the chance of ob- taining the tue val, It 8 fruitless to hope that a value is accurate ifthe proci- sion is poor, andthe analytical chemist strives for repeatable reslts to assure the highest possible accuracy, "These concepts canbe lustre with a target, a in Figure 3.1. Suppose you se at target practice snd you shoot the series of ballets that al and in the balls eye (left target. You ae both precise and accurate, In dhe mide target, you ate precise (steady hand and eye), but inaccurate, Perhaps the sight on your gun is out of alignment. In the righ target you ae imprecise and therefore probably inaceu- rate, So we see that good precision is needed for good accuracy, but it does not guarantee it ‘As we shall se later, the more measurements that are mde, the more rl ble willbe the measure of precision. The number of measurements required will depend on the accuracy equied and on the known reproducibility ofthe metbod. ‘42 Determinate Errors—They fre Systematic “Two main clases of erors can affect the acuracy or pression of « measured quan- tty. Determinate errors are those that, as the name implies, are determinable and that presumably canbe either avoided or corecied. They may be constant, ain the case of an uncalibrated weight that i ued in all Weighings. Or, they may be vri- able bot of such a nature that they canbe accounted for and corected, such as a bret whose volume readings are in exor by diferent amounts at different volumes. ‘The error can be proportional to sample size or may change in a more com plex manner. More often than not, the variation is unidirectional, asin the case of solubility loss of a precipitate (aegative error. It can, however, be random in sign, Such an example isthe change in solution volume and concentration oc- ‘curring with changes in temperature. This can be corrected for by measuring the solution temperature. Such measurable determinate erors are classed as system= atic errors Fg 133 INDETERMINATE ERRORS—THEY ARE RANDOM ‘Some common determinate errors are: 1 Instrumental errors, Tse inlade faulty equipment, uncalibrated weights, fd uncalibrated glasware. 2 operate er. Tse nde penal rot and cn be ect by iperencs ond cae ofthe aly in he physical manipulations volved efferescence and “bumping drig simple dissluon, incomplete dry. ing of samples, and sooo. These ae dificult conect for. Or per sonal ror ince mathematical errs i calelations and prejudice in =, where N's the number of measurements. In practice, ‘we must calculate the individual deviations from the mean of limited number of ‘measurements, ¥, in which i is anticipated thar x > although we have no as- ‘surance this will beso; ¥ is given by 2). 337. STANDARD DEVIATIONTHE MOST IMPORTANT STATISTIC 7s Foca set of W measurements, itis posible to ealculate N independently varie able deviations from some reference number. But if ie reference number chosen isthe estimated mean, x, the sum of the individual deviations (retaining signs) must necessarily add up to 720, and so values of 1” ~ 1 deviations are adequate to de- fine te Nth value. That is, there ae only N ~ 1 independent deviations from the ‘moan; when IV~ 1 values have been selected, the lst is predetermined. We have, in fect, used one degree of freedom of the dat in calculating the mean, leaving N~ 1 degrees of freedom for ealeulatng the precision ‘As a result, the estimated standard deviations of a finite set of experi- ‘mental data (generally N'<30) more nearly approximates @ ifthe number of ‘degrees of freedom is substituted for WV (W — 1 adjusts forthe diflerence between and p) Se F aoa 2) ‘The value of sis only an estimate of o, then, and will more neatly approach o 25, the mumber of measurements increases. Since we deal with small numbers of meas- ements in an analysis the precision is necessarily represented by s OD comic s7 (Calculate the mean and te standard deviation of the following se of analytical r= sults: 15.67, 15.69, and 1603 Solution ‘The standard deviation may be calculated slso using the following equivalent ‘equation: ean @3) ‘This is useful for computations with a calculator. Many calculators, in fact, have 2 standard deviation program that automatically calculates the standard deviation from entered individual data, ‘See Secon 3.15 and Equation 3.17 oc another way of estimating foe four ot les umber “The precision improves asthe square root of the numberof measurement DATA HANDLING AND SPREADSHEETS IN ANALYTICAL CHEMISTRY Baines (Calculate the standard deviation fr the data in Example 3.7 using Equation 33. Solution aig ‘The difference of 0.01 g from Example 37 is nt statistically significant since the vataton is at least 0:2 g. In applying this formula its important to keep an ex- ‘ra digit or even two inf forthe ealelation. ‘The standacd deviation ealcultion considered so far is an estimate of the probable enor ofa single measurement. The arithmetical mean ofa sces of N ‘eaturerients taken from an infinite population wil show less seater from the “eve value” than will an individual observation The seater wil decease as Nis increased; as N gets very large the sample average will approach the population average ja, and the scatter approaches zero, The arithmetical mean derived from ‘N measurements can be shown to be\\/N times, more relisble than a single. mea _suement. Hence, the xandom ester in the mean of a series of four observations is neal at ofa single observation. In other words, te precision of the mean cof N measurements i inversely proportional to the square root of N of ation ofthe individual values. Thus, - Sanda devon of he mean | 04 ‘The standard deviation ofthe mean i sometimes refered to. the sand eae, ‘The standard deviation is sometimes expresed as the relative standard des_ ‘ition (sd), which is just the standard deviation expressed as «fraction of the eof the mean (34), which is often. ‘HET Uualy ii given a. Galled tb coefficient of variation, @ Example 3.9 ‘The following replicate weighings were obtined: 29.8, 30.2, 286, and 29:7 me. Calculate the standard deviation of the individual values and the standard devia- tion of the mean, Express these as absolute (units of the measurement) and rela- tive (6 of the measurement) valves. 137. STANDARO DEVIANION—THE MOST IMPORTANT STATISTIC Solution « oom a8 oot 302 036 286 100 227 odo ples 519 5 tat ya las Sg ee eo ngehaun 2 x 0 = 29% coca ie oot sen {2 Song nt, 8 008 = 8 ta a ae The precision ofa measurement can be improved by-inereasing the number js inceased and woald approach “6 the numberof observations eppteached init. However, as seen above (Eatin 3.4, the deviation ofthe mean docs not decrease in direst proportion “the numberof observations, bu insted it desreses asthe aqure root of he num- tat of observations. A point willbe seached where a slight increase in precision wil require an unjusiishy lrg inrease inthe number of observations. For ex- ample, to decrease the standard deviation by afacor of 10 requires 100 times as many observations ‘The practical limit of wseful replication is reached wien the standard devie tion of the random erors is comparable tothe magnitude of the determinate or 5ys- tematic error (unless, of couse, these canbe identified and comected for). This is because the systematic exots in a determination cannot be remove by replication, ‘The significance of sin elation tothe normal cistibuton curve is shown in Figure 3.2. The mathematical feast from which the curve wes derived reveals {hat 68% ofthe individual deviations fll within one standard deviation (for an ine finite population) from the meen, 95% ae less than twice the standard deviation, and 99% are less than 25 times the standard deviation. So, «good approximation {shat 68% ofthe individual vues wil al within the ange 8 ¢, 98 wil fall ita = 2s, 99% wil fll within ¥* 25s, and so. ‘Actually, these percentage ranges were derived assuming an infinite muber of measurement. Thre are then two reasons why the analyst cannot be 95% cxr- tain thatthe tue value falls within ¥ = 25. Fist, one makes limited number of measurement, and the fever the measurements, the less certain one will be. See- ‘ond, the normal distribution curve assumes no deteminate eros, bat only random ‘rors Determinae enor, in effect, sift the normal erreur fom the te value. ‘An estimate of the tual certainty a mumber falls within scan be obtained from a calculation ofthe confidence lini (Se below) tis apparent tht there ae a variety of ways in which the precision ofa nue ber can be reported. Whenever a number is reported as 7 x, you should always ‘qualify under what conditions this holds, that is, how you arrived at x. Tk may, {or example, represents, 25, (mean), or the coefiient of variation. “Randoenes i equ wo make sual eaeustins come out rien”—Anonymons| ‘Thee value wil all within 8 26956 of the ime for 2 infinite ‘numberof mescrements, See the ‘confidence limit end Example 315, B ‘The vince uals DATA HANDLING AND SPREADSHEETS IN ANALYTICAL CHEMISTRY. ‘A term tats sometimes useful in statistics i the variance. This isthe square ofthe standard deviation, s?. We shall use this in determining the propagation of ‘error and in the F test below (Section 3.13) 3.4 Use of Spreadsheets in Analytical Chemistry A. spreadsheet isa power software program that canbe used fra variety of funeions, sich dst analysis nd pling Spreadihects are seta oe rei ing dt, ding repetiive celts, and Gipayng the cakslaons graphically or in char fr, They have builtin fenetons, fo example, standard deviation and Sher tail Sisto, for carying owt compsions on tat re inp Py the wae Poplar spreaishst programs ince Miron Excel, Lots 1-23, nd (utr Pro All eperste basically the sae tif somevhst in specie com ‘mands and sya Because oft widespread aalatity ane pep, we vl te Excl in our isaton “You probably have we spesdsoe program before and re fair with the basi ection. But we wll summarize re the ost rfl spt for analy ical cheminyappicatons You should eet the peadsest ma for more de ‘le nfrmation. lo, th Exel Hep onthe tol bar prover pect formation ‘You ae refered 1 the excellent trl on wins the Exes pendshee p= prety fc alfa Sate University at Stasis: gai ‘SdftaraxccVindx im The basic fonctions in th spreadsheet ae desebed, incadig entering dia and fonnla, formating ces, graphing, and regression ‘alysis Yu wl nd ti ery helpful and soul defintely cea before con Snug. The website yorwknsi/—comesdCHEMSO at Wester Keach Universi gives summary iasroctions for graphing wing eter Micro Excel, or Lots 1-23. Go othe exceendou hal en Joust. inks. 1 spreadsheet consis of ells aanged in columns (abled A, B,C, ) and rows (umber 2 3, A nd cll dented by column Teter and rw nia, for example, BS. Figure 3.3 has the eer ped nto some ofthe cls illseate When the movse point (he cms i cicked onan vidual elt becomes the active eel (Sark ines aon), andthe active cell isindcated the op eft ofthe formal a, and the contents of toc aes. {the gt ofthe equ sgn onthe ar FILLING THE CELL CONTENTS ‘You may enter tet, numbers, or formulas in specific cells. Formulas are the key to the uly of spreadsheets, allowing the same calculation to be applied to many numbers. We wil illustrate with ealelations of the weights of water delivered by two different 20-m pipes, fom the difference in the weights of «flask plus water and the empty flask. Refer to Figure 34 as you go through the steps. Fig. 13. spreadsheet ets, {3.9 USE OF SPREADSHEETS IN ANALYTICAL CHEMISTEY. x Seo Nar weigie Weight otis 3 sight ai Be=Be-as esl 0820005 % 6 fa 3 ‘Open an Excel spreadsheet by clicking on the Excel icon (oc the Microsoft [Excel program under Sta: Programs). You will enter text, numbers, and formu- las. Double click on the specific cell activate it. Enter es follows (information {ype into a cell is entered by depressing the Enter key) (Cell At: Net weighs Call A3: Piet (Cell Ad: Weight of flask + water, g (Call AS: Weight of ask, ¢ (Call 6: Weight of water, ‘You may make corrections by double clicking ona cell then edt the text (You can also edit the text inthe formula bar) If you singe click, new txt replaces the old text. You will have to widen the A cells o accommodate the lengthy text. Do so by placing the mouse poimer an the line becween A and B onthe row at the top, and ‘ragging itt the right til ll he text shows. This moves the other ells to th Fight (Call B3: 1 Cen C3: 2 (Cet Ba 47-700 Catt Cl: 49.239 (Cell BS: 27.687 Call C5: 29.199 Cell BG: =B4-BS You can also enter the formula by typing =, then click on BA, then type —. and click on BS. You neod to format the eels B4 to C8 to thre decimal places. High- light tat block of ces by clicking on one comer and dragging to the opposite cor- ner of the block. Tn the Menu bat, lick on Format:CellsNumber. For Decimal places: type 3, and clic OK ‘You need to add the formula to cell C6. You can retype it. But there is an casi way, by copying (fling) the formula in cell B6. Place the movse pointer on the lower tight comer of cell BG and drag it to cell C6. This fils the formula into C6 or addtional cells tothe right if there are more pipet columns). You may also Al formulas ino highlighted cells by clicking on Edit FillDown(or Righ). Double click on BS. This shows the formula in the cell and outlines the other ‘els contained in the formula, De the same for C6. Note that when you activate th cell by either single or double clicking on it, the formula is shown in the for ‘mula ba, Fig. 84, rtiog cet content n DATA HANDLING AND SPREADSHEETS IN ANALYTICAL CHEMISTRY. ‘SAVING THE SPREADSHEET Save the spreadsheet you have just ereated by clicking on File:SaveAs, I like to save documento the desktop first, Then they can be dragged to whatever fle you ‘wish, for example, My Documents. That way they don't get lst, So select Desk- {op athe top. Give the document a File Name atthe botom, for example, Pipet Calibration. Then elie Save. Ifyou wish to place the saved document on & disk, you can drag it from the desktop tothe opened disk. PRINTING THE SPREADSHEET (lick File:Page Setup. Normally, a shect is printed inthe Portrait format, that is, vertically on the 8% % L1-inch paper. If there are many columns, you may wish ‘o print in Landscape, that is, horizontally. Ifyou want gridines to print click on ‘Sheet-Gridines. Now you are ready to print. Click on Prin OK. Just the working see ofthe spreadsheet wil print, not the column and row identifiers. RELATIVE VS. ABSOLUTE CELL REFERENCES Tn the example above, wo used relative cell references in copying the formula, The formula in cell B6 said subract the cell above from the one above it. The copied formula in C6 sad the same for the cells above it. Sometimes we need to include a specific cel in each calculation, containing sy, 2 constant. To do this, we need to identify it in the formula as an absolueref- erence. This is accomplished by placing a $ signin front of the column and row cell identifiers, for example, $B§2. Placing the sign infront of both assures that whether we move scoss columns or rows, it will remain an absolute reference. ‘We can illuate this by creating a spreadsheet to calculate the means of dif- ferent seies of numbers. Fill in the spreadsheet as follows (refer to Figure 35) ‘Al: Titration meas ‘AS: itn, No, BB: Series A, mL. C3: Series B, mk Ba: 3927 BS: 39.18 B6: 39.30 BT: 39.20, 445.59 5:45.55 6: 45.65, cr: 45.66 Ad b ‘We can type in each ofthe tration numbers (1 through 4), but there are automatic ‘ways of incrementing a sting of numbers. Click on EditFillSeris. Check Columns and Linear, and leave Step Value at 1. For Stop Value, enter 4 and click OK. The ‘numbers 2 through 4 are insered inthe spreadsheet. You could also first highlight the cells you want filled (beginning with cell Ad. Then you donot have to insert a ‘Stop Value. Another way of incrementing a sris isto doit by formula. In cll AS, (ype = A4+1. Then you can fill own by highlighting from AS down, and clicking ‘on Ei FillDown (This i relative reference.) Or, you can highlight cell AS, lick {38 USE OF SPREADSHEETS IN ANALYTICAL CHEMISTRY. a ESS: = [5 frmie.—| sara | ss ieee — fet $1 — sao ‘aa 3a ie we| tees Sex -oesaa comers |i curso | Soars AST nm ca en = eat Fit. 35. rela and sso ce Hel Wane iad Oo it aap aoe on its lower right comer, and drag it to cell A7. This automaticaly copes the for- ‘ula ia the ther cells. [Now we wish to inser formula in cell B8 to calculate the mean, This wil be the sum divided by the numberof tations (ell AT). BS; =sum(B4BTVSAST ‘We place the S signs in the devisor because it wil be an absolute reference that we ‘wish to copy to the right in cell C8, Placing a$ before both the column and row txddcesses assures that the cel wll be weated a8 absolute whether i is copied ori- ~zonully oc vercally. The sum(B4:B7) isa syntax in the program for summing & series of numbers, fom cell B4 through cell BT Instead of typing in the cell ad- dresses, you ean also type *=(", then click on cll BS and drag cell BT, and type ")", We ave now calalated dhe mean for series A. We wish todo the sane for se- ries B. Highlight cell BB, click on its lower right come, and drag itt cell C8. Voila, the next mean is calelated! Doubleclick on cell C8, and you will ee tha he for. smu bas te same divisor (absolute reference), but the sum isa relative reference. TE we had noc type in the $ signs to make the divisor absolut, the formule would have assumed it was relative, and the divisor in cell C8 would be call B. USE OF EXCEL STATISTICAL FUNCTIONS [Excel bas a large number of mathematical and statistical functions that can be used {or calculations in ie of writing your ov formula. Let's uy the statistical func- tions to automatically calculate the mean, Highlight an empty cell and click fon ‘ie tool bat. The Paste Function window appears. Select Statistical inthe Function category. The following window appears: DATA HANDLING AND SPREADSHEETS IN ANALYTICAL CHEMISTRY Select AVERAGE forthe Function mime. Click OK, and for Number, type BAB 7, and click OK. The same average is calculated as you obtained with your own fr ‘mula. You can also type inthe activated cel the syntax =average(B4:B7). Try it [Let's calculate the standard deviation ofthe resus. Highlight cell B9. Un- der the Statistical function, select STDEV forthe Function name. Alternatively, you can type the syntax into cell BO, =sulev(B4:B7). Now copy the formula to ‘ell C9, Perform the standard deviation calculation using Equation 3.2 and com- pre with the Exel values, The ealeulaton fe series is 0.05 mL. The value in the spreadsheet, of course, should be rounded to 005 ml. USEFUL SYNTAXES. Excel has mumerous mathematical and statistical functions or syataxes that ean be used to simplify setting up calculations. Peruse the Function names for the Math {& Trig andthe Statistical function categories under fin the toolbar. Some you will find useful for this text are: Math and wig functions OGIO Caleulates the ese-10 logarithm of a number PRODUCT Calculates the products ofa series of numbers POWER Calculates the result of a mumber raised to e power SQRT Calculates the square rot of umber Statistical functions AVERAGE Calculates the mean ofa series of numbers MEDIAN Calculates the modian of a series of numbers STDEV Calculates the standard deviation ofa series of numbers TIEST Calculates the probability associated with Students # test VAR CCalculas the variance ofa series of numbers ‘The syntaxes may be typed, followed by the range of cells in parentheses, as we Aid above. ‘This tutorial should provide you the basis for other spreadsheet applications ‘You can write any formula that i inthis book into am active eel, and insert ap- propriate date for calculations. And obviously, we ean perform a variety of data analyses. We can prepare plots and charts of the data, for example, a calibration curve of instrument response versus concentratioa, along with statistical informa- sion. We will ustrate this later inthe chapter. 44 Propagation of Errors—ct Just Additive ‘When discussing significant figures eales, we sated that the relative uncertainty in the answer toa muliplication or division operation could be no beter than the relative uncertainty in the operator that had the poorest elatve uncertainty. Also, the absolute uncertainty inthe answer of an addition or subtraction could be no better than the absolute uncertainty in the number with te largest absolute uneer- ‘ainy. Without specific knowledge of the uncercinies, we assumed an uncertainty of at leat 1 in the last digit of each number 39 PROPAGATION OF ERRORS—-NOT JUST ADDITIVE From a knowledge ofthe uncertainties in each number, its possible 1o est- rat the actual uncertainty inthe answer. The error in the individual numbers will ‘propagate throughout a series of calculations, in either a relative or an absolute fashion, depending on whether the operation is « multiplication or division or whether itis an addition o a subtraction. ADDITION AND SUBTRACTION—THINK ABSOLUTE VARIANCES (Consider the addition and subtraction of the following number: (6506 = 007) + (16:13 = 001) ~ (22.68 + 0.02) = 58.51 (+7) ‘The unceruines listed represent the random or indeterminate errors asociated ‘with each number, expressed as standard deviations ofthe numbers. The maximum ror of the summation, expressed 88 standard devistion, would be 0.10; that {sit coud be ether +0.10 or ~0.10 if all uncertainties happened to have the same sign. The minimum uncertainty would be 0.00 if all combined by chance to can- cel. Both of these extremes are not highly likely, and statistically the uncertainty wil fall somewhere in between. For addition and subtraction, absolute uncertain ties are additive. The most probable ero i vepresented by the square rot of the sum of the absoluze variances, That is, the absolute variance of the answer isthe ‘sum of the indiveal variances. For a= +e ~ d, aegtatd 6s) =Vardra oo In the above example, VEO OOH EOE VEBRITT + EIN IO + ARID = VESERIO7 = #73 X 107 So the answer is $8.51 + 0.07. The number +0.07 represen the absolute uncer- taimy. If we wish to express i as relative uncertiny, this would be +007 Fe x 100% 19% a Example 310 ‘You have received three shipments of uranium ore of equal weight Analysis ofthe three ores indicated contents of 3.978 = 0.004%, 2.536 * 0.003%, and 3.680 = 0.003% respectively. What is the average urenium content of the ores and what are the absolute and relative uncertainties? Solution 0.00858) + (2.536 = 0.003%) + (2.680 3 ‘The sbolte varices of ations ad subrction are adie. 4 DATA HANDLING AND SPREADSHEETS IN ANALYTICAL CHEMISTRY ‘The uncertainty inthe summation is Vie ODay + EUMF = OOF VEIOR IF + EIKO + EXIT = VEX IO = 58x 107% U ence, the absolute uncertainty is aos A = 0006% = 3.398 + 0.006% U [Note that since there is no uncertainty inthe divisor 3, the relative uncertainty in ‘the uranium content is 58x 1078 U 3.298% U 2X10 or 02% (MULTIPLICATION AND DIVISION—THINK RELATIVE VARIANCES Consider the following operation: “Tae relive variances of matin (13.67 + 0.020204 = 02) oo. ‘ion and vision ae adv. Fes ong 35602) Hee, the relative uncertainties ee additive, and the most probable ero is repre- sented by the square r00t ofthe sum ofthe relative variances. That is, the relative ‘ariance ofthe answer isthe sum ofthe individal relative variances Fora = bold, (n= CB B+ Da en (ou= Vedat Dat Da ex Inthe above example, 0.0015 = 4000]7 20008 468 Gua = VEOO0ISF + (00017 + ZOD = VETIX 109 + EDI K 10-9 + (ELT X10, = V@EEX 105 = +26 x 107 (de = 00013 “The absolve uncertainty is given by aXe = 3560 x (42.6 x 10) = +093 ‘So the answer is 356.0 = 0.9. 139 PROPAGATION OF ERRORS—NOT JUST ADDITIVE éxample 31 CCalulate the uncertainty in the number of millimoes of chloride contained in 2500 mi of a sample when three equal aliquots of 25.00 mL ae tivated with sil- ver nitrate withthe following results: 36.78, 36.82, nd 36.75 mL. The molarity of the AgNO, solution is 0.1167 + 0.0002 Solution “The mean volume is 36.78 + 3682 + 36.75 : ! = 36.78 mL ‘The standard deviation is a win? 3678000 0.0000 : 3682 004 0016 3675003 0.0009 = 0.005 = 0035 Mean volume = 36.78 = 0.04 mL. smumol CP trated = (0.1167 + 0.0002 mmol }36.78 * 0.04 mL) = 4.292 (7) Gig (da = VEDOOTTE ¥ EO.00TTE = VETER ITF S090 K 105 = VEER IO* 19x10 The absolute uncertainty inthe milimoles of CI is 4.292 X (:£0.0019) = +0.0082 mmol rmavol CI in 25 mil = 4.292 + 0.0082 mmol ‘mmol CI in 250 mL. = 10(4.292 + 0.0082) = 42.92 = 0.08 mmol [Note that we retained one extra figure in computations until the final answer. Here, the absolute uncertainty determined is proportional wo the size of the sample; it ‘would not remain constant fr twice the sample size, for example I there isa combinacom of muplication/tivision and addtion/subraction ina calculation, the uncertainties ofthese most he combined. DATA HANDLING AND SPREADSHEETS IN ANALYTICAL CHEMISTRY imei ‘You have eceved tree shipments fin ore of the following weights: 2852, 1578, and 1877 Ib. There isan uncertainty inthe weights of =5 Ib. Analysis ofthe ores gives 36.28 * 004%, 22.68 = 0.03%, and 49.23 + 0.06%, respectively. You ae to pay $300 per ton of iron, What should you pay for these thee shipments and ‘hut isthe uncertainty in the payment? Solution ‘We need to calculate the weight of iron in each shipment, with the uncertainties, and then add these together to obtain the total weight of iron and the uncerainty in this. The relative uncenintis in the weigh ae +5 +5 +5 Japp 20017 apy 000 EF 00077 “The relative uncerinties inthe analyses are 104 03 06 Fem 200 ee toms = e002 “The weighs of iron in the shipments are a soc 00 GB SHOES OO aren ate Gola = VEDOOIT + CEOOOITF = 20,0020 5. 10347 X (0.0020) = =2.1 Ib Ib Fe = 10347 = 2.1 (We will carry an additional figure throughout) sts (22.68 = 0.03%) = = 357.89 (2) D Fe Gola = VED.OOBIE + EOOOIF = 0.0034 57.89 X (0.0084) = 1.2 Th Ib Fe= 3579121 (1877 * 5 1644923 + 0.06%) = 924.05 (=2) Ib Fe 100, VEDURTF + OTF = 0.0030 4.05 x (0.0030) = +28 ib 9240228 tb ‘Toul Fe = (1034.7 2 2.1 Ib) + (357.9 = 12) + (924028 B) 2316.6 (27) Ib a= VEDI LaF DBF = 23.7 ‘Toa Fe = 2317 + 41b Price = (2316.6 « 3.7 by$0.157b) = $347.49 + 0.56 Hence, you should pay $347.50 + 0.60 39 PROPAGATION OF ERRORS—NOT JUST ADDITIVE Be banpe 3s ‘You determine the acetic seid content of vinegar by titrating wih a standard (known concentration) solution of sodium hydtoxide to a phenolphihaein end point. An approximately 5-mL sample of vinegar is weighed on an analytical balance in 3 weighing bore (the inerease in weight represents the weight of the sample) and is found to be 5.0268 g. The uncertainty in making a single weighing is =0.2 mg. ‘The sodium hydroxide must be accuraisly standardized (its concentration dete mined) by tating known weights of high-purity potssium acid phihelate, and three such tations give molarites of 0.1167, 0.1163, and 0.1164.M. A volume of, 136.78 mi. of sodium hydroxide is used to Gate the sample. The uncertainty in reading the buret i 0.02 mi. What isthe percent sceic acid inthe vinegar, and ‘what ists uncertainty”? Solution ‘Two weighings are required wo obtain the weight of te sample: that ofthe empty ‘weighing bottle and that of the bore plus sample. Each has an uncertainty of 120.2 mg, and so the unceraimy of the net sample weigh (he difference ofthe «wo weights = VET COG = 03 me ‘The mean of the molarity ofthe sodium hydroxide is 0.1165 Bf, and its standard deviation is +£0.0002 Bf. Similarly, two buret readings (Intal and final) are re- quired to obtain the volume of base delivered, and the total uncertsigy is sa = VOUS = (E00F = 0.03 mL. ‘The moles of acetic acid are equal tothe moles of sodium hydroxide used to irate it so the percent of acetic acid is (50268 = 03) mg x 100% = 5119 7% ‘The uncertainty inthe formula weight of acetic acid is assumed to be negligible (could actually ealelate ito six figures to be exact). 0.000 Faas = 200017 (id ‘The numberof significant gues in sn mer is determine by the une certainty due to propagation of ence, DATA HANDLING AND SPREADSHEETS IN ANALYTICAL CHEMISTRY. The uncertainty inthe analysis is ahs = VEDOOTTF + OOO + ZO. OUGIF = = 0.0020 Seat = 5.119 % 010020 = 0.010% acetic acid Hence, the acetic uc content i 5.119 0.0108. The relative uncertain is 04 po. ‘The factor that limited the uncertainty the most was the variance in the mo- larity of te sodium hydroxide solution, This lustrates the importance of eareful calibration, which i discussed in Chapter 2. 3.10. Significant Figures and Propagation of Error ‘We noted earlier that the total uncertainty in computation determines how accu- rately we ean know the answer. In other word, the uncertainty ses the number of Significant gues. Take the following example: (731+ 0.290.245 + 0.008) = 164.1 + 0.7 We are justified in keeping four figures, eventhough the key number has thee. “ere, we don't have to carry the additonal figure asa subscript since we have i= dicated the actual uncertainty in it. Noe that the greatest relative uncertainty in the smulipliers is 0.0036, while that inthe answer is 0.0043; so, due to the propaga- tion of eror, we know the answer somewhat less accurately than the Key umber. ‘The key number (ihe one with he greatest uncertainty), when actual uncertainties ae Known, may not necessarily be the one with the smallest numberof digits. For ‘example, the relative uncertainty in 78.1 = 0.2 0.003, while that in 11.21 = 0.08 is 0.007, ‘Suppose we have the following calculation: (73.1 = 0.992.245 = 0.008) = 164.1 = 2, 42 Now the uncersiny in the answer isthe units place, and so figures beyond that ‘are meaningless. In this instance, tbe uncertainty ia the key number and the an- ‘swer ae similar (0.012) since the uncertainty in the other multiplier is signifi- cantly smaller. OB scan 3 Provide the answers tothe following calculations tothe proper numberof signif jean figures @ (8.68 = 0.07) ~ (6.16 + 0.09) = 32.52 (12.18 = 0.0823.04 + 0.07) e 3247 = 0006 S088 2.11 CONTROL CHARTS Solution (@) The calculated absolute uncetainty inthe answer is =0.11. Therefore, the an- swer is 325 2 0. (8) Te calculated relative uncertainty inthe answer is 0.0075, so the absolute un- certainty is 0.0075 X 86.43 = 0.65. Therefore, the answer is 86:4 * 0.6, even though we know all he other numbers to four figures; there is substantial un- ‘erent in the fourth digit, which leads to the uncertain in the answer. The relative uncertainty in that answer is 0.0075, andthe largest relative uncer- tainty inthe other numbers i 0.0066, very similar SIL Control Charts ‘A quality control chart is a time plot of « measured quantity thet is assumed to 'be constant (vith a Gaussian distribution) for the purpose of ascertaining thatthe measurement remains within a statistically acceptable range. It may be a day-to- day plot of the measured valve of a standard that i run intermiendly with saa- ples. The contol chart consists of eentral ine representing the known or assumed value of the contol and either one or wo pairs of limit lines, the inner and outer control limits. Usually the standard deviation of the procedute is known (a go0d cstimate of o), and this is used to establish the contol limits. ‘An example ofa control char is illustrated in Figure 3.6, representing a plot of day-to-day results of the analysis ofa pooled serum caleium or a coatrl sam- ple that is ran randognly and blindly with samples each day. A useful ine eoatol Timitis two standard deviations since there is only 1 chance in 20 dat an individ- tual messurement will exceed this purely by chanee, This might represent warn- ing limit. The outer limit might be 2.5 oc 3, in which case there is ony I chance in 100 oF 1 chance in $00 a measurement wil fll ouside this range inthe absence of systematic err. Usually, one contol is run with each batch of samples (¢-., 20 samples), 0 several contro points may be obtzined each day. The mean of these ‘may be ploted each day. The random scatter of his would be expected to be smaller by VN, compared to individual poins. Prticlar attention shouldbe paid to trends in one aection; thats, the points lie largely on one side of the central line. This would suggest that ether the com- tool is in eror o there is a systematic error in the measurement A tendency for points to lie ouside the control limits would indicate the presence of one or mare Aeterminate errors inthe determination, andthe analyst should check for detrio- ration of reagents, instrument malfunction, or environmental and other effects. Calum quality contol chart for October 2002 A contrl chats contact by oe odealy runing & own" conta sample Fig. 38. “Typical guity control char. ed DATA HANDLING AND SPREADSHEETS IN ANALYTICAL CHEMISTRY, ‘Trends should signal contamination of reagents, improper calibration or exoneous standards, or change in the contol lot S12 The Confidence Limit—ow Sure fre You? ‘The eval il within he cont- Caleulation of he frase of data provides an indication of the tence nonin ings ute. _ Scion eres pricier poate anal et a ‘esd confidence level umber of data it See caer ae seo nero ema ae as ects nn a iced ate ue ae ae within Comes olloe or cohen el sally epee 3 — ‘zat The confidence iii given by reat ony eas oe ie confidence evel aid — depen a eat ae Stn mDUSS 1 Nov Mate oes si — Bl th prt of an he standard evition of he mea (VT son xevhen Nis Tis given by x 215, being “larger than that of the mean by-a factor \/A-1is far he number of meas “wrements used to determines) — Table 31 Values of for » Degrees of Freedom for Various Confidence Levels? ‘Conience Level * 0% 98 25% 1 oa 12706 easr im 2 2500 4305 9925 14089 3 2333 382 Sasi 7453, 4 aise 278 4908 5338, 5 201s asm as ans ‘ 1963 ast 307 “a 7 1395, 2365 3300 sa 4 1360 2306 3.385 3832 9 1333 20 3250 360 0 1812 28 316 3581 5 1953 2131 2947 3252 » ins 2085, 2345 3183 2s 1708 2050 281 3078 * 164s 1960 2576 2807 TENA I= geet feos. 2412. THE CONFIDENCE LIMIT—HOW SURE ARE VOU? BP sanyiets ‘A soda ash sample is analyzed in the analytical chemistry laboratory by titration ‘wih standard hydrochloric acid. The analysis is performed in triplicate with the following resulis: 93.50, 93.58, and 93.43% NaCO,. Within what range are you 95% confdent thatthe true value Ties? Solution ‘The mean is 93.50%, The standard deviation sis calulaed to be 0.075% Na,CO, (ebsolute—celeuste it with spreadsheet). At the 95% confidence level and two degrees of freedom, += 4.303 and 4303 % 0.075 2502 ‘0 you ate 95% confident that, inthe absence of a determinate ecror, the ue value {alls within 93.31 to 93.69%. Note thet for an infinite mumber of measurements, ‘we would have predicted with 95% confidence tha the trae valu flls within sandard deviations (Figure 3:2); we see that for v =, ¢ is actually 1.96 (Table 3.1), and So the confidence limit would indeed be about twice the standard devia- tion of the mean (which approsches ¢ fr large N). [Remember from Section 3:7 and Figure 32 that we are 689% confident that the true value falls within 1, 95% confident it wil fll within +20, and 99% confident it will fall within =2-Sc. Nove that itis possible to estimate standard evition from a stated confidence interval, and vice versa a confidence inteval from a standard deviation, Ifa mean value is 7.37 © 0.06 atthe 95% confidence interval, then sine this is two standard deviations for a suitably large number of| ‘measurements, the standard deviation is 0.03 g, If we know the standard deviation is 003 g, thea ths isthe confidence interval atthe 68% confidence level, ori is (006 g at the 95% confidence level. For small numbers of mieasurements, willbe larger, which proportionately changes these numbers. ‘As the number of measurements increases, both # and s/V/V decease, with ‘he rest thatthe confidence interval is marowed, So the more measurements yo make, the more confident you will be thatthe wue value lies within a given range of, conversely, thatthe range will be narowed ata given confidence level. How ever, decreases exponentially with sn increase in NV jst a8 the standard devia- tion of the mean does (see Table 3.1), 80 point of diminishing retuns is eventually ‘ached in which the increase in confidence is not justified by the increase in the ‘multiple of stmples analyses required A ‘To high 3 condence level wil pve «ide ange that may encompass. ‘oarandom numbers, Too lou 2 ‘confidence level wl gve 2 narow range and excage vl random umber. Confidence level of 90 {0.95% are generally accepted a reasonable, Compare with Figure 3.2 where 955 ofthe ales al within. 92 DATA HANDLING AND SPREADSHEETS IN ANALYTICAL CHEMISTRY. 313 Tests of Signiicance—1s There a Difference? ical method, itis often desirable w compare the rsulls tied. eee ee ee eee a ~ sgl Aa sm sae ma ‘The Fens wed decrmise Thin. tat. deine to indicate whether thee isa significant ference be- ittwo incest saally ca fa mtd, andar devians. Fis defined in tenn of siren “ie variances ofthe (wo methods, where tbe varlames is Le squve of the sa end devon’ @.10) ‘vnre 53 > sf There ae wo diferent depress of freadom, and v». where de res of freedom is defined as N — for each ~Ffthe calculated F value from Equation 3.10-exceeds a tabulated value at_ ‘the selected confidence level, then there isa significant difference between the vari ances of the two meds. Ais of F values atthe 95% confidence level is given Sn Table 3.2, Table 32 Values of Fat the 95% Confidence Level 3 4 5 6 7 8 9 © § wm v=? 190 192 192 93 13 WA 194 194 194 198 194 195 935 928 942901 8D «BAD BES BBL «RTD 870 BSS 4 6b 65063925 GIG] SOHO) 56 «S85 SRO 55 5 51 S41 SUD 505 495488482477 AKAD AS6 430 6 54 476453 99-428 2141S 410406343873 7 8 ° 4 433 4i2 397 387379373368 36h SSL aM 3 44 0 407 AB 3358350 3a 33935-32308 426 3860463348337 329323313301 28h 286 wo 410 «371348 3333223307 3m gk 285277270 15 368 329 305-250 279-271 6k 259 ast 240238225 2% «34 310287 27 2D -251 2S 232235 9222.04 30332292 269253 2422332272226 2019 © 3 TESTS OF SIGNIFICANCE 1S THERE A DIFFERENCE? 93 a Example 8.16 ‘You re developing new colrimetic procedure for determining the hicosecon- tenta Blood serum, You have chosen the standard Flin- Wu procedure ith which to compe your ress. From fhe fling twee plete anslees. on. e ‘sme sap, determine wheter he varias f your metho ders signify from da oF ie standard method, Your Method (mg/dl) Folin-Wu Method (mg/dl) ies / 3 / as 13 {on 10 9 1 17 16 125 129 \ mea) 127 sean) i Mw He ‘The variances are aranged so thatthe F value is (1)'The yy 6 and m= $15 495. Since the CCNA vals es han this, we con “lade that tere is no significant difference in the precision of the two methods, that i, the’ standard deviations the sample. {rom random err alone and don't depend on ‘THE STUDENT T TEST—ARE THERE DIFFERENCES IN THE METHODS? “Tae test is wed to determin f 590 sets of messureents ae sttstialy Sone ments made by to iflent methods eof them wil beth est method. an “he ofice willbe an accepted method statistical wah salelated 25d comm ued with tabulated vale for the ven aumber of ent at ibe desi cofdence vel (Table 3.1). Ifthe calculated ¢ value exceeds the tabulated s-value-then-here. ‘sa infant erence Bewosn the rely the two method. hat snd evel it doesnot exceed the tabulated value, then we ean predict that theres mo L [DATA HANDLING AND SPREADSHEETS IN ANALYTICAL CHENISTRY. -sionifcan ference been the methods, This in.no-way impli tbat the 40. “sults ae identical, ew ih scm ed ln ie es ote “cate analyses on single sample may be performed using two methods, o° a Se-— ‘es of analyses tay be performed on a et of diferent samples by the «wo met 1. £ Test When an Accepted Value Is Known. Note that Equation 3.9 is a representation of the tue value jt. We can waite it ey Ie follows that = o- p>] G12 1g, fom National Ts of Standards and Tecnology (NIST) standard ee “ence mateval oc the ntimaie_in.chemical analysis, an afomie_ weight) then Eavaton 3.12 can be-used.to determine wheter the vale obtained from a tet “eos is saitelly ea Bane ‘You are developing a procedure for determining traces of coppet in biological ma- terials using a Wet digestion followed by measurement by atomic absorption spec ‘rophotomety. Tn onder (0 test the validity of the method, you obtain an NIST ‘orchard leaves standard reference material and analyze this material. Five replicas ‘ae sampled and analyzed, and the mean of the results ig found to be 10.8 ppm. ‘witha standard deviation of +0.7 ppm. The listed valu is 11.7 ppm. Does your ‘method give a statistically correct value atthe 95% confidence level? Solution en v5 = aos - ny ‘There are ive measurements, so there are four degrees of freedom (N ~ 1). From “able 3.1, we see that the tabulated value ofr atthe 95% confidence level is 2.776. 4:13 TESTS OF SIGNIFICANCE IS THERE A DIFFERENCE? ‘This s less than the ealeulated value, so here is a determinate erro inthe new procedare. That is, there is a 95% probability thatthe difference between the ref- ference value and the measured valu is not due to chance. [Note fom Equation 3.12 that as the precision is improved, tha is, ass be= ‘coms smaller, the calculated 1 becomes lager. Thus, there is a greater chance that the tabulated ¢ value willbe les than this, Tha is, a the precision improves, it is easier to distinguish nonrandom diferences. Looking again at Equation 3.12, this ‘means as s decreases, so must the difference betwoen the two methods (& ~ 1) in ‘order forthe difference to be ascribed only to random error. What this means is that comparing very lage sets of samples, witha smaller, will nearly always lead ‘o 8 statistically significant diference, but a statistically significant result is not ‘necessarily important beeause of the large numberof semples that better describe the population. 2. Comparison of the Means of Two Samples. When the wt is applied to tvo ses of data, yin Equation 3.12 is replaced by the mean of the second se. ‘The reciprocal of the standard deviation ofthe mean (V/N/) is replaced by that of the diflerences berween the two, which is readily shown tobe Ne MM, where 5, is the pooled standard deviation of the individual measurements of (wo ‘The pooled standart devit < proved estimate of the precision ofa method, and tis use for calculating the pre (Spon oT He Wo se of data i ape te, Thal rater tan sling on 8 ‘fession ofa method its sometimes preferable to perform several set of analyse, for example, on diferent days, ot on iferent samples with slightly diffrent composions. IF the indeterminate (fandom) error “ipapapeT We ee free os e hdnn fe iferetoa sno pooled. This provides a more reliable estimate of the precision of a method than Is obtained fom single set The pooled standard deviation sis given by ena) Where, «Ry are the means of each of sets of analyses, anda) «=» ‘xq ave the individual values in each se. Nis the total number of measurements sand is equal to (N) + Na-+ 2+ + Mp. If five sets of 20 analyses each are per- formed, k = $ and N= 100, (The numberof samples in each set need not be equal.) N= kis the degrees of freedom obtained from (N,~ 1) + (Ns — 1) #-0- + ¥, ~ 1); one degree of freedom is lost for each subset. This equation represents 8 ombinaton ofthe equations forthe stands deviations of each set of daa. ‘The Fest cam be applied wo the sariances of he ewo methods rather ‘han assuming they ae satsically equal before sppving the 2st. [DATA HANOLING AND SPREADSHEETS IN ANALYTICAL CHEMISTRY In applying the est between two methods, it is assumed that both methods have essentially te same standard evstion, thas, each represents the precision ofthe population (the same a). This canbe verified using the F test above. Bisnis ‘A new gravimetric method is developed for iron(i) in which the ion is precip- tated in crystalline form with an organoboron “cage” compound. The accuracy of the method is checked by analyzing the iron in an ore sample and comparing with the results using the standard precipitation with ammonia and weighing of Fe0,. “The results, reported as % Fe for each analysis, were as follows: Test Method Reference Method 20.10% 18.89% 2050 1920 18.65 19.00 1925 19.70 1940 1940 19.99 y= 19.208 5 = 19.65% 1s there a signitican difference between the two methods? Solution th mah Ga BF fa- kh aio 04s am akg9 035 ama nso 08s «07mm 00cm 1865 1001.00 »«1900 om (st 1925 040 ©0160 1970-082 W940 025 sets ass 99 0 outs Za ~ 59" = 040 ‘SGn — HF = 2.262 i _ 026s | Oana “This is less than the tabulated valve (6:26), so the two methods have comparable standard deviations andthe ¢ test can be epplied gx Rea Bia =F YN +N? = [P22 £080 sug 645-2 gpa 6S=194 [OS eee 056 | V6+s “The tabulated # for nine degrees of freedom (WN; +N; ~ 2) atthe 95% confidence levels 2.262, so there is no statistical difference inthe results by the (wo methods. {342 TESTS OF SIGNIFICANCE—1S THERE A DIFFERENCE? Rather than comparing (vo metods using ome sample, two samples could ‘be compared for comparability using a single analysis method in manner ident- cal to the above examples. 3. Paired t Test_Inse clinical chemist oratory, «new method is frequently sec analyzing several different samples of gy “varying composition (vithn physiological range. Tn is case, the £ value is ea eulatedin ferent form. The difference between each o th a a5) G16) ‘where D, is the individual diference between the two methods for each sample, ‘with rogard to sig; and D is the mean ofall the individsl differences. a Example 3.19 ‘You ae developing a new analytical method forthe determination of blood urea ni- ‘wogen (BUN). You want to determine wheter your method differs sigifcanl from standard one for analyzing a range of sample concentrations expected to be found inthe routine laborstry. I has been ascertained that the two methods have eomps- ‘able precisions. Following ae two sets of sls fora nomber of individual samples. Your — Standard Method “Method _ _ Semple (mg/l) (mg/d) Do D-B @-DY A 102 105 “03-06 036 B 27 19 08 os 02s © 86 ar -o1 -04 016 D "ms 169 06 03. 0.09 E 12 109 03 00 0.00 F us aa 04 OL oo1 nu = 0s7 D028 Solution ‘The tabulated ¢ value at the 95% confidence level for five degrees of freedom is 2.571. Therefore, ta < fe and tere is no significant difference between the two ‘methods at this coafidence level. 98. Fring’ thie aw: In any collection of dt, the gue most ebviously ‘comet, beyond all checking isthe mistake, “The 0 ws is we to dari ian ‘uteri de oa determinate et sor I mo, thon fale within (he expected random ear and should be reine. "And now the sequence of evens in no pacar orer*—Dan Rather, (cevsion news anchor. DATA HANDLING AND SPREADSHEETS IN ANALYTICAL CHEMISTRY Usually, a testa the 95% confidence level is considered significant, while ‘ne a the 99% level is highly significant, Thats, the smaller the calculate value, the more confident you sre that there is no significant difference between the two ‘methods If you employ too low a confidence level (¢., 80%), you ae likely to cconclude eroneously that tere is significant difference berween two methods (ype I ere. On the other hand, oo high a confidence level wil require too large a difference to detect (type Teron). [Fa calculated value is near the tabula value atthe 95% confidence evel, more ests shouldbe run to ascertain definitely whether the two methods are significantly different 414 Rejection of a Result: The @ Test Frequently, when a series of replicate analyses is performed, one of the results will appear to difer markedly from te others. A decison wil have to be made whether to reject the resolt orto retain it Unfortunately, thre are no uniform eiteria that can be used to decide if suspect result can be ascribed to accidental error rather than chance variation, I is tempting to delete extreme values from a data set be ‘cause they will alter the calculated statistics in an unfavorable way, thats, inrease the standard deviation and variance (measures of spread), and they may substan- tilly aller te reported mean. The only reliable basis fr rejection occurs when it ‘can be decided that some speificeror may have been made in obtaining the daubt- ful result. No result shouldbe rettined in cases where a known error has occurred nits colection, Experience and common sense may serve as just as practical a basis for judging the validity ofa particular observation as a statistical test would be. Fre- ‘quently, the experienced analyst will gan a good idea ofthe precision to be ex pected in a particular method and will recognize when a particular result is suspect. ‘Additionally, an analyst who knows the standard deviation expected of a rmethod may reject a data pont that falls outside 2s of 2.5 of the mean because there is about I chanee in 20 or I chance in 100 this will occur ‘A wide vatety of statistical tests have been suggested and used to detenmine whether an obseration should be rejected. In all ofthese, a range is established within which statically significant observations should fall. The difficulty with all of them is determining what the range shouldbe. If tis too small, then per- fectly good data will be rejected; and if i ie too large, then erroneous measure ‘ments willbe retained too high s proportion ofthe time. The Q testis, among the several suggested tet, on ofthe most statistically comrct for a fily small m= ‘er of observations and is recommended when atest is necessary, The ratio Q is calculated by errnging the dia in decreasing order of numbers. The difference between the suspect number and its nearest neighbor (a) is divided by the range () that is, the difference between the highest mbes and the lowest number. Re- fering to the figure in the margin, Q = aly. This ratio is compared with tabulated values of Q. If tis equal to or greater than the tblate valve, the suspected ob- secvation can be rejected. The tabulated valves of @ atthe 90, 95, and 99% confi- ‘dence levels are given in Table 33. If Q exceeds the tabulated value for a given number of observations and a given confidence level, the questionable measure ment may be rejected with, for example, 95% contidence that some definite exor isin this measurement. {2M REJECTION OF A RESULT: THE OTEST Table 3.3 Rejection Quotiont, Q, at Different Confidence Limits" Observations On Om 3 sat 097 0994 4 0765 039 os 5 0682 ono om 6 0560 0.825 om 7 0307 0368 0680 5 0468 0326 0634 9 0437 0393 0598 0 0412 0466 0568 5 0338 0386 047s 20 0300 030 0425 8 om 0317 0393 2» 0260 0298 0372 ‘apd rom DB, Race An Che 6 (191) 132, Diane sa “The following so of chloride analyses on separate aliquots of & pooled serum were reported: 103, 106, 107, and Il4 meq. One value appears suspect. Determine if it can be ascribed to accidental error, a the 95% confidence level Solution “The suspect result is 114 meg/L. I differs from its nearest neighbor, 107 meq/L, by 7 meglL. The range is 114 t0 103, or 1 mea/L. Q is therefore 7/L1 = 0.68 ‘The tabulated value for four observations is 0.829, Since the ealeulated is less ‘an the tabulated Q, the suspected number may be ascribed to random etror and should not be rejected. Fora small number of measorements (eg, thre to five), the discrepancy of ‘he measurement must be quit larg before it ean be rejected by this criterion. and itis likely that ecroneous results may be retained. This would cause a significant, change in the arithmetic mean because the mean is greatly influenced by a dis ‘ordant value, For this reason it has been suggested tbat the median rather than the mean be reported when a discordant umber cannot be rejected from a stall number of measurements, Tee median is the middle result of an odd number of| results, or the average of the central pair for an even number, when they ae arranged in order of magnicide, The median has the advantage of not being unduly influ- enced by an outlying value. Inthe above example, the median could be taken as the average of the two middle values [= (106 + 107)2 = 106). This compares ‘with a mean of 108, which is influenced more by th suspected number. “The following procedure is suggested for intepretation of the data of thee to five measurements if the precision is considerably poorer than expected and if ‘one ofthe observations is considerably diferent from the others of the set. ‘Conse epoting the median when sn outer canot quite be reece, 100 Large populition statistics do not "sic apply for small polation. “The median may be & beter repre sentative of thet vale han the mean, for small numbers of essere DATA HANDLING AND SPREADSHEETS IN ANALYTICAL CHEMISTRY ___Wikely to fail, (See. the paragraph. belaw.) 2. Chek edna lead tothe suspected umber ose ita definite ane an be ientied ing the medion soe than theme forall. ta. Ag ans reso, un ante ana. Agreement ofthe neu with ‘he spay vai ata prvi let wi end. ine te he sapected rout shouldbe rece, You sald wid how ex cote oning epee al the “gh” are i obo. “The Q test should not be applied to thee datapoints if two are identical. In that case, the test always indicates rejection ofthe third value, regardless of the magnitude ofthe deviation, because a is equal tw and Ox is always equal to 1. ‘The same obviously aplies for thre identical datapoints in four measurements, snd 80 forth 315. Statistics for Small Data Sets ‘We have discussed, in previous sections, ways of estimating, for a normally dis- tsbuted population, the central value (mean, ¥), the spread of result (standard de- ation, 3), andthe confidence limits (tes. These statistical values hold strictly for a large population. in analytical chemistry, we typically deal with fewer than 10 results, and fora given analysis, perhaps 2 or 3. For such small sts of data, oer estimates may be more appropri ‘The Q test in the previous section is designed for small dats sets, and we ‘mentioned there some rules for dealing with suspect results ‘THE MEDIAN MAY BE BETTER THAN THE MEAN ‘The median M may be used as an estimate of the contra vale. Tt has the advan: tage that its not markedly influenced by extraneous (tir) values, ass the mean, 5 The efficiency of M, defined as thé ratio of the variances of sampling distibue tions of these tw estimates ofthe “tite” mean value and denoted by Ey. is given in Table 3.4 t varies from 1 for only two observations (where the median is noe- essary identical with the mean) to 0.64 for large numbers of observations. The numerical value ofthe efficiency implies that the median from, for example, 100, ‘observation where the eciency i essentially 0.64, conveys as much information bout te central value ofthe population as does the mean calculated from 64 ob- servation. The median of 10 observations is as efficent conveying the informa- tion as is the mean from 10 X 0:71 = 7 observations. It may be desirable to use ‘the median inorder to avoid deciding whether a gross error is present, tha s,s ing the Q test Ithas been shown that for thee observations from a normal popu lation, the median is beter than the mean of the best two out of three (the 680 losest) values |A15 STATISTICS FOR SMALL DATA SETS 101 Table 34 Effcincios and Conversion Factors for 2 to 10 Observations” Range Confidence = Range pcr) No.of a Deviation Factor) Oberaions OF Median, Bs Orange Facog Taal Py 2 10 “00 0. oa 36 3 om ding 09 3 301 4 om se oie on 1 5 oe 96 on ast ois ‘ om a O40 a0 oss 7 ost 91 om a3 0st : on 099 03s 029 oa 3 oss as ou 036 a7 0 on ass on 02 033 a os ‘00 0 200 ‘00 “Ady tom RB Des 1 Bison, Aral Ch, 2 (51 6, RANGE INSTEAD OF THE STANDARD DEVIATION. ‘The range R for a small se of measurements, is highly efficient for describing the spread of results. Te efficiency of the range, E, shown in Table 34, is vstally ential to that ofthe standard deviation forfour or fewer measurements This high relative eicieny arses from the fact thatthe standard deviation is a poor estimate ‘of the spread fora small mimber of observations, although tis stl th best known, ‘estimate for a given set of data. To convert the range to &measute of spread tha is independent of the number of observations, we must multiply it by the deviation ‘uetor,K, given in Table 34. This factor adjusts the range so that on average it ‘reflects te standard deviation of the population, which we represent by 5 5 Ry ean In Example 3.9 the standard deviation ofthe four weighs is 0.69 mg. The range is 1.6 mg. Multiplying by Ke for fou observations, = 1.6 mg X 0.49 = 0.78 mg. AS WN increases, the efficiency of the range decreases relative to the standard deviation, ‘The median M may be used in computing the standard deviation, in order to ‘minimize the influence of extraneous values. Taking Example 3.9 again the stan- ard deviation caleulated using the median, 9.8, in place ofthe mean in Equation 32, is 0.73 mg, instead of 0.69 mg, CONFIDENCE LIMITS USING THE RANGE ‘Confidence limits coud be ealeulated using 5, obtained from the ange, in place of, in Equation 39, and a corresponding but different able. It is more convenient, ‘though, to calculate the limits directly from the range as Confidence limit == Re, G9) ‘The factor for converting R tos, has been included in he quantity, which is tab- ‘lated in Table 34 for 99 and 95% confidence levels. The calculated confidence limit st the 959 confidence level in Example 3.15 using Equation 3.18 is 93.50. 0.19 (13) = 9386 0.25% Na,CO,, “The rage is. as good « measue ofthe ead of tesul si the Sadard deviation forfour ores & Ns \ ort Ktrtrnan ‘METU LIBRARY 02 “ifn araight Hin A equed, cain only to eta points —Anonyious Fig 37. seating pt [DATA HANDLING AND SPREADSHEETS IN ANALYTICAL CHEMISTRY. 416. Linear Least Squares—tow fo Plot the Hight Straight Line ‘The analyst is fequeatly coaftonted wit pling data that fll ona straight line, 18 in an analytical calibration curve. Graphing, that i, curve fing, is critically important in obtaining accorateanalytieal data Ie is the calibration graph that is used to calculate the unknown coacenttation. Straight-line predictability and con- sistency will determine the accuracy of the unknown calculation. All measurements will have a degree of uncersinty, and so will the ploted straight line. Graphing is often done intuitively, that is, by simply “eyeballing” the best straight line by placing aruler through the point, which invariably have some scatter. A better ap- proach sto apply sttstis to define the most probable straight-line fof the data, ‘The availabilty of statistical fonctions in spreadsheets today make it staghtfor waed o prepare straight-line, or even nonlinea, fits. We will frst lear the com- tations that ae involved in curve fing and satsical evaluation. ‘fa straight-line relationship i assumed, then the data ft the equation ms +b a9) here isthe dependent variable, x isthe independent variable, m is the slope, ofthe curve, and b is the itercept on the ordinate (y axis» is usually the mea sured variable, plowed as a function of changing x (see Figure 3.7) Ina spec- trophotometrc calbration curve, y Would represent the measured absorbances nd _x would be the concentrations ofthe standards. Our problem, then, is to establish ‘als for m and b. LEAST-SQUARES PLOTS. [I can be shown statistically thatthe best strsght line Urough a series of exper- mental points is that line for which the sum ofthe square of the deviations (ihe residuals) of he points from the line i mininwon. This is known asthe method of least squares. If isthe fixed variable (eg, concentration) andy isthe measured variable (absorbance in aspectrophotometric measurement, the peak area in chro- matographie measurement, et.), thea the devition of y vecticlly from the line a given value of (x) is of interest. If y, isthe value onthe line, it i equal to ‘mx; -+.b. The square of the sum of the differences, S, is then Dow = Ele me + OF 2m, ‘This equation assumes no error in x, the independent variable, | 216 LINEAR LEAST SQUARES—-HOW TO PLOT THE RIGHT STRAIGHT LINE 03 ‘The best straight line occurs when $ goes through a minimum. This is ob- The least squares slope an ites tained by use of differential calculus by setting the derivatives of $ with respect to define the most probable sight am and b equal to zero and solving for m and b. The results Tine nw = DO-D, Ss 2 2) where Z isthe mean of al the values of x and 9 i ube mean of all the values fy. The use of diferences in celeulations is cumbersome, and Equation 3.21 ean be transformed into an easier to use form, especialy if a calculator i avilable: Sap = EDI Se= (xin) ap | where m isthe number of datapoints ie Example 3:21 Riboflavin (vitamin B,) is determined in a cereal sample by measuring its uo- rescence intensity in 5% acetc acd solution. A calibration curve was prepared bby measuring the Guorescence intensities of a series of stndards of increasing ‘concentrations. The following data were obtained. Use the method of least squares to obtain the best strait line forthe calibration curve aad to calculate the con- ‘entration of riboflavin inthe sample solution. The sample fuorescence intensity ras 154 Fluorescence Riboflavin, Intensity Arbitrary gil (x) Unite (yd a xy, 0.000 00 ‘0.0000 0.00 0.100 58 00100 058 0.200 122 0.0400 2a 0.400, 23 0.160, 3a 246, San = 465, 167; Solution Using Equations 3.23 and 3.22, 46.54 ~ (1.500 * 83.695) | me 0850, = 2.2505 16.72 ~ (53.75 X 0.300) = 06, for units 53.7 luo. unitsfppm 104 ‘The standard deviations of mand b _gve an equation from which the un czcaingy inthe unknown is caleu- tate, sing propagation of enor. DATA HANDLING AND SPREADSHEETS IN ANALYTICAL CHEMISTRY Fuoresceneeintensy g te a az oats os a7 aa te ibowiain, pom Fig. $8. Leastsquaves plot of data from Example 3.21, We have retained the maximum number of sigifcant figures in computation, Since the experimental values of y are obtained to only the first decimal pace, we can ound m and b to the frst decimal. The equation of the straight line i (FU = Huo- rescence units; ppm = ug/ml.) yUFU) = 53.80°U/ppm)x(ppm) + 0.6¢F4) ‘The sample concentration is 184 = 53.8 +06 275 yagi. ‘To prepare an actual plot ofthe line, ake two arbitrary values of x sulicenly far apart and calculate the comesponding y values (or vice vest) and use these as point to draw the line. The intercept y = 0.6 (atx = 0) could be sed as one pont. ‘AC 0.500 palm, y= 27'S. A plot ofthe experimental data and the least-squares tine drawa drough them is shown in Figure 3.8, This was ploaad using Excel, with the equation of the line and the square ofthe corclation coefficient (a measure of agreement between the two variables—ignore tis for now, we wl discus it ate). ‘The program automatically gives additional figures, but note the agreement with our calculated values for the slope and intercept. ‘STANDARD DEVIATIONS OF THE SLOPE AND INTERCEPT—THEY DETERMINE THE UNKNOWN UNCERTAINTY ach data point onthe least-squares line exhibits 2 normal (Gaussian) distribution bout the line on the y axis. The deviation ofeach y, fom the lin is; ~ 91 = (one +B), a8 in Equation 320. The standard deviation of each of these y-axis 22:16 LINEAR LEAST SQUARES—HOW TO PLOT THE RIGHT STRAIGHT LINE deviations is given by an equation analogous to Equation 3.2 except that here are 180 Tess degrees of freedom since two are used in defining the slope andthe intercept B= Gyn wise ym a G29, “This quanti is also called the standard deviation of repression, sr. The value can ‘be used wo obtain uncertainties forthe slope, m, and intercept, b of the least-squares lie since they ae related to the uncertainty in each value of y. For the lope: [== estar feta a VS am ae isthe mean of all x values. Forth intercept saf 1 'Y Nis Gay Y N= Oxyd ad In caleulating an unknown concentration, x, from Equation 3.19, represeating the Teast-squaes line, the uncertainties in ym, and b are all propagated in the sual manner, from Which we can determine the uncertainty in the unknown where Bae 402 Estimate the uncertainty in the slope, intercept, end y for the least-squares plot in Example 3.21, andthe uncersiny in the determined ribofavin concentration. Solution In order to solve forall the uncertainties, we need values for Dy2, (Zyp% Ix’, Gay, nd wt. From Example 321, Gy) = 5.6 = 69090; Eat = 085 2.250, and m? = (53:7,)>= 288 The (>,)' values are (0.07, (3.8), (adh, G23y, and (BSN =n 386, HES. and TAS. and 5 2554.6 (carrying exta figures). From Equation 3.24, From Equation 325, O5y ‘0.850, = 22505 0.850, 570.850) From Equation 3.26, 06, ‘Therefore, m= 53y++1y and b= 06 + 0.4, 106 ‘A. comlaion coefficient Aer 1 mca there i diet relationship ‘tween two vaiables, ab sovbance and concent DATA HANDLING AND SPREADSHEETS IN ANALYTICAL CHEMISTRY ‘Tho unknown riboflavin concentration is calculate from G+5)- G25) _ U54+06-6+04) ra S3a= ly eS Applying the principles of propagation of eror (absolute variances in numerator Additive, relative variances inthe division sep additive), we calculate that x = 027,001, ppm. ‘See Chapter 16 for the spreadsheet calculation of the standard deviation of re- ‘reason and the standard deviation ofan unknown for this, ‘417 Correlation Coefficient and Coefficient of Determination “The correlation coefficient is used es measure of the correlation between £WO variables. When variables x and y are correlated rather than being functionally = lated (ic. are not dretly dependent upon one another), we do not speak of the “pest” y Yalue coresponding toe given x value, but only of the most “probable value, The closer the observed values are to the most probable values the more 0.99 indicates excellent linearity. AR r> (0.999 can sometimes be obssined with eare. ‘The correlation coefficient gives the dependent and independent variables equal weight, which is usually aot ue in scientific measurements. The r value tends to give more confidence in the goodness of fi than warranted. The fit must ‘be quite poor before r becomes smaller than about 0.98 and is really very poor ‘when less than 0.9. 'A more conservative measure of closenest of itis the square ofthe correla: ‘ion coefficient, r, and this i what most statsteal programs calculate (inclading Exoel—see Figure 3.8). An r vlue of 0.80 coresponds to an valve of only 0.81, ‘while an of 0.95 is equivalent to an r of 0.90. The goodness offi is judged by ‘the numberof 9's. So three 9's (0.999) or better represents an exellent ft. We wil tse 7 as a measure of ft. This is also called the coefficient of determination. It should be mentioned that itis possible to have a high degree of corela- tion between two methods (r* near unity) bot to have a statistically significant dif- ference between the results of each according tothe £ test, This Would oceu, for example, if there were 2 constant determinate eror in one method. This would make the differences significant (nidt due to chance), but there would be a direct comelation between the results {r? would be near unity, bat the slope (mi) may not bbe near unity orth intercept (2) not near zero]. In principle, an empirical corec- tion factor (a constant) could be applied to make the results by each method the same over the concentration range analyzed, 5.18 Using Spreadsheets for Plotting Calibration Curves ‘The availability of spreadsheets makes it unecessary to plot data on graph paper and do hand calculations forthe least-squares regression aualysis and statistics. ‘We will use the dats in Example 3.21 to prepare the plot shown in Figure 3.8, us- ing Excel ‘Open anew spreadsheet and enter: (Cell AL: Riboflavin, ppm (adjust the column wid to incorporate the text) (Cel B 1: Fluorescence intensity 107 ‘Te colicin of determination (2) in a beter measure off 08 DATA HANDLING AND SPREADSHEETS IN ANALYTICAL CHEMISTRY Ce A3: 0.000 Cell Ad: 0.100 ‘Call AS: 0.200 Call A6: 0.400 ‘Cell a7: 0.800 Cell B3: 00 Cell BS: 58 Cell BS: 12.2 Cell B6: 22.3 coll BI: 43.3 Format th cell numbers to ave three decimal places for column A and one for column B. ‘Click onthe Chart Wizard icon onthe tolbar (the one with the vertical bars). ‘Step 1—ChartType—of te Chart Wizard will appear. Follow the folowing sequences: Select XY (scatter) and Seater (no line) for Chart subtype Next Data Range: enter A3:B7 (click on Serie, and note the X values and ¥ val- ‘es adresses) ‘Check: Columns (after going buck to Data Range) Next (Char tite: enter Calibration Curve 32.18 USING SPREADSHEETS FOR PLOTTING CALIBRATION CURVES ‘Value (0) axis: enter Riboflavin ‘Value (Y) axis: enter Fluorescence intensity (Gridlines: uncheck Major gridines Legend: Delete Show legend ‘Data labels: None (Try Show Value, and note the data entered on each point on the Line) Next (Click on As New sheet: Chart 1 Finish ‘The calibration graph is ploted on « newy Excel sheet. "Now we wish to enter te least-squares equation line and the»? value. Click fn the figure, and Chart will appear in the toolbar. Click on it and continu: ‘Add Trendline Linear Options Display equation on chart Display R-squared value on chart OK [Now look atthe char. Click on it to remove the end markers. Yoo ean move the ‘uation on the line toward the left and enlage it. Click on the equation cis high- Tighted with small squaes. Click ona comer and drag ito the let, down the line. "You can increase the font size. Clik on Forma: Select Data LablesFont. Select ize 14 then OK. Drag the equation close othe lin. You canals increase the font size ofthe ans labels by highlighting them and doing the same, as well s the tide. Let's get rid of the gray background. Click on the gray area, then Format: Select Plot Area, Click onthe white color square, then OK. The chart you have ‘ow prepared should look similar to Figure 38. ‘When you prepare te graph, you can inially highligh the cells (A3:B7) that xyou want to graph, andthe adresses will aatomatcallybe placed in the Data Range. Instead of placing the graph ona new sheet, you could have selected AS objet in ‘Sheet I. Tis would have placed iti the spreadsheet in which you entered the data ‘You can adjust its poston and size by clicking oni, and dragging the corer. Fig ure 39 shows the graph inserted into the spreadsheet. Try doing this. Once you have the graph inserted inthe speadsheet, this heoomes a generic plot for nw data, that is, you change the data in columns A and B, a new line is automatically charted. ‘Ty this, (You should save your original spreadsheet raph and rename the new one.) ‘You may print omly the graph by fst licking oni highlight it 4.19 Slope, Intercept, and Coefficient of Determination ‘We can use the Excel statistical functions to calculate the slope and intercept for a series of data, and the R? value, without a plot. Open a new spreadsheet and enter the ealibration data from Example 3.21, as in Figue 3.9, in cells ABT. In cell AD {ype Intercept, in cell A10, Slope, and in cell ALL, R® lghlight cell B9, click on f-Statisial, sd scroll dow to INTERCEPT under Fonction name, and click OK. For Known_x's, enter the aay A3:A7, and for Known_y's, enter BSB7. Click 109 0 DATA HANDLING AND SPREADSHEETS IN ANALYTICAL CHEMISTRY FA TET TRE ss —, “sof es = ca | aay] 8 ai q 2 5 a ast} gi — 22H F28) sasoravoms z fe) cen i. 2 8 oso aj TTB a0 a H ‘ao00, 200. 0400 600 0800. 1.000 ‘iboiavn, pom Fig, 39. catiration graph inserted in spreadsheet (Sheet 1). (OK, andthe intercept is displayed in cell B9. Now repeat, highlighting cell BIO, scrolling to Slope, and entering the same arays. The slope appears in cell B10. Re- peat again, highlighting cell B11, and scoling to RSQ. R? appear in cell B11. (Compare wid the values in Figure 3.9. 3.20. WINEST for Additional Statistics ‘The LINEST program of Exce! allows us to quickly obttin several statistical func tions fora set of data, in particular, the slope aod its standard deviation, the iatr- ‘cept and its standard deviation, te coefficient of determination, and the standard ceror of the estimate, besides otbers we will not discuss nov. Linst will auton ‘cally calculate a total of 10 funetions in 2 columns of the spresdshet. Open a new spreadsheet, and enter the calibration data from Example 3.21 as you did above, in cells A3:BT, Refer to Figure 3.10. The statistical data willbe placed in 10 cells, so let's label them now. We will place them in cells B9:CI3, ‘Type labels as follows: Cel AS: slope (Call 10: std dev, Cel AN: (Call A12: F Coll A13: sum 59. rt (Cell DS: intercept (Cell DIO: st. dev, Cell DIL: std. exor of esti. Catt D12: (Cell D13: sum 39 resid. F 13.22. DETECTION LINITS THERE IS NO SUCH THING AS ZERO. Sa Sl A 7 tes |oves ae ——— | nat — | — | — | = =| ee oe rca Gewese] ocueau oar dea —| ee ee Highlight ces B9:C13, and click on f,. From the Statistical function, seroll down to LINEST and click’ OK. For Known_y’s, enter the ary B3:B7, and for Kaovn_x's enter A3:A7, Then in each ofthe bores lnbeled Const and Stats, type “cue”, Now we have to use the keyboard to execute the calculations. Depress Shift, (Control, and Enter, and release. The statistical data are entered into the highlighted calls. This keystroke combination must be used whenever performing a function ‘on an array of cells like here. The slop is in cell B9 and its standard deviation in ‘all B10. The intercept isin cell C9 and its standard deviation in cell CLO. The co- ficient of determination isin cell BL. Compare the standard deviations with those calculated in Example 3.22, andthe slope, intercept, and R? with Example 3.21 or Figure 3.8. ‘Cell C11 contains the standard error of the estimate (or standard deviation of the regression) and is measure ofthe eror in estimating values of y. The smallor| itis, the closer the numbers are tothe line, The othe cells contain data we will ‘not consider hee: Cell BI? is the F value, cell C12 the degrees of freedom (used for F), cell B13 the sum of squares of the ogresion, and cell C13 the sum of ‘squares ofthe residuals ow many significant Ggures should we kecp for the least-squares line? The standard deviations give us the answer. The slope has a standard deviation of 1.0, ‘and 20 we write the slope as $3.8 1. at best. Te intercept standard deviation is “40.42, x0 forthe slope we write 0.6 = 0.4, See also Example 3.22. 421. Statistics Software Packages [Excel offers a numberof statistical functions, listed under the Tools menu. Go to ‘Add-Ins, and check Analysis ToolPa. Click OK and return to the spreadsheet. Now ‘when you goto the Tools menu, you will see Data Analysis. Go to that, and you will se 19 statistical programs listed. As you experiment with these, you wil find some very useful. One Add-In thats very useful is Solver, fr solving complicated formulas. Its use is described in Chapter 6, Soe also the text website warw.viley. ‘comicalleg/chrstian for © list of some commercial software packages for per forming basic as well as more advanced statistical calculations. 422 Detection Limits—There ls Mo Such Thing as Zero ‘The previous discussions have dealt with statistical methods to estimate the relia. bility of analyses at specific confidence levels, these being ultimately determined, m fig SIL usgunestee 2 ‘The concentration that gives a signal quel 0 tee times the anda de ‘ton ofthe backaround i gone ally akon as the detection iit DATA HANDLING AND SPREADSHEETS IN ANALYTICAL CHEMISTRY fi UL Peavey noe te ts fr deen it The backroom! toes repent cnnmnly erat bsgows Sale wr Be le meen repel te peg Arb ame ipa wd 1 sone ‘eran awe daceh to emnge fe ete Rest, by the pecision of the method. Al instrumental methods have a degree of noise associated with the measurement that limits the amount of analyte that canbe de- tected. The nose is reflected inthe precision of the blank or background signal, and noise may be apparent even when there is no significant blank signal. Tis smay be due to ectvation inthe dark current of a photomultiplier tube, ame Ricker Jn an atomie absorption instrument, and other factors ‘The limit of detection is the lowest concentration level that canbe determined to be statistically different from an analyte Bank. There are numerous ways that ‘desocton limits have been defined. For example, the concentration tat gives twice the peak-to-peak noise ofa series of background signal measurements (r of .6on- tinuously recorded background signal) may be taken asthe detection limit (see ig- ture 3.11). A generally avcepted detection limit isthe concentration that gives a signal thre times the standard deviation ofthe background signal OB amie six ‘A series of sequential baseline absorbance measurements ate made in spectro ‘photometric method, for determining the purity of aspirin i tablets using blank Solution. The absorbance readings are 0.002, 0.000, 0.008, 0.006, and 0.003. A. standard 1 ppm asprin solution gives an absorbance reading of 0.051. What isthe detection Lait? Solution ‘The standard deviation ofthe blank readings is 0.0032 absorbance units, and the ‘mean ofthe Blank readings is 0.004 absorbance unis. The detection limit is that ‘concentration of analyte that gives a reading of 3 x 0.0032 = 0.0096 absorbance: ‘wading, above the blank signal. The net reading forthe standards 0081 ~ 0.004 = (0.047. The detection limit would correspond to 1 ppm (.008610.087) = 02 ppm and would give a total absorbance reading of 0.0096 + 0.004 ‘The precision atthe detection imi is by definition about 334%, For quamatve smeaiuremens,conceniaons should be at last 10 tines the detection lint (@ ppm in the above example) {325 STATISTICS OF SAMPLING—HOW MANY SAMBLES, HOW LARGE? There have been various ausmpts to place the concept of detection limit on 4 more firm statistical ground. The International Conference on Harmonization (OCH; see Chapter 4) of Technical Requirements for Registration of Pharmaceuti- cals for Human Use has proposed guidelines for analytical method validation (Ref, 18) The ICH Q2B guideline on validation methodology suggests calculation based ‘a the standard deviation, s, of the response and the slope or sensitivity, S, ofthe ‘eaitation euve at levels approaching the limit. For the init of detection (LOD), Lop = 3.3005) 629) And for limit of quantitation (LOQ) HS aS) 630) ‘The standard deviation ofthe response can be determined based on the standard deviation of either the blank, the residual standard deviation of the least-squares regression line, or the stndsrd deviation of they inereep of the repression line. ‘The Excel statistical funtion canbe used to abiin the lst two ‘The International Union of Pare and Applied Chemistry TUPAC) uses a value ‘of 3 in Equation 3.29 (for blane measurements), derived from a confidence level (of 95% for a reasonable number of measurements. The cnfidence level, of couse, varies with the number of measurements, and 7 to 10 measurements should be ‘taken. The botiom line is that one should regard a detection limit as an approxi ‘mate guise to performance and not make efforts to determine it to precisely. 4.23. Statistics of Sampling—tow Hany Samples, How Large? “The sequiring of a vali analytical sample is perhaps the most eitical part of any analysis. The physical sampling of diferent types of materials (solids, liquids, ‘gses) i discussed in Chapter 2. We describe here some of the statistical consid erations in sampling, ‘THE PRECISION OF A RESULT—SAMPLING IS THE KEY ‘More often than not, the accuracy and precision of an analysis is limited by the ‘sampling rather tan the measurement step. The overal variance of an analysis is ‘the sum ofthe sampling variance andthe variance ofthe remaining, analytical op- erations, that is, godes? ean Ifthe vaiance due to sampling is known (by having performed multiple sam- pings of the material of interest and anslyzing icusing a precise measurement ec higue), then there is litle to be geined by reduction of g, to less than is, For ‘extmple, ifthe absolute standard deviation for sampling is 3.09% and tht ofthe analysis is 1.0%, then s3 = (L0¥ + @.0)' = 100, ors, = 3.2%. Here, 94% ofthe imprecision is due to sampling and only 6% is due to measurement (i increased from 30 to 3.2%, s0 0.2% is dve to the measorement) I the sampling impreci- sion is relatively lage, i is beter to use @ rapid, lower precision method and an- alyze more samples 3 Line is gained by improving he oles variance Fess tan 0c ‘hid te sampling variance is bet ‘ero analyze more samples using & fae, es precise method 14 The greater the sample size, the smaller the varitce [DATA HANOLING AND SPREADSHEETS IN ANALYTICAL CHEMISTRY ‘We are really interested in the value and vasiance ofthe tue valve. The to: tal variance is si = 62 +92 + 52, where 3 describes the “tue” variability ofthe analyte inthe system, the value of which isthe goal ofthe analysis. For relsble Interpretation ofthe chemical analysis, the combined sampling and analytical var- lance should not exceed 20% of the (otal variance, (See M. H. Ramsey, “Appro- pate Precision: Matching Analytical Precision Specifiations to the Particular ‘Appliction” Anal. Proc, 30 (1993) 110.) ‘THE “TRUE VALUE” ‘The range in which the tre valve falls forthe analyte comtent in bile material can be estimated from ar test ata given confidence level (Equation 3.11) Here, +s the average ofthe analytical results forthe parcular material analyzed, and 1s the standard deviation thats obtained previously from analysis of similar ma- ‘eral samples o¢ from the present analysis if there are sufficient samples. (MINIMUM SAMPLE SIZE Statistical guidelines have been developed forthe proper sampling of beteroge- ‘neous materials, based onthe sampling variance. The minimum size of individual Increments for a well-mixed population of differen kinds of particles can be est- ‘mated from Ingamell’s sampling constant, K, wR = K, 32) where w isthe weight of sample analyzed and R isthe perent relative standard ‘deviation of the sample composition. Ky represents the weight of sample for 1% sampling uncertainty ata 68% confidence evel and is obtained by determining the ‘standard deviation from the measurement ofa series of samples of weight w. This ‘equation, in effect, says thatthe sampling variance is inversely proportional to the sample weight MB ames “Ingamell's sampling constant forthe analysis ofthe nitrogen content of wheat sam= ‘ples is 0.50 g. What weight sample shouldbe taken to obtain a sampling precision of 0.2% red in the analysis? Solution w(02)'= 0505 25g [Note thatthe entre sample is not likely to be analyzed. The 125-g gross sample willbe finely ground, ada few hundred milligrams of tbe homogeneous material analyzed. Ifthe sample were not made homogeneous, then the bulk of it would have to be analyze. {328 STATISTICS OF SAMPLING—HOW MANY SAMPLES, HOW LARGE? 5 (MINIMUM NUMBER OF SAMPLES “The number of individual sample increments needed to achieve given level of, confidence in the analytical results i estimated by ma a= 633) ‘where 1s the Student value forthe confidence level desired, s isthe sampling ‘variance, ris the acceptable relative standard deviation of the average of the ana Iya results, x 5, the absolute standard deviation, in the same units as 7, and 0/m i unites. Values of s, and are obtained from preliminary measurements ot rior knowledge. Since rs equal to ,f, ve can wit that suo ens i tae nen, i rs a Se tt, ‘value for the given confidence level is initially estimated and an iterative proce- ce is used to calculate DB sae 325 ‘The iron content in a bleaded lot of bulk ore material is about 5% (wt/wt), and the relative standard deviation of sampling, sis 0.021 (2.1% rsd). How many sam- ples shouldbe taken in onder to obtain a relative standard deviation, r, of 0.016 (1.6% ro) in the results atthe 95% confidence level i.e, the standard deviation, 4 for the Sion ennten is 0.08% (wUwO)? oa) Solution ‘We can use either Equation (3.33) or 3.34). We will use the latter, Sets = 1.96 (for n= ©, Table 3.1) at the 95% confidence level. Calculate « proliminary value ‘of m. Then use this n to select-a closer value, and recalculate; continue itera- tion to constant _ 2.967(0.0217 Woy For n = 2365, _ 23650021 (0167 2s oon (015 =o ‘See if you got the same result using Equation (3.33) 16 Learning Objectives Questions DATA HANDLING AND SPREADSHEETS IN ANALYTICAL CHEMISTRY Equation 3.33 holds for a Gaussian distribution of analyte concentration within che bulk materia, that is, it will be centered around x with 689% of the val- ves falling within one standard deviation, or 95% within two standard deviations. In this case, the variance of the population, ois small eommpared to the true value. If the concentration follows a Poisson distribution, that is, follows a random dis- ‘mbution in the bulk material such tht the tue o mesh value # approximates the variance, s, ofthe population, then Equation 333 is somewhat simplified: f4i.£ 3) Note hat since sis equ othe right band par of the expesin eames eal to, bt the nits do ne cance In this css hen te onsen stbton ‘stron sir than namow, many moe simples ar egued to get apse tive renal from the anaes the anal ocr in clumps or paths, the smpling satey becomes ‘more complicated. The patches can be considered as separate strata and sampled Separate If tlk matcal ar epregated rsd, ad the average composi tion is dsc, ten the namber of samples fom each sum should bein pro Porton oo sizeof he stim WHAT ARE SOME OF THE KEY THINGS WE LEARNED FROM THIS CHAPTER? 1 Accuracy and precision, p. 65 © Types of exors in measurements, p. 68 « Significant figures in measurements and calculations, pp. 6, 67 Standard deviation, p. 74 1 How t0 use spreadsheets, p. 78 «Propagation of exors, p. 82 © Control chars, p. 89 © Statistics: confidence limits, rests, F tests, p. 90 Rejection ofa result, p. 98 © Least squares plots and coefficient of determination, pp. 102, 106 ‘© Using spreadsheets for plotting calibration euves, p. 107 ‘© Detection limits, p. 11 ‘© Statistics of sampling, p. 113, 1. Distinguish between accuracy and provision. 2. What i determinate err? An indeterminate ex? ‘3. The following is alist of common errors encountered in research laboratories. (Categorize each asa determingte or an indeterminate error, and further cat gorize determinate exors as inrumental, operative, or methodic: (8) An un- ‘known being weighed is hygroscopic. (b) One component of a mixture being PROBLEMS. analyzed quantitatively by gas chromatography teacts with the column pock= ing (€) A radioactive sample being counted repeatedly without any change in ‘conditions yields a slight different count at each tial.) The tip of te pipet used inthe analysis is broken. (¢) In measuring the same peak heights of a chromatogram, two technicians each report diferent heights. For the statistical problems, do the calulations manually frst and then use the [Excel statistical functions and se if you get the same answers. See the CD, Prob Jems 14-18, 20, 21, 25-20, and 37-40. SIGNIFICANT FIGURES 4. How many significant figues does each of the following numbers bave? (4) 200.06, ¢b) 6.030 * 10°, and (€) 7.80 % 108, '5. How many significant figures does each of the following numbers have? (@) 0.02670, &) 328.0, (€) 70000, and (3) 0.00200. 6. Calculate the formula weight of LINO, to the corect numberof significant figures. 7. Calculate the formula weight of PACI, t the comeet number of significant figures. 28, Give the answer to the following problem tothe maximum numberof signif- feant figures: 50.00 % 27.8 X 0.1167 9. Give the answer of the following to the maximum number of significant ig- tres: (2.776 X 0.0050) ~ (6.3 * 107) + (0.036 X 0.0271), 10, An analyst wishes to analyze spectrophotometrcally the copper content in a ‘bronze sample I the sample weighs about 5 and ifthe absorbance (A) is to be read tothe nearest 0.001 absorbance unit, how accurately should dhe sam- ‘ple be weighed? Assume the volume of the measured solution will be adjusted to obtain minimam eror in the absorbance, that is, so that O.1

You might also like