one page
Statistics for Social
Workers
J. Timothy Stocks
tatrstrrsrefers to a branch ot mathematics dealing ‘”‘th the direct de
tion of sample or population characteristics and the an.ll)’5i• of popula·
lion characteri>tics b)’ inference from samples. It co•·ers J wide range of
content, including th~ collection, organization, and interpretJtion of
data. It is divided into two broad categoric>: de;cnptive >lathrics
and
inferential >lJt ost ics.
Descriptive statistics involves the CQnlputation of statistics or pnr.1meters to describe a
sample’ or a popu lation _~ All t he data arc available and used in <.omputntlon o f t hese
aggregate characteristics. T his may involve reports of central tendency or v.~r i al>il i ty of
single variables (univariate statistics). ll also may involve enumeration of the I’Ciation-
sh ips between or among two or moo·e variables’ (bivariate or multivariJte stot istics}.
Descriptiw statistics arc used 10 provide information about a large m.b> of data in a form
that ma)’ be easily understood. The defining characteristic of descriptive ;tJtistks b that
the product is a report, not .on inference.
Inferential statisti<> imolvc’ the construction of a probable description of the charac· Descriptive Statistics
Measures of Central Tendency Arir!Jmeric .\1ea11. The arithmetic mean usually is simply called the mca11. It also is called 75 76 PA11 f I • OuANTifAllVi AffkOAGHU: fouHo~;noM Of Ot.r”‘ CO ltf(TIO’J
~, =l:: X , 1
where 11 represents the popu I at ion mean, X represems an individual score, and rr is t he The formula for the sample mean is the same except t hat the mean is represented by – l:;X X= –. Following are t he numbers of class periods skipped by 20 seventh-graders d uring l:;X 219 • II 2 0
Mode. The mode is the most frequently appearing score. It really is not so much a measure quency distribution and determining which score has t he greatest fre-
TABLE 6 . 1 Truancy Scores Score
20 18 7
1 6
IS 4
1 3
1 2
II
10 6 5
4
3 2 frequ ency
2 3 2 0 I 0 0 Because 17 is the most frequently appearing number, the mode (or Unlike the mean or median, a distribution o f scores can have more ,llfedinrr. lf we take all the scores in a set of scores, place t hem in o rder There a.re 20 scores in the previous example. The median would be Measures of Variabi li ty de
information on how “spread out” scores in a d istribution are.
J If R
:.aJ de c .. …nu 6 • STAnsnu t<~~ Soc&AL Wouta~ 77
to the maximum ( highest) score. h is obtained by subtracting the 111ini murn score flom Let us compute th.- rang.- for the following dJt.l ~ct:
/1, 6, 10, 14, 18,22/.
‘T’he n1inimum i!) 2, and tht.” tnJximum is 22:
Range = 22 – 2 20.
Sum ofSquaus. The sum of squares is a measure of the total amount of variability in” set The formulas for sample and population sums ot squares are the same except for sam- SS = I(X ~tl’
Using the dJtJ set fo r t11e range, the sum of squnres would be computed as i n
‘ldble6.2.
V.~rinuce. Another name for variance i~ mean square. This is short for mean of squared ss n whc1e cr2 is the syn>bol for populn tion variance, SS is the symbol fo r sum of squares, and The variance for the example we used to compute TAOLE 6.2 Computing the Sum of Squares
X X m
2 tO
6 6
10 ]
l 18 >6
12 10
NOTE, !X~ 72; n- 6; ~ • 12; l:(X – p)’ ~ 780
(X – m)’
100
36
4 2 280 6 The sample variJnce is not an unbi.as.ed estin1a1o1 ss II – I CHA,Ut 6 • Sr”n~nn HJa SOCIAl wouus 77
to the maximum (highc;t) score. h is obtained by subtracting the minimum scoo·c from let us compute the rnnge for the following data set:
12. 6, 10, 14, 18.221 .
The minimum is 2. and the maximum is 22:
Range 22-2 = 20.
Sum of8qo~t~res. The ,um of squares;, a measure of the total amoun t o f variability in a set The formulas for <.omple and popul.llion sums of squares are the ~arne except tor S ss l.(X -X)’
Usi ng the data set for the range, t he su m of squares would be computed ns i n \~rta11u. Another name for variance is mean square. This is short for menn of 51JIUtred ss n where o ‘ is th e symbol foo· population v•o·ia.nc.e, SS is t he symbol fo o· Slim o f squares. a11d The •-..ria nee for the example we used to compute TABu 6.2 Computing the Sum of Squares
X X-m
2 – 10
6 -6
10 -2
14 +2
18 +6
22 +10
HOT£: r.x- 72: n; ti; p = 12: l:lX Ill’= 250.
(X- m)’
100 j(,
4 J&
tOO
280 6 The snmple variance is uot Jn \Ulbiased estimalor ss r =-. 78 PAll I • QuAiuu.ot.nvt A”MACH(S.:. FouHDAIIOif”i Of O.AIA CoLLfcnow
The n – 1 i> a correction fac tor for this tendency to undcre>tima te. I t is c.1 lled .1 280 6 – 1 Sumdard Deviatron. Although the variance is a measure of average variability associJtc’ Using the same .ct of numbers as before, the population standard deviation would be
cr -/46.67 = 6.83 .
and the sample st.mdard deviation would be
s J56 = 7.’18.
For a normally d istribured set of scores, n ppwximately 68% of all ;cores will be within Measures of Relationship One can use ,·eg,·cssion procedures to dcrivr the line that best fo ts the data. This line is Y,_. = 3.555 t 1.279X,
where Yis fi-equ ency o f Slope is the ch•ngc in Y for a unit increase in X. So, the slope of 11.279 meam that”” The equation does not give the actual value of Y (called the obt.tined or obserwd – r iQUIO 6.1 8
Frequency ol Stre 0 Punishment
~ c . Y P’td; – 3.555 + 1.279X .. 3 til 2 0
0 0 Stressors
example, if X were 3 , rhen we would predi<.t t hal Y would be - 3.555 + 1.279(3) ~ - 3.5
55
+ 3.837 ~ 0.282.
Tuu 6 . 3 frequency of Sue-ssors Pun1.shm~nt
3 0
4 4 }
s 3 7 ~
8 6
7 q 8
1() 9
teristics of a population b•sed on s.unple data. We compute statistics from .1 pJrtial;et of
the population data (a samplt) to estimate the population parameters. Thrse t
m:uics tO provide evidence for the exi
Measures of central tenden’)’ are individual numbers that typify the tot.tl set of ~cores.
The three most frequently used mca>urcs of centraltendenq are the arithmetic mean, the
mode, and the median.
the m-erage. It is computed b)’ adding up all of a set of scores and dwidmg by the number
of scores in the set. The algebraic representation of this is
1
number of scores being adde(l.
the variable lener with a bar above it:
II
I week: {1, 6,2,6, 15,2(),3,20, 17, 11, 15, 18,8,3, 17, 16, 14, 17,0, 101. Wecomputethe
mean by adding up the class periods missed and dh•iding by 20:
J.l = — = – = 10.9o.
of centrality as it is a measure of typicalness. It is found by o rganizing scores int o a fre-
quency. Table 6. 1 displays the truancy scores arranged in a frequency
distribution.
19
1
1
9
8
7
1
0
0
1
1
I
0
l
0
1
2
0
0
2
modal number) of class periods skipped is 17.
than one mode.
from least to greatest, and count in to the middle, then the score in the
middle is the median. This is easy enough if there is an odd number of
scores. However, if there is an even number of scores, then there is no
single score in the middle. In this case, t he two middle scores are
selected, and their average is the median.
the a”erage of the lOth and lith scores. We usc t he frequency table to
find these scores, which are 14 and J 5. T hus, the median is 14.5.
Whereas measures of central tendency are used to estimate a typical
score in a dimibution, measures of variability may be thought of ns a
way in which to measure departu re from typic<~lness. They pro"i
10
13
lhe maximum ~cor~.
of scores. Jts na me tells how to wmpute it. Smu ofsqunres is short (or sum ofsqumed dc1ti
til ion scores. It is represented by the S)’lnbol SS.
ple and populat•on mean symbob:
devintron score<. 1l1is is obtained by dividi ng the sum of squares by the number of scores
(11). It is a me,tsure of the average amount of variabilit y associated with each score in a set
of scores. The population variance fOI'mu la is
a2= -.
11 st,uJds for th e number of scores in the population.
sum of squares would be
4
36
100
(J –= 46.67.
of thf population variance. If we compute the vari
anccs for these samples using the SS/11 formula, then
the- san1ple vadn nccs wil1 average o ut smaller than
the population val’iance. For th is rc:~son, the sample
variance is computed differently froru the population
variance:
sl = – – .
the maximum score.
of score~. It> name tells how to compute it. Sum of 51Jo.arcs is short for ;um of squared dco•i-
atiou scores. It is reprewnt<>tl by the symlxll SS.
T.,b)e 6.2.
devontw11 scores. This os obtained by dividing the sum of squares by the number of ><.ores
(n). It is a measure of t he averoge ••m ount of var iability associated w ith each score in a set
of scores. T he popula tio n variance for m11ln is
¢ =- .
11 stands for the numbet of scores in the population.
sum of squar~s would be
4
cr2 =
~ 46.67.
o f’ t he population variance. Jf we com pute t he vari-
ances for these samples using th” SShr formu la, then
the sample variances will average out smaller than
thc population ••ariance. For this reJson, the sample
Vllriance is computed differe ntly from the population
variance:
n – J
degree• of freedon1. If
> =–
280 6
5 = 5.
age squared deviation from the mean. To get ” me
ll •tanrlard deviation of 1 he mean.
T.1ble 6.3 shows the relat iortship between number of >treSsors experien
rcfel’l’ed to as a regression line (or line of best ii 1 o r prediction I inc). Su ch a line bas been
.CJiculated for the example plot. It has a Y ime,·cept of – 3.555 t11id a slope of + 1.279. T his
gives us the prediction equation of
increase in stres.ors (X) of 1 will be accomp.ulicd by an increase in predicted frequency of
~orporal punishment (I’) of + 1.279 incidents per week. If the slope were a negati’e
number, then an increase in X would be accompanied by a pred ictcd decrease in Y.
score); rather, it giv~s a prediction of the value of Y for a certain value of X. Fo r
Cu,”na 6 • SrAliSnc
6 0
” 5 0 r:r
e …
c 4 ..
E
.r:
·;:
” Q.
0 1 2 3 4 5 6 7 8 9
Sttessors and Use of
Corporal Punishment
6 4