Please see attachment
Assignment 1: at least 200 words, APA format, close writing with an open-ended question, cite relevant sources; please be sure to use your own words assignment will be submitted through safe assign for plagiarism
See attachment for Box 1.2
Tests and testing can be issues that draw strong negative and positive emotions in individuals. As you can see in Ch. 1, the Army Alpha Test was used in World War I to determine a recruit’s placement in the military.
Take this test (Box 1.2) and then share your impressions about the cultural fairness and possible bias presented by some of the questions on the test.
What would it be like if this test was used to determine your entrance into the Clinical Mental Health Counseling (CMHC) program? As a result of your success or lack of success on the Army Alpha Test, how might your experience positively or negatively impact your work with future clients?
Assignment 3: : at least 200 words, APA format, cite relevant sources…please be sure to write your own words. Assignment will be submitted to safe assign for plagiarism
Write concisely but substantively, giving detailed (i.e., specific) but very brief explanations. Your response should be in formal English writing, in APA format (i.e., title page, double spaced, running head, page #, crediting source of information, reference page, etc), and be no longer than 2 to 3 pages of writing, not counting your title page and reference page. Be sure to use spell-check and other helps provided to you in Word and please write your own words. Assignment will be submitted through safe assign
Please see textbook attachment
1. In Chapter 1, Figure 1.3 shows different types of cognitive assessments. On the same page in Ch. 1, Figure 1.4 shows 3 types of personality assessments.
2. In some detail (i.e., be specific), compare and contrast the 2 main types of assessments in the cognitive domain (Achievement Testing and Aptitude Testing). Ch. 8 and 9 may be referred to for additional info.
3. For the 3 types of personality assessments, explain briefly how these personality assessments differ from each other. You may refer to chapters 10 & 11 to help you with this assignment.
Assignment 2: at least 200 words, APA format, close writing with an open-ended question, cite relevant sources…please be sure to write your own words. Assignment will be submitted to safe assign for plagiarism
Do some research on Informed Consent and Testing Procedures, which are addressed in Chapter 2 of your text. Attach a sample “Informed Consent Form” appropriate for someone with your intended license( Mental Health Counseling), and say what you like about it.
Also, review and describe specific “Procedures for Testing” which should be followed by counselors.
Textbook:
Neukrug, E.S. and Fawcett, R. C. (2020).
The essentials of testing and assessment
, 3rd ed. Enhanced, Stamford, CT: Cengage Learning
Chapter 9
This chapter provides an overview of three kinds of achievement tests (survey battery, diagnostic, and readiness tests) and one kind of aptitude test (cognitive ability test). Together, these make up the four domains of educational assessment (see the shaded domains in
Figure
8.1). Individual intelligence tests, and special and multiple aptitude tests, which largely do not focus on educational ability, are also sometimes used in the schools and have other broad applications. These tests will be covered in Chapter 9 and Chapter 10, respectively.
As you might recall from Chapter 1, we defined survey battery, diagnostic, readiness, and cognitive ability tests in the following ways:
Survey Battery Tests: Tests, usually given in school settings, which measure broad content areas and often used to assess progress in school.
Diagnostic Tests: Tests that assess problem areas of learning and often used to assess learning disabilities.
Readiness Tests: Tests that measure one’s readiness for moving ahead in school and often used to assess readiness to enter first grade.
Cognitive Ability Tests: Tests that measure a broad range of cognitive ability. These tests are usually based on what one has learned in school and are useful in making predictions about the future (e.g., whether an individual might succeed in school or in college).
You might also remember from Chapter 1 that sometimes the difference between Achievement Testing and Aptitude Testing has more to do with how the test is being used than what the test is actually measuring. For instance, although the SAT seems to be measuring content knowledge or what you have learned in school, it is used to predict success in college and is therefore often listed in the Aptitude Section of Ability Testing. That is why there is a double-headed arrow in Figure 8.1—to remind us that many of these tests share much in common with one another—at least in terms of what they are measuring. In the rest of this chapter, we examine a number of survey battery, diagnostic, readiness, and cognitive ability tests and conclude with some final thoughts about the use and misuse of such tests.
Cognitive Ability Tests
As noted earlier, cognitive ability tests are aptitude tests that measure what one is capable of doing and are often used to assess a student’s potential to succeed in grades K–12, college, or graduate school. First, we look at two K–12 tests: the Otis-Lennon School Ability Test (OLSAT) and the Cognitive Ability Test (CogAT). Then we look at tests used to assess potential ability in college and graduate school: the American College Testing Assessment (ACT), the SAT, the Graduate Record Exam (GRE), the Miller Analogy Test (MAT), the Law School Admission Test (LSAT), and the Medical College Admission Test (
MCAT
).
Otis-Lennon School Ability Test, Eighth Edition (OLSAT 8)
The Otis-Lennon School Ability Test, Eighth Edition (OLSAT 8), is one of the more common cognitive ability tests. “The OLSAT 8 supplies educators with valuable information to enhance the insights gained from traditional achievement tests” (Pearson, 2012b, “overview”). Usually given in large group format and for students in K–12, the test assesses different clusters in the verbal and nonverbal realms. For instance, the two clusters for verbal ability include verbal comprehension and verbal reasoning. The three clusters for nonverbal ability include pictorial reasoning, figural reasoning, and quantitative reasoning. As you might expect, different grade levels are given different clusters, and each cluster has different subtests. Testing time is between 60 and 75 minutes depending on the age of the student.
A variety of scores can be used in describing OLSAT 8 results, including a School Ability Index (SAI), which uses a mean of 100 and SD of 16, percentile ranks based on age and grade, stanines, scaled scores, and NCEs based on grade. In addition, when given along with the Stanford 10, an Achievement/Ability Comparison (AAC) score can be obtained to give teachers insights into how students are actually doing in school compared to their potential. Significantly higher scores on a cognitive ability test (e.g., the OLSAT 8) as compared to an achievement test (e.g., Stanford 10) could be an indication of a learning disability (see
Box
8.2). Figure 8.8 shows an Individual Profile Report from the OLSAT 8 and Figure 8.9 shows 20 of 56 students from a Grade Profile Report of the OLSAT 8.
Figure
8.8.
OLSAT 8 Individual Profile Report
Source: OLSAT 8 Results Online. Retrieved from http://pearsonassess.com/haiweb/Cultures/en-US/Site/ProductsAndServices/Products/OLSAT8/OLSAT+8+Results+Online.htm
Figure
8.9.
OLSAT 8 Grade Profile Report
Source: OLSAT 8 Results Online. Retrieved from http://pearsonassess.com/haiweb/Cultures/en-US/Site/ProductsAndServices/Products/OLSAT8/OLSAT+8+Results+Online.htm
The norm groups of the OLSAT 8 consisted of 275,500 students in the spring of 2002 and an additional 135,000 in the fall of 2002 (Morse, 2010). Internal consistency measures of reliability based on the KR-20 for the composite score ranged from 0.89 to 0.94. Reliabilities for individual subtests using the KR-21 ranged from 0.52 to 0.82, with most falling in the 0.60s and 0.70s. Evidence of test content validity is somewhat vague as the publisher notes that each user must determine if the content fits the population they are testing. Correlation coefficients for the OLSAT 8 composite scores with OLSAT 7 scores demonstrated coefficients in the range of 0.74 to 0.85, depending on grade level. Similarly, correlations among different levels of the OLSAT 8 were adequate. The test also showed reasonable correlations with the Stanford Achievement Test (Stanford 10). Finally, the test is often given around the same time as the Stanford Achievement Test and scores on each test can be listed on the profile reports thus making comparisons relatively easy (see Box 8.2 and Figures 8.3 and 8.4).
The Cognitive Ability Test (CogAT)
Another common cognitive ability, the Cognitive Ability Test (CogAT) has a name befitting of its test category. Although a Form 8 has been developed, technical information about it is not readily available, so this review examines Form 7, which is very similar to the new form.
Form 7 of the CogAT is designed to assess cognitive skills of children from kindergarten through 12th grade (Houghton Mifflin Harcourt, 2018b). The purpose the CogAT is to identify student’s abilities, provide a different means of measuring cognitive ability than achievement tests, and identify students who might have large discrepancies between their cognitive ability testing (as measured on the CogAT) and their achievement testing (Ackerman & David, 2017)). Such discrepancies can be indicative of learning problems, lack of motivation, problems at home, problems at school, or self-esteem issues. Teachers and support staff should be cognizant of such discrepancies and make appropriate referrals as necessary (see Box 8.2 and 8.4).
Box
8.4.
Identifying Needed Services
When I was in private practice, I worked with a third-grader who, a couple of years prior to my first meeting him, had been identified as having a math learning disability. This disability was first hypothesized after a large discrepancy was found between his cognitive ability score in math and his math achievement in school. Further testing verified the disability, and he was soon given an Individualized Education Plan that included three one-hour sessions of individualized assistance with math each week. After receiving this help, he soon began to do much better in math. However, after he had been getting higher math scores for about a year, the school discontinued these services. I met him soon after this, following his parents’ divorce, when his math scores had once again dropped. The scores had likely dropped due to the chaos at home as well as the removal of services. I immediately realized that this young man was not being given the extra help he was legally entitled to receive. I contacted the school, which agreed that he should be given assistance in math based on his Individualized Education Plan (IEP). Within a few weeks after obtaining this assistance, his math scores once again improved, and he was a noticeably happier child.
— Ed Neukrug
© Cengage
The CogAT measures three broad areas of ability: verbal ability, quantitative ability, and nonverbal ability and offers a composite score. It is constructed with two models of intelligence in mind: Vernon’s hierarchical abilities and Cattell’s fluid and crystallized abilities (Ackerman & David, 2017) (see Chapter 9). However, cognitive ability tests should never be viewed as substitutes for individual intelligence tests, as the manner in which they are created and administered is vastly different. They tend to focus primarily on traditional knowledge as obtained in school, particularly verbal and mathematical ability.
CogAT scores can be converted to standard scores that use a mean of 100 and standard deviation of 16, percentile ranks, and stanines. Comparisons are made by age and by grade level. The entire test takes between 90 minutes and two hours, depending on the age range of the student. The most recent norm group contained a representative sample of over 65,000 students from public and private schools. Internal consistency reliability of the CogAT ranges from 0.80 to 0.94 for the verbal, quantitative, and nonverbal sections and from 0.88 to 0 .99 for the composite score (Ackerman & David, 2017).
All cognitive ability tests have difficulty establishing content validity, since their purpose is to predict how well a student can do. However, the CogAT suggests that the content domain was defined logically and fits the basic theory of reasoning that the CogAT measures (Ackerman & David, 2017). Past forms of the CogAT showed relatively high correlations between the instrument and tests of general intelligence. However, only a few small studies of this recent form have been conducted. Two studies suggest decent correlations with the Iowa Assessments and with the Wechsler Scale of Intelligence for Children (Fourth Edition). Although the CogAT would do well with additional validity studies, it seems to be a good instrument for identifying large discrepancies between aptitude and achievement results. This can help identify students who could potentially do better if their reason for not achieving at optimal levels could be determined.
College and Graduate School Admission Exams
A number of cognitive ability tests are used to predict achievement in college and graduate school. Some of the ones with which you might be familiar include the ACT, the SATs, the GRE General Test and Subject Tests, the Miller Analogies Test (MAT), the LSAT, and the MCAT. Despite much consternation over the use of these tests, research indicates that such tests tend to predict performance in undergraduate and graduate school about as well or better than other indicators and are especially useful when combined with grade predictors (e.g., grades in high school or college grades) (Kobrin, Patterson, Shaw, Mattern, & Barbuti, 2008; Kuncel & Hezlett, 2007).
ACT
The ACT, along with the SAT (see next section), are the two most widely used admission exams at the undergraduate level (Straus, 2012). The ACT assesses four skill areas that are based on what one has learned in high school: English, math, reading, and science. In addition, there is an optional writing test and a composite score. The test contains 215 multiple-choice questions and takes 3.5 hours to complete, and the mean ACT composite score for graduating seniors tends to be about a 21, with a standard deviation around 5 (ACT, 2017). The SEM for the composite score is about 1.25 (ACT, 2017). Publishers of the ACT performed a major norm sampling in 1988 with more than 100,000 high school students and another resampling update in 1995 with 24,000 students stratified against the U.S. Department of Education figures for region, school size, affiliation, and ethnicity.
Reliability of the ACT has ranged between 0.85 and 0.92 for the four skill areas and is 0.94 for the composite score. Evidence of content validity is shown through the test development process by consistently showing that test items are related to how students have “developed the academic skills and knowledge that are important for success in college” (ACT, 2017, Chapter 11.2). The ACT publishers also performed studies that showed a sound correlation between students’ ACT scores and their high school GPAs. Predictive validity studies correlating ACT scores and first-year college GPA was .51. The ACT also shows predictions between high school GPA and socioeconomic status, with first-year college GPA, which is .58 and .24 respectively. (see Box 8.5).
Box
8.5.
Use of College Admission Exams: Selective or Oppressive?
Today, tests like the SATs and ACTs, along with a student’s GPA and other materials, are used to determine college readiness. Supporters of these college admission exams believe the test “equals the playing field” in that it allows all students to compete on the same test, whereas high school grades and rank can vary dramatically as a function of the student’s high school. However, others have suggested that they actively prevent some students from gaining access to some colleges.
For instance, consider the student who has the ability to do exceptionally well but attended a disadvantaged school that lacked resources to promote college readiness? Or what about the student who does not have the economic resources to hire a tutor or to pay for an SAT or ACT study course? Or, think about the student who hopes to be the first in his or her family to go to college and compare the kind of intellectual stimulation he or she gets at home to the child of highly educated parents. Unfortunately, all education is not created equal, particularly if you live in poverty.
At best, college admissions exams are a tool that standardizes the admission process by allowing students from various academic backgrounds to compete equally. At worst, they are tools that have the potential to widen the educational gap by discouraging, excluding, and stigmatizing those that lack access to adequate educational resources. What do you think?
— Michelle Reaves , Graduate Student, Old Dominion University
SAT
In addition to the ACT, the other major college entrance exam is the SAT. In 2013, the SAT was redesigned to better reflect the kind of work that students do in college. The new SAT is composed of four tests: a reading test (52 multiple choice questions, 65 minutes), a writing and language test (44 multiple-choice questions, 35 minutes), a math test (58 multiple choice items and 13 grid questions), and an optional essay (The College Board, 2017a, 2017b). Students receive an evidence-based reading and writing (EBRW) score and a separate math score, both of which range between 200 and 800. The mean standard score for each of these tests tends to be around 500 and the standard deviation is near to 100, although these scores fluctuate from year to year. Percentile ranks are also given to help students understand where they score relative to their peer group. Subset scores on reading, writing and language, and math are also provided. Essays are graded on a scale from 2 through 8, which is a combination rating from two raters who rate the essay for reading, analysis, and writing from 1 through 4. The SAT technical manual provides a wide range of psychometric information broken down by ethnicity and socioeconomic status (The College Board, 2017a). Reliability of the whole test is high, about .96, while the math and ERW sections are .90 and .94 respectively. The essay test has a reliability coefficient of about .90 (The College Board, 2015).
One of the most important types of validity for a test like the SATs is its predictive validity, since a major goal of the SAT is to predict achievement in college. It’s also interesting to see how high school GPA predicts college grades, as it too is generally used as a measure of how well a student might do in college. Based on recent research with high school GPA and with SAT scores (Shaw et al., 2016), high school GPA showed a correlation of .48 with first year GPA (FYGPA) in college. The SAT EBRW scores showed a correlation of .51 with FYGPA, while the math scores showed a correlation of .49 with FYGPA. Combining the SAT EBRW with the SAT math, a correlation of .53 is found with FYGPA, and adding HSGPA to the mix, we find a correlation of .58 with FYGPA. However, there is some evidence that the SAT might not be as predictive for minorities and for women, which is clearly a serious drawback to the test (Jaschik, 2016).
GRE General Test
The GRE General Test is a cognitive ability test frequently required by U.S. graduate schools. The General Test contains three sections: verbal reasoning, quantitative reasoning, and analytical writing (Educational Testing Service, [ETS], 2018). The scaled scores for the verbal and quantitative reasoning sections used to be similar to that of the SAT, but recently have changed. Now, the verbal and quantitative sections range from scores of 130 to 170, while the analytical writing section is scored on a scale of 0 to 6 with one-half point increments. The analytical section is scored by two trained readers, and scores range from 0 to 6 (a third reader is brought in if the two scores are more than 1 point apart). Recent scaled score means and standard deviations for the verbal and quantitative sections were 150.05 (8.43) and 152.8 (9.13), respectively, while the mean and standard deviation of the writing section was 3.5 and 0.87. The Educational Testing Service does not “set” the mean or standard deviation and instead uses a scaled score mean and standard deviation that floats over time. Thus, it is probably prudent to look at percentile ranks as opposed to scaled scores as they allow examinees to compare themselves to those who are currently taking the same exam.
Reliability estimates for the GREs are 0.92 for the verbal reasoning and 0.95 for quantitative reasoning while it is a lower, 0.85 for the analytical writing section (ETS, 2018). Predictive validity of the GRE General Test seems to indicate that the test does predict first-year graduate GPA to some degree. For instance, Burton and Wang (2005) found that the combined scores on the GRE with graduate school GPA was 0.51 for psychology, 0.51 for biology, 0.50 for chemistry, 0.62 for English, and 0.66 for education. However, some suggest the test does not focus on motivation, perseverance, and other attributes that might be more amenable to women and minorities (Clayton, 2016).
GRE Subject Tests
In addition to the GRE General Test, a number of subject tests are provided for those graduate programs that wish to assess more specific ability. The available subject tests include biochemistry, cell and molecular biology; biology; chemistry; computer science; literature in English; mathematics; physics; and psychology. Like the GRE General Test, the subject tests use a floating mean and standard deviation; however, the Subject Test’s scaled score ranges between 200 and 980 (ETS, 2018). Means and standard deviations can vary dramatically among subject tests, so subject scores should not be compared to each other, although they can be compared to themselves over time. For instance, the mean and standard deviation for those examinees who took the biochemistry, cell and molecular biology test between July 2014 and June 2017 were 667 and 120, respectively, while the mean and standard deviation for psychology during that time were 617 and 104 (ETS, 2018). Reliability is in the low to mid .90s and scores are moderately related to GPA in a number of graduate degrees (Bridgeman, Burton, & Cline, 2008).
Miller Analogy Test (MAT)
Another test used for admission to graduate school is the Miller Analogies Test (MAT), which “measures your ability to recognize relationships between ideas, your fluency in the English language, and your general knowledge of the humanities, natural sciences, mathematics, and social sciences” (MAT, 2017, p. 4). The test, which has 120 analogies, can be taken on computer or by hand and takes 1 hour to complete. A scaled score based on all who took the test between January 2012 and December 2015 are given and range from 200 to 600 with a mean set at approximately 400. Percentile scores are also compared to the examinee’s intended major and to the total group who took the test. Internal reliability is 0.91 for combined majors. Predictive validity studies for the MAT show weak to moderate correlations with 1st year graduate GPA (Meagher, 2008; Pearson, 2012c). For instance, one study found a correlation of .27 with first-year graduate GPA while a meta-analysis found a correlation of .41 for the MAT scores with first-year GPA. Additional research showing the construct and predictive validity of this instrument is needed.
Law School Admission Test (LSAT)
The Law School Admissions Test (LSAT) is a half-day test that consists of five 35-minute sections used to determine admission to law school (Law School Admission Council, [LSAC], 2018). The test includes three multiple-choice sections measuring reading comprehension, analytical reasoning, logical reasoning, and a fourth section that asks for a writing sample that is not scored but sent directly to law schools to which the applicant is applying. A fifth section is unscored and used to pretest new questions. LSAT scores range from 120 to 170 (mean about 150, SD about 10), although scores fluctuate from test to test. Percentiles are also given, and the test’s reliability is in the low .90s. (Law School Admission Council, n.d.). The LSAT predicts first-year law school grades better than the undergraduate GPA (UGPA), but the UGPA is very slightly better at predicting cumulative grades (.28 for LSAT, .29 for UGPA) (Marks & Moss, 2016). STEM and EAF majors and work experience are significant predictors of grades.
Medical College Admission Test (MCAT)
MCAT
The most widely used exam for admission to medical school is the Medical College Admission Test (MCAT). The MCAT has four sections: chemical and physical foundations of biological systems; biological and biochemical foundations of living systems; psychological, social, and biological foundations of behavior; and critical analysis and reasoning skills (Association of Medical Colleges [AAMC], 2018a). The MCAT takes more than 7 hours to administer its 230+ items. Examinees receive a standard score for each section that ranges from 118 to 132 (M ≈ 126, SD ≈ 3) and a total score that ranges between 472 and 528 (M ≈ 502, SD ≈ 9.5). Percentiles are also provided. Preliminary data from predictive validity studies of the MCAT with performance in medical school (GPA) after one year shows the MCAT to be about as predictive as undergraduate GPA (UGPA) with correlations of .56 (MCAT) and .52 (UGPA), respectively (AAMC, 2018b). Together, the MCAT and the UPGA show a correlation of .65 with first-year grades. In addition to the MCAT, medical schools are now looking at other admissions criteria, such as experience, demographics, and type of degree.
The Role of Helpers in the Assessment of Educational Ability
In addition to teachers, a variety of helping professionals can play critical roles in assessing educational ability. For instance, school counselors, school psychologists, learning disabilities specialists, and school social workers often work together as members of the school’s special education team to assess eligibility for assessment of learning disabilities and to help determine a child’s Individualized Education Plan (IEP). School psychologists and learning disability specialists are generally the testing experts who are called upon to do the testing to identify learning problems. Sometimes outside experts in assessment, such as clinical and counseling psychologists, are called upon to do additional assessments or to contribute a second opinion to the school’s assessment.
Because school counselors are some of the few experts in assessment who are housed at the school (school psychologists generally float from school to school), they will sometimes assist teachers in understanding and interpreting educational ability tests. In addition, by disaggregating the data from achievement tests, school counselors and others can play an important role in helping to identify students and classrooms that might need additional assistance in learning specific subject areas. Finally, licensed professionals in private practice need to know about the assessment of educational ability when working with children who are having problems at school. In fact, it is often critical that these clinicians consult with professionals in the schools to assure that the child is making adequate progress (see Box 8.6).
Box
8.6.
Assisting Teachers to Interpret Standardized Testing
A school system once hired me to run a series of workshops for their teachers on how to interpret the results of national standardized testing, such as survey battery achievement test scores. I was quite surprised to learn that few of them had ever taken a course on how to interpret such data and thus had no idea of the important information that could be gained from analyzing these data. I was, however, impressed with the content knowledge that these teachers possessed. It was almost as though I was teaching them Greek (basic test statistics used in test interpretation) and they were teaching me Latin (the meaning of the content areas that were being reported on the test reports). It was certainly a learning experience for all of us!
Final Thoughts About the Assessment of Educational Ability
As you can see from this chapter, the assessment of educational ability has become an important aspect of testing in the United States. Despite the widespread use of these tests, many criticisms have arisen, including the following:
Teachers are increasingly being forced to teach to the test. This prevents them from being creative and limits the kind of learning that takes place in the schools.
Testing leads to labeling and this can cause peers, teachers, parents, and others to treat the child as the label. For the child, the label sometimes becomes a self-fulfilling prophecy that prevents him or her from being able to transcend the label.
Some tests, particularly readiness tests and cognitive ability tests, are just a mechanism to allow majority children to move ahead and to keep minority children behind.
Testing fosters competitiveness and peer pressure, creating a failure identity for a large percentage of students.
On the other hand, many have spoken positively about tests of educational ability and have made the following points:
Tests allow us to identify children, classrooms, schools, and school systems that are performing poorly. This, in turn, allows us to address weaknesses in learning. In fact, evidence already exists that as a result of state standards of learning and the achievement testing associated with them, poor children, minority children, and others who traditionally have not done as well in schools have been doing better academically.
Without diagnostic testing, we could not identify a large portion of those children who have a learning disability, and we would not be able to offer them services to help them learn.
Testing allows a child to be accurately placed in his or her grade level. This ultimately provides a better learning environment for all children.
Testing helps children identify what they are good at and pinpoint weak areas that require added attention.
Probably both the criticism and praises of educational ability testing hold some truth. Perhaps as we realize the positive aspects of such testing we should also pay attention to the criticisms and find ways to address them.
Defining Intelligence Testing
As mentioned in Chapter 1, intelligence testing is a subset of intellectual and cognitive functioning and assesses a broad range of cognitive capabilities that generally results in an “IQ” score (see Figure 1.3 and Box 1.5). Like special aptitude testing, multiple aptitude testing, and cognitive ability testing, intelligence testing measures aptitude, or what one is capable of doing. Special and multiple aptitude testing will be examined in Chapter 10 as types of tests sometimes used in occupational and career counseling. Cognitive ability tests were examined in Chapter 8 as a type of educational ability testing, and although intelligence tests are sometimes used to assess aspects of educational ability, they tend to have much broader applications. For instance, intelligence tests are used:
to assist in determining giftedness;
to assess intellectual disabilities;
to identify certain types of learning disabilities;
to assess intellectual ability following an accident, the onset of dementia, substance abuse, disease processes, and trauma to the brain;
as part of the admissions process to certain private schools; and
as part of a personality assessment battery to aid in understanding the whole person.
But before we examine intelligence tests, let’s take a look at some models of intelligence that have been used as templates for the development of some of the more frequently used intelligence tests.
Models of Intelligence
Some models of intelligence have been around for 100 years, and others are new. Many of them are complicated and all of them have contributed to the development of current day intelligence tests. What follows is a brief overview of some of the more prevalent models, including Spearman’s two-factor approach, Thurstone’s multifactor approach, Vernon’s hierarchical model of intelligence, Guilford’s multifactor/multidimensional model, Cattell’s fluid and crystal intelligence, Piaget’s cognitive development theory, Gardner’s theory of multiple intelligences, Sternberg’s triarchic theory of successful intelligence, and the Cattell-Horn-Carroll (CHC) integrated model of intelligence.
Spearman’s Two-Factor Approach
When Alfred Binet created the first widely used intelligence test, he developed a number of subtests that assessed a range of what he considered to be intellectual tasks. Then, he determined the average scores that individuals, at different age groups, obtained on these tasks. Consequently, when an individual was assessed on the Binet scale, the individual’s score could be compared to the average score of individuals at different age levels. Asserting that such a test was a hodgepodge or “promiscuous pooling” of factors, Charles Edward Spearman (1863–1945) was critical of Binet and others (Spearman, 1970, p. 71). Spearman, in other words, felt that Binet had lumped a number of different factors together in a spurious fashion.
Spearman (1970) believed in a two-factor approach to intelligence that included a general factor (g) and a specific factor (s), with the “weight” of g varying as a function of what was being measured. For example, he stated that the “talent for classics” (e.g., understanding the ancient worlds of Rome and Greece) had a ratio of g to s of 15 to 1; that is, general intelligence is much more significant than any specific ability in understanding the ancient worlds. Conversely, he purported that the ratio of general intelligence (g) to specific talent for music (s) was a ratio of 1 to 4, meaning that having music ability was much more significant in having talent for music than was general intelligence (Spearman, 1970). Although Spearman’s theory was one of the earlier models of intelligence, many still adhere to the concept that there is a g factor that mediates general intelligence and s factors that speaks to a variety of specific talents
Thurstone’s Multifactor Approach
Using multiple-factor analysis, Thurstone developed a model that included seven primary factors or mental abilities. Although Thurstone’s research did not substantiate Spearman’s general factor (g), he did not rule out the possibility that it existed since there appeared to be some commonality among the seven factors (Thurstone, 1938). The seven primary mental abilities he recognized were verbal meaning, number ability, word fluency, perception speed, spatial ability, reasoning, and memory.
Guilford’s Multifactor/MultiDimensional Model
Guilford (1967) originally developed a model of intelligence with 120 factors. As if that were not enough, he later expanded it to 180 factors (Guilford, 1988). His three-dimensional model can be represented as a cube and involves three kinds of cognitive ability: operations, or the general intellectual processes we use in understanding; content, or how we apply our intellectual process; and the products, or how we apply our operations to our content (see Figure 9.2). Different mental abilities will require different combinations of processes, contents, and products. All of the possible combinations are combined to create the (6×6×5)=180 factors. Guilford’s multifactor model provides a broad view of intelligence (Guilford, 1967); however, his model is sometimes considered too unwieldy to implement and has not significantly influenced the testing community.
Cattell’s Fluid and Crystal Intelligence
After attempting to remove cultural bias from intelligence tests, Raymond Cattell observed that as information based on learning was removed from such tests (the portion most affected by cultural influences), the raw or unlearned abilities provided a different score (Cattell, 1971). He then considered the possibility that two “general factors” made up intelligence: “fluid” (gf) intelligence, or that culture-free portion of intelligence that is inborn and unaffected by new learning, and “crystallized” intelligence (gc), which is acquired as we learn and is affected by our experiences, schooling, culture, and motivation (Cattell, 1979). He eventually estimated that heritability variance within families for fluid intelligence was about 0.92, which basically means if your parents have it, you are likely to have it (Cattell, 1980). Abilities such as memory and spatial capability are aspects of fluid intelligence.
As one might expect, crystallized intelligence will generally increase with age, while many research studies have found that fluid intelligence tends to decline slightly as we get older (see Box 9.1). Therefore, many theorists believe that overall intelligence (g) is maintained evenly across the lifespan (see Figure 9.3). As we look at specific intelligence tests later in the chapter, see if you can identify how Cattell’s ideas have influenced their development.
Box
9.1.
Example of Fluid and Crystallized Intelligence
At our university, we have a good mix of traditional aged as well as older adult (35+-year-old) students. It is interesting to observe students when we give exams. The last five or six students left taking an exam will invariably be almost all of the older adult students. Sure, some of the reason is that they may be more careful, but the decrease in fluid intelligence could also be a major factor. Are the older students’ scores lower? Absolutely not. As a matter fact they are at least equal to the younger students, if not higher. Why is this? It might be due to the fact that older students may be experiencing a decrease in fluid intelligence, but their crystallized intelligence is making up for the difference!
— Charlie Fawcett
© Cengage
Figure
9.3.
Fluid and Crystallized Intelligence across the Lifespan
Piaget’s Cognitive Development Theory
Piaget (1950) approached intelligence from a developmental perspective rather than a factors approach. Spending years observing how children’s cognitions were shaped as they grew, he developed the now familiar four stages of cognitive development: sensorimotor, preoperational, concrete operational, and formal operational. Piaget (1950) believed that cognitive development is adaptive; that is, as new information from our environment is presented, we are innately programmed to take it in and make sense of it in some manner to maintain a sense of order and equilibrium in our lives.
Piaget believed that we adapt our mental structures to maintain equilibrium through two methods: assimilation and accommodation. Assimilation is incorporating new stimuli or information into existing cognitive structures. On the other hand, accommodation is creating new cognitive structures and/or behaviors based on new stimuli. For example, a parent might teach a young child that the word hot means that one should stay away from certain items (e.g., a stove, iron, etc.) because touching those items can result in something bad happening. The child then learns that stoves are “hot” and should be avoided. In addition, every time the child is near a hot object, he or she knows that this object should be avoided (e.g., match, coal, frying pan, etc.). The new object has been assimilated as something to be avoided.
As the child grows older, he or she comes to realize that not all hot items are hot all the time, and the child accommodates to this new information. For instance, as the child comes to understand the stove is not hot all the time, he or she creates a new meaning for the concept “stove,” which now is seen as an object that can be hot or cold. Consequently, the child’s behavior around a stove will change as he or she has accommodated this new meaning into his or her mental framework.
Consider how assimilation and accommodation are also important in the learning of important concepts in school, such as the shift between addition and multiplication (multiplication being a type of advanced addition). Also, consider how you have assimilated or accommodated information from this text into your existing structures of understanding! Although Piaget’s understanding of cognitive development does not speak directly to the amount of learning taking place, it does highlight the process of learning—a critical concept for teachers and helpers to understand.
Gardner’s Theory of Multiple Intelligences
Gardner (1999, 2011), who is vehemently opposed to most other models of intelligences and the manner in which intelligence tends to be measured, refers to the predominant notion of intelligence as the “dipstick theory” of the mind; that is, it holds that there is a specific amount or level of intelligence in the brain, and if you could place a dipstick in the brain and pull it out, you should be able to accurately read how smart a person is (Gardner, 1996). In contrast to this approach, he believes that intelligence is much too vast and complex to be measured accurately by our current methods.
Based on his research of brain-damaged individuals, as well as literature in the areas of the brain, evolution, genetics, psychology, and anthropology, Gardner developed his theory of multiple intelligences, which asserts there are eight or nine intelligences, and with more research, others might even be found. Following are the nine identified intelligences, although research on the ninth type of intelligence, existential intelligence, has not clearly established its validity at this point (Gardner, 2003).
Verbal-Linguistic Intelligence: well-developed verbal skills and sensitivity to the sounds, meanings, and rhythms of words
Mathematical-Logical Intelligence: ability to think conceptually and abstractly, and capacity to discern logical or numerical patterns
Musical Intelligence: ability to produce and appreciate rhythm, pitch, and timbre
Visual-Spatial Intelligence: capacity to think in images and pictures, to visualize accurately and abstractly
Bodily-Kinesthetic Intelligence: ability to control one’s body movements and to handle objects skillfully
Interpersonal Intelligence: capacity to detect and respond appropriately to the moods, motivations, and desires of others
Intrapersonal Intelligence: capacity to be self-aware and in tune with inner feelings, values, beliefs, and thinking processes
Naturalist Intelligence: ability to recognize and categorize plants, animals, and other objects in nature
Existential Intelligence: sensitivity and capacity to tackle deep questions about human existence, such as the meaning of life, why do we die, and how did we get here. (Educational Broadcasting Corporation, 2004, section: “What is …”)
Gardner’s understanding of intelligence is revolutionary and not yet mainstream. However, agreement with this theory appears to be growing in academic and nonacademic settings. Not only are his identified categories novel, but his understanding of how intelligence manifests itself is different, as noted in the following list:
All human beings possess a certain amount of all of the intelligences.
All humans have different profiles or amounts of the multiple intelligences (even identical twins!).
Intelligences are manifested by the way a person carries out a task in relationship to his or her goals.
Intelligences can work independently or together, and each is located in a distinct part of the brain (Gardner, 2003, 2011).
Sternberg’s Triarchic Theory of Successful Intelligence
Triarchic theory
Like Gardner, Sternberg (2009, 2012) has a novel view of intelligence that is based on the individual’s capacity at using one’s abilities and talents, navigate one’s environment, and adapt to new situations. He states that
successful intelligence is:
(1)the use of an integrated set of abilities needed to attain success in life, however an individual defines it, within his or her sociocultural context. People are successfully intelligent by virtue of
(2)recognizing their strengths and making the most of them, at the same time that they recognize their weaknesses and find ways to correct or compensate for them. Successfully intelligent people
(3)adapt to, shape, and select environments through
(4)finding a balance in their use of analytical, creative, and practical abilities.
(Sternberg, 2012, pp. 156–157)
Sternberg’s model is composed of what he calls three subtheories (Sternberg, 2012), which he calls the componential, experiential, and contextual (see Figure 9.4).
Figure
9.4.
Sternberg’s Triarchic Theory of Successful Intelligence
© Cengage
In Sternberg’s theory, the componential subtheory, sometimes called analytical facet, is related to the more traditional types of intelligence and has to do with higher-order thinking (metacomponents), how one acts on our higher-order thinking (performance), and the strategies one uses to store and use knowledge (knowledge acquisition).
The experiential subtheory, sometimes called the creative facet, is focused on the ability to deal with novel situations and one’s adeptness at automatically attending to tasks so that one can focus on other tasks (e.g., multitasking). Creative thinkers can focus on novel situations, attend to them, and eventually deal with similar situations in the future in an automatic manner as they no longer are novel.
Finally, Sternberg’s contexutal subtheory, sometimes called the practical facet, has to do with the ability to adapt to the ever-changing envirnoment, being able to shape one’s environment to meet one’s goals, and selecting a new envionrment (“renunciating the old environment”) if adaptation and shaping is not successful.
Although Sternberg believes that his theory is universal, how it is applied can vary in different cultures. For instance, what is seen as important higher-order thinking in one culture, may be different in another culture, and the kinds of novel situations faced in one’s environment will vary from culture to culture (see
Exercise
9.1).
Exercise
9.1.
Using Your Intelligence to Successfully Navigate through School
In class, discuss the kinds of componential intelligence, experiential intelligence, and contextual intelligence that is needed to successfully go through your major or graduate program? How does each type of intelligence impact one another?
Cattell-Horn-Carroll (CHC) Integrated Model of Intelligence
CHC theory
Like those who came before them, Horn and Cattell (1966) examined the idea of multiple abilities. Using factor analysis, they provided evidence that intelligence could be understood in reference to six factors that included fluid intelligence, crystallized intelligence, general visualization, general speediness, facility in the use of concept labels, and carefulness. This work was later expanded by (Carroll, 1993). More recently, Horn and Blankson (2012) theorized eight or nine factors and Carroll (2005) and Schneider & McGrew (2012) reported support for three additional abilities (i.e., kinesthetic ability, olfactory ability, and tactile ability), which, they argued, helps to explain Gardner’s (2003, 2011) multiple intelligences and Sternberg’s (2009, 2012) triarchic theory of successful intelligence.
Using Cattell, Horn, and Carroll’s work, today, an integrated model that consolidates their research has been developed (McGrew, 2009; Schneider & McGrew, 2012). The Cattell-Horn-Carroll (CHC) integrated model includes 16 broad ability factors, 6 of which are tentative (see
Table
9.1). In addition, this approach suggests over 70 narrow abilities that tie into the different factors. Finally, although Carroll (1993) suggested that a g factor mediates the various abilities, Cattell and Horn suggested it did not (Horn & Blankson, 2012). Despite this difference, their theories tie together nicely.
Table
9.1.
CHC Factors Listed under Broad Domains*
Domain Free
Fluid Reasoning (Gf): “… the deliberate but flexible control of attention to solve novel, ‘on the spot’ problems that cannot be performed by relying exclusively on previously learned habits, schemas, and scripts.”
Memory
Short-Term Memory (Gsm): “… the ability to encode, maintain, and manipulate information in ones immediate awareness.”
Long-Term Storage & Retrieval (Glr): “… the ability to store, consolidate, and retrieve information over periods of time measured in minutes, hours, days, and years.”
General Speed
* Psychomotor Speed (Gps): “… the speed and fluidity with which physical body movements can be made.”
Processing Speed (Gs): “… [t]he ability to perform simple repetitive cognitive tasks quickly and fluently.”
Reaction and Decision Speed (Gt): “… [t]he speed of making very simple decisions or judgments when items are presented one at a time.”
Motor
* Kinesthetic Abilities (Gk): “… the ability to detect and process meaningful information in proprioceptive sensations.”
* Psychomotor Abilities (Gp): “… the ability to perform physical body motor movements (e.g., movement of fingers, hands, legs) with precision, coordination, or strength.”
Sensory
* Olfactory Abilities (Go): “… the ability to detect and process meaningful information in odors.”
* Tactile Abilities (Gh): “… the ability to detect and process meaningful information in haptic (touch) sensations.”
Visual Processing (Gv): “… the ability to make use of simulated mental imagery (often in conjunction with currently perceived images) to solve problems.”
Auditory Processing (Ga): “… the ability to detect and process meaningful nonverbal information in sound.”
Acquired Knowledge
Reading and Writing (Grw): “… [t]he depth and breadth of knowledge and skills related to written language.”
Quantitative Knowledge (Gq): “… the depth and breadth of knowledge related to mathematics.”
Comprehension-Knowledge (Gc):” … The depth and breadth of knowledge and skills that are valued by one’s culture.”
* Domain-Specific Knowledge (Gkn): “… the depth, breadth, and mastery of specialized knowledge (knowledge not all members of a society are expected to have).” (McGrew & Schneider, 2012, pp. 111–137)
Source: McGrew, K. S., & Schneider, K. S. (2012). The Cattell-Horn-Carroll model of intelligence. In D. P. Flanagan & P. L. Harrison (Eds.), Contemporary Intellectual Assessment: Theories, Tests, and Issues (3rd ed., pp. 99–144). New York: Guilford Press.
Intelligence Testing
It would make sense that the models of intelligence just discussed are the basis for intelligence tests. So, it is not surprising that over the years a number of intelligence tests have been devised to measure such factors as general intelligence (g), specific intelligence (s), fluid and crystal intelligence, and other factors traditionally seen to be related to intellectual ability. However, one should keep in mind that intelligence tests only measure a portion of the competencies involved with human intelligence. In fact, such tests are best seen as estimates of performance in school, work, and in the broad range of life activities. Although there is likely some innate capacity being measured in intelligence tests, the variability of intelligence test scores is impacted by a wide range of family, cultural, and societal factors (Rose, 2006), and to some degree, IQ is a reflection of how well individuals have mastered middle-class facts, concepts, and problem-solving strategies. This is highlighted by the fact that intelligence tests scores are not necessarily fixed, as some persons will exhibit significant increases or decreases in their measured intellectual abilities.
Although a number of intelligence tests are available today, the Stanford-Binet and the three Wechsler scales of intelligence are the most widely used. Thus, we examine these tests as well as another popular intelligence test known as the Kaufman Assessment Battery for Children, Second Edition. We also briefly introduce the concept of nonverbal intelligence tests and introduce the Comprehensive Test of Nonverbal Intelligence, Second Edition (CTONI-2), the Universal Intelligence Test-2 (UNIT2), and the Wechsler Nonverbal Scale of Ability (WNV).
Chapter 11
Objective Personality Testing
As discussed in Chapter 1, objective personality testing is a type of personality assessment that mostly uses multiple-choice, true/false, and related types of formats to assess various aspects of personality. Each objective personality test measures different aspects of an individual’s personality based on the specific constructs defined by the test developer. For example, the Minnesota Multiphasic Personality Inventory (MMPI) measures psychopathology and is used to assist in the diagnosis of emotional disorders. The Myers-Briggs Type Indicator (MBTI)® measures personality based on a construct created by Carl Jung related to how people perceive and make judgments about the world, and the Substance Abuse Subtle Screening Inventory (SASSI)® is used to assess the probability of one having substance abuse issues. Although these three tests measure very different aspects of personality, they all can be useful in developing a picture of a client. Let’s take a look at some of the more common objective personality tests and explore how they are used.
Projective Testing
In Chapter 1, projective personality testing was defined as a type of personality assessment in which a client is presented a stimulus to which to respond, and subsequently, personality factors are interpreted based on the client’s responses. We noted that such testing is often used to identify psychopathology and to assist in treatment planning.
When interpretations about client responses are made, they are often based on normative data. However, with projective testing clients respond in an open-ended manner to vague stimuli that results in a wide-range of responses that often limits norm-group comparisons. Thus, the validity and reliability of these instruments is not as solid as most objective tests. Therefore, a battery of projective tests should never be used alone. Although they are a powerful tool that can elicit information often not found from objective tests, projective instruments must be paired with objective measures as well as the clinical interview and other collateral data to make a well-rounded assessment. Dozens of projective tests exist, and we explore some of the more prevalent ones in the following sections.
Common Projective Tests
The most popular projective tests include the Thematic Apperception Test (TAT); the
Rorschach Inkblot Test
; the
Bender Visual-Motor Gestalt Test, Second Edition
; the House-Tree-Person (HTP); the Kinetic House-Tree-Person Test (KHTP); the Sentence Completion Series; and the Rotter Incomplete Sentence Blank. The following discussion offers a brief overview of these instruments.
The Thematic Apperception Test and Related Instruments
The Thematic Apperception Test (TAT) was developed by Henry Murray and his colleagues in 1938 and consists of a series of 31 cards with vague pictures on them, although only 8 to 12 cards are generally used during an assessment depending on the age and gender of the client as well as the client’s presenting issues. Showing the cards one at a time, the examiner asks the client to create and describe a story that has a beginning, middle, and end. The storytelling process allows great access to the client’s inner world and shows how that world is affected by the client’s needs and by environmental forces, known as press.
The ambiguous pictures on the cards are more structured than inkblot tests such as the Rorschach; consequently, the TAT tends to draw out from the client issues related more to current life situations than deep-seated personality structures (Groth-Marnat, 2003) (see Figure 11.4). The TAT is based on Murray’s need-press personality theory, which states that people are driven by their internal desires, such as attitudes, values, goals, and so on (needs), or by external stimuli (press) from the environment. Therefore, individuals are constantly struggling to balance these two opposing forces.
The TAT has been extensively researched; however, it still lacks a level of standardization that most objective personality tests have achieved (Groth-Marnat, 2003). There is no universally agreed upon scoring and interpretation method, although most clinicians use a qualitative process of interpreting responses. Hence there is considerable controversy over reliability and validity of the instrument. When scoring systems have been used in a controlled setting, interscorer reliability has been seen to range between 0.83 and 0.92 and test-retest reliabilities between 0.64 and 0.83 (Ronan, Gibbs, Dreer, & Lombardo, 2008); however, if responses are interpreted by clinicians outside of more controlled environments, these figures would likely drop (Groth-Marnat). Studies to show evidence of validity regarding the TAT are controversial with some arguing in favor and others against (see Lilienfeld, Wood, & Garb, 2000 vs. Woike & McAdams, 2001). However, others contend that due to the nature of projective tests, the evidence of validity for these instruments is not as important as for objective measures (Karon, 2000). For instance, some suggest that the rich narrative detail developed through the TAT gives the therapist a unique window into the client’s psyche. Indeed, the value of the TAT seems to be supported by its widespread use. It is the sixth most frequently used test by both clinical and counseling psychologists (Camara et al., 2000) and heavily used by counselors and counselor educators (Neukrug, et al., 2013; Peterson, et al., 2014) (see Table 1 and Table 2 in the Section III Introduction).
Because the cards are dated, and the fact that the human figures in the cards are almost exclusively White, many of the cards may raise historical and cross-cultural bias. The development of Southern Mississippi’s TAT (SM-TAT) and the Apperceptive Personality Test (APT) are two attempts to counter some of these problems. The APT has only eight cards with multicultural pictures and an objective scoring method. Although the SM-TAT and APT are probably superior instruments (more modern, more rigorous methodology, and greater validity), the long tradition of the TAT will probably prevent its replacement (Groth-Marnat, 2003).
In addition to the TAT, the Children’s Apperception Test (CAT), designed for children ages 3 to 10, has been developed. Because children have shorter attention spans than adults, the instrument has only 10 cards, and due to the fact that children tend to relate more easily to animals than humans, the pictures depict animals. In addition, a later version of the CAT, the CAT-H, was made that depicts humans. Despite the development of these instruments, the TAT is still frequently used with children, probably due to its familiarity among clinicians.
Rorschach Inkblot Test
Rorschach Inkblot Test
Herman Rorschach developed his famous inkblot test in 1921 by splattering ink onto pieces of paper and folding them in half to form a symmetrical image (see Figure 11.5). After much experimentation, he chose 10 cards to create the Rorschach Inkblot Test that is still used today. When giving the Rorschach, clinicians show clients the cards, one at a time, and ask them to talk about what they see on the card. A follow-up inquiry with clients addresses issues of what they actually saw, how they saw it, and where on the card it was seen. Ultimately, the clinician wants to see exactly what the client saw on the card.
Figure
11.5.
Inkblot Similar to Those on the Rorschach
Source: Lambert/Archive Photos/Getty Images
Rorschach, a student of Carl Jung, believed the ambiguous shapes of the inkblots allowed the test-taker to project his or her unconscious mind onto these images. By 1959 the Rorschach had become the most frequently used instrument in clinical practice (Sundberg, 1961), and it continues to be one of the most frequently used projective personality test (Camara et al., 2000; Hogan, 2005; Neukrug et al., 2013; Peterson et al., 2014). Although it has had tremendous popularity, it has also been closely scrutinized and criticized. By 1955, more than 3,000 journal articles had been written about it (Exner, 1974), and a recent ERIC and Psyclnfo database search shows more than 8,500 articles in which the Rorschach is cited. The greatest difficulty with the Rorschach has been providing adequate validity. Another challenge of the Rorschach test is that it requires extensive training and practice to use. However, we believe this instrument still has merit and can be a useful tool in the assessment process (see Box 11.1).
Box
11.1.
Rorschach Use in Clinical Practice
Card IV of the Rorschach is known by some as the “father card” because it shows what some see as a hefty, overbearing figure with a large penis. When I was giving the Rorschach to a 17-year-old female high school student, I obtained what is sometimes called a “shock response.” I showed her this card and although she had made fairly normal responses to the other cards, when seeing this card she said to me, rather emphatically, “I see nothing there.” Upon inquiry later, I again asked her to give me a response to that card, at which point she firmly put the card face down and said, “I told you I didn’t see anything there.” When all testing was complete, I looked at the young woman and asked, “Were you molested?” at which point she broke down and started to sob …. This was the beginning of counseling for this young woman who had never shared this secret before.
( Ed Neukrug )
Exner’s scoring system examines location, determinants, and content
One of the most popular scoring systems for the Rorschach was developed by Exner (1974). This system uses three components: location, determinants, and content. Location is the portion of the blot to which the response occurred, and the examinee’s responses are broken down into categories such as the whole blot (w), common details (D), unusual details (Dd), and white space details (S). Determinants are used to describe the manner in which the examinee understood what he or she saw, and these are broken down into
(1)form (“that looks just like a bat”),
(2)color (e.g., “it’s blood, because it’s red”)
(3)shading (“it looks like smoke because it’s grayish-white”).
Finally, content is scored based on 22 categories such as whole human, human detail, animal, art, blood, clouds, fire, household items, sex, and so on. Specific content can hold meaning; for instance, a goat can be an indication of a person being obstinate, or a number of animal responses by an adult could be an indication of immature psychosexual development (children tend to include lots of animals in their responses). Once all of the data have been recorded, a fairly complex series of calculations are used to create numerical ratios, percentages, and derivations with which to interpret the results. Scoring systems such as Exner’s are very complex and are important ways of managing the large amount of interpretive material the client is presenting.
Bender Visual-Motor Gestalt Test, Second Edition
Lauretta Bender originally published the Bender Visual-Motor Gestalt Test in 1938, and after several revisions, it is now called the Bender Gestalt II. A brief test that takes only 5 to 10 minutes to administer, it measures an individual’s developmental level and psychological functioning and is also used to assess neurological deficits after a traumatic brain injury (Pearson, 2018h). The test asks children aged 4 to 7 and individuals aged 8 to 85+ to draw the nine figures shown in Figure
11.6.
Children 4 to 7 have four additional figures to replicate, and individuals 8 to 85 have three additional figures to copy.
Figure
11.6.
The Bender Gestalt Figures
© Cengage
Above are reproductions of four of nine figures of the Bender Visual Motor Gestalt from two children who are developmentally on target. When Hannah couldn’t reproduce figure 6, she became very frustrated. Look at how much easier it is for the older child, Rebecca, to reproduce the figures.
The latest version of the instrument, the Bender Gestalt II, used a norm group of 4,000 individuals who were representative of the 2000 U.S. Census (MHS, n.d.b). This version uses a new, global, five-point scoring system. A score of 0 represents no resemblance or scribbling, and a score of 4 represents a nearly perfect drawing (Brannigan, Decker, & Madsen, 2004). This version provides standard scores, T-scores, and percentile ranks. As you might suspect, the test can examine psychomotor development of children by comparing a child’s response to mean responses of children belonging to the child’s age group; personality factors such as compulsivity in completing the drawing accurately (graduate students!); and neurological impairment, such as might be evidenced when an individual cannot accurately place the diamond in drawing 8 (see Figure 11.6).
Accurately interpreting a number of these factors takes advanced training and should not be attempted without such preparation. The original version of the Bender Gestalt showed some evidence that it was measuring the factors it purported to measure, and reliability data showed test-retest reliability at 0.84 while interrater reliability was shown to be in the low to mid 0.90s (Naglieri, Obrzut, & Boliek, 1992). Evidence of convergent validity is demonstrated in studies examining scores of the Bender-Gestalt II with visual-spatial subtests on the WISC-III (Decker, Allen, & Choca, 2006) and quantitative reasoning, fluid reasoning, and visual-spatial factors on the Stanford-Binet intelligence test (SB-5; Decker, Englund, Carboni, & Brooks, 2011). Predictive validity has been demonstrated by showing that kindergarten children’s Bender-Gestalt scores is predictive of academic achievement, social adjustment, and emotional adjustment one-year later (Bart, Hajami, & Bar-Haim, 2007).
House-Tree-Person and Other Drawing Tests
It’s not quite clear when the use of drawing tests began, but over the last 60 years they have become some of the most popular, simple, and effective projective devices. By asking clients to draw simple pictures, one can gain tremendous insight into a client’s life and perhaps unconscious undertow.
Buck (1948) introduced the original House-Tree-Person (HTP) drawing test when he simply asked clients to draw a house, a tree, and a person on three separate pieces of paper. The Kinetic House-Tree-Person Drawing Test (KHTP) is slightly different in that the client is asked to draw all the figures “with some kind of action” on one sheet of paper (½8½×11 presented horizontally) (Burns, 1987, p. 5). Burns believes the tree is the universal metaphor for human development as seen in religion, myth, poetry, art, and sacred literature:
In drawing a tree, the drawer reflects his or her individual transformation process. In creating a person, the drawer reflects the self or ego functions interacting with the tree to create a larger metaphor. The house reflects the physical aspects of the drama. (p. 3)
Numerous books and materials describe how to specifically interpret the H-T-P and K-H-T-P drawings. Table 11.10 provides examples of a few interpretive suggestions (Burns, 1987).
Table
11.10.
Sample of Suggested K-H-T-P Interpretations (Burns, 1987)
Characteristic Interpretation
General
Unusually large drawings Aggressive tendencies; grandiose tendencies; possible hyper or manic conditions
Unusually small drawings Feelings of inferiority, ineffectiveness or inadequacy; withdrawal tendencies; feelings of insecurity
Very short, circular, sketch strokes Anxiety, uncertainty, depression, and timidity
House
Large chimney Concerns about power, psychological warmth, or sexual masculinity
Very small door Reluctant accessibility; shyness
Absence of windows Suggests withdrawal and possible paranoid tendencies
Tree
Broken or cut-off branches Feelings of trauma
Upward-pointed branches Reaching for opportunities in the environment
Slender trunk Precarious adjustment or hold on life
Person
Unusually large head Overvaluation of intelligence; dissatisfaction with body; preoccupation with headaches
Hair emphasis on head, chest, or beard Virility striving; sexual preoccupation
Wide stance Aggressive defiance and/or insecurity
© Cengage
Other drawing tests exist, such as the Draw-A-Man, Draw-A-Woman, or the Kinetic Family Drawing (KFD), which asks the individual to draw his or her family doing something together. These tests all try to tap into unconscious aspects of the individual’s self by focusing on slightly different content. Drawing tests do not require artistic prowess on the part of the client, are quickly administered, and can often produce important interpretive material for the clinician.
Sentence Completion Tests
“This book is .” Completing a sentence has been used as a projective device since Galton and Jung (see Chapter 1), and although you are not likely to see a sentence stem asking about a specific book, you are likely to see sentence stems that ask you to describe your relationship with your mother, father, spouse, lover, friends, and so on. Two of the more common sentence completion tests are the Sentence Completion Series and the Rotter Incomplete Sentence Blank. In addition, clinicians will sometimes create their own sentence completion tests that can be used when giving a personality battery.
The Sentence Completion Series is a “semi-projective” series of tests for gathering personality and psychodiagnostic information from adolescents and adults. The series contains eight forms with 50 sentence stems per form (PAR, 2012d). Individual forms address specific issues such as family, marriage, work, aging, and so on. An interpretive manual makes very broad suggestions about how to interpret the responses, such as assessing the examinees general tone and level of defensiveness; however, there is no objective scoring methodology. In addition, there is no reliability, validity, or norm data. It is certainly a useful instrument, but its greatest weakness is the lack of any psychometric data (Moreland & Werner, 2001).
The Rotter Incomplete Sentence Blank®, Second Edition (RISB®-2; Pearson, 2018i), is designed for assessing the overall adjustment of high school students, college students, and adults. The instrument has 40 items, which start with one- or two-word sentences that need completing. It takes approximately 20 minutes to complete and has a semi-objective scoring method using a seven-point ordinal scale for rating (Boyle, 1995). The norm group in the manual was college students that brings into question its validity for use with high school students and adults. Researchers have shown internal consistency alphas of 0.78, split-half alphas of 0.76, and inter-rater agreement of 0.92 (Logan & Waehler, 2001; Weis, Toolis, & Cerankosky, 2008). Results also suggested convergent validity between the RISB-2 scores and self-report, parent report, and teacher report. Criterion validity was suggested when shown that students with scores of 140 or higher could be screened for maladaptive behaviors (Weis et al.).
Although questions about the validity and reliability of sentence completion tests and other projective measures remain, it is clear that these instruments can provide a quick method of obtaining a client’s feelings and unconscious thoughts about important issues in the client’s life (Youngstrom, 2013).
The Role of Helpers in Clinical Assessment
Because clinical assessment can add one more piece of knowledge about a client, it should always be considered as an additional tool to use. Therefore, all helpers can, and perhaps should, be involved in some aspects of clinical assessment. For instance, an elementary school counselor might consider using a self-esteem inventory when working with young children to help identify students who are struggling with self-esteem issues, while a high school counselor might want to use any of a wide range of objective personality measures to help identify concerns and aid in setting goals for students. College counselors, agency clinicians, social workers, and private practice professionals all might use a wide range of clinical assessment tools as part of their repertoire to help identify issues and devise strategies for problem solving. Clinicians should reflect on what kinds of clinical assessment tools would benefit their clients at their particular setting and whether they have sufficient training to give such assessments (e.g., projective tests generally require advanced training)
Skip to main content
eTextbook: Essentials of Testing and Assessment: A Practical Guide for Counselors, Social Workers, and Psychologists, Enhanced
- descriptionCover Page
- descriptionTitle Page
- descriptionCopyright Page
- descriptionDedicationi
- descriptionPrefacexv
- descriptionAcknowledgementsxx
- keyboard_arrow_downSection I. Understanding the Assessment Process: H…1
keyboard_arrow_downChapter 1. History of Testing and Assessment4
descriptionLearning ObjectivesdescriptionDistinguishing between Testing and Assessment5
keyboard_arrow_downThe History of Assessment
descriptionAncient HistorydescriptionPrecursors to Modern-Day Test Development6
descriptionThe Emergence of Ability Tests (Testing in the Cog…8
descriptionThe Emergence of Personality Tests (Testing in the…13
descriptionThe Emergence of Informal Assessment Procedures14
descriptionModern-Day Use of Assessment Procedures15
descriptionQuestions to Consider When Assessing Individuals
descriptionSummary18
descriptionChapter Review19
descriptionReferences20
keyboard_arrow_downChapter 2. Ethical, Legal, and Professional Issues…22
descriptionLearning Objectiveskeyboard_arrow_downEthical Issues in Assessment
descriptionOverview of the Assessment Sections of the ACA and…23descriptionStandards for Responsible Testing Practices27
descriptionMaking Ethical Decisions28
keyboard_arrow_downLegal Issues in Assessment29
descriptionThe Family Education Rights and Privacy Act (FERPA…descriptionThe Health Insurance Portability and Accountabilit…30
descriptionPrivileged Communication
descriptionThe Freedom of Information Act
descriptionCivil Rights Acts (1964 and Amendments)31
descriptionAmericans with Disabilities Act (ADA) (PL 101-336)32
descriptionIndividuals with Disabilities Education Act (IDEA)
descriptionSection 504 of the Rehabilitation Act33
descriptionCarl Perkins Career and Technical Education Improv…
keyboard_arrow_downProfessional Issues
descriptionProfessional Associations34descriptionAccreditation Standards of Professional Associatio…
descriptionForensic Evaluations35
descriptionAssessment as a Holistic Process36
descriptionCross-Cultural Issues in Assessment37
descriptionEmbracing Testing and Assessment Procedures38
descriptionSummary39
descriptionChapter Review40
descriptionReferences41
keyboard_arrow_rightChapter 3. Diagnosis in the Assessment Process 44
descriptionLearning ObjectivesdescriptionThe Importance of Diagnosis45
descriptionThe Diagnostic and Statistical Manual (DSM): A Bri…46
keyboard_arrow_rightThe DSM-5
descriptionSingle-Axis vs. Multiaxial Diagnosis47descriptionMaking and Reporting Diagnosis48
descriptionSpecific Diagnostic Categories50
descriptionOther Medical Considerations53
descriptionPsychosocial and Environmental Considerations54
descriptionCultural Considerations
descriptionFinal Thoughts on DSM-5 in the Assessment Process55
descriptionSummary
descriptionChapter Review57
descriptionAnswers to Exercise
descriptionReferences58
keyboard_arrow_rightChapter 4. The Assessment Report Process: Intervie…60
descriptionLearning ObjectivesdescriptionPurpose of the Assessment Report
descriptionGathering Information for the Report: Garbage in, …61
keyboard_arrow_rightStructured, Unstructured, and Semi-Structured Inte…62
descriptionComputer-Driven Assessment64descriptionChoosing an Appropriate Assessment Instrument65
keyboard_arrow_rightWriting the Report
descriptionDemographic Information66descriptionPresenting Problem or Reason for Referral
descriptionFamily Background67
descriptionSignificant Medical/Counseling History
descriptionSubstance Use and Abuse68
descriptionEducational and Vocational History
descriptionOther Pertinent Information
descriptionMental Status69
descriptionAssessment Results73
descriptionDiagnosis74
descriptionSummary and Conclusions75
descriptionRecommendations
descriptionSummarizing the Writing of an Assessment Report76
descriptionSummary78
descriptionChapter Review
descriptionReferences79
- keyboard_arrow_downSection II. Test Worthiness and Test Statistics82
keyboard_arrow_rightChapter 5. Test Worthiness: Validity, Reliability,…83
descriptionLearning ObjectivesdescriptionCorrelation Coefficient85
descriptionCoefficient of Determination (Shared Variance)87
keyboard_arrow_rightValidity
descriptionContent Validity88descriptionCriterion-Related Validity90
descriptionConstruct Validity92
descriptionVisual Representation of Types of Validity94
keyboard_arrow_rightReliability95
descriptionTest-Retest Reliability96descriptionAlternate, Parallel, or Equivalent Forms Reliabili…
descriptionInternal Consistency97
descriptionVisual Representation of Types of Reliability98
descriptionItem Response Theory: Another Way of Looking at Re…99
descriptionCross-Cultural Fairness100
keyboard_arrow_rightPracticality
descriptionTimedescriptionCost
descriptionFormat103
descriptionReadability
descriptionEase of Administration, Scoring, and Interpretatio…
keyboard_arrow_rightSelecting and Administering a Good Test
descriptionStep 1: Determine the Goals of Your Client104descriptionStep 2: Choose Instrument Types to Reach Client Go…
descriptionStep 3: Access Information about Possible Instrume…105
descriptionStep 4: Examine Validity, Reliability, Cross-Cultu…106
descriptionStep 5: Choose an Instrument Wisely
descriptionSummary107
descriptionChapter Review
descriptionReferences108
keyboard_arrow_rightChapter 6. Statistical Concepts: Making Meaning ou…111
descriptionLearning ObjectivesdescriptionRaw Scores112
descriptionFrequency Distributions113
descriptionHistograms and Frequency Polygons114
descriptionCumulative Distributions116
keyboard_arrow_rightNormal Curves and Skewed Curves
descriptionThe Normal Curve117descriptionSkewed Curves118
keyboard_arrow_rightMeasures of Central Tendency
descriptionMeandescriptionMedian119
descriptionMode
keyboard_arrow_rightMeasures of Variability
descriptionRange120descriptionInterquartile Range121
descriptionStandard Deviation123
descriptionRemembering the Person125
descriptionSummary126
descriptionChapter Review
descriptionReferences127
keyboard_arrow_rightChapter 7. Statistical Concepts: Creating New Scor…128
descriptionLearning ObjectivesdescriptionNorm Referencing versus Criterion Referencing129
keyboard_arrow_rightNormative Comparisons and Derived Scores130
descriptionPercentilesdescriptionStandard Scores131
descriptionDevelopmental Norms140
descriptionPutting It All Together141
descriptionStandard Error of Measurement142
descriptionStandard Error of Estimate144
keyboard_arrow_rightScales of Measurement
descriptionNominal Scale146descriptionOrdinal Scale
descriptionInterval Scale
descriptionRatio Scale147
descriptionSummary148
descriptionChapter Review 149
descriptionReferences150
- keyboard_arrow_downSection III. Commonly Used Assessment Techniques152
keyboard_arrow_downChapter 8. Assessment of Educational Ability: Surv…158
descriptionLearning ObjectivesdescriptionDefining Assessment of Educational Ability159
keyboard_arrow_downSurvey Battery Achievement Testing160
descriptionNational Assessment of Educational Progress (NAEP)161descriptionStanford Achievement Test163
descriptionIowa Assessments164
descriptionMetropolitan Achievement Test
keyboard_arrow_downDiagnostic Testing166
descriptionThe Wide Range Achievement Test 5 (WRAT5)167descriptionWechsler Individual Achievement Test—Third Edition…
descriptionPeabody Picture Vocabulary Test, Fourth Edition (P…
descriptionWoodcock-Johnson® IV
descriptionKeyMath3™ Diagnostic Assessment171
keyboard_arrow_downReadiness Testing
descriptionSchool Readiness Test, Fourth Edition172descriptionKindergarten Readiness Test (KRT)173
descriptionMetropolitan Readiness Test (MRT6)
descriptionGesell Developmental Observation—Revised174
keyboard_arrow_downCognitive Ability Tests
descriptionOtis-Lennon School Ability Test, Eighth Edition (O…175descriptionThe Cognitive Ability Test (CogAT)178
descriptionCollege and Graduate School Admission Exams179
descriptionThe Role of Helpers in the Assessment of Education…183
descriptionFinal Thoughts About the Assessment of Educational…184
descriptionSummary185
descriptionChapter Review186
descriptionReferences187
keyboard_arrow_downChapter 9. Intellectual and Cognitive Functioning:…191
descriptionLearning ObjectivesdescriptionA Brief History of Intelligence Testing
descriptionDefining Intelligence Testing192
keyboard_arrow_downModels of Intelligence
descriptionSpearman’s Two-Factor Approach193descriptionThurstone’s Multifactor Approach
descriptionVernon’s Hierarchical Model of Intelligence
descriptionGuilford’s Multifactor/MultiDimensional Model
descriptionCattell’s Fluid and Crystal Intelligence194
descriptionPiaget’s Cognitive Development Theory
descriptionGardner’s Theory of Multiple Intelligences197
descriptionSternberg’s Triarchic Theory of Successful Intelli…198
descriptionCattell-Horn-Carroll (CHC) Integrated Model of Int…
descriptionTheories of Intelligence Summarized
keyboard_arrow_rightIntelligence Testing200
descriptionStanford-Binet, Fifth Edition201descriptionWechsler Scales204
descriptionKaufman Assessment Battery for Children, Second Ed…
descriptionNonverbal Intelligence Tests209
keyboard_arrow_rightNeuropsychological Assessment
descriptionA Brief History of Neuropsychological AssessmentdescriptionDefining Neuropsychological Assessment211
descriptionMethods of Neuropsychological Assessment212
descriptionThe Role of Helpers in the Assessment of Intellect…215
descriptionFinal Thoughts on the Assessment of Intellectual a…
descriptionSummary216
descriptionChapter Review218
descriptionReferences220
keyboard_arrow_rightChapter 10. Career and Occupational Assessment: In…222
descriptionLearning ObjectivesdescriptionDefining Career and Occupational Assessment
keyboard_arrow_rightInterest Inventories223
descriptionStrong Interest Inventory® 224descriptionSelf-Directed Search
descriptionCOPSystem229
descriptionO*NET and Career Exploration Tools231
descriptionOther Common Interest Inventories
keyboard_arrow_rightMultiple Aptitude Testing
descriptionFactor Analysis and Multiple Aptitude Testing234descriptionArmed Services Vocational Aptitude Battery and Car…235
descriptionDifferential Aptitude Tests238
keyboard_arrow_rightSpecial Aptitude Testing
descriptionClerical Aptitude Tests239descriptionMechanical Aptitude Tests240
descriptionArtistic Aptitude Tests241
descriptionMusical Aptitude Tests
descriptionThe Role of Helpers in Occupational and Career Ass…242
descriptionFinal Thoughts Concerning Occupational and Career …
descriptionSummary243
descriptionChapter Review
descriptionReferences245
keyboard_arrow_rightChapter 11. Clinical Assessment: Objective and Pro…248
descriptionLearning ObjectivesdescriptionDefining Clinical Assessment
keyboard_arrow_rightObjective Personality Testing
descriptionCommon Objective Personality Tests249keyboard_arrow_rightProjective Testing267
descriptionCommon Projective Tests268descriptionThe Role of Helpers in Clinical Assessment
descriptionFinal Thoughts on Clinical Assessment275
descriptionSummary276
descriptionChapter Review
descriptionReferences278
keyboard_arrow_rightChapter 12. Informal Assessment: Observation, Rati…282
descriptionLearning ObjectivesdescriptionDefining Informal Assessment
keyboard_arrow_rightTypes of Informal Assessment283
descriptionObservation284descriptionRating Scales285
descriptionClassification Methods288
descriptionEnvironmental Assessment291
descriptionRecords and Personal Documents293
descriptionPerformance-Based Assessment298
keyboard_arrow_rightTest Worthiness of Informal Assessment299
descriptionValiditydescriptionReliability300
descriptionCross-Cultural Fairness302
descriptionPracticality
descriptionThe Role of Helpers in the Use of Informal Assessm…
descriptionFinal Thoughts on Informal Assessment303
descriptionSummary304
descriptionChapter Review
descriptionReferences305
- descriptionAppendix A. Websites of Codes of Ethics of Select …307
- descriptionAppendix B. Assessment Sections of ACA’s and APA’s…309
- descriptionAppendix C. Code of Fair Testing Practices in Educ…315
- descriptionAppendix D. Sample Assessment Report 322
- descriptionAppendix E. Supplemental Statistical Equations328
- descriptionAppendix F. Converting Percentiles from z-Scores330
Jump to Page
GomenueTextbook: Essentials of Testing and Assessment: A Practical Guide for Counselors, Social Workers, and Psychologists, Enhancedmenu_openeTextbook: Essentials of Testing and Assessment: A Practical Guide for Counselors, Social Workers, and Psychologists, Enhancedhelp_outlineQuick Tour
printPrint
searchSearch
bookAnnotations
text_fieldsAccessibilitycloseAccessibility options
Font Size:
A
A
Text FontDefault
Open Sans
Lucida Sans Unicode
Tahoma
Trebuchet MS
Georgia
Verdana
Times New Roman
Arial
sans-serif
Segoe UI
Background ColorDefault
Normal
Sepia
Black
Read Aloud
Listen to this page
replay_5
play_circle_filled
forward_5
stop
Reading speedSlowNormalFast
Voice
Male – Australia
Male – United Kingdom
Female – United States
Male – United States
bookmark_borderBookmark
more_vertTerms and Conditions
Privacy Policy
Help & Support
keyboard_arrow_leftPrevious pageCattell-Horn-Carroll (CHC) Integrated Model of Intelligence
Next pageIntelligence Testingkeyboard_arrow_right
Theories of Intelligence Summarized has loaded
replay_5
play_circle_filled
forward_5
stop
cancel