Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
207 views26 pages

Coaching Issues: Barbara Griffin

Coached students

Uploaded by

Lil Cosi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
207 views26 pages

Coaching Issues: Barbara Griffin

Coached students

Uploaded by

Lil Cosi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

9

Coaching Issues
Barbara Griffin

Overview
The use of commercial coaching services to gain a competitive advan-
tage has become an issue across industries and countries, particularly in
the high-stakes contexts of employment selection and admission into
sought-after university degrees including law, business and medicine.
Indeed, it appears so pervasive that by 2004 McGagie, Downing, and
Kulilius (p. 203) were claiming that commercial test preparation had
become “part of the fabric of U.S. medical education”.
This chapter begins with a definition of the term “coaching” and a
brief description of its development within educational settings, includ-
ing information on the prevalence of the use of commercial coaching.
A discussion of the potential impact of coaching, both positive and neg-
ative, follows. Empirical evidence is then examined, considered in the
light of methodological limitations of the extant literature. The chapter

B. Griffin (*) 
Macquarie University, Sydney, Australia
e-mail: [email protected]
© The Author(s) 2018 223
F. Patterson and L. Zibarras (eds.), Selection and Recruitment in the Healthcare
Professions, https://doi.org/10.1007/978-3-319-94971-0_9
224    
B. Griffin

concludes with a summary of the practical implications, illustrated by a


case study of how one university adjusted its process to limit the effect
of coaching on selection decisions. Given the limited research available
in the broader healthcare professions, most of the statistics presented
here relate to medicine and the selection of medical students.
By the end of this chapter you should:

• understand how and why commercial coaching has developed into a


significant industry
• be able to evaluate the potential benefits and risks of coaching for
selection of healthcare professionals
• have some evidence for developing processes designed to limit any
unwanted effects of coaching on selection decisions.

Definition, Description and Prevalence


There is a broad range of education and training activities encompassed
under the umbrella term of “coaching”. For example, athletes can be
coached to improve their underlying skill and in educational settings
students can seek external support in developing skills such as under-
standing mathematics, improving essay-writing skill, or revising resus-
citation techniques. However, the coaching discussed in this chapter
concerns commercial training in test-taking strategies. This form of
coaching focuses on a specific test (usually used for selection purposes)
with the aim of improving test scores, regardless of whether or not the
underlying skill or ability measured by the test is improved.
Lievens, Buyse, Sackett, & Connelly (2012, p. 273) define this sort
of coaching as “a formal intervention of a coaching firm to teach candi-
dates test-related content and test-taking strategies (i.e., test familiariza-
tion, drill and practice with feedback, training in strategies for specific item
formats and for general test taking). ” The focus is typically on test content
and drill rather than skill development and is largely instructor-driven.
Commercial coaching implies that applicants must pay to attend such
a program. Within the education literature, commercial coaching falls
9  Coaching Issues    
225

under the rubric of what is classed as “shadow education”, being activities


that fall outside of formal schooling (Buchman, Condron, & Roscigno,
2010). Researchers typically distinguish the commercial coaching indus-
try from the large array of informal or freely available preparation activi-
ties. Some describe the latter as “test familiarization,” which is driven by
the applicant and likely to reduce individual differences in test familiarity
(Arendasy, Sommer, Gutierrez-Lobos, & Punter, 2016).
The first recorded commercial coaching company was founded by
Stanley Kaplan in 1938. His focus was on increasing clients’ scores on
the Scholastic Aptitude Test (SAT), used for selection into tertiary edu-
cation in the United States of America. With successful marketing, it
was not long before the coaching industry was flourishing. It remains
a significant force today, with substantial revenue. In 2016, Kaplan
Inc’s test preparation division alone earned US$287 million (Graham
Holdings, 2016), with one of its Medical College Admission Test
(MCAT) courses advertised for US$9499 per person. A basic coach-
ing program for the Undergraduate Medical and Health Sciences
Test (UMAT), used in Australia and New Zealand for admission into
health-related training programs, can cost up to AUS$1970 (MedEntry,
2017). In Britain, expensive coaching courses for the UK Clinical
Aptitude Test (UKCAT) are not only offered locally, but also in many
countries across Asia.
The coaching industry appears to respond quite rapidly to trends in
selection, seen in the development of manuals and courses designed
to specifically address the growing use of situational judgement tests
for selection (Lievens et al., 2012) and multiple mini interviews (e.g.,
To, 2013). The companies, which generally exist outside national edu-
cation accreditation regulations, undertake quite aggressive marketing,
arguably capitalizing on the anxiety and vulnerability of applicants who
face significant hurdles in becoming one of the small proportion to
actually be offered a place in a medical or other health science degrees
(Tompkins, 2011). For example, one company (MedEntry, 2017) not
only asserts on its website that that 90% of their “students” have been
offered places in one or more medical schools but makes the somewhat
226    
B. Griffin

grandiose claim that “our training is geared towards making you a bet-
ter person, with better reasoning and problem-solving skills, and emotional
intelligence.”
But just how prevalent is the uptake of commercial coaching?
Despite anecdotal accounts of its ubiquitous use in high stakes selection
contexts, there are limited data available (Stemig, Sackett, & Lievens,
2015), and figures may be downwardly biased if applicants believe they
may somehow be disadvantaged by admitting to having been coached.
In relation to applicants to medical schools, Stemig et al. (2015) report
that while 69% of those applying in Belgium completed freely available
practice exercises, just 26% paid for coaching. In a study of those sitting
the UKCAT (Lambe, Waters, & Bristow, 2012), only 9% of respond-
ents admitted to attending a commercial coaching course, although
significantly more made use of commercially available resources. This
contrasts with the figures from New Zealand (Wilkinson & Wilkinson,
2013), where 60% of applicants to healthcare programs said they were
coached by a commercial company. In Australia, reported prevalence
ranges from 36% (Laurence, Zajac, Lorimer, Turnbull, & Sumner,
2013) to 56.2% (Griffin, Carless, & Wilson, 2013a), while one study
in Israel indicated 20–25% of applicants attended commercial coach-
ing (Moshinsky, Ziegler, & Gafni, 2017). Similarly, an early study of
MCAT test takers in North America (Jones, 1986), estimated 25% used
commercial coaching. However, this figure is unlikely to reflect current
prevalence given increasing competition for places and evidence that the
number of MCAT practice tests sold increased by over 20,000 in the
two years from 2007 to 2009 (Matthew, 2010).

Potential Impact of Coaching


This section outlines the four major areas that raise concerns in terms of
the impact of coaching on the use of selection tests, namely the impli-
cations for widening participation of disadvantaged applicants, the pos-
sible impact on the construct validity and predictive validity of the test,
and how applicant well-being might be affected.
9  Coaching Issues    
227

Widening Participation

Given the potential for higher education to improve social mobility


(Brown, Reay, & Vincent, 2013), governments have enacted policies that
encourage and enable those from disadvantaged groups to enroll in univer-
sity degrees. Initially the focus was on gender and ethnicity where there has
been some success in widening participation (Cleland, Nicholson, Kelly, &
Moffat, 2015), but more recently attention has turned to the lack of diver-
sity with regard to socioeconomic status (SES). This is highlighted within
healthcare training programs, particularly medicine, psychology, and den-
tistry. Griffin and Hu (2015), for example, showed how there are not only
fewer low SES background applicants for medical degrees, but that the
cognitive ability testing used in making selection decisions has a further
adverse impact on these applicants, disproportionally reducing their suc-
cess rate. The high cost of commercial coaching (see above) may be one
factor that accounts for these findings (see Chapter 10 for a review).
As discussed later, many applicants have quite an entrenched view on
the efficacy of commercial coaching (Moshinsky et al., 2017; Wilkinson
& Wilkinson, 2013). Low SES individuals, with a lack of financial
resources to access coaching, may not even bother applying if they
believe that their chance of success is minimal without it. Alternatively,
if one of few aspirants in a low SES high school environment, they may
not even be aware of opportunities for coaching regarding university
entrance tests (Arendasy, Sommer, Gutierrez-Lobos, & Punter, 2016).
If such coaching does actually increase test scores, then lack of access
to coaching (either because one can’t afford it or lives remote from the
urban training centers) creates an issue of measurement fairness in the use
of testing. Not only ethically unfair that they don’t have the opportunity
to increase performance, but psychometrically biased because, as outlined
in the next section, the selection test could measure different latent con-
structs for uncoached compared to coached applicants, invalidating any
rank ordering (Arendasy et al., 2016). So seriously is this viewed that
some have argued that shadow education activities, such as coaching,
could well magnify inequality and undermine efforts to provide educa-
tion fairly across all sectors of a country (Arendasy et al., 2016).
228    
B. Griffin

Construct Validity

Coaching may alter the construct validity (what is being measured) of a


test, in either a positive or negative manner.
First, if coaching familiarizes people with test format, reduces their
anxiety, and increases their self-confidence about being able to per-
form on the test, then it can actually remove error variance from the
test scores thereby enhancing construct validity. In other words, the test
score has the noise, or construct-irrelevant variance, taken out of it and
is therefore more likely to reflect whatever it is supposed to measure
(Ryan, Ployhart, Greguras, & Schmit, 1998).
Second, coaching may have the opposite effect by increasing
­construct-irrelevant variance, thus undermining measurement by arti-
ficially inflating scores. This outcome has been attributed to coached
applicants having increased “test-wiseness”, defined as understanding
rules or procedures for how to take a test (Chung-Herrera, Ehrhart,
Solamon, & Kilian, 2009). When coaching focuses on drill and teach-
ing specific strategies for answering test items, it can result in higher

Fig. 9.1  Example item of the UMAT Section 3 (used with permission UMAT
Candidate Information Booklet (ACER, 2017)
9  Coaching Issues    
229

scores without actually altering the test-taker’s true level of the construct
(e.g., cognitive ability) that the test was designed to measure (Millman,
Bishop, & Ebel, 1965).
Arendasy et al. (2016) describe a third alternative (other than coach-
ing having no effect on construct validity), whereby coaching might
increase the very specific ability assessed in the test but the improved
ability will only be for those types of test items. The ability to generalize
knowledge or ability beyond that domain will be significantly compro-
mised. For example, a test such as the UMAT Section 3 assesses abstract
reasoning via items that require non-verbal pattern recognition, such as
identifying the next picture in a series of pictures (see Fig. 9.1 for an
example). Coaching to assist in answering these types of items might
well improve performance on similar items but will not generalize to
improving overall abstract reasoning.

Predictive Validity

It follows that if the construct validity of a test is not altered then there
should be no changes to its predictive validity. However, if coach-
ing removes construct-irrelevant variance (e.g., anxiety, slowness due
to unfamiliarity) then predictive validity may well be improved. If,
on the other hand, the test is a less pure assessment of the quality/
construct it was designed to measure due to the effect of coaching, we
might expect that the predictive validity of the test will be degraded.
In this case, scores of coached applicants should over-predict their
later performance—they will not, for instance, achieve as good a GPA
once enrolled in a degree as their selection test result might indicate.
Evidence of measurement bias such as altered predictive validity, could
increase the risk of litigation for any institution using the test.

Applicant Well-Being

Much of the attention in research on coaching has been from the per-
spective of the educational/employing institution, with far less focus on
the applicant’s reaction to coaching. And yet, if we have a duty of care
230    
B. Griffin

to applicants and future students, then the potential effect of coaching


on their well-being is worth considering.
Selection into one’s preferred training program can be the fulfil-
ment of long-held career aspirations. When high levels of uncertainty
are attached to this outcome due to the extremely competitive nature
of the admissions process, it is bound to engender stress and anxiety in
the applicant. It appears that many applicants are of the firm belief that
coaching will improve their chances of selection (Bardes, Best, Kremer,
& Dienstag, 2009; Wilkinson & Wilkinson, 2013), and so it is not sur-
prising that there is a relatively high uptake of commercial coaching.
There is little research to show if coaching does reduce this test-related
anxiety. However, anecdotally, it seems that the coaching industry, by
means of aggressive marketing, capitalizes on this selection anxiety and
may even make it worse in their effort to secure more clients.
Also, not generally considered is the ongoing effect on coached
applicants who end up being selected. If their test scores over-predict
performance, we would expect students who were coached on admis-
sion tests who fall into the category of “false positive” (should not have
been selected due to inadequate true ability) to be under-performing
academically; possibly at risk of failure. By decaying the validity of
a selection decision, coaching might have a long-term effect of cre-
ating greater stress in students who struggle to maintain satisfactory
performance.
In the next sections, the empirical evidence regarding the four issues
identified above is presented.

Empirical Evidence of the Effect of Coaching


There are decades of research in educational settings on the effect of
test preparation, including several large meta-analyses. Much of this
has been concentrated on the Scholastic Aptitude Test (SAT), used for
selection into college in North America. More recently, researchers have
turned from general college admission to the more high-stakes con-
text of medical student selection. Before examining these findings, it is
9  Coaching Issues    
231

important to understand the methodological issues that can influence


studies of coaching, to the extent that the design might result in an
under- or over-estimation of any effects. The next section reviews four
factors that potentially impact study results: type of participant, type of
test, type of coaching, and study design.

Methodological Challenges to Studying Coaching Effects

First is the problem of generalizability of findings across participant


groups. There is likely to be, for example, a greater range of ability
among those undertaking the SAT exam (bulk of research studies)
than those who are applying for medicine or another high-stakes, high
ability-requirement degree. Not only are there restriction of range
issues in the latter, but studies have shown that pre-existing ability
influences the extent that an individual will benefit from coaching
(Griffin, Carless & Wilson, 2013a). Therefore, when there is a broader
range of ability among a coached group the effect of coaching might
be diminished.
Second is the problem of generalizability of findings across tests. It
appears that some tests and some test items (e.g., the UMAT Section 3
items compared to the UMAT Section 2 items) are more easily coacha-
ble than others (Domingue & Briggs, 2009; Griffin et al., 2013a) and
therefore omnibus conclusions about the effectiveness of coaching may
not be warranted.
Third is the problem of generalizability of findings across types of
coaching. As indicated above, coaching activities can range from simple
test familiarization to paid one-on-one tutoring by commercial compa-
nies. Studies (e.g., Lievens et al., 2012) that compare these typically find
differential effects, which can also depend on the length of time spent in
the coaching “intervention” (Ryan et al., 1998).
Perhaps more importantly is the actual study design. Apart from the
fact that the majority, if not all, field studies rely on self-reports of
coaching behavior, a significant proportion of the empirical work on
the effectiveness of coaching has been criticized for its lack of rigorous
232    
B. Griffin

control and potential bias through the unavoidable self-selection that


occurs into coaching programs (McGaghie et al., 2004). For exam-
ple, applicants who seek coaching are found to have high motivation
and higher test anxiety than those who choose not to be coached, and
they are more likely to be male, from a non-minority culture, and of
higher SES (Domingue & Briggs, 2009; Ryan et al., 1998; Stemig et al.,
2015). It may well be these differences that influence test scores rather
than the coaching itself. Indeed, Powers (1993) reports that research-
ers who do not control for self-selection find effects four to five times
greater than the more rigorously designed studies.
Although laboratory studies allow random assignment into coached
and uncoached groups, the fidelity of these is questionable as the typical
research participant is unlikely to have the same motivation as a real-
world applicant. However, because of the high-stakes nature of many of
the selection contexts, it would not really be ethical to conduct prospec-
tive randomized trials on real-world applicants. More recently, propen-
sity score matching has been used to overcome some of the self-selection
bias effects (see Domingue & Briggs, 2009; Lievens et al., 2012).
Propensity scoring involves identifying a set of covariates likely to influ-
ence outcomes, then estimating the probability each participant has of
being in the coached group based on these. This enables comparisons
between coached and uncoached groups by matching individuals with
equal probabilities of having been coached.

Research Findings on Coaching Effects

As mentioned, the majority of early studies examined the effect of


coaching on performance in the SAT, followed by studies of general
cognitive ability or IQ tests. These have been summarized in a num-
ber of meta-analyses (e.g., Hausknecht, Halpert, Di Paolo, & Moriarty
Gerrard, 2007; Kulik, Bangert-Drowns, & Kulik, 1984; te Nijenhuis,
van Vianen, & Flier, 2007), which will be referred to briefly below.
However, most of the cited studies in this section have been chosen for
their relevance to healthcare (albeit predominantly in the selection of
medical students).
9  Coaching Issues    
233

Relationship of Socioeconomic Status (SES) to Coaching

Although limited, findings indicate that those from low SES back-
grounds are less likely to engage in commercial coaching for college
entry tests such as the SATs (Buchmann et al., 2010). With regard to
applicants to medical school, Griffin (2016) reported that those from
lower SES areas had a lower rate of coaching than those from high SES
areas, and longer time spent living in an area of inequality predicted
lower likelihood of engaging in coaching. However, it seems that this
difference may not be due solely to financial reasons as Stemig et al.
(2015) showed that low SES background applicants for medical school
were also less likely to attend freely available information sessions.

Effect of Coaching on Selection Test Performance

The following section reflects the fact that there is quite an extensive
literature on the effectiveness of coaching on tests of cognitive ability
(such as the SAT, IQ tests, MCAT, and UMAT), with far fewer studies
on other types of tests (such as interviews or personality tests).

1. Tests of cognitive ability

Meta-analytic evidence of tests used on wider populations, including


the SAT (Kulik et al., 1984) and cognitive ability tests used in employee
selection (Hausknecht et al., 2007), suggest that coaching has a signifi-
cant positive effect on performance, although the incremental variance
of commercial coaching over mere test familiarization is relatively small
at approximately 0.25 SD.
The picture is less clear when it comes to tests used in medical stu-
dent selection such as the MCAT, UKCAT, and UMAT. Jones (1986)
and McGaghie et al. (2004) found small to minimal positive effects
respectively in studies on the effects of commercial coaching on the
MCAT. Lambe et al. (2012) reported that coaching per se did not
impact performance on the UKCAT, but time spent practicing or being
coached did have a small positive relationship with scores. In contrast,
234    
B. Griffin

the four recent studies that investigated the effects of coaching on


the UMAT (Griffin et al., 2008; Griffin, Carless & Wilson, 2013a;
Laurence et al., 2013; Wilkinson & Wilkinson, 2013) showed a small
benefit of coaching when overall score was examined. However, this
benefit was driven by just one of the three sections of the test.
The UMAT’s three sub-sections are problem-solving, which relates
primarily to verbal reasoning; understanding people, which relates to
verbal and emotional intelligence; and non-verbal reasoning, which
relates most strongly to numerical reasoning (Griffin, Carless & Wilson,
2013b). Griffin and colleagues (2008, 2013a) and Wilkinson and
Wilkinson (2013) showed that commercial coaching had no ability to
increase scores for the items assessing problem-solving or understanding
people but did increase the non-verbal reasoning score. The non-verbal
section is similar to tests like the Raven’s progressive matrices (Raven,
Raven, & Court, 1998), which consists of geometric designs with a
missing piece that the test-taker must identify from a number of possi-
ble choices. Such tests require rapid pattern recognition, but apparently,
the solutions depend on a limited number of rules, which are likely
to be implicitly learnt with regular practice (Carpenter, Just, & Shell,
1990; Verguts & de Boeck, 2002). Interestingly, this differential effect
of coaching is found in the non-medical domain, where verbal tests
appear to have limited coachability, while numerical tests seem some-
what more sensitivity to coaching interventions (Domingue & Briggs,
2009; Hausknecht et al., 2007; Ryan et al., 1998). Furthermore, there
is consistent evidence (e.g., Griffin et al., 2008; Kulik et al., 1984; Ryan
et al., 1998) that, contrary to expectations, the only applicants who
benefit from coaching are those of high ability, who would probably do
well regardless.
These studies of cognitive ability tests highlight an important feature
of the effect of coaching on the construct validity of a test. Briefly, there
is evidence supporting the hierarchical structure of cognitive ability (see
Carroll, 1993), whereby narrow abilities (e.g., verbal, quantitative, and
abstract reasoning) are sub-facets of a general cognitive ability factor,
widely referred to as “g ”. The variance in a cognitive ability test score
is comprised of variance due to the narrow ability that is targeted by
the test plus variance due to g. It is the g component of each test that
9  Coaching Issues    
235

is generalizable across domains and drives predictive validity for later


performance (te Nijenhuis et al., 2007). There is ample evidence that a
test’s g saturation is reduced by coaching (Estrada, Ferrer, Abad, Román, &
Colom, 2015; Hayes, Petrov, & Sederberg, 2015; te Nijenhuis,
Voskuijl, & Schijve, 2001; te Nijenhuis et al., 2007). In other words,
coaching does not increase g or the applicant’s general intelligence, only
(if anything) the very narrow ability assessed by the test. This principle
was underscored in a major recent review of so-called “brain training”
exercises (Simons et al., 2016), which demonstrated how practice and
coaching on cognitive exercises may improve performance on closely
related tasks but has no effect in improving everyday cognitive perfor-
mance. Similarly, a study by Arendasy et al. (2016) of applicants to the
Medical University of Vienna found that test preparation (including
commercial coaching) did not lead to increases in g, and that gains in
test-specific factors only exhibited minimal transfer to general science
knowledge. These authors argue that coaching does not compromise the
fair use of cognitive ability tests for selection.
In summary, applicants who engage in coaching may sometimes
achieve higher cognitive ability test scores, but these scores reflect a dif-
ferent construct than those achieved by uncoached applicants, although
it may not be sufficient to alter predictive validity. Nevertheless, there
is little research on how much coaching gains in cognitive ability tests,
which are generally small in medical selection contexts, actually result in
an improved chance of selection.

2. Situational judgment tests

In recent years, there has been quite a rapid uptake of the use of situa-
tional judgment tests (SJTs) for high-stakes selection, including in med-
ical contexts. Perhaps unsurprisingly, this has been accompanied by a
growth in coaching firms targeting SJTs in their courses and material
(Lievens et al., 2012).
Lievens et al. (2012) estimated the positive effects of coaching
for SJT performance to be as large as 0.5 SDs. In a subsequent study
(Stemig et al., 2015), a positive effect was also found for other test
familiarization activities (attending an information session, completing
236    
B. Griffin

practice exercises, etc.) although private tutoring actually had a negative


effect on performance. Furthermore, the authors showed that higher
SJT scores (as a result of preparation) did not necessarily improve inter-
personal skills.
As with any selection method, as its use increases in popularity, so
too does demand for commercial coaching programs, designed to teach
candidates strategies for improving their SJT scores, so that they might
increase their chance of success in the selection process (Lievens et al.,
2008). Given that SJTs are a measurement method, factors relating to
coachability and coaching effects will vary between individual SJTs. As
such, the potential risks and impacts of coaching should be assessed for
each SJT used in selection (Patterson, Ashworth, Kerrin, & Neill, 2013).
Although research on the “coachability” of SJTs is limited and ­merits
further exploration, some studies have addressed this issue and contrib-
ute valuable points to consider to reduce the risk of coaching effects.
Evidence from research suggests elements of best practice SJT develop-
ment and design can also serve as precautionary measures to mitigate risk
of an SJT of being “coachable”. First, SJTs designed for a specific target
role should contain content that is highly contextualized for that role.
Second, tailoring response instructions and formats to the test speci-
fication is also advisable (Patterson et al., 2013). Response formats that
involve a greater cognitive load, such as ranking formats and “choose
three”/multiple choice formats are less susceptible to coaching effects
than less cognitively loaded formats (i.e., “choose one best response
from four”). Additionally, research supports the use of cognitively
­oriented response instructions (e.g., “what should you do?”) over behav-
iorally oriented instructions (i.e., “what would you do?”) as the latter
are more susceptible to self-deception and impression management and
thus coaching effects (Patterson et al., 2013).

3. Personality inventories

Although there is widespread use of personality questionnaires for selec-


tion in corporate settings, its use for selection into healthcare profes-
sions or training programs is less common. There are also few studies on
the effect of coaching on personality test scores—most of the research
9  Coaching Issues    
237

focus has been on applicants faking good (even without the assistance
of coaching). For example, medical school applicants showed clear evi-
dence of faking good on a personality test battery at the time of selec-
tion (using their answers after acceptance as a comparison) despite the
fact that they were told the test would not be used to make selection
decisions (Griffin & Wilson, 2012).
It is therefore not surprising that coached applicants (for the police
force) achieved higher scores than uncoached applicants on a test of
conscientiousness (Miller & Barrett, 2008). Interestingly though, this
study also found that coached applicants were able to fake the test more
effectively than those who just faked good without having training on
how to do so.

4. Interviews

Interviews, which are ubiquitous in corporate selection contexts, are


also utilized to make selection decisions by a growing number of med-
ical schools and medical specialty training programs, and increasingly,
for entry into other health-related training programs such as psychol-
ogy and physiotherapy. Nevertheless, there are few studies on the effects
of coaching on selection interview performance despite the fact that an
internet search will reveal hundreds, if not thousands, of sites offering
training designed to improve interview performance.
Using police and fire department employees applying for promotion,
Maurer, Solamon, Andrews, and Troxtel (2001) found that those who
attended an interview coaching session had higher ratings on a situa-
tional interview than those who didn’t attend. In contrast to this result
(which replicated Maurer and colleagues’ earlier study in 1998), two
studies on the effect of coaching on multiple mini interviews (Griffin
et al., 2008; Moshinsky et al., 2017) found that coaching had no influ-
ence on station ratings. Indeed, at one station in the Griffin et al. study,
coached students had significantly lower scores. It may be that coached
students are quicker to assume what they think is being assessed at a
particular station, so may be less aware of subtle cues and therefore pro-
vide the wrong information (Griffin, 2014) or that interviewers have
been trained to identify “coached” responses.
238    
B. Griffin

Predictive Validity

As previously explained, when construct validity is altered there is a risk


that the predictive validity of the test also suffers. Predictive validity
studies investigate whether the relationship between selection test scores
and later performance (on the job or in an educational program) is dif-
ferent for those who have been coached compared to those who haven’t.
Such studies are rare because the data are not easily obtainable.
In terms of cognitive ability tests used in general selection contexts
(where the range of ability is typically less restricted than in med-
ical selection), te Nijenhuis et al. (2001) argue that a session of short
coaching is unlikely to have a substantial impact on predictive validity.
However, this type of coaching is quite different to the intensive coach-
ing courses and associated practice material that are marketed today
for applicants to many health professions. For example, in the course
of research on this topic, I have had applicants to medical school con-
fess that they spent half an hour a day answering practice items for six
months prior to the UMAT test.
Lievens, Reeve, and Heggestad (2007) studied medical school appli-
cants who were retested on selection tests (cognitive ability) and found
that scores gained on the second occasion had significantly poorer pre-
dictive ability, which was due to a reduction in g-loading caused by
retesting. Even though coaching may or may not have occurred dur-
ing the period between the first and second testing, both coaching and
retesting have the similar effect of reducing the g-loading of test scores.
We might therefore expect a similar reduction in validity for coached
scores. However, there are no direct predictive validity studies of cogni-
tive ability tests used for selection in the healthcare professions.
Nevertheless, Griffin, Yeomans, and Wilson (2013) have shown that
when applicants coached for the UMAT test in Australia were accepted
into a medical degree, they went on to achieve a significantly lower
GPA than those who had not been coached. A later analysis (Griffin,
Bayl-Smith, & Hu, under review) found that, even after controlling
for selection test scores, having been coached significantly increased the
odds that a student would be among those who consistently achieved
9  Coaching Issues    
239

below-average results across the whole degree. These findings could


imply that coached scores did not represent true ability or that those
who rely on coaching cope less well with the rigors of medical study.
In contrast, when Stemig et al. (2015) compared the ability of
SJT scores of coached and uncoached applicants (for Belgium med-
ical schools) to predict interpersonal skill (measured after enrolling),
there was only a small, but non-significant, decrement in validity for
the coached scores. Coaching may therefore not be hugely problem-
atic for SJT prediction. There is too little available evidence to indi-
cate whether this is due to the stability of SJT constructs in the face of
coaching.

Coaching and Applicant Well-Being

It seems clear that those with poor stress tolerance are more likely to
engage in a coaching intervention when faced with the demands of a
selection process (Ryan et al., 1998). However, while coaching is not
very effective in reducing this anxiety (Ryan et al., 1998), coached
applicants are convinced that they will perform better as a result of
being coached, even though this confidence has been shown to be mis-
placed (Wilkinson & Wilkinson, 2013). A letter from two senior medi-
cal students highlights this concern when they state that those applying
for medicine “now opt to undertake coaching for fear of ‘missing out’
on what may potentially be an edge obtained by other prospective stu-
dents. It is the psychological comfort provided to students that they
have done some preparation that marketers of coaching courses have
been exploiting” (Wong & Roberts-Thomson, 2009).
Given the initial evidence described above that medical students
who were coached on admissions tests underperform compared to
their uncoached peers, there may be a lasting negative effect of coach-
ing. Students may even suspect that they would not be enrolled without
external assistance to boost their selection scores, setting up potential
issues about fear of ongoing failure or lack of ability. This is an area for
future research.
240    
B. Griffin

Finally, applicant attitudes and reactions to selection tests have


important implications for both their performance on the test, their
attraction to the hiring organization, their behavior and attitudes after
being hired, their well-being, and their propensity to initiate legal pro-
ceedings regarding unfairness (Ryan & Ployhart, 2000; Truxillo, Bauer,
& McCarthy, 2015; see also Chapter 8 for a review). The perceived
coachability of tests such as the UMAT and UKCAT are strong factors
in creating negative attitudes to these tests among applicants (Brown &
Griffin, 2014; Cleland et al., 2011).

Future Research

The above review of the empirical research on coaching has identified


some areas for ongoing research. Primarily, we need predictive validity
studies to provide evidence of the accuracy of current selection deci-
sion-making. Related information is needed on the extent to which
any score improvements that exist as a result of coaching actually result
in “false positive” selection decisions or cause “false negative” choices
regarding potentially more deserving applicants.
Clearly, more work is needed on non-academic assessment tests
to determine if they are more or less amenable to coaching. Lievens
et al. (2012) add that there may be some types of items (for example
in an SJT) that are more resilient to coaching efforts and therefore
greater understanding on what determines coachability is an impor-
tant avenue for future research. In addition, while there have been a
few studies on the type of coaching and familiarization programs that
are most effective, more information on this and whether there are
individual differences in who benefits from what type of intervention
will also assist in identifying any negative impact (te Nijenhuis et al.,
2007).
In the face of applicant perceptions regarding unfairness related to
coaching, one institution made the decision to make all the MMI sta-
tion content from one year’s testing freely available on the internet the
following year (Moshinsky et al., 2017). One presumes that this then
requires new station content to be developed each year. Not only is this
9  Coaching Issues    
241

resource-intensive, but may threaten the reliability of items when there


is little chance of pilot testing, etc. It would be useful to have more evi-
dence of the effects of this practice, which is not uncommon for some
of the larger cognitive ability selection tests.

Implications for Practice
While nearly 30 years ago, it appears that Bond’s (1989, p. 442) obser-
vation remains true that “the coaching debate will probably continue una-
bated for some time to come. One reason for this of course is that so long
as tests are used in college admission decisions, employment, and profes-
sional certification, people will continue to seek a competitive advantage.”
Knowing how or whether to implement policies or procedural changes
to selection processes in light of the above evidence is a complex issue,
in particular because it is not generally known at the time of selection
who has or hasn’t undertaken commercial coaching, which after all, is
not an illegal activity.
A number of test providers make public comment that coaching is
of no value for performance on their test [see for example, the UMAT
(Australian Council for Educational Research, 2017)]. However, it
seems that applicants are impervious to or actively disbelieve such
statements (McGaghie et al., 2004; Moshinsky et al., 2017; Wilkinson
& Wilkinson, 2013). In response, one strategy is to make training
and training materials freely available to counteract any fairness issues
(Arendasy et al., 2016). However, as Stemig et al. (2015) caution, this
will not ensure, in and of itself, that all applicants, especially those from
low SES backgrounds whose parents do not have tertiary education, will
access such material. Institutions therefore need to develop processes
that will promote wide access.
Another focus for dealing with the effect of coaching is to develop
strategies for managing potentially biased scores. For example,
McDaniel, Psotka, Legree, Yost, and Weekley (2011) offer a suggestion
for eliminating certain response styles on SJT items. As described in the
Case Study 9.1, one Australian university also changed the way they
242    
B. Griffin

utilized scores of the most coachable section of the UMAT test in an


effort to eliminate from the shortlisting pool those who have benefited
overly from coaching.

Case Study 9.1: Minimizing the Effect of Coaching on Selection


Success
Griffin and Hu (2015b) report on efforts to reduce any impact of coaching
on selection decisions. This action was motivation by three research find-
ings (described above). First, evidence that one of the three components
of the UMAT test, which was used to shortlist applicants for interview,
was amenable to coaching. Second, scores on this subtest had a negative
correlation with GPAs obtained in medical school. Third, those who were
coached at the point of selection appeared to be consistently underper-
forming throughout their medical degree, suggesting that they cope less
well in a high-pressure training program, either for academic or psycho-
logical reasons, than students who did not attend coaching.
In the first few cohorts of this new medical school, based in an area of
disadvantage, just over 50% of selected applicants had undertaken com-
mercial coaching. The metric used for combining scores of the UMAT’s
three components was altered in 2014. This action resulted in a signifi-
cant drop in the number of interviewees who had attended coaching to
35.5%. Furthermore, there were now no differences in either UMAT scores
or MMI scores when comparing coached and uncoached applicants. This
change in policy appears to have significantly reduced the likelihood of
coached applicants gaining a place in medical school, making the competi-
tion more even.

Conclusion
The global rise of shadow education activities including commercial
coaching has not gone unnoticed by major bodies such as the United
Nations Educational, Scientific and Cultural Organization (UNESCO).
A UNESCO report (Bray, 2007) suggests that private tutoring and
commercial coaching have the potential to negatively impact national
education systems in terms of both equity and quality. In this chapter,
the existing evidence has been reviewed, which indicates that coach-
ing may well produce small improvements in test scores (depending on
the test), but whether these are large enough to make a difference to
selection decisions is not yet clear. They are however, not as large as the
9  Coaching Issues    
243

coaching companies would assert. Nonetheless, they may contribute to


the problem of widening participation of low SES applicants.
Among others, applicants to healthcare professional jobs and train-
ing programs appear willing to invest in large sums of money to gain a
competitive advantage. Given the impact on them as individuals and
the potential for poor selection decisions into what are safety-critical and
demanding roles, it is important for researchers to continue to investigate
the effects of commercial coaching programs and for practitioners who
have influence on selection policy and procedure to consider this evidence.

Practice Points

1. Large numbers of applicants will undertake commercial coaching to


gain a competitive advantage when large-scale testing is used to make
selection decisions.
2. Coaching could potentially improve scores on some tests of cogni-
tive ability without improving the test-taker’s actual level of cognitive
ability. However, this may not necessarily impact on the predictive
validity on the test or an improved chance of selection.
3. Coaching potentially poses a risk to accurate selection decisions.
4. Tests should be designed to minimize coaching effects, particularly in
high-stakes selection.

Explore Further

Key references to provide a background to coaching effects include:


Arendasy, M. E., Sommer, M., Gutierrez-Lobos, K., & Punter, J. F.
(2016). Do individual differences in test preparation compromise the
measurement fairness of admission tests? Intelligence, 55, 44–56.
Buchman, C., Condron, D. J., & Roscigno, V. J. (2010). Shadow edu-
cation, American style: Test preparation, the SAT and college enroll-
ment. Social Forces, 89(2), 435–461.
Hausknecht, J. P., Halpert, J. A., Di Paolo, N. T., & Moriarty Gerrard,
M. O. (2007). Retesting in selection: A meta-analysis of coaching
and practice effects for tests of cognitive ability. Journal of Applied
Psychology, 92, 373–385.
244    
B. Griffin

References
Arendasy, M.E., Sommer, M., Gutierrez-Lobos, K., & Punter, J. F. (2016). Do
individual differences in test preparation compromise the measurement fair-
ness of admission tests? Intelligence, 55, 44–56.
Australian Council for Educational Research. (2017). https://umat.acer.edu.
au/. Website Accessed 8 December.
Bardes, C. L., Best, P. C., Kremer, S. J., & Dienstag, J. L. (2009). Perspective:
Medical school admissions and noncognitive testing: Some open questions.
Academic Medicine, 84(10), 1360–1363.
Bond, L. (1989). The effects of special preparation on measures of scholastic
ability. In R. Linn (Ed.), Educational measurement (3rd ed., pp. 429–444).
Washington, DC: National Council on Education and American Council
on Education.
Bray, M. (2007). The shadow education system: Private tutoring and its impli-
cations for planners (2nd ed.). Paris: UNESCO, International Institute for
Educational Planning.
Brown, J., & Griffin, B. (2014). Stakeholder perceptions of selection in a high-
stakes context. Poster presented at the 29th Annual conference of the society
for industrial and organizational psychology, Honolulu, Hawaii.
Brown, P., Reay, D., & Vincent, C. (2013). Education and social mobility.
British Journal of Sociology of Education, 34(5–6), 637–643.
Buchman, C., Condron, D. J., & Roscigno, V. J. (2010). Shadow education,
American style: Test preparation, the SAT and college enrollment. Social
Forces, 89(2), 435–461.
Carpenter, P. A., Just, M. A., & Shell, P. (1990). What one intelligence test
measures: A theoretical account of the processing in the Raven Progressive
Matrices Test. Psychological Review, 97(3), 404–431.
Carroll, J. B. (1993). Human cognitive abilities: A survey of factor analytic stud-
ies. New York: Cambridge University Press.
Chung-Herrera, B. G., Ehrhart, K. H., Ehrhart, M. G., Solamon, J., & Kilian,
B. (2009). Can test preparation help to reduce the black—White test per-
formance gap? Journal of Management, 35(5), 1207–1227.
Cleland, J. A., French, F. H., Johnston, P. W., & Scottish Medical Careers
Cohort Study Group. (2011). A mixed-methods study identifying and explor-
ing medical students’ views of the UKCAT. Medical Teacher, 33(3), 244–249.
Cleland, J. A., Nicholson, S., Kelly, N., & Moffat, M. (2015). Taking con-
text seriously: Explaining widening access policy enactments in UK medical
schools. Medical Education, 49(1), 25–35.
9  Coaching Issues    
245

Domingue, B., & Briggs, D. C. (2009). Using linear regression and propen-
sity score matching to estimate the effect of coaching on the SAT. Multiple
Linear Regression Viewpoints, 35(1), 12–29.
Estrada, E., Ferrer, E., Abad, F. J., Román, F. J., & Colom, R. (2015). A gen-
eral factor of intelligence fails to account for changes in tests’ scores after
cognitive practice: A longitudinal multi-group latent-variable study.
Intelligence, 50, 93–99.
Graham Holdings. (2016). 2016 Annual Report. Arlington, VA: Graham
Holdings.
Griffin, B. (2014). The ability to identify criteria: its relationship with social
understanding, preparation, and impression management in affecting pre-
dictor performance in a high stakes selection context. Human Performance,
27(2), 147–164.
Griffin, B. (2016). Coaching: Much ado about nothing? Keynote address
InRESH conference, Perth, Australia.
Griffin, B., Carless, S., & Wilson, I. (2013a). The effect of commercial coach-
ing on selection test performance. Medical Teacher, 35(4), 295–300.
Griffin, B., Carless, S., & Wilson, I. (2013b). The undergraduate medical
and health sciences admissions test: What is it measuring? Medical Teacher,
35(9), 727–730.
Griffin, B., Harding, D. W., Wilson, I. G., & Yeomans, N. D. (2008). Does
practice make perfect? The effect of coaching and retesting on selection tests
used for admission to an Australian medical school. The Medical Journal of
Australia, 189(5), 270–273.
Griffin, B., & Hu, W. (2015a). The interaction of socio-economic status and
gender in widening participation in medicine. Medical Education, 49(1),
103–113.
Griffin, B., & Hu, W. (2015b). Reducing the impact of coaching on selection
into medicine. Medical Journal of Australia, 203(9), 363.
Griffin, B., & Wilson, I. G. (2012). Faking good: Self-enhancement in medi-
cal school applicants. Medical Education, 46(5), 485–490.
Griffin, B., Yeomans, N. D., & Wilson, I. G. (2013). Students coached for
an admission test perform less well throughout a medical course. Internal
Medicine Journal, 43(8), 927–932.
Hausknecht, J. P., Halpert, J. A., Di Paolo, N. T., & Moriarty Gerrard, M. O.
(2007). Retesting in selection: A meta-analysis of coaching and prac-
tice effects for tests of cognitive ability. Journal of Applied Psychology, 92,
373–385.
246    
B. Griffin

Hayes, T. R., Petrov, A. A., & Sederberg, P. B. (2015). Do we really become


smarter when our fluid-intelligence test scores improve? Intelligence, 48,
1–14.
Jones, R. F. (1986). The effect of commercial coaching courses on performance
on the MCAT. Academic Medicine, 61(4), 273–284.
Kulik, J. A., Bangert-Drowns, R. L., & Kulik, C. L. C. (1984). Effectiveness of
coaching for aptitude tests. Psychological Bulletin, 95(2), 179–188.
Lambe, P., Waters, C., & Bristow, D. (2012). The UK clinical aptitude test: Is
it a fair test for selecting medical students? Medical Teacher, 34(8), 557–565.
Laurence, C. O., Zajac, I. T., Lorimer, M., Turnbull, D. A., & Sumner, K.
E. (2013). The impact of preparatory activities on medical school selec-
tion outcomes: A cross-sectional survey of applicants to the university of
Adelaide medical school in 2007. BMC Medical Education, 13(1), 159.
Lievens, F., Buyse, T., Sackett, P. R., & Connelly, B. S. (2012). The effects
of coaching on situational judgment tests in high-stakes selection.
International Journal of Selection and Assessment, 20(3), 272–282.
Lievens, F., Peeters, H., & Schollaert, E. (2008). Situational judgment tests: A
review of recent research. Personnel Review, 37(4), 426–441.
Lievens, F., Reeve, C. L., & Heggestad, E. D. (2007). An examination of psy-
chometric bias due to retesting on cognitive ability tests in selection set-
tings. Journal of Applied Psychology, 92(6), 1672–1682.
Matthew, D. (2010). Evolving behaviors of MCAT examinees who apply to
U.S. medical schools. Academic Medicine, 85(6), 1100.
Maurer, T., Solamon, J., & Troxtel, D. (1998). Relationship of coaching with
performance in situational employment interviews. Journal of Applied
Psychology, 83(1), 128–136.
Maurer, T. J., Solamon, J. M., Andrews, K. D., & Troxtel, D. D. (2001).
Interviewee coaching, preparation strategies, and response strategies in rela-
tion to performance in situational employment interviews: An extension of
Maurer, Solamon, and Troxtel (1998). Journal of Applied Psychology, 86(4),
709–711.
McDaniel, M. A., Psotka, J., Legree, P. J., Yost, A. P., & Weekley, J. A. (2011).
Toward an understanding of situational judgment item validity and group
differences. Journal of Applied Psychology, 96(2), 327–336.
McGaghie, W. C., Downing, S. M., & Kubilius, R. (2004). What is the
impact of commercial test preparation courses on medical examination per-
formance? Teaching and Learning in Medicine, 16(2), 202–211.
9  Coaching Issues    
247

MedEntry UMAT Products. (2017). https://www.medentry.edu.au/courses/


umat-courses. Accessed July 5.
Miller, C. E., & Barrett, G. V. (2008). The coachability and fakability of per-
sonality-based selection tests used for police selection. Public Personnel
Management, 37(3), 339–351.
Millman, J., Bishop, C. H., & Ebel, B. (1965). An analysis of test-wiseness.
Educational and Psychological Measurement, 18, 787–790.
Moshinsky, A., Ziegler, D., & Gafni, N. (2017). Multiple mini-interviews in
the age of the internet: Does preparation help applicants to medical school.
International Journal of Testing. https://doi.org/10.1080/15305058.2016.12
63638.
Patterson, F., Ashworth, V., Kerrin, M., & O’Neill, P. (2013). Situational
judgement tests represent a measurement method and can be designed to
minimize coaching effects. Medical Education, 47(2), 220–221.
Powers, D. E. (1993). Coaching for the SAT: A summary of the summaries
and an update. Educational Measurement: Issues and Practice, 12, 24–30.
Raven, J., Raven, J. C., & Court, J. H. (1998). Manual for Raven’s progressive
matrices and vocabulary scales. Oxford: Information Press.
Ryan, A., Ployhart, R. E., Greguras, G. J., & Schmit, M. J. (1998). Test prepa-
ration programs in selection contexts: Self-selection and program effective-
ness. Personnel Psychology, 51(3), 599–621.
Ryan, A. M., & Ployhart, R. E. (2000). Applicants’ perceptions of selection
procedures and decisions: A critical review and agenda for the future.
Journal of Management, 26(3), 565–606.
Simons, D. J., Boot, W. R., Charness, N., Gathercole, S. E., Chabris, C. F.,
Hambrick, D. Z., & Stine-Morrow, E. A. (2016). Do “brain-training” pro-
grams work? Psychological Science in the Public Interest, 17(3), 103–186.
Stemig, M. S., Sackett, P. R., & Lievens, F. (2015). Effects of organization-
ally endorsed coaching on performance and validity of situational judgment
tests. International Journal of Selection and Assessment, 23(2), 174–181.
te Nijenhuis, J., van Vianen, A. E., & van der Flier, H. (2007). Score gains on
g-loaded tests: No g. Intelligence, 35(3), 283–300.
te Nijenhuis, J., Voskuijl, O. F., & Schijve, N. B. (2001). Practice and coach-
ing on IQ tests: Quite a lot of g. International Journal of Selection and
Assessment, 9(4), 302–308.
To, K. (2013). Multiple Mini Interview (MMI) for the mind. Advisor Prep
Education: United States.
248    
B. Griffin

Tompkins, J. (2011). Money for nothing? The problem of the board-exam


coaching industry. The New England Journal of Medicine, 365(2), 104–105.
Truxillo, D. M., Bauer, T. N., & McCarthy, J. M. (2015). Applicant fairness
reactions to the selection process. In R. Cropanzano & M. Ambrose (Eds.),
The Oxford handbook of justice in work organizations (pp. 621–640). Oxford:
Oxford University Press.
Verguts, T., & de Boeck, P. (2002). The induction of solution rules in Raven’s
Progressive Matrices Test. European Journal of Cognitive Psychology, 14(4),
521–547.
Wilkinson, T. M., & Wilkinson, T. J. (2013). Preparation courses for a medi-
cal admissions test: Effectiveness contrasts with opinion. Medical Education,
47, 417–424.
Wong, C. X. J., & Roberts-Thomson, R. L. (2009). Does practice make per-
fect? The effect of coaching and retesting on selection tests used for admis-
sion to an Australian medical school [letter]. Medical Journal of Australia,
190(2), 101–102.

You might also like