Oana, Setmethods
Oana, Setmethods
Abstract This article presents the functionalities of the R package SetMethods, aimed at performing
advanced set-theoretic analyses. This includes functions for performing set-theoretic multi-method
research, set-theoretic theory evaluation, Enhanced Standard Analysis, diagnosing the impact of
temporal, spatial, or substantive clusterings of the data on the results obtained via Qualitative Com-
parative Analysis (QCA), indirect calibration, and visualising QCA results via XY plots or radar charts.
Each functionality is presented in turn, the conceptual idea and the logic behind the procedure being
first summarized, and afterwards illustrated with data from Schneider et al. (2010).
Introduction
Set-theoretic methods, in general (Goertz and Mahoney, 2012), and Qualitative Comparative Analysis,
in particular, are becoming increasingly popular within different disciplines in the social sciences and
neighboring fields (Rihoux et al., 2013). Parallel to conceptual developments and increasing numbers
of applied studies, accelerating progress in terms of software development can be witnessed. While
less than a decade ago only two functioning software packages were available to users (fsQCA Ragin
et al. (2006) and Tosmana Cronqvist (2011)), there are now over a dozen different software solutions
offered (see http://compasss.org/software.htm). Many of them are developed within the R software
environment, with R package QCA (Dusa, 2007) being not only the one with the longest history, but
also the most complete and complex.
In this paper, we discuss the different functionalities of the R package SetMethods (Medzihorsky
et al., 2016). It is best perceived of as an add-on tool to package QCA and allows applied researchers to
perform advanced set-theoretic analyses. More precisely, SetMethods enables researchers to perform
Set-Theoretic Multi-Method Research, the Enhanced Standard Analysis (ESA), Set-Analytic Theory
Evaluation, to run diagnostics in the presence of clustered data structures, and to display their results
in various ways.
We proceed as follows. Each of the different functionalities within SetMethods is presented in
a separate section. Within each section, we first briefly summarize the conceptual idea behind the
analysis in question, then describe the computational logic of the function for performing the analysis,
after which we demonstrate the use of the function by displaying the R syntax and selected output by
using an example from published research.
Even though the main purpose is to present the functionality of R package SetMethods, this
article is also useful for researchers who perform their QCA in software environments other than R
because we present the logic of several of the main advanced set-analytic procedures in a concise and
transparent manner.
library(SetMethods)
# First rows of the Schneider et al.(2010) data called SCHF from package
# SetMethods:
data(SCHF)
head(SCHF)
Key for combining QCA with process tracing is the sorting of cases to different case types based on
the QCA solution formula. The literature identifies five different types (Schneider and Rohlfing, 2013).
Membership in a type is defined by the membership scores of a case in the outcome Y, on the one
hand, and the sufficient term T or the solution formula S, on the other hand. Table 1 summarizes
the definition of each case type and the analytic purpose of the within-case analysis in single cases.
Figure 2 visualizes the location of each case type in an XY plot.
Typical cases and deviant cases consistency are defined based on their membership in a sufficient
term T, whereas deviant cases coverage and IIR cases are defined based on their membership solution
formula S. Deviant cases consistency are subdivided into deviant in degree and deviant in kind. The
latter are always preferable for within-case analysis. IIR cases are not useful for single-case studies,
but they play an important role for comparative within-case analyses (see Section 2.3.2).
Table 1 is adapted from Schneider and Rohlfing (manuscript).
1 For
a systematic discussion of the pre-QCA case studies, see Rihoux and Lobe (2009).
2 For
MMR after an analysis of necessity, see Rohlfing and Schneider (2013).
3 For a systematic test of the mathematical formulas used for selecting single cases or pairs of cases for set-
Typical
(cases=2)
Deviant Consistency
Single Case (cases=3)
(match=FALSE) Deviant Coverage
(cases=4)
IIR
(cases=5)
Post-QCA MMR Typical-IIR
(cases=2)
Typical-Typical
Comparative (cases=1)
(match=TRUE) Typical-Deviant Consistency
(cases=3)
Deviant Coverage-IIR
(cases=4)
Membership in Goal of
Type of case T Y within-case analysis
(1)
(4)
Outcome Y
Outcome Y
(2)
0.5 0.5
(3) (5)
0.5 0.5
Sufficient Term T Solution Formula S
Typical cases
Process tracing in typical cases aims at empirically probing the causal mechanism(s) linking the
sufficient term S to outcome Y. For conjunction S to be causal, each conjunct C of S must be causal,
i.e. they must make a difference to outcome Y by making a difference to mechanism M. This requires
as many within-case analyses of typical cases as there are conjuncts in the sufficient conjunction.
For each analysis, one is the focal conjunct FC and the others are the complementary conjuncts CC.
The focal conjunct FC is the conjunct for which we want to find out whether it makes a difference
for the mechanism M, while the complementary conjuncts CC represent the other conjuncts of the
sufficient term S (Schneider and Rohlfing, manuscript). For causal inference on the configuration we
proceed by taking each conjunct at a time as the focal conjunct FC. Additionally, we also apply the test
severity principle. With fuzzy-sets the membership in mechanism M can only vary within the corridor
established by the membership in FC (the lowest value M can take) and Y (the highest value M can
take) for preserving the causal chain FC → M → Y (Schneider and Rohlfing, manuscript). The smaller
the corridor, the smaller the range of membership values M can take. Therefore, the most severe test for
M is the one in which FC = S = Y because the only consistent membership score in M equals FC = S =
Y.
The best-available typical case fulfills the following criteria: a) the focal conjunct is the one that
defines the membership of the typical case in the term (FC ≤ CC); b) the corridor for mechanism M as
defined by the sufficient term S (from a) we also have S = FC) and Y is small; c) membership in the
sufficient term S is high; d) the case is uniquely covered by the sufficient term S.
Figure 3 visualizes the test severity principle in two different ways. The XY plot in the upper
panel shows that for cases closer to the diagonal, test severity increases. The length of the vertical
and horizontal arrows, respectively, visualizes the range of fuzzy set membership scores for M that
would still be consistent. The larger this range, the less severe the test. The Euler diagram in the lower
panel visualizes the same by contrasting S1 almost as big as Y with S2 being much smaller than Y. The
former leaves little and the latter a lot of room for M.
The ideal typical case is located in the upper-right corner of the XY plot in Figure 3 with FC = S =
Y = 1. In applied QCA, such cases usually do not exist in the data at hand. Function mmr() identifies
the best available typical case in a given data set.
Function mmr() first sorts each typical case based on whether FC ≤ CC (rank 1) or FC > FC (rank
2). Cases in each rank are then further sorted according to Formula 1. Smaller values indicate better
suitable cases.4
1
increase severity
S2
increase severity
S1
Outcome Y
0.5
0 0.5 1
Focal Conjunct S
Y S1 M M S2 Y
to 2. As argument term is set to 1 the output shows the typical cases for each focal condition in the first
sufficient term, together with some additional information. The information included in the output
comprises of membership values of the typical cases in the focal conjuct, complementary conjuncts,
the whole sufficient term, and the outcome (in this case EXPORT), formula values St, whether the
case is the most typical according to the formula, which rank does the case sit in, and whether the case
is uniquely covered by the sufficient term. The order of the information that users should look for in
this output is whether the case is uniquely covered, what rank is the case in (the smaller, the better),
and what formula value St does the case have (the smaller, the better). For example, for focal conjunct
emp in sufficient term emp ∗ bargain ∗ OCCUP, Switzerland_03 appears to be the best available typical
case, being uniquely covered, being in Rank 1, and having the smallest formula value (St=0.59).
incl.cut = .9,
complete = TRUE,
PRI = TRUE,
sort.by = c("out", "incl", "n"))
# Get typical cases for the first term of the second intermediate solution:
Deviant cases consistency are puzzling because their membership in the sufficient term S exceeds that
in the outcome Y, i.e. S > Y. This becomes even more puzzling if S > 0.5&Y < 0.5, that is, if we have
deviant cases consistency in kind rather than just in degree (see Table 1). The more S exceeds Y, the
bigger the empirical puzzle, especially if membership in S is high. Within-case analysis of a deviant
case consistency aims at identifying the reasons why mechanism M either absent or prevented from
producing Y. The reason must be an INUS condition omitted from S. Formula 2 identifies the best
available deviant case consistency in a data set.
Using the same data from Schneider et al. (2010) and focusing on the parsimonious solution,
function mmr() identifies the deviant consistency cases for each sufficient term. For obtaining this
we need to keep argument match set to FALSE, as we are doing single case identification, but set
argument cases to 3, the identifier for deviant cases consistency (see Figure 1). The output shows the
deviant consistency cases (first column) grouped by sufficient term (second column) together with
term membership, outcome membership, formula value Sd, and whether the case is the most deviant
for a particular term. In the output we see that, for example, for term emp ∗ OCCUP the most deviant
case consistency is Switzerland_90 with the smallest formula value (Sd=0.67). Figure 4 shows all the
deviant cases consistency (cases in the lower right corner) for the first sufficient path emp ∗ OCCUP of
the parsimonious solution.
XY plot
Cons.Suf: 0.836; Cov.Suf: 0.353; PRI: 0.596; Cons.Suf(H): 0.776
0.9
0.8
0.7
EXPORT
0.6
0.5
0.4
0.3
0.2
0.1
emp*OCCUP
Deviant cases coverage are puzzling because they are members of the outcome without, however,
being members of any known sufficient term. Within-case analysis aims at identifying sufficient term
S+ omitted from the solution formula, which triggers mechanism a M and outcome Y.
Since deviant cases coverage are defined by what they are not - members of the solution formula
(see Table 1) - this solution formula is not a good place to start selecting the best available deviant
cases coverage. Instead, this type of case is selected based on their membership in their truth table row
TT. For each TT with at least one deviant case coverage, a within-case analysis can be performed. If
more than one deviant case coverage populates the same TT, Formula 3 identifies the best available
case for within-case analysis.
emp ∗ bargain ∗ UN I ∗ occup ∗ STOCK ∗ MA (rows 9, 10, 4, and 5 of the output) is populated by 4
deviant coverage cases, out of which UK_90 is the best available for within case analysis, having the
smallest formula value (Sd=0.51).
Individually irrelevant (IIR) cases owe their name to the fact that single within-case analyses in this
type of cases is not useful. IIR cases do play a crucial role in two forms of comparative within-case
analysis (see Section 2.3.2). Even if not useful for single case studies, identifying IIR cases is informative
as their list - together with the deviant cases coverage - indicate the diversity among cases without the
outcome. The more different truth table rows are populated by IIR cases (and deviant cases coverage),
the more heterogeneous this group of cases is.
Function mmr() lists all individually irrelevant cases with respect to the entire solution formula
(also called globally uncovered IIR cases) and sorts each of them into the truth table to which they
belong best. Since these cases are not informative for single case studies and are being used just to
indicate diversity among the cases without the outcome, the function does not involve a formula
ranking of IIR cases.
The literature identifies four feasible within-case comparisons after a QCA between two types of
cases each (Schneider and Rohlfing, manuscript). With each comparison a different analytic goal is
pursued. Figure 5 summarizes these goals. The two comparisons ’along the main diagonal’ pursue
a causal inference goal, whereas the two ’vertical’ comparisons aim at improving the QCA model
specification by identifying either an INUS condition missing from a known sufficient term or an
entire new sufficient term missing from the solution formula.
ar
ul
ni
Outcome Y
Outcome Y
us
ha
ca
ec
m
t
is
nc
u
nj
0.5 0.5
co
is
0.5 0.5
Sufficient Term S Truth Table Row TT
The purpose of the within-case comparison between a typical case and and IIR case is to empirically
investigate whether a sufficient term is a difference-maker, i.e. causal, not only for the outcome (Y) at
the cross-case level, but also for the mechanism M at the within-case level. Similarly, the within-case
comparison of two typical cases empirically probes whether the same mechanism M links the sufficient
term S to outcome Y in typical cases that are as different from each other as possible. For both forms
of comparison, it holds that if S is a conjunction, each of its conjuncts C must be a difference-maker.
Hence, the comparisons between a typical case and an IIR case (or another typical case) must be
performed for each single conjunct C at a time. The following sections provide more details for each
form of comparison and spells out the sorting mechanisms and mathematical formulas that underly
the respective functions in SetMethods.
Table 2: Possible membership constellations between focal (FC) and complementary conjuncts (CC)
in comparison of typical and IIR case
Function mmr() first sorts each pair of typical and IIR cases into ranks 1-8 as defined in Table 2.
Cases in smaller rank numbers are more adequate for the analytic goal of the comparative within-case
analysis of these two case types. For case pairs in rank 1, for example, it holds that the difference-
making quality can be attributed to the focal conjunct FC both on the typical and the IIR case, and that
it is determinate.
Within each rank, Formula 4 maximizes the following criteria: between both cases, the difference
in FC and in Y, respectively, should be small; both should have high membership in CC; and both
should be close to the diagonal. Within each rank, case pairs with smaller formula values are more
appropriate. Additionally, typical cases should be uniquely covered by the sufficient term under
inverstigation, while IIR cases should be globally uncovered (not covered by any of the sufficient
terms).
# Get matching pairs of typical and IIR cases for the first term
# of the parsimonious solution:
The matching of two typical cases follows a logic similar to the one between a typical and an IIR case.
The goal is to probe the difference-making properties of each conjunct (FC) in sufficient term S to
mechanism M. Table 3 defines the four ranks that can occur based on two typical cases’ membership
in FC and the complementary conditions CC. After sorting each possible pair of typical cases into one
of these ranks, Formula 5 further ranks those pairs such that their difference in FC and the outcome,
respectively, is minimized; that their membership in CC is maximized; and that both are close to the
diagonal (test severity principle). Additionally, the two typical cases should be uniquely covered by
the sufficient term.
Table 3: Possible membership constellations between focal (FC) and complementary conjuncts (CC)
in comparison of two typical cases
1 FC ≤ CC FC ≤ CC Yes Yes
2 FC ≤ CC FC > CC Yes No
2 FC > CC FC ≤ CC No Yes
4 FC > CC CC > FC No No
# Get matching pairs of typical and typical cases for the first term
# of the parsimonious solution:
The comparative within-case analysis of a typical and a deviant case consistency aims at identifying
the INUS condition missing from the sufficient term S in question. The best available pair of cases
maximizes the following criteria: their membership in S should be as high and similar as possible
and their membership in Y is different as possible. Formula 6 translates these matching criteria into
practice.
Setting cases to 3 we get best available pair of typical and deviant consistency cases for each
sufficient term in the parsimonious solution sol_yp. For identifying a missing INUS in sufficient term
emp ∗ OCCUP, the best available pair of cases that we could choose for process-tracing would be the
one between typical case Switzerland_03 and deviant consistency case Australia_90, as they have
the smallest formula value (Distance=0.84).
# Get matching pairs of typical and deviant consistency cases for the
# parsimonious solution:
## Term emp*OCCUP :
## ----------
## typical deviant_consistency distance term best_matching_pair
## 1 Switzerland_03 Australia_90 0.84 emp*OCCUP TRUE
## 2 Switzerland_99 Australia_90 0.85 emp*OCCUP FALSE
## 3 Switzerland_99 Switzerland_90 0.85 emp*OCCUP FALSE
## 4 Switzerland_03 Switzerland_90 0.92 emp*OCCUP FALSE
## 5 Switzerland_03 Australia_95 0.96 emp*OCCUP FALSE
##
## Term BARGAIN*UNI*STOCK :
## ----------
## typical deviant_consistency distance term
## 1 Netherlands_03 Australia_95 0.55 BARGAIN*UNI*STOCK
## 2 Netherlands_03 Australia_03 0.59 BARGAIN*UNI*STOCK
## 3 Netherlands_99 Australia_95 0.65 BARGAIN*UNI*STOCK
## 4 Netherlands_99 Australia_03 0.69 BARGAIN*UNI*STOCK
## 5 Netherlands_99 Spain_99 0.73 BARGAIN*UNI*STOCK
## best_matching_pair
## 1 TRUE
## 2 FALSE
## 3 FALSE
## 4 FALSE
## 5 FALSE
##
## Term occup*STOCK*ma :
## ----------
## typical deviant_consistency distance term best_matching_pair
## 1 USA_03 Spain_03 0.84 occup*STOCK*ma TRUE
## 2 Japan_99 Spain_03 0.86 occup*STOCK*ma FALSE
## 3 Japan_03 Spain_03 0.88 occup*STOCK*ma FALSE
## 4 USA_90 Spain_03 0.89 occup*STOCK*ma FALSE
## 5 USA_95 Spain_03 0.91 occup*STOCK*ma FALSE
The comparative within-case analysis of a deviant case coverage and an IIR case aims at identifying
the sufficient conjunction S+ missing from the sufficient solution formula generated with QCA. The
point of reference for matching cases is their membership in the truth table row TT to which they
belong. Analogous to the within-case comparison of a typical and a deviant case consistency case,
the goal is to maximize both cases’ membership and their similarity in TT and their difference in Y.
Cases for this forth type of comparison can be identified by setting cases to 4. Since for deviant
coverage and IIR cases we are interested in identifying an entire missing sufficient term, the output
for these pairs is focused on matching pairs in truth table rows, rather than in sufficient terms.
Therefore, the output is sorted by truth table rows (the columns starting with TT showing the
combination of conditions) and for each truth table row we can identify a best matching pair of
cases according to formula values in column distance. For example, if we focus on truth table row
EMP ∗ BARGAI N ∗ uni ∗ OCCUP ∗ stock ∗ ma (rows 6, 7, 8, 9, 10 in the output), the deviant case
coverage France_95 and the IIR case Finland_90 constitute the best matching pair, having the smallest
formula value for this specific truth table row (distance=1.43).
# Get matching pairs of deviant coverage and IIR cases for the
# parsimonious solution:
logical minimization (yielding the conservative or complex solution CS), to include all remainders that
are simplifying (yielding the most parsimonious solution PS), or to include only those simplifying
assumption that are easy based on so-called directional expectations (yielding the intermediate solution
(IS)).
Schneider and Wagemann (2012, chapter 8) propose the Enhanced Standard Analysis (ESA), which
argues that simplifying assumptions on specific remainders can be untenable. There are three sources
of untenability. Incoherent counterfactuals, which are either logical remainders contradicting claims
of necessity6 or assumptions made for the negated outcome7 , and implausible counterfactuals, which
consist of claims about impossible remainders8 . ESA simply stipulates that no QCA solution formula
can be based on untenable assumptions.
Figure 6 provides a graphical representation of the different types of assumptions as defined
by SA and ESA. Both approaches only allow for simplifying assumptions9 (i.e. those in the inner
circle) and both distinguish between difficult and easy counterfactuals (i.e. the vertical line inside the
circle)10 . ESA but not SA does block any untenable assumption (i.e. the gray area on the lower part).
A risk of making untenable assumption is given whenever a researcher is claiming the presence of a
necessary condition, when statements of sufficiency for both the outcome and its negation are made,
and/or when two or more conditions with mutually exclusive categories are used in a truth table.
ESA requires that researchers identify those logical remainder rows whose inclusion into the logical
minimization would amount to an untenable claim. As a result, one obtains the enhanced PS and the
enhanced IS.11
Function esa() provides a straightforward tool for avoiding untenable assumptions and thus
putting ESA into practice. First, function esa() can exclude remainders that contradict single necessary
conditions, unions of necessary conditions, or more complicated expressions of necessity. For example,
assuming that the disjunction STOCK + MA is necessary for the outcome EXPORT, we ban all
remainder rows implied by this necessity claim in the nec_cond argument. All the logical remainder
rows that are subsets of ¬STOCK ¬ MA are subsequently set ot OUT = 0 in the truth table object ttnew
and thus excluded from further logical minimization.
neither. This is because all assumptions not constrainted by directional expectations (if there are any or just some)
are by default easy.
11 For an application of ESA, see, for instance Thomann (2015).
## 31 0 1 1 1 1 0 ? 0 - -
## 33 1 0 0 0 0 0 0 0 - -
## 34 1 0 0 0 0 1 ? 0 - -
## 35 1 0 0 0 1 0 ? 0 - -
## 36 1 0 0 0 1 1 ? 0 - -
## 37 1 0 0 1 0 0 0 0 - -
## 38 1 0 0 1 0 1 ? 0 - -
## 39 1 0 0 1 1 0 ? 0 - -
## 40 1 0 0 1 1 1 ? 0 - -
## 41 1 0 1 0 0 0 0 0 - -
## 42 1 0 1 0 0 1 ? 0 - -
## 44 1 0 1 0 1 1 ? 0 - -
## 45 1 0 1 1 0 0 0 0 - -
## 46 1 0 1 1 0 1 ? 0 - -
## 47 1 0 1 1 1 0 ? 0 - -
## 48 1 0 1 1 1 1 ? 0 - -
## 50 1 1 0 0 0 1 ? 0 - -
## 51 1 1 0 0 1 0 ? 0 - -
## 52 1 1 0 0 1 1 ? 0 - -
## 54 1 1 0 1 0 1 ? 0 - -
## 58 1 1 1 0 0 1 ? 0 - -
## 59 1 1 1 0 1 0 ? 0 - -
Secondly, the esa() function can also ban implausible counterfactuals to produce truth tables
in which specific logical remainders identifyied through conjunctions are excluded. For example,
we can ban all remainder rows that have BARGAI N + ∼ OCCUP by using the Boolean expression
in argument untenable_LR. Finally, the function can exclude contradictory simplifying assumptions
(which are another form of untenable assumptions) and empirically observed rows that are part of
simultaneous subset relations12 by just using the unique truth table row identifier in the argument
contrad_rows. While argument untenable_LR accepts Boolean expression for excluding only logical
remainders, argument contrad_rows can exclude both empirically observed rows and remainder rows
through their unique identifier (row number).13
Simplifying Assumptions
Easy Difficult
Counterfactuals Counterfactuals
different types of cases. Each case has partial membership in all areas but only in one of higher than
0.5. Figure 7 provides a visualization of the areas and the kinds of cases in each area.14
Theory T Solution S
T ¬S TS ¬ TS
¬Y: most likely ¬Y: inconsistent most likely ¬Y: inconsistent least likely
¬ T ¬S
Function theory.evaluation() performs the theory evaluation procedure between a theory spec-
ified in Boolean terms and results obtained using the QCA package. Assuming that the theory can
be summarized as EMP*∼MA + STOCK, the example below shows how theory evaluation works using
the second intermediate solution for outcome EXPORT. The first part of the output shows the names
and proportion of cases in each of the intersections between theory and the empirical solution. The
second part of the output shows parameters of fit for the solution, the theory, and their intersections,
which indicate how much each of these areas are in line with a statement of sufficiency for EXPORT.
Additionally, the function also stores the membership of each case in each intersection between theory
and empirics, which can be accessed by setting argument print.data to TRUE.
14 For applied examples of theory evaluation, see, for instance, Sager and Thomann (2017); Schneider and Maerz
(2017).
##
## Cases:
## ----------
##
## Covered Most Likely (T*E and Y > 0.5) :
## *******************
##
## Cases in the intersection/Total number of cases: 23 / 76 = 30.26 %
##
## Case Names:
## [1] "Ireland_90" "Japan_90" "USA_90" "Ireland_95"
## [5] "Japan_95" "Switzerland_95" "USA_95" "Denmark_99"
## [9] "Finland_99" "France_99" "Japan_99" "Netherlands_99"
## [13] "Sweden_99" "Switzerland_99" "Belgium_03" "Denmark_03"
## [17] "Finland_03" "France_03" "Japan_03" "Netherlands_03"
## [21] "Sweden_03" "Switzerland_03" "USA_03"
##
## Covered Least Likely (t*E and Y > 0.5) :
## *******************
##
## Cases in the intersection/Total number of cases: 0 / 76 = 0 %
##
## Case Names:
## [1] "No cases in this intersection"
##
## Uncovered Most Likely (T*e and Y > 0.5) :
## *******************
##
## Cases in the intersection/Total number of cases: 12 / 76 = 15.79 %
##
## Case Names:
## [1] "UK_90" "France_95" "Netherlands_95" "Sweden_95"
## [5] "UK_95" "Germany_99" "Ireland_99" "UK_99"
## [9] "USA_99" "Germany_03" "Ireland_03" "UK_03"
##
## Uncovered Least Likely (t*e and Y > 0.5) :
## *******************
##
## Cases in the intersection/Total number of cases: 0 / 76 = 0 %
##
## Case Names:
## [1] "No cases in this intersection"
##
## Inconsistent Most Likely (T*E and Y < 0.5) :
## *******************
##
## Cases in the intersection/Total number of cases: 12 / 76 = 15.79 %
##
## Case Names:
## [1] "Canada_90" "Switzerland_90" "Australia_95" "Canada_95"
## [5] "Denmark_95" "Finland_95" "Australia_99" "Belgium_99"
## [9] "Spain_99" "Australia_03" "Norway_03" "Spain_03"
##
## Inconsistent Least Likely (t*E and Y < 0.5) :
## *******************
##
## Cases in the intersection/Total number of cases: 1 / 76 = 1.32 %
##
## Case Names:
## [1] "Australia_90"
##
## Consistent Most Likely (T*e and Y < 0.5) :
## *******************
##
## Cases in the intersection/Total number of cases: 23 / 76 = 30.26 %
##
## Case Names:
## [1] "Austria_90" "Belgium_90" "Denmark_90" "Finland_90"
## [5] "France_90" "Germany_90" "Italy_90" "Netherlands_90"
## [9] "Norway_90" "Spain_90" "Sweden_90" "Austria_95"
## [13] "Belgium_95" "Germany_95" "Italy_95" "New Zealand_95"
## [17] "Norway_95" "Spain_95" "Austria_99" "Canada_99"
## [21] "Italy_99" "Canada_03" "Italy_03"
##
## Consistent Least Likely (t*e and Y < 0.5) :
## *******************
##
## Cases in the intersection/Total number of cases: 4 / 76 = 5.26 %
##
## Case Names:
## [1] "New Zealand_90" "New Zealand_99" "Norway_99" "New Zealand_03"
##
##
## Fit:
## ----------
##
## Cons.Suf Cov.Suf PRI Cons.Suf(H)
## emp*bargain*OCCUP 0.909 0.194 0.721 0.865
## BARGAIN*UNI*STOCK 0.796 0.497 0.665 0.704
## emp*UNI*OCCUP*ma 0.919 0.171 0.611 0.894
## emp*occup*STOCK*ma 0.904 0.298 0.802 0.859
## UNI*occup*STOCK*ma 0.894 0.341 0.795 0.853
## Sol.Formula 0.799 0.705 0.691 0.716
## Theory 0.639 0.973 0.515 0.550
## T*E 0.811 0.705 0.707 0.726
## t*E 0.825 0.165 0.423 0.764
## T*e 0.651 0.547 0.419 0.592
## t*e 0.697 0.203 0.232 0.640
data(SCHLF)
head(SCHLF)
# Get pooled, within, and between consistencies for the intermediate solution:
## Consistencies:
## ---------------
## emp*bargain*OCCUP BARGAIN*UNI*STOCK emp*UNI*OCCUP*ma
## Pooled 0.909 0.796 0.919
## Between 1990 0.839 0.873 0.733
## Between 1995 0.903 0.727 0.953
## Between 1999 0.928 0.802 1.000
## Between 2003 0.951 0.818 1.000
## Within Australia 1.000 0.405 0.634
Function cluster() can be applyied in a similar fashion for necessary relationships by just setting
argument necessity to TRUE and inputting the necessary condition to be diagnosed in the field
results. Additionally, we can also diagnose Boolean expressions by just entering this into the results
argument.
Additional functions
Plotting sufficient terms and solutions, truth table rows, and necessity relations
Package SetMethods also includes a function pimplot() for plotting each sufficient term and the
solution formula (obtained by using the minimize() function in package QCA). The function can also
plot truth table rows against the outcome by using arguments incl.tt or ttrows as in the examples
below. Additionally, the function can plot results obtained from necessity analyses using an object
of class "sS" (obtained by using the superSubset() function in package QCA) by setting argument
necessity to TRUE.15
# Plot all truth table rows with a consistency higher than 0.9:
QCAradar
Another function included in the package is the QCAradar() function which allows visualization of
QCA results or simple Boolean expressions in the form of a radar chart16 . The function accepts in the
argument results sufficient solutions obtained through the function minimize() in package QCA, or
Boolean expressions involving more than three conditions, as in the second example below.
Figure 8a shows a radar chart for the second intermediate solution formula. The different sufficient
terms are overlapping on the radar in different shades. For example, we can see the first term
emp*bargain*OCCUP, as condition EMP is missing it is set to 0 for that respective corner, condition
BARGAI N is missing and set to 0, and condition OCCUP is present and set to 1. Since the rest of the
conditions are not specified in this term, they are all left at -.
15 The plots resulting from these functions are not included in the paper due to length reasons.
16 See Maerz (2017) for an applied example of radar charts.
Solution Formula
Cons.Suf: 0.799 Cov.Suf: 0.705 PRI: 0.691 Cons.Suf(H): 0.716 A*~B*C*~D
EMP A
1 1
− −
BARGAIN MA
0 0
B D
UNI STOCK
OCCUP C
QCAradar(results = "A*~B*C*~D")
Figure 8b shows a radar chart for the Boolean expression ”A∗ ∼ B ∗ C ∗ ∼ D”. Conditions A and
C that are present are set to 1 for their respective corners, while conditions B and D that are missing
and set to 0. There are not conditions left at - in this figure, as all conditions are specified.
Indirect calibration
SetMethods also includes a function for performing the indirect calibration procedure described by
Ragin (2008)17 . This procedure assumes that the cases included in the analysis have interval-scale raw
scores which can be initially sorted broadly into different levels of fuzzy set membership. Subsequently,
the raw scores are transformed into calibrated scores using a binomial or a beta regression. Assuming
that vector x contains the initial raw scores, while vector x_cal contains the rough grouping of those
values into set membership scores, function indirectCalibration() can produce a vector of fuzzy-set
scores a by fitting the x to x_cal using a binomial regression if binom is set to TRUE.
set.seed(4)
x <- runif(20, 0, 1)
# Find quantiles
# Theoretical calibration
x_cal <- NA
x_cal[x <= quant[1]] <- 0
x_cal[x > quant[1] & x <= quant[2]] <- .2
x_cal[x > quant[2] & x <= quant[3]] <- .4
x_cal[x > quant[3] & x <= quant[4]] <- .6
x_cal[x > quant[4] & x <= quant[5]] <- .8
x_cal[x > quant[5]] <- 1
x_cal
Conclusions
In this article, we have presented the main functionalities of the R package Setmethods. It is true
that starting to perform QCA in R is more onerous than starting with a point-and-click software.
Yet, the flexibility offered by R is also its strength, especially for a young method like QCA. As set
methods continue to develop, software implementations need to be updated and improved at a fast
rate. Package SetMethods is designed to do precisely this: providing a tool for implementing new
ideas that enhance set-theoretic analyses for applied researchers.
Acknowledgements
We thank Juraj Medzihorsky and Mario Quaranta for their intput into previous versions of the
SetMethods package. We also thank the participants of various ECPR Summer and Winter Schools in
Methods and Techniques whose questions and testing are continuously improving the package.
Bibliography
L. Cronqvist. Tosmana: Tool for Small-n Analysis, Version 1.3.2.0 [Computer Program]. Record ID:
8880, 2011. [p507]
A. Dusa. User Manual for the QCA(GUI) Package in R. Journal of Business Research, 60(5):576–586, 2007.
[p507]
R. García-Castro and M. A. Arino. A General Approach to Panel Data Set-Theoretic Research. Journal
of Advances in Management Sciences & Information Systems, 2:63–76, 2016. [p526]
G. Goertz and J. Mahoney. A Tale of Two Cultures: Contrasting Qualitative and Quantitative Paradigms.
Princeton University Press, Princeton, N.J, 2012. [p507]
S. Maerz. The Many Faces of Authoritarian Persistence. PhD thesis, Central European University, Doctoral
Dissertation, Central European University, 2017. [p530]
J. Medzihorsky, I.-E. Oana, M. Quaranta, and C. Q. Schneider. SetMethods: Functions for Set-
Theoretic Multi-Method Research and Advanced QCA, R Package, Version 2.1. https://cran.r-
project.org/web/%0Apackages/SetMethods/index.html, 2016. [p507]
K. S. Mikkelsen. Fuzzy-Set Case Studies. Sociological Methods & Research, 46(3):422–455, 2017. ISSN
0049-1241. URL https://doi.org/10.1177/0049124115578032. [p508]
C. C. Ragin. The Comparative Method: Moving Beyond Qualitative and Quantitative Strategies. University
of California Press, Berkeley, 1987. ISBN 0520058348. [p521, 523]
C. C. Ragin. Redesigning Social Inquiry: Fuzzy Sets and Beyond. University of Chicago Press, Chicago,
2008. [p508, 521]
B. Rihoux and B. Lobe. The Case for Qualitative Comparative Analysis (QCA): Adding Leverage for
Thick Cross-Case Comparison. In D. Byrne and C. C. Ragin, editors, Sage Handbook Of Case-Based
Methods, pages 222–242. Sage, 2009. [p508]
B. Rihoux, P. Alamos, D. Bol, A. Marx, and I. Rezsohazy. From Niche to Mainstream Method? A
Comprehensive Mapping of QCA Applications in Journal Articles from 1984 to 2011. Political
Research Quarterly, 66(1):175–184, 2013. [p507]
I. Rohlfing and C. Q. Schneider. A Unifying Framework for Causal Analysis in Set-Theoretic Multi-
Method Research. Sociological Methods & Research, 47(1):37–63, 2018. URL https://doi.org/10.
1177/0049124115626170. [p508]
F. Sager and E. Thomann. Multiple Streams in Member State Implementation: Politics, Problem
Construction and Policy Paths in Swiss Asylum Policy. Journal of Public Policy, 37(3):287–314, 2017.
ISSN 0143-814X. URL https://doi.org/10.1017/s0143814x1600009x. [p524]
C. Q. Schneider and S. Maerz. Legitimation, Cooptation, and Repression and the Survival of Electoral
Autocracies. Zeitschrift für Vergleichende Politikwissenschaft, 11:213–235, 2017. ISSN 1865-2646. URL
https://doi.org/10.1007/s12286-017-0332-2. [p524]
C. Q. Schneider and I. Rohlfing. Combining QCA and Process Tracing in Set-Theoretic Multi-Method
Research. Sociological Methods and Research, 42(4):559–597, 2013. [p508]
C. Q. Schneider and I. Rohlfing. Case Studies Nested in Fuzzy-Set QCA on Sufficiency: Formalizing
Case Selection and Causal Inference. Sociological Methods & Research, 45(3):526–568, 2016. ISSN
0049-1241. URL https://doi.org/10.1177/0049124114532446. [p508]
C. Q. Schneider and C. Wagemann. Set-Theoretic Methods for the Social Sciences: A Guide to Qualitative
Comparative Analysis. Cambridge University Press, Cambridge, 2012. [p522, 523]
Ioana-Elena Oana
Central European University (CEU)
Nador utca 9, 1051 Budapest
Hungary
[email protected]
Carsten Q. Schneider
Central European University (CEU)
Nador utca 9, 1051 Budapest
Hungary
[email protected]