0% found this document useful (0 votes)

50 views15 pages

Backtesting Framework for Market Risk

This document from the Basel Committee on Banking Supervision presents a supervisory framework for incorporating backtesting into the internal models approach for calculating market risk capital requirements. It describes backtesting as comparing daily profit and loss amounts to model-generated risk measures to evaluate risk measurement systems. The framework requires banks to conduct backtests using one-day risk measures and trading outcomes to compare to the intended 99% confidence level. It acknowledges challenges comparing risk measures to actual trading outcomes due to portfolio changes and fee income, but argues for benchmarking risk measures against reality while addressing fee income through removing trading outcome means.

Uploaded by

PaulsonPaul

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

50 views15 pages

Backtesting Framework for Market Risk

Uploaded by

PaulsonPaul

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

SUPERVISORY FRAMEWORK FOR THE

USE OF "BACKTESTING" IN CONJUNCTION WITH THE

INTERNAL MODELS APPROACH TO MARKET RISK

CAPITAL REQUIREMENTS

Basle Committee on Banking Supervision

January 1996
Supervisory framework for the use of “backtesting” in conjunction with the internal
models approach to market risk capital requirements

I. Introduction
This document presents the framework developed by the Basle Committee on
Banking Supervision ("the Committee") for incorporating backtesting into the internal models
approach to market risk capital requirements. It represents an elaboration of Part B.4 (j) of the
Amendment to the Capital Accord, which is being released simultaneously.
Many banks that have adopted an internal model-based approach to market risk
measurement routinely compare daily profits and losses with model-generated risk measures
to gauge the quality and accuracy of their risk measurement systems. This process, known as
“backtesting”, has been found useful by many institutions as they have developed and
introduced their risk measurement models.
As a technique for evaluating the quality of a firm’s risk measurement model,
backtesting continues to evolve. New approaches to backtesting are still being developed and
discussed within the broader risk management community. At present, different banks
perform different types of backtesting comparisons, and the standards of interpretation also
differ somewhat across banks. Active efforts to improve and refine the methods currently in
use are underway, with the goal of distinguishing more sharply between accurate and
inaccurate risk models.
The essence of all backtesting efforts is the comparison of actual trading results with
model-generated risk measures. If this comparison is close enough, the backtest raises no
issues regarding the quality of the risk measurement model. In some cases, however, the
comparison uncovers sufficient differences that problems almost certainly must exist, either
with the model or with the assumptions of the backtest. In between these two cases is a grey
area where the test results are, on their own, inconclusive.
The Basle Committee believes that backtesting offers the best opportunity for
incorporating suitable incentives into the internal models approach in a manner that is
consistent and that will cover a variety of circumstances. Indeed, many of the public
comments on the April 1995 internal models proposal stressed the need to maintain strong
incentives for the continual improvement of banks’ internal risk measurement models. In
considering how to incorporate backtesting more closely into the internal models approach to
market risk capital requirements, the Committee has sought to reflect both the fact that the
industry has not yet settled on a single backtesting methodology and concerns over the
imperfect nature of the signal generated by backtesting.
-2-
The Committee believes that the framework outlined in this document strikes an
appropriate balance between recognition of the potential limitations of backtesting and the
need to put in place appropriate incentives. At the same time, the Committee recognises that
the techniques for risk measurement and backtesting are still evolving, and the Committee is
committed to incorporating important new developments in these areas into its framework.
The remainder of this document describes the backtesting framework that is to
accompany the internal models capital requirement. The aim of this framework is the
promotion of more rigorous approaches to backtesting and the supervisory interpretation of
backtesting results. The next section deals with the nature of the backtests themselves, while
the section that follows concerns the supervisory interpretation of the results and sets out the
agreed standards of the Committee in this regard.

II. Description of the backtesting framework

The backtesting framework developed by the Committee is based on that adopted by
many of the banks that use internal market risk measurement models. These backtesting
programs typically consist of a periodic comparison of the bank’s daily value-at-risk
measures with the subsequent daily profit or loss (“trading outcome”). The value-at-risk
measures are intended to be larger than all but a certain fraction of the trading outcomes,
where that fraction is determined by the confidence level of the value-at-risk measure.
Comparing the risk measures with the trading outcomes simply means that the bank counts
the number of times that the risk measures were larger than the trading outcome. The fraction
actually covered can then be compared with the intended level of coverage to gauge the
performance of the bank’s risk model. In some cases, this last step is relatively informal,
although there are a number of statistical tests that may also be applied.
The supervisory framework for backtesting in this document involves all of the steps
identified in the previous paragraph, and attempts to set out as consistent an interpretation of
each step as is feasible without imposing unnecessary burdens. Under the value-at-risk
framework, the risk measure is an estimate of the amount that could be lost on a set of
positions due to general market movements over a given holding period, measured using a
specified confidence level.
The backtests to be applied compare whether the observed percentage of outcomes
covered by the risk measure is consistent with a 99% level of confidence. That is, they attempt
to determine if a bank’s 99th percentile risk measures truly cover 99% of the firm’s trading
outcomes. While it can be argued that the extreme-value nature of the 99th percentile makes
it more difficult to estimate reliably than other, lower percentiles, the Committee has
concluded that it is important to align the test with the confidence level specified in the
Amendment to the Capital Accord.
An additional consideration in specifying the appropriate risk measures and trading
outcomes for backtesting arises because the value-at-risk approach to risk measurement is
-3-
generally based on the sensitivity of a static portfolio to instantaneous price shocks. That is,
end-of-day trading positions are input into the risk measurement model, which assesses the
possible change in the value of this static portfolio due to price and rate movements over the
assumed holding period.
While this is straightforward in theory, in practice it complicates the issue of
backtesting. For instance, it is often argued that value-at-risk measures cannot be compared
against actual trading outcomes, since the actual outcomes will inevitably be “contaminated”
by changes in portfolio composition during the holding period. According to this view, the
inclusion of fee income together with trading gains and losses resulting from changes in the
composition of the portfolio should not be included in the definition of the trading outcome
because they do not relate to the risk inherent in the static portfolio that was assumed in
constructing the value-at-risk measure.
This argument is persuasive with regard to the use of value-at-risk measures based
on price shocks calibrated to longer holding periods. That is, comparing the ten-day, 99th
percentile risk measures from the internal models capital requirement with actual ten-day
trading outcomes would probably not be a meaningful exercise. In particular, in any given ten
day period, significant changes in portfolio composition relative to the initial positions are
common at major trading institutions. For this reason, the backtesting framework described
here involves the use of risk measures calibrated to a one-day holding period. Other than the
restrictions mentioned in this paper, the test would be based on how banks model risk
internally.
Given the use of one-day risk measures, it is appropriate to employ one-day trading
outcomes as the benchmark to use in the backtesting program. The same concerns about
“contamination” of the trading outcomes discussed above continue to be relevant, however,
even for one-day trading outcomes. That is, there is a concern that the overall one-day trading
outcome is not a suitable point of comparison, because it reflects the effects of intra-day
trading, possibly including fee income that is booked in connection with the sale of new
products.
On the one hand, intra-day trading will tend to increase the volatility of trading
outcomes, and may result in cases where the overall trading outcome exceeds the risk
measure. This event clearly does not imply a problem with the methods used to calculate the
risk measure; rather, it is simply outside the scope of what the value-at-risk method is
intended to capture. On the other hand, including fee income may similarly distort the
backtest, but in the other direction, since fee income often has annuity-like characteristics.
Since this fee income is not typically included in the calculation of the risk measure,
problems with the risk measurement model could be masked by including fee income in the
definition of the trading outcome used for backtesting purposes.
Some have argued that the actual trading outcomes experienced by the bank are the
most important and relevant figures for risk management purposes, and that the risk measures
-4-
should be benchmarked against this reality, even if the assumptions behind their calculations
are limited in this regard. Others have also argued that the issue of fee income can be
addressed sufficiently, albeit crudely, by simply removing the mean of the trading outcomes
from their time series before performing the backtests. A more sophisticated approach would
involve a detailed attribution of income by source, including fees, spreads, market
movements, and intra-day trading results.
To the extent that the backtesting program is viewed purely as a statistical test of the
integrity of the calculation of the value-at-risk measure, it is clearly most appropriate to
employ a definition of daily trading outcome that allows for an “uncontaminated” test. To
meet this standard, banks should develop the capability to perform backtests based on the
hypothetical changes in portfolio value that would occur were end-of-day positions to remain
unchanged.
Backtesting using actual daily profits and losses is also a useful exercise since it can
uncover cases where the risk measures are not accurately capturing trading volatility in spite
of being calculated with integrity.
For these reasons, the Committee urges banks to develop the capability to perform
backtests using both hypothetical and actual trading outcomes. Although national supervisors
may differ in the emphasis that they wish to place on these different approaches to
backtesting, it is clear that each approach has value. In combination, the two approaches are
likely to provide a strong understanding of the relation between calculated risk measures and
trading outcomes.
The next step in specifying the backtesting program concerns the nature of the
backtest itself, and the frequency with which it is to be performed. The framework adopted
by the Committee, which is also the most straightforward procedure for comparing the risk
measures with the trading outcomes, is simply to calculate the number of times that the
trading outcomes are not covered by the risk measures (“exceptions”). For example, over 200
trading days, a 99% daily risk measure should cover, on average, 198 of the 200 trading
outcomes, leaving two exceptions.
With regard to the frequency of the backtest, the desire to base the backtest on as
many observations as possible must be balanced against the desire to perform the test on a
regular basis. The backtesting framework to be applied entails a formal testing and
accounting of exceptions on a quarterly basis using the most recent twelve months of data.
The implementation of the backtesting program should formally begin on the date
that the internal models capital requirement becomes effective, that is, by year-end 1997 at
the latest. This implies that the first formal accounting of exceptions under the backtesting
program would occur by year-end 1998. This of course does not preclude national
supervisors from requesting backtesting results prior to that date, and in particular does not
preclude their usage, at national discretion, as part of the internal model approval process.
-5-
Using the most recent twelve months of data yields approximately 250 daily
observations for the purposes of backtesting. The national supervisor will use the number of
exceptions (out of 250) generated by the bank’s model as the basis for a supervisory response.
In many cases, there will be no response. In other cases, the supervisor may initiate a
dialogue with the bank to determine if there is a problem with a bank’s model. In the most
serious cases, the supervisor may impose an increase in a bank’s capital requirement or
disallow use of the model.
The appeal of using the number of exceptions as the primary reference point in the
backtesting process is the simplicity and straightforwardness of this approach. From a
statistical point of view, using the number of exceptions as the basis for appraising a bank’s
model requires relatively few strong assumptions. In particular, the primary assumption is
that each day’s test (exception/no exception) is independent of the outcome of any of the
others.
The Committee of course recognises that tests of this type are limited in their power
to distinguish an accurate model from an inaccurate model. To a statistician, this means that it
is not possible to calibrate the test so that it correctly signals all the problematic models
without giving false signals of trouble at many others. This limitation has been a prominent
consideration in the design of the framework presented here, and should also be prominent
among the considerations of national supervisors in interpreting the results of a bank’s
backtesting program. However, the Committee does not view this limitation as a decisive
objection to the use of backtesting. Rather, conditioning supervisory standards on a clear
framework, though limited and imperfect, is seen as preferable to a purely judgmental
standard or one with no incentive features whatsoever.

III. Supervisory framework for the interpretation of backtesting results

(a) Description of three-zone approach

It is with the statistical limitations of backtesting in mind that the Committee is

introducing a framework for the supervisory interpretation of backtesting results that
encompasses a range of possible responses, depending on the strength of the signal generated
from the backtest. These responses are classified into three zones, distinguished by colours
into a hierarchy of responses. The green zone corresponds to backtesting results that do not
themselves suggest a problem with the quality or accuracy of a bank’s model. The yellow
zone encompasses results that do raise questions in this regard, but where such a conclusion is
not definitive. The red zone indicates a backtesting result that almost certainly indicates a
problem with a bank’s risk model.
The Committee has agreed to standards regarding the definitions of these zones in
respect of the number of exceptions generated in the backtesting program, and these are set
forth below. To place these definitions in proper perspective, however, it is useful to examine
-6-
the probabilities of obtaining various numbers of exceptions under different assumptions
about the accuracy of a bank’s risk measurement model.

(b) Statistical considerations in defining the zones

Three zones have been delineated and their boundaries chosen in order to balance
two types of statistical error: (1) the possibility that an accurate risk model would be
classified as inaccurate on the basis of its backtesting result, and (2) the possibility that an
inaccurate model would not be classified that way based on its backtesting result.
Table 1 reports the probabilities of obtaining a particular number of exceptions from
a sample of 250 independent observations under several assumptions about the actual
percentage of outcomes that the model captures (that is, these are binomial probabilities). For
example, the left-hand portion of Table 1 reports probabilities associated with an accurate
model (that is, a true coverage level of 99%). Under these assumptions, the column labelled
“exact” reports that exactly five exceptions can be expected in 6.7% of the samples.
The right-hand portion of Table 1 reports probabilities associated with several
possible inaccurate models, namely models whose true levels of coverage are 98%, 97%,
96%, and 95%, respectively. Thus, the column labelled “exact” under an assumed coverage
level of 97% shows that five exceptions would then be expected in 10.9% of the samples.
Table 1 also reports several important error probabilities. For the assumption that the
model covers 99% of outcomes (the desired level of coverage), the table reports the
probability that selecting a given number of exceptions as a threshold for rejecting the
accuracy of the model will result in an erroneous rejection of an accurate model (“type 1”
error). For example, if the threshold is set as low as one exception, then accurate models will
be rejected fully 91.9% of the time, because they will escape rejection only in the 8.1% of
cases where they generate zero exceptions. As the threshold number of exceptions is
increased, the probability of making this type of error declines.
Under the assumptions that the model’s true level of coverage is not 99%, Table 1
reports the probability that selecting a given number of exceptions as a threshold for rejecting
the accuracy of the model will result in an erroneous acceptance of a model with the assumed
(inaccurate) level of coverage (“type 2” error). For example, if the model’s actual level of
coverage is 97%, and the threshold for rejection is set at seven or more exceptions, the table
indicates that this model would be erroneously accepted 37.5% of the time.
In interpreting the information in Table 1, it is also important to understand that
although the alternative models appear close to the desired standard in probability terms (97%
is close to 99%), the difference between these models in terms of the size of the risk measures
generated can be substantial. That is, a bank’s risk measure could be substantially less than
that of an accurate model and still cover 97% of the trading outcomes. For example, in the
case of normally distributed trading outcomes, the 97th percentile corresponds to 1.88
standard deviations, while the 99th percentile corresponds to 2.33 standard deviations, an
-7-
increase of nearly 25%. Thus, the supervisory desire to distinguish between models providing
99% coverage, and those providing say, 97% coverage, is a very real one.

(c) Definition of the green, yellow, and red zones

The results in Table 1 also demonstrate some of the statistical limitations of

backtesting. In particular, there is no threshold number of exceptions that yields both a low
probability of erroneously rejecting an accurate model and a low probability of erroneously
accepting all of the relevant inaccurate models. It is for this reason that the Committee has
rejected an approach that contains only a single threshold.
Given these limitations, the Committee has classified outcomes into three categories.
In the first category, the test results are consistent with an accurate model, and the possibility
of erroneously accepting an inaccurate model is low (green zone). At the other extreme, the
test results are extremely unlikely to have resulted from an accurate model, and the
probability of erroneously rejecting an accurate model on this basis is remote (red zone). In
between these two cases, however, is a zone where the backtesting results could be consistent
with either accurate or inaccurate models, and the supervisor should encourage a bank to
present additional information about its model before taking action (yellow zone).
Table 2 sets out the Committee’s agreed boundaries for these zones and the
presumptive supervisory response for each backtesting outcome, based on a sample of 250
observations. For other sample sizes, the boundaries should be deduced by calculating the
binomial probabilities associated with true coverage of 99%, as in Table 1. The yellow zone
begins at the point such that the probability of obtaining that number or fewer exceptions
equals or exceeds 95%. Table 2 reports these cumulative probabilities for each number of
exceptions. For 250 observations, it can be seen that five or fewer exceptions will be obtained
95.88% of the time when the true level of coverage is 99%. Thus, the yellow zone begins at
five exceptions.
Similarly, the beginning of the red zone is defined as the point such that the
probability of obtaining that number or fewer exceptions equals or exceeds 99.99%. Table 2
shows that for a sample of 250 observations and a true coverage level of 99%, this occurs
with ten exceptions.

(d) The green zone

The green zone needs little explanation. Since a model that truly provides 99%
coverage would be quite likely to produce as many as four exceptions in a sample of 250
outcomes, there is little reason for concern raised by backtesting results that fall in this range.
This is reinforced by the results in Table 1, which indicate that accepting outcomes in this
range leads to only a small chance of erroneously accepting an inaccurate model.
-8-
(e) The yellow zone

The range from five to nine exceptions constitutes the yellow zone. Outcomes in this
range are plausible for both accurate and inaccurate models, although Table 1 suggests that
they are generally more likely for inaccurate models than for accurate models. Moreover, the
results in Table 1 indicate that the presumption that the model is inaccurate should grow as
the number of exceptions increases in the range from five to nine.
The Committee has agreed that, within the yellow zone, the number of exceptions
should generally guide the size of potential supervisory increases in a firm’s capital
requirement. Table 2 sets out the Committee’s agreed guidelines for increases in the
multiplication factor applicable to the internal models capital requirement, resulting from
backtesting results in the yellow zone.
These guidelines help in maintaining the appropriate structure of incentives
applicable to the internal models approach. In particular, the potential supervisory penalty
increases with the number of exceptions. The results in Table 1 generally support the notion
that nine exceptions is a more troubling result than five exceptions, and these steps are meant
to reflect that.
These particular values reflect the general idea that the increase in the multiplication
factor should be sufficient to return the model to a 99th percentile standard. For example, five
exceptions in a sample of 250 implies only 98% coverage. Thus, the increase in the
multiplication factor should be sufficient to transform a model with 98% coverage into one
with 99% coverage. Needless to say, precise calculations of this sort require additional
statistical assumptions that are not likely to hold in all cases. For example, if the distribution
of trading outcomes is assumed to be normal, then the ratio of the 99th percentile to the 98th
percentile is approximately 1.14, and the increase needed in the multiplication factor is
therefore approximately 0.40 for a scaling factor of 3. If the actual distribution is not normal,
but instead has “fat tails”, then larger increases may be required to reach the 99th percentile
standard. The concern about fat tails was also an important factor in the choice of the specific
increments set out in Table 2.
It is important to stress, however, that these increases are not meant to be purely
automatic. The results in Table 1 indicate that results in the yellow zone do not always imply
an inaccurate model, and the Committee has no interest in penalising banks solely for bad
luck. Nevertheless, to keep the incentives aligned properly, backtesting results in the yellow
zone should generally be presumed to imply an increase in the multiplication factor unless the
bank can demonstrate that such an increase is not warranted.
In other words, the burden of proof in these situations should not be on the
supervisor to prove that a problem exists, but rather should be on the bank to prove that their
model is fundamentally sound. In such a situation, there are many different types of
additional information that might be relevant to an assessment of the bank’s model.
-9-
For example, it would then be particularly valuable to see the results of backtests
covering disaggregated subsets of the bank’s overall trading activities. Many banks that
engage in regular backtesting programs break up their overall trading portfolio into trading
units organised around risk factors or product categories. Disaggregating in this fashion could
allow the tracking of a problem that surfaced at the aggregate level back to its source at the
level of a specific trading unit or risk model.
Banks should also document all of the exceptions generated from their ongoing
backtesting program, including an explanation for the exception. This documentation is
important to determining an appropriate supervisory response to a backtesting result in the
yellow zone. Banks may also implement backtesting for confidence intervals other than the
99th percentile, or may perform other statistical tests not considered here. Naturally, this
information could also prove very helpful in assessing their model.
In practice, there are several possible explanations for a backtesting exception, some
of which go to the basic integrity of the model, some of which suggest an under-specified or
low-quality model, and some of which suggest either bad luck or poor intra-day trading
results. Classifying the exceptions generated by a bank’s model into these categories can be a
very useful exercise.

Basic integrity of the model

1) The bank’s systems simply are not capturing the risk of the positions
themselves (e.g., the positions of an overseas office are being reported
incorrectly).
2) Model volatilities and/or correlations were calculated incorrectly (e.g., the
computer is dividing by 250 when it should be dividing by 225).

Model's accuracy could be improved

3) The risk measurement model is not assessing the risk of some instruments with
sufficient precision (e.g., too few maturity buckets or an omitted spread).

Bad luck or markets moved in fashion unanticipated by the model

4) Random chance (a very low probability event).

5) Markets moved by more than the model predicted was likely (i.e., volatility
was significantly higher than expected).
6) Markets did not move together as expected (i.e., correlations were significantly
different than what was assumed by the model).

Intra-day trading

7) There was a large (and money-losing) change in the bank’s positions or some
other income event between the end of the first day (when the risk estimate
- 10 -
was calculated) and the end of the second day (when trading results were
tabulated).

In general, problems relating to the basic integrity of the risk measurement model are
potentially the most serious. If there are exceptions attributed to this category for a particular
trading unit, the plus should apply. In addition, the model may be in need of substantial
review and/or adjustment, and the supervisor would be expected to take appropriate action to
ensure that this occurs.
The second category of problem (lack of model precision) is one that can be
expected to occur at least part of the time with most risk measurement models. No model can
hope to achieve infinite precision, and thus all models involve some amount of
approximation. If, however, a particular bank’s model appears more prone to this type of
problem than others, the supervisor should impose the plus factor and also consider what
other incentives are needed to spur improvements.
The third category of problems (markets moved in a fashion unanticipated by the
model) should also be expected to occur at least some of the time with value-at-risk models.
In particular, even an accurate model is not expected to cover 100% of trading outcomes.
Some exceptions are surely the random 1% that the model can be expected not to cover. In
other cases, the behaviour of the markets may shift so that previous estimates of volatility and
correlation are less appropriate. No value-at-risk model will be immune from this type of
problem; it is inherent in the reliance on past market behaviour as a means of gauging the risk
of future market movements.
Finally, depending on the definition of trading outcomes employed for the purpose
of backtesting, exceptions could also be generated by intra-day trading results or an unusual
event in trading income other than from positioning. Although exceptions for these reasons
would not necessarily suggest a problem with the bank’s value-at-risk model, they could still
be cause for supervisory concern and the imposition of the plus should be considered.
The extent to which a trading outcome exceeds the risk measure is another relevant
piece of information. All else equal, exceptions generated by trading outcomes far in excess
of the risk measure are a matter of greater concern than are outcomes only slightly larger than
the risk measure.
In deciding whether or not to apply increases in a bank’s capital requirement, it is
envisioned that the supervisor could weigh these factors as well as others, including an
appraisal of the bank’s compliance with applicable qualitative standards of risk management.
Based on the additional information provided by the bank, the supervisor will decide on the
appropriate course of action.
In general, the imposition of a higher capital requirement for outcomes in the yellow
zone is an appropriate response when the supervisor believes the reason for being in the
yellow zone is a correctable problem in a bank’s model. This can be contrasted with the case
- 11 -
of an unexpected bout of high market volatility, which nearly all models may fail to predict.
While these episodes may be stressful, they do not necessarily indicate that a bank’s risk
model is in need of redesign. Finally, in the case of severe problems with the basic integrity
of the model, the supervisor should consider whether to disallow the use of the model for
capital purposes altogether.

(f) The red zone

Finally, in contrast to the yellow zone where the supervisor may exercise judgement
in interpreting the backtesting results, outcomes in the red zone (ten or more exceptions)
should generally lead to an automatic presumption that a problem exists with a bank’s model.
This is because it is extremely unlikely that an accurate model would independently generate
ten or more exceptions from a sample of 250 trading outcomes.
In general, therefore, if a bank’s model falls into the red zone, the supervisor should
automatically increase the multiplication factor applicable to a firm’s model by one (from
three to four). Needless to say, the supervisor should also begin investigating the reasons why
the bank’s model produced such a large number of misses, and should require the bank to
begin work on improving its model immediately.
Although ten exceptions is a very high number for 250 observations, there will on
very rare occasions be a valid reason why an accurate model will produce so many
exceptions. In particular, when financial markets are subjected to a major regime shift, many
volatilities and correlations can be expected to shift as well, perhaps substantially. Unless a
bank is prepared to update its volatility and correlation estimates instantaneously, such a
regime shift could generate a number of exceptions in a short period of time. In essence,
however, these exceptions would all be occurring for the same reason, and therefore the
appropriate supervisory reaction might not be the same as if there were ten exceptions, but
each from a separate incident. For example, one possible supervisory response in this instance
would be to simply require the bank’s model to take account of the regime shift as quickly as
it can while maintaining the integrity of its procedures for updating the model.
It should be stressed, however, that the Committee believes that this exception
should be allowed only under the most extraordinary circumstances, and that it is committed
to an automatic and non-discretionary increase in a bank’s capital requirement for backtesting
results that fall into the red zone.

IV. Conclusion
The above framework is intended to set out a consistent approach for incorporating
backtesting into the internal models approach to market risk capital requirements. The goals
of this effort have been to build appropriate and necessary incentives into a framework that
relies heavily on the efforts of banks themselves to calculate the risks they face, to do so in a
- 12 -
way that respects the inherent limitations of the available tools, and to keep the burdens and
costs of the imposed procedures to a minimum.
The Basle Committee believes that the framework described above strikes the right
balance in this regard. Perhaps more importantly, however, the Committee believes that this
approach represents the first, and therefore critical, step toward a tighter integration of
supervisory guidelines with verifiable measures of bank performance.
Table 1

Model is accurate Model is inaccurate: Possible alternative levels of coverage

Exceptions Coverage = 99% Exceptions Coverage = 98% Coverage = 97% Coverage = 96% Coverage = 95%
(our of 250) exact type 1 (our of 250) exact type 2 exact type 2 exact type 2 exact type 2
0 8,1 % 100,0 % 0 0,6 % 0,0 % 0,0 % 0,0 % 0,0 % 0,0 % 0,0 % 0,0 %
1 20,5 % 91,9 % 1 3,3 % 0,6 % 0,4 % 0,0 % 0,0 % 0,0 % 0,0 % 0,0 %
2 25,7 % 71,4 % 2 8,3 % 3,9 % 1,5 % 0,4 % 0,2 % 0,0 % 0,0 % 0,0 %
3 21,5 % 45,7 % 3 14,0 % 12,2 % 3,8 % 1,9 % 0,7 % 0,2 % 0,1 % 0,0 %
4 13,4 % 24,2 % 4 17,7 % 26,2 % 7,2 % 5,7 % 1,8 % 0,9 % 0,3 % 0,1 %
5 6,7 % 10,8 % 5 17,7 % 43,9 % 10,9 % 12,8 % 3,6 % 2,7 % 0,9 % 0,5 %
6 2,7 % 4,1 % 6 14,8 % 61,6 % 13,8 % 23,7 % 6,2 % 6,3 % 1,8 % 1,3 %
7 1,0 % 1,4 % 7 10,5 % 76,4 % 14,9 % 37,5 % 9,0 % 12,5 % 3,4 % 3,1 %
8 0,3 % 0,4 % 8 6,5 % 86,9 % 14,0 % 52,4 % 11,3 % 21,5 % 5,4 % 6,5 %
9 0,1 % 0,1 % 9 3,6 % 93,4 % 11,6 % 66,3 % 12,7 % 32,8 % 7,6 % 11,9 %
10 0,0 % 0,0 % 10 1,8 % 97,0 % 8,6 % 77,9 % 12,8 % 45,5 % 9,6 % 19,5 %
11 0,0 % 0,0 % 11 0,8 % 98,7 % 5,8 % 86,6 % 11,6 % 58,3 % 11,1 % 29,1 %
12 0,0 % 0,0 % 12 0,3 % 99,5 % 3,6 % 92,4 % 9,6 % 69,9 % 11,6 % 40,2 %
13 0,0 % 0,0 % 13 0,1 % 99,8 % 2,0 % 96,0 % 7,3 % 79,5 % 11,2 % 51,8 %
14 0,0 % 0,0 % 14 0,0 % 99,9 % 1,1 % 98,0 % 5,2 % 86,9 % 10,0 % 62,9 %
15 0,0 % 0,0 % 15 0,0 % 100,0 % 0,5 % 99,1 % 3,4 % 92,1 % 8,2 % 72,9 %

Notes: The table reports both exact probabilities of obtaining a certain number of exceptions from a sample of 250 independent observations under several assumptions about the true
level of coverage, as well as type 1 or type 2 error probabilities derived from these exact probabilities.
The left-hand portion of the table pertains to the case where the model is accurate and its true level of coverage is 99%. Thus, the probability of any given observation being an
exception is 1% (100% - 99% = 1%). The column labelled "exact" reports the probability of obtaining exactly the number of exceptions shown under this assumption in a sample of 250
independent observations. The column labelled "type 1" reports the probability that using a given number of exceptions as the cut-off for rejecting a model will imply erroneous rejection of
an accurate model using a sample of 250 independent observations. For example, if the cut-off level is set at five or more exceptions, the type 1 column reports the probability of falsely
rejecting an accurate model with 250 independent observations is 10.8%.
The right-hand portion of the table pertains to models that are inaccurate. In particular, the table concentrates of four specific inaccurate models, namely models whose true levels
of coverage are 98%, 97%, 96% and 95% respectively. For each inaccurate model, the "exact" column reports the probability of obtaining exactly the number of exceptions shown under
this assumption in a sample of 250 independent observations. The columns labelled "type 2" report the probability that using a given number of exceptions as the cut-off for rejecting a
model will imply erroneous acceptance of an inaccurate model with the assumed level of coverage using a sample of 250 independent observations. For example, if the cut-off level is set
at five or more exceptions, the type 2 column for an assumed coverage level of 97% reports the probability of falsely accepting a model with only 97% coverage with 250 independent
observations is 12.8%.
Table 2

Zone Number of Increase in Cumulative

exceptions scaling factor probability

0 0,00 8,11 %
1 0,00 28,58 %
Green Zone 2 0,00 54,32 %
3 0,00 75,81 %
4 0,00 89,22 %
5 0,40 95,88 %
6 0,50 98,63 %
Yellow Zone 7 0,65 99,60 %
8 0,75 99,89 %
9 0,85 99,97 %
Red Zone 10 or more 1,00 99,99 %

Notes: The table defines the green, yellow and red zones that supervisors will use to assess backtesting results in
conjunction with the internal models approach to market risk capital requirements. The boundaries shown in the table are
based on a sample of 250 observations. For other sample sizes, the yellow zone begins at the point where the cumulative
probability equals or exceeds 95%, and the red zone begins at the point where the cumulative probability equals or
exceeds 99.99%.
The cumulative probability is simply the probability of obtaining a given number or fewer exceptions in a
sample of 250 observations when the true coverage level is 99%. For example, the cumulative probability shown for four
exceptions is the probability of obtaining between zero and four exceptions.
Note that these cumulative probabilities and the type 1 error probabilities reported in Table 1 do not sum to one
because the cumulative probability for a given number of exceptions includes the possibility of obtaining exactly that
number of exceptions, as does the type 1 error probability. Thus, the sum of these two probabilities exceeds one by the
amount of the probability of obtaining exactly that number of exceptions.

Bcbs 57
No ratings yet
Bcbs 57
4 pages
Market Risk Capital Accord Update
No ratings yet
Market Risk Capital Accord Update
11 pages
McKinsey Working Papers On Risk
No ratings yet
McKinsey Working Papers On Risk
15 pages
Risk Management Failures Explained
No ratings yet
Risk Management Failures Explained
3 pages
Back-Test Pain: Need To Know
No ratings yet
Back-Test Pain: Need To Know
3 pages
Iba Ibs Event 27-07-06
No ratings yet
Iba Ibs Event 27-07-06
4 pages
MR18
No ratings yet
MR18
9 pages
Market Risk Management
No ratings yet
Market Risk Management
12 pages
The Risk Study and Control in Investment Decision: 2.1 A Relative Measure of Risk
No ratings yet
The Risk Study and Control in Investment Decision: 2.1 A Relative Measure of Risk
6 pages
Market Risk Model Validation - Insights and Approaches
No ratings yet
Market Risk Model Validation - Insights and Approaches
297 pages
Framework: Basle Committee On Banking Supervision
No ratings yet
Framework: Basle Committee On Banking Supervision
31 pages
VaR Backtesting for Risk Managers
No ratings yet
VaR Backtesting for Risk Managers
25 pages
Testing Density Forecasts, With Applications To Risk Management
No ratings yet
Testing Density Forecasts, With Applications To Risk Management
36 pages
Back PDF
No ratings yet
Back PDF
36 pages
Advanced Trading Strategies
No ratings yet
Advanced Trading Strategies
3 pages
Banking Risk & Governance Insights
No ratings yet
Banking Risk & Governance Insights
4 pages
Basel II: A Guide for Bankers
No ratings yet
Basel II: A Guide for Bankers
6 pages
General Banking MGT
No ratings yet
General Banking MGT
16 pages
Banking Capital Framework Guide
No ratings yet
Banking Capital Framework Guide
8 pages
Risk Management Essentials
No ratings yet
Risk Management Essentials
2 pages
SFM Charts: CA Mayank Kothari
No ratings yet
SFM Charts: CA Mayank Kothari
15 pages
Basel II Impact on 2008-09 Risk Management
No ratings yet
Basel II Impact on 2008-09 Risk Management
30 pages
An Overview of Risk and Risk Management
No ratings yet
An Overview of Risk and Risk Management
24 pages
Var Vs Expected Shortfall (Why VaR Is Not Subadditive, But ES Is) - Hull
No ratings yet
Var Vs Expected Shortfall (Why VaR Is Not Subadditive, But ES Is) - Hull
4 pages
Enterprise Risk Management Maturity-Level Assessment Tool
No ratings yet
Enterprise Risk Management Maturity-Level Assessment Tool
25 pages
Market Riask of Ambee Pharma
No ratings yet
Market Riask of Ambee Pharma
12 pages
Institute and Faculty of Actuaries: Subject ST9 - Enterprise Risk Management
No ratings yet
Institute and Faculty of Actuaries: Subject ST9 - Enterprise Risk Management
5 pages
RAROC and Capital Allocation
No ratings yet
RAROC and Capital Allocation
72 pages
Risk Management Concepts: January 1998
No ratings yet
Risk Management Concepts: January 1998
26 pages
2015.11 FRM 二级 mock 题: Societe Generale's
100% (1)
2015.11 FRM 二级 mock 题: Societe Generale's
29 pages
Risk Based Capital in General Insurance
No ratings yet
Risk Based Capital in General Insurance
10 pages
Riscuri Bancare
No ratings yet
Riscuri Bancare
33 pages
Glossary of Finance Terms
No ratings yet
Glossary of Finance Terms
8 pages
EWS Report
No ratings yet
EWS Report
21 pages
Internal Control System For Banking Organization
100% (1)
Internal Control System For Banking Organization
34 pages
Risk and Uncertainty
No ratings yet
Risk and Uncertainty
6 pages
Return On Risk Managment FINAL PDF
No ratings yet
Return On Risk Managment FINAL PDF
8 pages
Market Risk: Value at Risk and Stop Loss Policies
No ratings yet
Market Risk: Value at Risk and Stop Loss Policies
4 pages
Market Risk PDF
No ratings yet
Market Risk PDF
11 pages
VRM
100% (1)
VRM
2 pages
FandI ST9 Specimen Solutions FINAL
No ratings yet
FandI ST9 Specimen Solutions FINAL
13 pages
Control System Components
No ratings yet
Control System Components
4 pages
Controlling
No ratings yet
Controlling
2 pages
Artikel Pak Wimboh
No ratings yet
Artikel Pak Wimboh
43 pages
Scorecard Models For Operational Risk Ma
No ratings yet
Scorecard Models For Operational Risk Ma
6 pages
Risk Management Concepts: January 1998
No ratings yet
Risk Management Concepts: January 1998
26 pages
Executive Summary
No ratings yet
Executive Summary
3 pages
Risk & Return Analysis - Prime Book
No ratings yet
Risk & Return Analysis - Prime Book
25 pages
BASLE II A Simplified Explanation
No ratings yet
BASLE II A Simplified Explanation
5 pages
Portfolio Management Tutorial Solutions
No ratings yet
Portfolio Management Tutorial Solutions
7 pages
Derivatives in Plain Words
No ratings yet
Derivatives in Plain Words
172 pages
Case 4.2
No ratings yet
Case 4.2
7 pages
The Price of An Asset - Interest Rates - Market Volatility - Market Liquidity
No ratings yet
The Price of An Asset - Interest Rates - Market Volatility - Market Liquidity
41 pages
Fiches CFA Level I - Ethics Mais Pas Seulement
No ratings yet
Fiches CFA Level I - Ethics Mais Pas Seulement
19 pages
Econometrics 3A Assessment Guide
No ratings yet
Econometrics 3A Assessment Guide
8 pages
CHAPTERI
No ratings yet
CHAPTERI
13 pages
Contoh Soal PMP - 3
No ratings yet
Contoh Soal PMP - 3
2 pages
Educational Leadership and Management ST d646b394
No ratings yet
Educational Leadership and Management ST d646b394
16 pages
A Bibliometric Analysis of Atangana-Baleanu
No ratings yet
A Bibliometric Analysis of Atangana-Baleanu
6 pages
Makalah Scientific Approach
No ratings yet
Makalah Scientific Approach
8 pages
MU0013 - HR Audit
No ratings yet
MU0013 - HR Audit
32 pages
AI in Hand Surgery - Assessing Large Language Models in The Classification and Management of Hand Injuries
No ratings yet
AI in Hand Surgery - Assessing Large Language Models in The Classification and Management of Hand Injuries
14 pages
Wenner Gren Dissertation Fieldwork Application Form
100% (1)
Wenner Gren Dissertation Fieldwork Application Form
8 pages
Analysis of Excessive Hydrogen Generation in Transformers in Service
100% (1)
Analysis of Excessive Hydrogen Generation in Transformers in Service
8 pages
BS Iso TR 7066-1 - 1997
No ratings yet
BS Iso TR 7066-1 - 1997
36 pages
Problem Based Learning and Applications
No ratings yet
Problem Based Learning and Applications
9 pages
CSIR-NPL - Project Staff - Applicaton - Form
No ratings yet
CSIR-NPL - Project Staff - Applicaton - Form
2 pages
Literature Review Medicine Example
100% (1)
Literature Review Medicine Example
5 pages
Expressive Illocutionary Speech Acts in Webtoon: Dark Moon: The Blood Altar
No ratings yet
Expressive Illocutionary Speech Acts in Webtoon: Dark Moon: The Blood Altar
10 pages
College Ranking Systems Analyzed
No ratings yet
College Ranking Systems Analyzed
6 pages
Mmpds 2015 Statistical Property Analysis Overview
No ratings yet
Mmpds 2015 Statistical Property Analysis Overview
13 pages
Sampling and Sampling Distributions (Basic Statistics)
No ratings yet
Sampling and Sampling Distributions (Basic Statistics)
26 pages
The Importance of Consumer Behavior in Marketing
No ratings yet
The Importance of Consumer Behavior in Marketing
9 pages
6220ac4ef0e8b Su Phe5020 w3 A2d Earle G
No ratings yet
6220ac4ef0e8b Su Phe5020 w3 A2d Earle G
2 pages
Article JMHSB 178
No ratings yet
Article JMHSB 178
9 pages
Teacher's Guide Ss
No ratings yet
Teacher's Guide Ss
10 pages
Development of Thermal Hydraulic and Margin Analysis Code For Steady State Forced and Natural Convecting of Plate Fuel Research Reactor
No ratings yet
Development of Thermal Hydraulic and Margin Analysis Code For Steady State Forced and Natural Convecting of Plate Fuel Research Reactor
13 pages
Opcrf Movs Checklist Sy 2022 2023
No ratings yet
Opcrf Movs Checklist Sy 2022 2023
9 pages
Pakistan Stock Returns Analysis
No ratings yet
Pakistan Stock Returns Analysis
7 pages
Smith, Hazel and Dean, Roger T. - Practice-Led Research, Research-Led Practice in The Creative Arts
93% (14)
Smith, Hazel and Dean, Roger T. - Practice-Led Research, Research-Led Practice in The Creative Arts
289 pages
Teaching Plan Geomatics Engineering Class
No ratings yet
Teaching Plan Geomatics Engineering Class
9 pages
S11078457 Final Poster Mini Project
No ratings yet
S11078457 Final Poster Mini Project
1 page
Early Versus Late Preventive Ileostomy Closure.21
No ratings yet
Early Versus Late Preventive Ileostomy Closure.21
10 pages
The Anatomy of Corporate Fraud
No ratings yet
The Anatomy of Corporate Fraud
24 pages

Backtesting Framework for Market Risk

Uploaded by

Backtesting Framework for Market Risk

Uploaded by

SUPERVISORY FRAMEWORK FOR THE

USE OF "BACKTESTING" IN CONJUNCTION WITH THE

INTERNAL MODELS APPROACH TO MARKET RISK

Basle Committee on Banking Supervision

II. Description of the backtesting framework

III. Supervisory framework for the interpretation of backtesting results

(a) Description of three-zone approach

It is with the statistical limitations of backtesting in mind that the Committee is

(b) Statistical considerations in defining the zones

(c) Definition of the green, yellow, and red zones

The results in Table 1 also demonstrate some of the statistical limitations of

(d) The green zone

Basic integrity of the model

Model's accuracy could be improved

Bad luck or markets moved in fashion unanticipated by the model

4) Random chance (a very low probability event).

(f) The red zone

Model is accurate Model is inaccurate: Possible alternative levels of coverage

Zone Number of Increase in Cumulative

You might also like