## Servicios Personalizados

## Revista

## Articulo

## Indicadores

- Citado por SciELO
- Accesos

## Links relacionados

- Citado por Google
- Similares en SciELO
- Similares en Google

## Compartir

## Revista de análisis económico

##
*versión On-line* ISSN 0718-8870

### RAE vol.28 no.2 Santiago oct. 2013

#### http://dx.doi.org/10.4067/S0718-88702013000200001

**CONDITIONAL PREDICTIVE ABILITY OF EXCHANGE RATES IN LONG RUN REGRESSIONS***

*PREDICTIBILIDAD DEL TIPO DE CAMBIO: UNA EVALUACION CONDICIONAL CON REGRESIONES DE LARGO PLAZO*

**PABLO M. PINCHEIRA****

* The author is grateful to Steven Durlauf for his permanent support, to Kenneth West for his brilliant comments, to Salvador Navarro for a fruitful discussion, and to Alan Spearot and Katie for their marvelous assistance. The views expressed in this paper do not necessarily represent those of the Central Bank of Chile or its Board members. All errors are responsibility of the author.

** Central Bank of Chile, Address: Agustinas 1180, Santiago, Chile. Phone: +56 2 26702874. E-mail: ppinchei@bcentral.cl.

**Abstract**

*In this paper we evaluate exchange rate predictability using a framework developed by Giacomini and White (2006). This new framework tests for conditional predictive ability rather than unconditional predictive ability, which has been the standard approach. Using several shrinkage based forecasting methods, including new methods proposed here, we evaluate conditional predictability of five bilateral exchange rates at differing horizons. Our results indicate that for most currencies a random walk would not be the optimal forecasting method in a real time forecasting exercise, at least for some predictive horizons. We also show that our proposed shrinkage methods in general perform on par with Bayesian shrinkage and ridge regressions, and sometimes they even perform better.*

**Keywords:** *Exchange rate predictability, conditional predictive ability, Bayesian shrinkage, ridge regression, forecast evaluation.*

JEL Classification: *C22, C53, E37, F3L*

**Resumen**

*En este documento evaluamos la predictibilidad de algunos tipos de cambio usando un nuevo enfoque desarrollado por Giacomini y White (2006). Lo novedoso de este enfoque es la realización de tests de predictibilidad condicionales no sólo incondicionales como ha sido la regla hasta ahora. Usando varias técnicas predictivas de reducción de parámetros, incluyendo algunos nuevos métodos presentados en este documento, se evalúa la predictibilidad condicional de cinco tipos de cambio bilaterales con respecto al dólar estadounidense, considerando distintos horizontes de predicción. Nuestros resultados indican que, para la mayoría de las monedas el camino aleatorio no sería el mejor predictor en un ejercicio de predicción en tiempo real, al menos para ciertos horizontes. También mostramos que, en general, los métodos de reducción de parámetros propuestos en este documento son tan buenos predictores como los métodos de reducción de parámetros tradicionales, y a veces incluso mejores.*

**Palabras clave**: *Predictibilidad de tipos de cambio, evaluación predictiva condicional, contracción bayesiana, regresión ridge, evaluación capacidad predictiva.*

Clasificación JEL: *C22, C53, E37, F3L.*

*1.***INTRODUCTION**

One of the most striking contributions in the exchange rate literature is the well known result of Meese and Rogoff (1983a,b). Using a variety of linear exchange rate models, these authors showed that no economic model was able to consistently display improved forecast accuracy over a simple random walk model. This result was shown to be robust across different exchange rates and predictive horizons.

Later on, improved methodological techniques showed some results that partially overturned this seminal work. Some evidence of predictability is shown in Chinn and Meese (1995), Mark (1995), MacDonald and Marsh (1997), McCracken and Sapp (2005), and Clark and West (2006). Nevertheless, this evidence is still weak and no conclusive result on exchange rate predictability has been shown.

These improved methodological techniques are partly based upon the development of econometric strategies for forecast comparison under general loss functions. West (1996) and Diebold and Mariano (1995) established the basic econometric framework under which out-of-sample tests of predictive ability are carried out.

An important observation needs to be made. When engaging in tests of predictive ability there are two major questions that might be addressed. One is a question about theory. Namely, tests of predictive ability are used as instruments to test an economic theory. The second question is an empirical question seeking to find a profitable forecasting method irrespective of any theoretical implications about the underlying data generating process. These two questions are not equivalent. In particular it is possible to show that even when the null hypothesis of no predictability is rejected, it is likely that a forecasting method based upon the rejected null model will outperform some forecasting methods based upon the alternative model.

This distinction is analyzed in depth in Giacomini and White (2006). They argue that the framework for out-of-sample predictive ability testing, developed by West (1996) and Diebold and Mariano (1995), might not be useful or appropriate for an applied forecaster trying to assess which of two competing forecasting methods will provide more accurate forecasts in the future. They propose an alternative approach that claims to be more relevant to economic forecasters.

The main distinction between the two approaches is twofold. First, Giacomini and White (2006) focus their analysis on conditional expectations of forecasts, while West (1996) and Diebold and Mariano (1995) focus on unconditional expectations. According to this distinction we will call the Giacomini and White (2006) approach the conditional approach, and that of West (1996) and Diebold and Mariano (1995) the unconditional approach. This difference is relevant for a forecaster that is highly interested in finding the best forecast for the next relevant period instead of a forecast that is the best on average. Second, the conditional approach is concerned with the whole "forecasting method" rather than just with the theoretical model used to generate forecasts, which is the main object of interest of the unconditional approach. The "forecasting method" is a much more general notion than the forecasting model because it includes the model, its estimation technique, the size of estimation and forecasting windows, and in general all the elements of the forecasting method that could possible affect its future predictive ability performance.

The recent literature that has partially overturned the result of Meese and Rogoff (1983a,b) has built on the unconditional approach to draw inference about exchange rate predictability. In consequence, it may be totally feasible that even for those currencies, models and horizons for which predictability is found, forecasts from these models may be outperformed by a simple random walk strategy in a real-time forecasting exercise. Little or no research has addressed the evaluation of conditional predictive ability for exchange rates.

To fill this gap, in this paper we perform tests of conditional predictive ability for several exchange rates, using a variety of shrinkage based forecasting methods based upon models of interest parity. Besides this contribution, we also introduce a new shrinkage estimation approach aimed at improving forecast accuracy under quadratic loss.

The rest of the paper is organized as follows: Section 2 further develops the conditional predictive ability approach and its differences with the unconditional approach. The relevant econometric environment is presented in Section 3^{1}. Section 4 displays a description of the model and different estimation techniques that are used to build the different "forecasting methods". Empirical results are reported for five bilateral exchange rates in Section 5. Section 6 concludes.

**2. CONDITIONAL VERSUS UNCONDITIONAL TESTING FRAMEWORK**

To correctly illustrate the main differences between the conditional and unconditional approaches, we consider two competing parametric forecasting models for the conditional expectation of a scalar time series y_{t+1}. We denote the forecasts from these two models as *y*^{2}_{t}_{+}* _{1}* (β

_{1}) and y

_{t}

^{2}

_{+1}(β

_{2}) where β

_{1}and

*β*denotes population parameters of the two competing models. For a given loss function

_{2}*L = L (y*(β

_{t}+_{1}, yi_{t+1}_{1})),

*i*= 1,2 the unconditional approach suggests a test of equal forecast accuracy as follows

whereas the conditional approach suggests the following testing strategy

where and denote parameter estimates of β_{1} and β_{2} with information up until time *t.* The implementation of the conditional approach relies on the fact that (2) is equivalent to

for all Ψ_{t} -measurable function h_{t}.

Some of the differences between the two approaches are evident. First, the unconditional approach asks a question directly involving the true unknown parameters of the competing models, whereas the conditional approach asks a direct question involving only estimates of those parameters^{2}. When focusing on the true population parameters, the unconditional approach is implicitly testing the appropriateness of a model to correctly approximate the true data generating process. However, it is clear that even the true model might yield poor forecasts in the presence of parameter uncertainty, and clearly some "false" models have the chance to outperform the correct model in this context. In this regard, the use of known parameter estimates in the conditional approach might be more useful to determine which model will provide more accurate forecasts in a real time forecasting application. This is because in the conditional approach testing and future forecast accuracy now both depend upon the same magnitudes (* *and ), whereas in the unconditional approach testing focuses on the true population parameters but forecast accuracy is measured using and . Second, Giacomini and White (2006) argue that a null hypothesis established as (1) can be interpreted as saying that, on average, the two models provide equal forecast accuracy. This information might not be very useful for a forecaster that needs to know which model provides the best forecast for tomorrow given information available today. The conditional null hypothesis seems a better choice for this scenario.

Some other differences are subtle. In particular we want to emphasize that when the conditional null hypothesis is stated in terms of the estimates of the true population parameters, this null is implicitly imposing restrictions on those population parameters, on the size of the estimation window and also on the ridge or shrinkage factors that may be used for estimation^{3}. In other words, whereas the unconditional null only imposes a typically simple restriction on the parameters of the models, the conditional null imposes a restriction involving these parameters, the estimation sample size and the shrinkage factor used for estimation. This is important because the choice of estimation sample size and shrinkage factor may have an impact on the size of the conditional test^{4}. As there is no empirical guide about how to choose these two magnitudes, we recommend caution when interpreting the results using conditional tests.

Further differences are also worth mentioning. For instance, the unconditional approach relies on stationarity assumptions, whether the unconditional approach relies on a more general assumption of heterogeneity. Besides, the conditional approach applies to both nested and non-nested models. On the contrary, the unconditional approach, originally established for only non-nested models, needs to make significant adjustments when models are nested, McCracken (2004). Further differences are described in detail in Giacomini and White (2006).

**3. ECONOMETRIC ENVIRONMENT**

Consider a scalar time series process with general term denoted by *y _{t}* and the set of information available until time

*t*denoted by

^{x}P

_{t}. We want to build τ-step-ahead forecasts for this scalar time series based upon information available until time

*t.*We have two different methods to build τ-step-ahead forecasts for the relevant time series y

_{t}. These methods provide two different forecasts denoted by

*f*and

_{Rt}*g*We will further assume that these forecasts are built from estimates of parametric models, so we can express

_{R}_{t}.The *R* subscript means that the forecasts are constructed using at most the last *R* sample observations available until time *t.* This strategy is well known as a rolling estimation window of maximum size *R.*

We will be using two models to build our forecasts:

where *X' _{t}*+

_{1}is a vector of exogenous random variables and

*e*is a zero mean martingale difference series meaning that

_{t}+_{1}*E(e*+

_{t}_{1}lΨ

_{t}) = 0. The optimal forecast under quadratic loss is 0 for Model 1 and

*X*for Model 2. Therefore we propose the following forecasting methods

_{t}+/where represents a rolling estimate of the unknown parameter *β* using rolling window size *R,* information available up to time *t* and estimation method *i.*

Forecast evaluation is carried out simulating an out-of-sample exercise. One has *T* + 1 observations of y_{t}+_{1} and *X' _{t}*+

_{1}.The first

*R*observations are used for the first estimation. Therefore the first τ-step-ahead forecast is built at time

*R*and compared with the realization y

_{R}+

_{t}. The second forecast is obtained using the last

*R*observations available for estimation. This forecast is compared with the realization

*y*We iterate like this until the

_{R}+_{1}+_{t}.*T*+ 2 -

*τ*-

*R*forecast is built again using the last

*R*observations available for estimation. This forecast is compared with the realization y

_{T}+

_{1}. We generate a total of

*P*forecasts, with

_{t}*P*satisfying

_{t}*R*+

*(P*- 1) +

_{t}*τ*=

*T*+ 1. So

These forecasts are evaluated using a loss function depending on both the forecasts and the realization of the data. We will focus our analysis in a quadratic loss function. Then we test the following null hypothesis

The implementation of the conditional approach relies on the fact that the null hypothesis is equivalent to

for all Ψ_{t}–measurable function *h _{t}*

We first select our preferred choice of a *q χ* 1 test function *h _{t}* to construct the relevant statistics that are described next.

**3.1. One-Step-Ahead Conditional Test**

When* * = 1é the sequence is a martingale difference sequence if the null is true. Giacomini and White (2006) propose the following statistic for the test of equal conditional predictive ability

Giacomini and White (2006) give conditions under which the asymptotic distribution ofis Chi-square^{5}

**3.2. Multi-Step Conditional Test**

When *> 1* Giacomini and White (2006) propose the following statistic for the test of equal conditional predictive ability

where

and is a HAC estimate of the variance * _{ }*computed according to Newey and West (1987).

Giacomini and "White (2006) give conditions under which the asymptotic distribution of is Chi-square^{6}

**3.3. A Forecasting Decision Rule**

Assume that we are able to carry out a test of conditional predictive ability and we are also able to reject the null hypothesis. We then need to decide how to build a forecast for time *T* + 2. Rejection of the null hypothesis gives statistical evidence indicating that one forecasting method is more accurate than the other, and that the test function *h _{T}+_{1}* contains useful information for the determination of the best forecasting method. Giacomini and White (2006) propose the following decision rule:

1. Pick a threshold level *c.*

2. Regress * *on *h _{t}* over the out-of-sample period to obtain the regression coefficient .

3. Pick the forecast and choose *f* otherwise.

Giacomini and White (2006) also propose an indicator to evaluate the number of times this decision rule would have chosen forecast method *g* over *f* or the other way around

where

We will implement this same indicator in our empirical application.

**4. FORECASTING METHODS AND MODELS **

**4.1. Derivation of a Forecasting Shrinkage Estimator**

In this subsection we derive a new shrinkage estimator for the parameter of a linear regression model. Interestingly, this new shrinkage estimator provides a natural interpretation for the matrix of perturbations typically used in ridge regressions.

Let us assume that * * is a sequence satisfying the following expression

where *β _{0}* is the true value of the parameter of the model and

where {Ψ* _{t}* } represents a filtration such that Ψ

*is the sigma-field generated by current and past Xs and es.*

_{t}A traditional OLS and ridge estimators for *β _{0}* are given by the following expression:

where is called the ridge factor. In principle the right or optimal ridge factor is unknown and needs to be estimated. We propose a natural approach to this problem: an approach that uses the context of out-of-sample model evaluation and has the same variance reduction advantages of traditional ridge regressions.

We will assume that the number of observations available is *T* + 1 = *P* + R, where *R* is the size of the estimation window and *P* is the size of the prediction window. In this case we want to find an estimate of β_{0} by solving the following problem:

To build our first forecasts we only use *R* observations of our sample, and we want to engage in a one-step-ahead prediction exercise that not only minimizes an in-sample estimate of the loss function, but also a combination of an in-sample estimate and an out-of-sample estimate of the loss function.

Notice that we could rewrite the problem (9) as

The expectation in (9) can be estimated as follows

Let us consider now *N » R* . We have that

The terms inside the first expectation on the right hand side are a function of terms belonging to the information set Ψ* _{R}* so we could write the previous expression as

Furthermore the second term on the right hand side can be decomposed as follows:

Notice that

because for all > 0

therefore

so finally

taking derivatives with respect to *β* we finally have that *β* satisfies

We define our statistic *β* by replacing the unknowns in the expressions above by sample estimates:

We will assume that the conditional expectation of the square of the perturbations is independent of the parameter β. Furthermore, we are interested in a shrinkage estimator to obtain benefits from variance reduction. This leads us to pick *β _{0}*= 0. Therefore, with appropriate assumptions of identification, we propose the following estimator:

This estimator is similar to the ridge estimator presented earlier. We need to be precise about two elements of this new estimator: the choice of N, and the expectation formation

For the later we propose to estimate a VAR(p) model on the regressors X_{t+1}. Usual model selection criteria may be followed. For the choice of *N* we propose three strategies:

1. *N* = R. The idea here is to impose the fact that when forecasting with rolling OLS regressions we are only imposing an in-sample minimization of the loss function. However the evaluation of the forecast accuracy involves comparing the forecast with the unknown predicted value. We try to overcome this situation with this new estimator.

2. *N* = *T* + 1. In this case we are imposing that our estimate will minimize a combination of the in-sample loss function and an estimate of the future loss function that will be used to evaluate forecast accuracy. The problem with this scheme is that now . For many applications this is not a problem, but for the application we are interested in here, dependence from *P* may render a degenerate distribution for our loss function comparisons.

3. *N*=*R,* where * >* 1 gives us an approximation of the importance that the forecaster gives to the out-of-sample minimization versus the in-sample minimization.

Finally, we want to present a particular case in which there is only one regressor and this is a constant.

**4.1.1. Example**

Consider the original model in which *X´ _{t}_{+1}* is a constant, and the vector

*e*is

_{t+1}*i.i.d.*with homoscedastic shocks . We could rewrite our model as

In this particular case we have that

so, our estimate is

If we choose * =* 0 then we get a shrunken *OLS* for an arbitrary N:

Furthermore, if we choose *N = T,* we have

Finally, if we choose *N* = *R* - 1, where * >* 1, then we have

In our empirical application we will implement three of these variations. First, we will use (10) with a choice of *N* given by *N* = R,, where *>* 1× Second, we will implement the full shrinkage approach given by β_{3} and finally, we will also implement .

**4.2. Forecasting Methods**

In this subsection we introduce the model and estimation strategies that are used for the implementation of the conditional tests. Our target is to build forecasts of the log difference of five US dollar bilateral exchange rates using monthly data. We analyze the cases of Canada, Japan, Switzerland, the UK and Chile^{7}. We want to evaluate the conditional predictive ability of six different forecasting methods based upon an uncovered interest parity model, and compare their predictive ability with a forecasting method based upon a random walk model. All of the six methods basically posit that exchange rate returns are predicted by two regressors: a constant and the one-month interest differential. The forecasts are constructed according to Mark (1995) using the following equation:

and we will denote

where *s _{t}* represents nominal exchange rate at time

*t*and

*x*represents interest rate differential.

_{t}The only difference between these six forecasting methods is the way the parameters are estimated. We will denote * *as the estimate of *β _{τ}* using method

*i*and the information available up until time

*t.*A description of the different estimation methods follows next. For simplicity the analysis is written assuming = 1.

The six different estimation approaches that we use have the following two features in common. First, all of these estimation approaches are rolling with an estimation window of the same size, R. Second, all six estimation techniques can be summarized by the following general expression

where *M _{i}* is a real matrix that truly identifies each of the proposed methods. The choice of

*M*is described next:

_{i}1. Rolling OLS (OLS). The choice of M_{1} is given by:

*M*1 = 0

so the unknown parameter *β* is estimated via OLS using the last *R* available observations. Therefore =_{1} satisfies:

2. Rolling Bayesian Shrinkage (Bayesian): The choice of M_{2} is given by:

where ^{2} is the standard deviation of the residuals of a regression between y_{t+1} and y_{t}. *V* is the diagonal variance-covariance matrix for the prior distribution of

where * _{y}* represents the sample standard deviation of the dependent variable (exchange rate returns) and

*represents the sample variance of the interest rate differential variable. We also need to provide*

_{x}*a priori*values for the hyperparameters

*and*

*Following Litterman (1986) we use*

*= =*0.2.

3. Deterministic Rolling Out-of-Sample OLS (Det OOS-OLS): The choice of M_{3} is given by:

with a choice of *μ* given by

4. Full Shrinkage Approach (Full): The choice of M_{4} is given by:

where the parameter *μ* - 1 > 0 is arbitrarily big. In our empirical application we set *μ* = 20. It is easy to see that

5. Rolling Out-of-Sample OLS (OOS-OLS): The choice of M_{5} is given by:

where the expectation *E (x _{s}X'_{s}*lΨ

*+*

_{t}_{1}) isestimated fitting a VAR(p) model over the vector X

_{t+1},

*t*= R, ...,

*R*+

*Ρ*- 1. In our empirical application we use prior information about the process of the interest rate differential. Following Clark and West (2006) we fit an AR(1) model.

_{t}6. Rolling Ridge Regression (Ridge): The choice of *M _{6}* is given by:

where the ridge parameter > 0 is set to = 20, and *k* is the number of variables in the regression.

We will use these methods to evaluate conditional predictive ability of several bilateral exchange rates in the next section.

**5. EMPIRICAL RESULTS**

In this section we present results for a number of tests of conditional predictive ability for five bilateral exchange rates. We analyze the cases of Canada, Japan, Switzerland, the UK and Chile. For these countries we take the series of 1-, 2-, 4-, 6-, 8-, 12- and 16-months ahead forecast errors to conduct tests of conditional predictive ability using a quadratic loss function. For Canada, Japan, Switzerland and the UK we set *R* = 120 and *Ρ* = *T* + 1 - *R,* where *T* + 1 is the total number of observations. For Chile we set *R* = 36, *P* = 108. Our main goal is to evaluate if any of our forecasting methods may outperform the random walk. For each country we run a total of 126 tests of the following form

where subscript *i* denotes the type of test function used in the analysis, *j* denotes the estimation technique used to obtain parameter estimates of the model and *k* denotes the 7 horizons used in the analysis according to the following description:

where subscript i denotes the type of test function used in the analysis, j denotes the estimation technique used to obtain parameter estimates of the model and k denotes the 7 horizons used in the analysis according to the following description:

*j* = 1 means traditional OLS estimation, *j* = 2 means Bayesian shrinkage estimation, *j* = 3 means deterministic out-of-sample OLS, *j* = 4 means a full shrinkage procedure, *j* = 5 means Out-of-Sample OLS, and *j* = 6 means a ridge regression. All these methods are described in the previous section. Finally, *k* = 1, 2, 4, 6, 8, 12, 16, denotes the horizon of the analysis.

In case of rejection of the null hypothesis we also implement the decision rule in (7) to evaluate which method would have been selected. We also report the percentage gain (loss) in Mean Square Prediction Error (MSPE) for all the predictions.

Tables 1-5 in the appendix show p-values of the tests of conditional predictive ability for each of the five countries. Tables 6-10 in the appendix show the percentage gain (loss) in MSPE of each forecasting method with respect to the random walk. We analyze our results according to the following criteria. First, we simply want to know whether it is possible to beat the random walk in a real-time forecasting exercise under quadratic loss. Second, in case evidence of predictability is found, we want to know how predictability varies along different predictive horizons. Third, in case evidence of predictability is found, we would like to know how the conditional tests perform better than the unconditional tests. Finally, we would like to know the size of the improvement in predictability should any evidence of predictability be found, and we would also like to identify the best estimation method to carry out a real time exchange rate forecasting exercise.

**5.1. Predictability**

Tables 1-5 shows p-values for the null of equal predictive ability. A minus sign indicates that the decision rule in (7), suggests using the random walk as a forecasting method, while a plus sign indicates the decision rule points to the corresponding alternative approach. Tables 1-4 show that evidence of predictability is found for Canada, Chile, Japan and Switzerland, as some p-values are lower than the 10% significance level and (7) suggests using the corresponding alternative model to build forecasts. No predictability evidence is found for the UK, as every time the null of equal predictability is rejected, the decision rule in (7) suggests the use of the random walk over any other alternative approach considered.

**5.2. Predictability Horizons**

According to Tables 1-4, evidence of predictability is found for Canada at the 1-, 2- and 4-month ahead forecast horizon. For Chile and Switzerland, evidence of predictability is found at every single considered horizon. For Japan, evidence of predictability is only found at the 4-, 6- and 8-month ahead forecast horizon.

Differing from Mark (1995), we do not see long-term predictability dominating over short term predictability. In fact, we see that every time there is long time predictability (Chile and Switzerland) there is also short term predictability. In this respect these results are consistent with those of McCracken and Sapp (2005) and those of Kilian (1999).

**5.3. Conditional and Unconditional Predictability**

Tables 1-5 report three panels of p-values. Each panel corresponds to a different testing function h_{it}, *i* = 1,..., 3. We remark here that the first testing function is h_{1t} = 1. In this case the conditional approach of Giacomini and White (2006) reduces to an unconditional approach in which the true unknown value of the parameters is replaced by sample estimates. We compare results from the first panel (h_{1t} = 1) with results obtained using more general testing functions. These results are shown in panels 2 and 3, labeled h2 and h3 in Tables 1-5. We compare whether rejection in panels 2 and 3 is encompassed by rejection in panel 1. In case rejections in panel 2 and 3 provide new information, we attempt to check for robustness of these rejections by comparing suggestions from the decision rule (7) and the sign of the difference in MSPE^{8}.

Table 1 shows that for Canada there is no new information from the truly conditional panels 2 and 3, as any rejection of the null of equal predictive ability in panels 2 and 3 is also found in the "unconditional" panel 1. Quite the contrary happens with Chile. For this country, the "unconditional" approach shows no rejection whatsoever. The "truly" conditional panels, however, show a number of rejections that are consistent with Table 7 in terms of choosing the forecasting model displaying the lowest MSPE. Out of 28 rejections, there is only 1 "mistake"^{9}. For Japan the conditional panels add two new "correct" rejections in favor of the random walk forecasting method. For Switzerland, the conditional panels add 11 new and "correct" rejections whereas for the UK the conditional panels add two new and "correct" rejections.

Overall, we see that conditioning seems to help in getting statistically significant information about conditional predictive ability. Notice that with a simple unconditional approach, (panel 1) fewer rejections of the null of equal conditional predictive ability would have occurred.

**5.4. Predictability Size and Best Method**

Tables 6-10 show percentage gains (losses) in out-of-sample MSPE between the random walk and the corresponding alternative forecasting method. A negative value means that the random walk displays lower out-of-sample MSPE whereas a positive value means that the corresponding alternative forecasting method is more accurate in terms of quadratic loss.

Once the null of equal conditional predictive ability is rejected against the random walk, we care about the size of the out-of-sample MSPE. We measure this as the percentage gain in MSPE over the random walk. From Tables 6-10 we see that MSPE percentage gains range from 0% to 2.5% with an average gain of 0.76%. Even though gains are mild, statistical rejection suggests that they are also systematic.

In terms of choosing a forecasting method we consider two variables: power and predictive accuracy. In other words, a forecasting method is good if in a testing environment it yields a powerful test and, if used in a predictive exercise, its accuracy is high. For the first point of view we see that the full shrinkage approach is the most powerful method as it accounts for the 43% of all the rejections. The rest of the shrinkage procedures provide roughly around the same number of rejections. Notice that via direct OLS estimation there is no rejection whatsoever. It is worth mentioning that excluding the case of Chile, the only three methods providing rejection are the three proposed shrinkage methods: full shrinkage, Det OOS-OLS and OOS-OLS.

In terms of forecast accuracy, the full shrinkage approach performs poorly as it provides percentage MSPE gains ranging between 0% and 0.3%. Much higher MSPE gains can be obtained with Det OOS-OLS and OOS-OLS, methods which give improvements up to 1.2% and 2.1% respectively.

In summary we confirm that shrinkage methods are more appropriate than simple OLS estimation to provide both more powerful tests of conditional predictive ability and more accurate forecasts. We also showed that the three proposed shrinkage methods perform well and sometimes much better than their considered competitors.

**6. DISCUSSION**

This paper evaluates exchange rate predictability using a new conditional framework developed by Giacomini and White (2006). Instead of testing an economic theory, this framework is more appropriate for an applied forecaster trying to assess which of two competing forecasting methods will provide more accurate forecasts in the future.

We use six different forecasting methods, based upon a model of interest parity, to test the null of equal conditional predictive ability when the benchmark forecasting method is a random walk. We consider seven different predictive horizons to perform a total of 126 tests for each bilateral exchange rate corresponding to Canada, Chile, Japan, the UK and Switzerland.

Our results indicate that all bilateral exchange rates, with the exception of the British pound, display statistically significant evidence of conditional predictability against the random walk, at least for some small group of predictive horizons. Furthermore, our results reveal that conditional predictive ability is more frequently found at shorter or medium horizons rather than at longer horizons.

This is interesting because it coincides with results showed by McCracken and Sapp (2005) and Kilian (1999). We emphasize again that our question and testing framework are different than those in previous papers. We are trying to detect exchange rate predictability from a forecaster point of view and we are not directly interested in testing economic theory.

We also provide evidence indicating that shrinkage methods are more appropriate than simple OLS estimation in providing both more powerful tests of conditional predictive ability and more accurate forecasts. Similarly, we show that the three proposed shrinkage methods perform well and sometimes much better than their considered competitors.

We have made a number of assumptions to obtain our results. For instance, all of our forecasting methods are based upon the simple interest parity model. We have also chosen priors, testing functions, ridge factors, loss functions, and forecasting and estimation windows size, among other variables. A natural extension of this paper should relax some of these assumptions. The consideration of more models, more estimation techniques and the use of bootstrap critical values are left for future research.

**APPENDIX**

**A.1. Theoretical Appendix**

**Theorem 1** *(Conditional Predictive Accuracy Test) For forecast horizon =1, **, maximum estimation window of size R < and q χ* 1 *test function sequence {h _{t}} suppose:*

*1. **{y _{t}, X_{t}*

*},*

*{h*/ (2r - 1),

_{t}} are mixing sequences with of size -r*r ≥ 1, or of size -r / (r - 1), r > 1.*

*2. **<* *C* *for some δ>* 0, *i = 1, q and for all t.*

*3. **is uniformly positive definite.*

*Then under the null of equal conditional predictive accuracy*

**Proof **See Giacomini and White (2006).

**Theorem 2** *(Multi-Step Conditional Predictive Accuracy Test) For given forecast horizon* *> 1**,**maximum estimation window of size R < and q χ 1 test function sequence {h _{t}} suppose:*

*1. {y _{t},*

*X*

_{t}*},*

*{h*

_{t}} are mixing sequences with of size -r / (2r - 2), r > 2, or of size*-r / (r - 2), r > 2.*

*2. for some δ > 0, i = 1,...,q and for all t.*

*3. ** is uniformly positive definite, where is a HAC estimate of the variance* of* ^{ }*

*Then under the null of equal conditional predictive accuracy*

*where **is given by (6). *

* Proof See Giacomini and White (2006*).

1 Sections 2 and 3 are based upon Giacomini and White (2006).

2 We should point out that depending on the estimation method, a direct question involving only estimates of the population parameters defining the underlying data generating process will also impose some restrictions over these parameters. The main difference between the unconditional and conditional approach reduces to the fact that restrictions on the population parameters are in general different.

3 We are extremely grateful to Kenneth West for making this point.

4 In some cases there is a unique choice of sample size and shrinkage factor for which the conditional test is correctly sized.

5 These are summarized in the Appendix.

6 These are summarized in the Appendix.

7 The data from Canada, Japan, Switzerland, and the UK were generously provided by Todd Clark and correspond to the same database used in Clark and West (2006). Interest rates correspond to 1-month Eurocurrency deposit rates, taking an average of bid and ask rates at London close. Monthly time series are formed as the last daily rate of each month. Data was obtained from Global Insight's FACS database. We obtained the data for Chile from the International Financial Statistics. This time we use the discount rates as measures of interest rates.

8 We have three tests for each forecasting method and predictive horizon. We make a forecasting decision if the number of rejections in favor of one method outnumbers the number of rejections in favor of the competing method.

9 For two month ahead forecasts and OOS-OLS forecasting method, the use of the testing function h3 jointly with the decision rule suggests choosing the OOS-OLS forecasting method over the random walk, yet the MSPE of the random walk is lower than that of its competing forecasting method. We label this situation as a mistake. It only happens once.

**REFERENCES**

CHINN, *M. and R.A. MEESE (1995). "Banking on Currency Forecasts: How Predictable is Change in* *Money?", Journal of International Economics 38, pp. 161-178. *

* CLARK, T and K. WEST (2006). "Using Out-of-Sample Mean Squared Prediction Errors to Test the* *Martingale OiSerence Hypothesis", Journal of Econometrics 135, pp. 155-186. *

* DIEBOLD, F. and R. MARIANO (1995). "Comparing Predictive Accuracy", Journal of Business and* *Economic Statistics 13, pp. 253-263. *

*GIACOMINI, R. and H. WHITE (2006). "Test of Conditional Predictive Ability", Econometrica 14,* *pp. 1545-1578. *

*KTLLTAN, L. (1999). "Exchange Rates and Monetary Fundamentals: What Do We Learn From Long-Horizon* *Regressions?", Journal of Applied Econometrics 14, pp. 491-510. *

*LITTERMAN, R.B. (1986). "Forecasting with Bayesian Vector Autoregressions - Five Years of Expperience",* *Journal of Business and Economic Statistics 4, pp. 25-38. *

*MARK, N. (1995). "Exchange Rates and Fundamentals: Evidence on Long-Horizon Predictability",* *American Economic Review 85, pp. 201-218. *

*MacDONALD, R. and I. MARSH (1997). "On Fundamentals and Exchange Rates: A Casselian Perspective",* *Review of Economics and Statistics 79, pp. 655-664. *

* McCRACKEN, M. (2004). "Asymptotics for Out-of-sample Tests of Causality". Manuscript, University* *of Missouri. *

*McCRACKEN, M. and S. SAPP (2005). "Evaluating the Predictability of Exchange Rates Using Long* *Horizon Regressions", Journal of Money, Credit, and Banking 37 (3), pp. 473-494. *

*MEESE, R. and K. ROGOFF (1983a). "Empirical Exchange Rate Models of the Seventies. Do They Fit* *Out-of-Sample?", Journal of International Economics 14, pp. 3-24. *

*MEESE, R. and K. ROGOFF (1983b). "The Out-of-Sample Failure of Empirical Exchange Rate Models?",* *in J. Frenkel (ed.), Exchange Rates and International Macroeconomics, pp. 67-105. Chicago:* *University of Chicago Press. *

*NEWEY, W. and K. WiEST (1987). "A Simmple, Positive Semmidefinite, Heteroskedasticity and Autiocorrrelation* *Consistent Covariance Matrix", Econometrica 55, pp. 703-708. *

* WEST, K. (1996). "Asymptotic Inference About Predictive Ability", Econometrica 64, pp. 1067-1084. *

*WEST, K. (20(06). "Forecast Evaluation", in G. Elliot, C.W. Granger and A. Timmermman (eds.), Handbook* *of Economic Forecasting, Volume 1, Holland: Elsevier. *