Two Observations in the Application of Logarithm Theory and their Implications for Economic Modeling and Analysis

The contents of this paper apply to researches in the fields of economics, statistics – physical or life sciences, other social sciences, accounting and finance, business management and mathematics – core and applied. First, I discussed the misconception and the implications thereof, inherent in the conventional practice of entering interest rates as natural or untransformed series in data analysis most especially, regression models. The trends and variabilities of both transformed and untransformed interest rate series were shown to be similar thereby enhancing the likelihood of similar performances in regressions. By extension therefore, the indicated conventional practice unnecessarily and unjustifiably precluded elasticity inference on the coefficients of interest rates and summing up to procedural inefficiency as an independent computation of elasticity became the only available option. Percentages were not the equivalence of percentage changes and thus only series in growth terms hence, percentage changes should be spared log transformation. Secondly, the paper stressed the imperative to avoid unwieldy and theory incongruent expressions in post preliminary data analysis, by flagging the idea that regression models, in particular, of the growth varieties, should as much as practicable, sync with the dictates of modern time series econometrics in the specification of final equations.


Introduction
Logarithmic transformation of data, as a mathematical modification to the values of a variable, is often observed in the empirical literature for various reasons. Some of these reasons include, stabilizing the variance of a variable over the observed range (overcoming the problem of heteroskedaticity) and improving the normality of a skewed distribution, especially in serious cases. However, data transformation should be applied with caution and correctly as they fundamentally alter the nature of the variable, making the interpretation of the results somewhat more complex such as Osborne, [1]. Also, a distribution becomes normal or close to normal when log-transformed if it is naturally log-normal; if not, apply the log transformation may render the distribution more skewed than prior to transformation, which can be seen in some cases, for example, Feng, Wang, Lu, Chen, He, Lu, & Tu, [2]. Besides, the need to adopt a correct retransformation procedure (back transformation) is crucial to ensuring the unbiasedness of the point estimate of the expectation of the dependent variable, for example, [3,4].
This paper highlights two observations in the application of logarithm theory and draws their implications for economic modeling and analysis. It is organized as follows.
Reviews of linked literatures -theoretical, methodological and empirical -as part of the materials and methods, successively occupy the next three sections. Observations and implications are discussed in the succeeding section and the last section concludes the paper.

Theoretical Literature
Osborne [1] explained that logarithmic transformation was not a single transformation but a class of transformations. A logarithm is a power that a base number must be raised to in order to get the original number, expressing y a = x. Using base 10 as an example, 1 is 10 0 , 100 is 10 2 , 16 is 10 1.2 , etc. Therefore, log10(100) = 2 and log10(16) = 1.2. Besides base 10, a common option is the natural logarithm, where the constant e (2.7182818…) is the base. In this case, e(100) is 4.605. A base in a logarithm can be any number; hence, transformations can be done in an infinite number of ways. Also, the logarithm of negative numbers or numbers less than 1 are undefined; therefore, a constant must be added to move the minimum value of the distribution, preferably to 1.00. This is due to the reason that transformation will have the greatest effect if the distribution is anchored at 1 and the effect declines as it moves away from 1, for example, Osborne [5]. According to Osborne [5], transformation improves normality by compressing the right side of the distribution more than the left side; and since transformations improve normality by altering the relative distances between data points, if the right transformation is applied, all data points remain in the same relative order as before transformation. However, the curvilinear nature of the transformations renders results complex to interpret.
In obtaining the point estimate of a dependent variable given an explanatory variable, E(Y│X), there may be a need to use log-transformation in the regression analysis. However, to obtain the expected value of the original series by retransformation using the exponential or anti-log function may yield biased estimate. For example, Granger & Newbold [6] proved theoretically that conditional expectation of a variable obtained through instantaneous nonlinear transformation may not lead to the conditional expectation of the transformed variable. In terms of log-transformed series, they showed that obtaining the forecast of the original variable by using the exponential of the forecast of the log-transformed variable, which they referred to as the naïve forecast, will result in biased forecast. The optimal forecast of the original variable can only be obtained by multiplying the naïve forecast of the original variable by a 'correction' factor containing the variance of the forecast error of the log-transformed variable. The factor corrects for the bias due to the change in the distribution of the transformed variable, for example, Lütkepohl & Xu [7]. The result of Granger & Newbold [6] was extended to VAR time series model established by Ariño & Franses [8].
In Granger and Newbold (1976) model, a univariate model can be represented as Z = log (Y), where Y is a dependent variable, then Y = exp(Z). The forecast of Y at time t+h is Where t = 0. Equation (1) is the naïve forecast, and the forecast of the transformed series is given by Where e is the h-step ahead forecast error with mean zero and variance, α 2 z (h). Hence, e t,h = α 2 z (h). Although equation (2) is optimal for the log series, equation (1) is not the optimal forecast of the original variable Y. So, the optimal forecast for the original variable Y, according to the theory, is given by equation (3) Y t+h = exp(Z t+h + ( Where the second term on the right hand side (RHS) is the correction factor.
Likewise, according to [3,9] given, also, that Z = log (Y). Y = exp(Z), the log-normal distribution Z has mean µ z and variance σ z However, transforming µ z, the mean of Z to µ Y, by taking the anti-log (exponential) function does not give an accurate mean of Y Hence, (5) is not an unbiased mean of Y. The unbiased mean of the original variable Y is Teekens & Koerts [10] also proved the case for the multiplicative model. When a multiplicative model with x p -vector explanatory variables was log-transformed, in the linear model, the assumptions that the error term has mean equal to zero holds but this did not hold for the original multiplicative model. If a constant term was introduced into the relationship, the constant term took account of the mean of the error term. Hence, in the linear model, only the estimator of the constant term was biased. If the estimators of the constant term of the linear and multiplicative models were represented as β o and δ o respectively, then, was biased. The unbiased expected value of δ o is E(̂o) = exp(β o - Where σ 2 o is the variance of β o and σ 2 is the variance of the error tem in the linear model. Also, in the linear model, the least squares estimator of the expected value of the dependent variable, E(Y│x p ) was unbiased. However, this was not so for the multiplicative model. For this type of model, the bias in the LS estimator of the expectation of Y, given some x p -vector consisted of two parts; the first part was due to a bias of the LS estimator of the constant term in the linear model, exp(-1 2 σ 2 ), and the second one was a "transformation bias", exp (

Methodological Literature
Some studies examined the effects of using different methodologies for estimation of log-transformed data or for retransformation of estimates while others had sought to examine the effects of log-transformation on the empirical findings and forecasts by subjecting the results from the level and log-transformed data analysis to various texts. For example, Manning [14] belonged to the latter category.
Manning & Mullahy [4] examined different approaches for log models in the literature, in terms of their precision and bias in the presence of common data problems such as skewness, heteroskedasticity, heavy tails (kurtosis), etc. They adopted five different estimators for each data generating process in the study. These were two classes of OLS: ordinary least-squares (OLS) regression of lny on x and an intercept, using homoscedastic smearing factor to retransform the results to obtain the expected value of y given x, and ordinary least-squares regression of ln(y) on x and an intercept, but using a heteroskedasticity retransformation. The other three estimators, the gamma, Poisson-like and nonlinear least squares versions, were variants of generalized linear models (GLM) for y with a log link function. The GLM models could provide estimates of the ln(E(y|x)) and E(y|x) directly, without need for retransformation.
Monte Carlo simulation was applied to each model, evaluated on 1000 random samples with a sample size of 10000 for each sample. Empirical examples involved estimation of a survey data with 27,598 observations on adults who had at least one doctor visit during the 12 months prior to the survey. The socio-economic statistics considered were age, gender (male), years of completed schooling, race (white), marital status (married), and health status (excellent, very good, good). The estimates for evaluation were mean, standard error, and 95% interval of the simulation estimates of the slope β 1 of ln(E(y)) with respect to x. The mean provided evidence on the consistency of the estimator, while the standard error and 95% simulation interval indicate the precision of the estimate. Others were the mean squared error (MSE), and the absolute prediction error (APE) of the estimate of β 1 , which is the absolute value of the estimate of β 1 minus its true value. A more precise estimator should be closer to the true value.
Kramer & Davies [11] undertook a Monte Carlo simulation involving 1000 runs, to test the robustness of the null hypothesis of the DF test when there is an improper transformation of data. Dambolena, Eriksen & Kopcso [3] employed the least squares linear regression to estimate average price for a four-color full-page advertisement in a magazine against circulation of the magazine using a sample of 2400 magazines. They used the log-log transformation in their regression analysis. Then retransformation results using the anti-logarithm (exponential) and the 'correct' retransformation method with the correction factor were compared.
Ariño & Franses [8] used a bivariate cointegrated vector series of US real GNP and real domestic investment from 1947Q1 to1988Q1 for an empirical examination of their theoretical findings. They used observations from 1947Q1 to 1980Q4 to estimate the VAR model for the log-transformed data. The remaining quarters (29 horizons) were used to evaluate the naïve and unbiased forecasts of the two series. The forecast evaluation criteria were mean error, ME, mean absolute error, MAE, mean average percentage error (MAPE) and root mean squared error, RMSR.
Bårdsen & Lütkepohl [12] re-evaluated the empirical example of Ariño & Franses [8] using the RMSE as forecast criterion. In addition to estimating the VAR model, the Monte Carlo simulation was employed in the study with a sample size of observation of 100. They used a different strategy in computing the RMSE. Starting with a sample of 100 observations, the forecasts were computed recursively, over a shorter horizon. They increased the sample by one period, and redid the estimation and forecasting over an evaluation period of 65 quarters at the end of the sample.
Mayr & Ulbricht [14] also adopted similar approach, a recursive scheme with VAR models using data from G4 countries (USA, Japan, Germany, and the UK) to forecast GDP in logs and levels, 1 to 8 quarters ahead. The level forecast, naïve forecast and optimal forecast based on Ariño & Franses [8] were all compared using the forecast errors. For their study, Lütkepohl & Xu [13] adopted different AR models to forecast inflation in 24 European Union (EU) member countries, and later with the USA for the period 1996M1-2007M12 using seasonally unadjusted monthly CPI data. They compared the results from levels to those form log-transformed data (naïve and optimal forecasts). The (root) mean square error was used as a measure of forecast precision.
Lütkepohl & Xu [7] explored the conditions under which logarithms can help improve the forecasts of economic variables. They compared forecasts from the original series based on logs using simulation centered on AR1 process. They also used three different predictors to forecast some economic variables which are typically logged in economic analysis. These include data from top 9 stock indices from 1990M1 to 2007M12; private consumption expenditure for Australia, Belgium, Canada, Japan, Norway, United Kingdom (UK) and the US from 1980Q1-2006Q4; and a set of gross domestic product (GDP) data for a range of countries. The forecasts compared were (i) An ARIMA forecast for the original variable without the log transformation. (ii) An ARIMA forecast based on the logs of the series, where the forecast of the original series was obtained by applying the exponential function to the forecast of the log series (naïve approach). (iii) The forecast for the log series obtained under (ii) was converted to a forecast for the original series by a more optimal process. The MSE was used as a measure of forecast precision.

Empirical Literature
In the study of Manning & Mullahy [4], three main data problems were selected: skewness, heteroskedasticity and heavy-tail (kurtosis).
On skewness: Since severe skewness in health utilization is the main reason for using a log transformation, the analysis started by providing results on the consistency and precision in the estimates β 1 , and the slope with respect to x. According to the authors, the skewness in y on the raw-scale increased in the variance for the log normal models. In the absence of heteroskedasticity in x in the error term, the OLS model with homoscedastic retransformation and the three versions of GLM gave consistent estimates of the slope. However, in the case of precision, more precise estimates were obtained with the OLS estimator via the gamma, Poisson, and NLS versions of the GLM model following in that order from lower to higher variance. The efficiency loss of using the GLM models relative to the OLS model was high and variance on the log-scale increases.
On heteroskedasticity: If the heteroskedasticity depends on x, then OLS with heteroskedastic retransformation when used on ln(y) led to consistent estimates of the impact of x on the expectation of y. The GLM models were also consistent. However, if precision is the problem, the OLS with heteroskedasticity retransformation was the most precise, followed by the gamma, the Poisson, and NLS models in that order. In the case of the heavy-tailed (K=4 or 5) error distribution on the log-scale, it was submitted that its presence did not cause consistency problems for the estimators either, but the estimates were much more imprecise for the three GLM models. The analysis showed that the efficiency losses of GLM models (relative to the OLS-based estimator) were substantial and the coefficient of kurtosis of the log-scale error increases.
Generally, each data generating mechanism had a more suitable choice of estimator. There were substantial gains from selecting the best estimator for a given data situation.
They concluded that the problems of skewness, kurtosis, and heteroskedasticity can lead to significant bias for some estimators e.g. simple OLS for ln(y) or significant losses in precision for others e.g. GLM. OLS while homoscedastic retransformation seemed to overcome the different data problems, except when the log-scale error term is heteroskedasticity, in which case it led to a significantly biased result.
The GLM models, such as the NLS, Poisson-like, and gamma models, provide the alternatives of directly estimating E(y|x) or ln(E(y|x)), without having to estimate the variance function of x for the log-scale error that is required for retransforming log OLS results. However, if the model is a heteroskedasticity equation for ln(y), then the GLM methods are less precise for dealing with some problems.
The findings of Kramer & Davies [11] showed that over-rejection was a more dangerous problem when the Dickey-Fuller (DF) test was applied to the levels and the random walk was in the logs while under-rejection was the major issue when the true random walk was in the levels but the DF-test was applied to the logs instead.
Dambolena, Eriksen & Kopcso [3] found that using the anti-log retransformation, rather than the right retransformation with the correction factor, underestimated the average price of advertisement by six (6) percent. Ariño & Franses [8] findings indicated that for both the real GNP and real investment, the unbiased (optimal) forecast performed better than the naïve forecast, except the mean error for real GNP. The mean absolute error, MAE, mean average percentage error MAPE and root mean squared error, RMSR were smaller for the unbiased forecast in both series. The unbiased forecast, also, outperformed the naïve forecast in most of the horizons, especially, for the longer horizon.
In re-evaluating the findings of Ariño & Franses [8], the overall results of Bårdsen & Lütkepohl [12], from the Monte Carlo simulation indicated that there were no gains from optimal forecasts relative to naïve forecasts at any horizon, except for forecasting integrated variables at short horizons. They also computed forecasts of the investment-GNP ratio. In their results, the forecast of investment-GNP ratio, which was stationary, showed that there were no or, at best, only very small gains from using optimal forecasts at any forecast horizon. In addition, except for short horizons where the two variables, which, were integrated, have very similar RMSEs, performance of the naïve forecasts was better relative to the optimal forecast. The naïve forecast improves with the forecast horizon. In their opinion, the gains reported by Ariño & Franses [8] of using the optimal forecast were a result of the specific way they computed the RMSEs.
Using similar methodology but with a different precision measure, the study conducted by Mayr & Ulbricht [14] showed that the re-transformation of the forecasts from the log-transformed series to levels by just taking the exponential led to biased forecasts. However, when the optimal forecast was carried out as well, no appreciable difference was found between the naïve and optimal forecasts for all countries. Test results only showed differences were more important at shorter horizons. The findings indicated that for USA and Japan, forecasts based on levels were outperformed by those based on log-transformed data (both naïve and optimal) in most cases, for one quarter horizon forecasts. However, for Germany and the UK, the reverse was the case. They concluded that, in general, the findings demonstrated that the automatic transformation of the data was at best harmless.
Lütkepohl & Xu [13] found out that in most cases, sizable and statistically significant forecast improvements were obtained by modeling the CPI series in levels rather than in logs (naïve and optimal forecasts), even if forecasts based on logs were better, the improvement were usually small. Compared with the naïve forecast, the optimal forecast was only marginally better. Thus, the common practice of using log differences to approximate the inflation rate is not necessarily optimal. They were, therefore, recommended using levels as the default. They could not find a reliable in-sample method to discriminate between levels or logs for a better out-of-sample forecast.
Lütkepohl & Xu's [7] simulation results indicated that log-transformed variables could produce better results if log transformation made the variance more homogeneous throughout the sample. However, forecasts using the levels were preferable if it had a more homogeneous variance than the transformed series. In addition, comparison of forecasts from the predictors was in line with that from the simulation. If the log transformation does not lead to homogenous variance, forecasts based on ARIMA models for the original series were better. In this case, forecasts based on the log series can be damaging to the forecast precision. Comparing the naïve to the optimal forecast, it can be concluded that the MSEs from both forecasts were close. They, however, recommended log-transformed series as the default. Their efforts, also, to find optimal criteria for selecting between forecasts at levels or logs was not successful.

Observations
The first observation on the application of logarithm theory was connected with the widespread practice of entering nominal interest rate as an untransformed series in regression estimations. The motivation for this preference (though largely unsubstantiated) appeared to be its usual expression in percent terms. It would appear that by this action, analysts were wont to overlook the difference between percentages and percentage changes. However, data series in percent terms are just like any other series, those computed as percentage changes or growth rates are equivalent to log differences and should not be subjected to another round of log transformation. Thus, the former could be logged without violating any mathematics principle. The inherent fear of inadmissible negative values in data series has the antidote of adding a constant 'across the board'. In addition, whether in log or natural level, the trends of any class of interest rate are similar implying that, their variabilities are the same. Figure 1 plots the log transformed and the untransformed series of three interest rates -monetary policy rate, real interest rate and three months deposit rate of interest for Nigeria.
The only conclusion that could be reached from the comparative context appeared to be that of striking similarities in the trends of pairs of representations. This thus suggests that they may likely perform similarly in regressions. Denying the relevant variables log transformation therefore imposes an inability to infer elasticity directly from the coefficients. By implication, the extra task of computing such elasticity through say, about the means, connotes an inefficient procedure overall (A study such as RaziraAniza, Chin & Darmesah [15] attributing improved diagnostic estimates to alternative estimation methods of small samples may in a way fit into this relative efficiency proposition).
A linked observation on the application of logarithm principles had to do with its use in growth analysis that was conducted within the context of 'the general to specific' methodology. A basic understanding in growth modeling is the imperative of specifying the dependent variable, usually, gross domestic product (GDP) or any other preferred measure of output or productions, in growth terms (that is, percentage change or log difference). However, such an emphasis or requirement of preliminary data operations results in double differencing (that is, growth rate of growth rate given log difference). A dictate of 'the general to specific' methodology was that, irrespective of the outcome of unit root tests, all variables in final equations under an error correction parameterization should be specified in the first difference. In other words, the methodology guaranteed by default, an automatic expression of the dependent variable in growth terms, which can be seen in the example of Adams [16]. Sticking to the basic structure of growth model therefore would have produced a seemingly unwieldy expression occasioned by the violation of a key requirement of the methodology. The implication of this submission is that, in preliminary data analysis of growth models, any attempt to initially adhere rigidly to the dictate of first differencing dependent variable would ultimately produce an unscientific outcome that might not yield itself to any meaningful interpretation or be insightful.

Conclusions
This paper had been concerned with two observations in the application of logarithm theory and their implications for economic modeling and analysis. Log transformation undoubtedly conferred elegance and convenience on models and analysis but the somewhat general perception on the treatment of interest rates in model specification and estimations under the theory tended to introduce procedural inefficiency methodologically. The submission in this study was that there were no convincing justifications for denying interest rates log transformation. It was also revealed in the paper that given the inbuilt feature of 'the general to specific' methodology to always specify all variables in final equations of an error correction model in first difference, the understanding in growth models that dependent variables should be in growth terms amounted to double differencing that might not convey any valuable information.