ArmaModelling {fSeries}R Documentation

Integrated ARMA Time Series Modelling

Description

A collection and description of simple to use functions to model univariate autoregressive moving average time series processes, including time series simulation, parameter estimation, diagnostic analysis of the fit, and predictions of future values.

The functions are:

armaSim Simulates an artificial ARMA time series process,
armaFit Fits the parameters of an ARMA time series process,
predict Forecasts an ARMA time series process,
print S3 Print Method,
plot S3 Plot Method,
summary S3 Summary Method.

The function armaFit allows for the following formula arguments:

x ~ ar() autoregressive time series processes,
x ~ arma() autoregressive moving average processes,
x ~ arima() autoregressive integrated moving average processes, and
x ~ fracdiff() fractionally integrated ARMA processes.

For the first and third selection x ~ ar() or x ~ arima() the function uses the AR and ARIMA modelling as implemented in R's ts package, for the second selection x~arma() the function uses the ARMA modelling implemented in R's contributed tseries package, and for the last selection x~fracdiff() the function uses fractional ARIMA modelling from R's contributed fracdiff package. Note that AR, MA, and ARMA processes can all be modelled by the same algorithm specifying the formula x~arima(p,d,q) in the proper way, i.e. setting d=0 and choosing the orders of p and q as zero in agreement with the desired model specification.

Alternatively, one can use the functions from R's "stats" package:

arima.sim Simulates from an ARIMA time series model,
ar, arima, arima0 Fits an AR, ARIMA model to a univariate time series,
predict.ARIMA Forecast from models fitted by "arima",
tsdiag Plots time-series diagnostics.

In addition there are two functions for the analysis of the true ARMA process. The first investigates the characteristic polynomial and the second the true autocorrelation function:

armaRoots Roots of the characteristic ARMA polynomial,
armaTrueacf True autocorrelation function of an ARMA process.

Usage

armaSim(model = list(ar = c(0.5, -0.5), d = 0, ma = 0.1), n = 100, innov = 
        NULL, n.start = 100, start.innov = NULL, rand.gen = rnorm, ...) 

armaFit(formula = x ~ arima(2, 0, 1), 
        method = c("CSS-ML", "ML", "CSS", "yw", "burg", "ols", "mle"), 
        include.mean = TRUE, fixed = NULL, fracdiff.M = 100, fracdiff.h = -1, 
        title = "", description = "", ...)

## S3 method for class 'fARMA':
predict(object, n.ahead = 10, n.back = 50, conf = c(80,95), 
        doplot = TRUE, doprint = TRUE, ...)
        
## S3 method for class 'fARMA':
print(x, ...)
## S3 method for class 'fARMA':
plot(x, gof.lag, reference.grid = TRUE, col = "steelblue4", ...)
## S3 method for class 'fARMA':
summary(object, doplot = TRUE, ...)

armaRoots(coefficients, n.plot = 400, digits = 4, ...) 
armaTrueacf(model, lag.max = 20, type = "correlation", doplot = TRUE) 

Arguments

coefficients [armaRoots] -
a numeric vector with the coefficients of the characterisitic polynomial.
col, reference.grid [plot] -
a character string to set the plot color, by default "steelblue4"; and a logical flag, by default TRUE, to draw a reference grid on the plot or not.
description, title two character strings, assigning a title and a brief description to an "fARMA" object.
digits [armaRoots] -
output precision, an integer value.
doplot [armaRoots] -
a logical. Should a plot be displayed?
[predict][summary] -
is used by the predict and summary methods. By default, this value is set to TRUE and thus the function calls generate beside written also graphical printout. Additional arguments required by underlying functions have to be passed through the dots argument.
doprint [predict] -
should intermediate results be printed? Arguments used by the predict method.
fixed [armaFit] -
is an optional numeric vector of the same length as the total number of parameters. If supplied, only NA entries in fixed will be varied. In this way subset ARMA processes can be modeled. ARIMA modelling supports this option. Thus for estimating parameters of subset ARMA and AR models the most easiest way is to specify them by the formulas x~ARIMA(p, 0, q) and x~ARIMA(p, 0, 0), respectively.
formula [armaFit] -
a formula specifying the general structure of the ARMA form. Can have one of the forms x ~ ar(q), x ~ arma(p, q), x ~ arima(p, d, q), or x ~ fracdiff(p, q). x is the response variable and must appear in the formula expression. In the first case R's function ar from the ts package will be used to estimate the parameters, in the second case R's function arma from the tseries package will be used, in the third case R's function arima from the ts package will be used, and in the last case R's function fracdiff from the fracdiff package will be used. The state space modelling based arima function allows also to fit ARMA models using arima(p, d=0, q), and AR models using arima(q, d=0, q=0), or pure MA models using arima(q=0, d=0, p). (Exogenous variables are also allowed and can be passed through the ... argument.)
fracdiff.M, fracdiff.h two numeric parameter values for the "fracdiff" estimator. M is the number of terms in the likelihood approximation. h is the size of the finite difference interval for numerical derivatives. By default, or if negative, h=min(0.1,eps.5*(1+ abs(cllf))), where clff:=log.max.likelihood, as returned, and eps.5:=sqrt(.Machine$double.neg.eps), typically 1.05e-8.
gof.lag [print][plot][summary][predict] -
the maximum number of lags for a goodness-of-fit test.
include.mean [armaFit] -
Should the ARIMA model include a mean term? The default is TRUE, note that for differenced series a mean would not affect the fit nor predictions.
innov [armaSim] -
is a univariate time series or vector of innovations to produce the series. If not provided, innov will be generated using the random number generator specified by rand.gen. Missing values are not allowed. By default the normal random number generator will be used.
lag.max [armaTrueacf] -
maximum number of lags at which to calculate the acf or pacf, an integer value by default 20.
method [armaFit] -
denotes the method used to fit the model by a character string. method must be one of the strings in the default argument. For ARIMA models a valid selection is one of "CSS-ML", "ML", "CSS", for ARMA models only one valid selection is possible "CSS", for AR models a valid selection is one of "yw", "burg", "ols", "mle", and for FRACDIFF models only one valid selection is possible "FRACDIFF". If no method argument specified the following default settings are used: "CSS-ML" for ARIMA, "FRACDIFF" for FRACDIFF, "CSS" for ARMA, and "yw" for AR modelling.
Additional arguments required by the optimization algorithm have to be passed through the dots argument.
model [armaSim] -
a list with one (AR), two (ARMA) or three (ARIMA, FRACDIFF) elements . ar is a numeric vector giving the AR coefficients, d is an integer value giving the degree of differencing, and ma is a numeric vector giving the MA coefficients. Thus the order of the time series process is (F)ARIMA(p, d, q) with p=length(ar) and q=length(ma). d is a positive integer for ARIMA models and a numeric value for FRACDIFF models. By default an ARIMA(2, 0, 1) model with coefficients ar=c(0.5, -0.5) and ma=0.1 will be generated.
[armaTrueacf] -
a specification of the ARMA model with two elements: model$ar is the vector of the AR coefficients, and model$ma is the vector of the MA coefficients.
n [armaSim] -
is the length of the series to be simulated (optional if innov is provided). The default value is 100.
n.ahead, n.back, conf [print][plot][summary][predict] -
are presetted arguments for the predict method. n.ahead determines how far ahead forecasts should be evaluated together with errors on the confidence intervals given by the argument conf. If a forecast plot is desired, which is the default and expressed by doplot=TRUE, then n.back sets the number of time steps back displayed in the graph.
n.plot [armaRoots] -
the number of data point to plot the unit circle; an integer value.
n.start [armaSim] -
gives the number of start-up values discarded when simulating non-stationary models. The start-up innovations will be generated by rand.gen if start.innov is not provided.
object [summary][predict] -
is an object of class fARMA returned by the fitting function armaFit and serves as input for the summary, and predict methods. Some methods allow for additional arguments.
rand.gen [armaSim] -
is the function which is called to generate the innovations. Usually, rand.gen will be a random number generator. Additional arguments required by the random number generator rand.gen, usually the location, scale and/or shape parameter of the underlying distribution function, have to be passed through the dots argument.
start.innov [armaSim] -
is a univariate time series or vector of innovations to be used as start up values. Missing values are not allowed.
type [armaTrueacf] -
a character string, "correlation" to compute the true autocorrelation function, "partial" to compute the true partial autocorrelation function, or "both" if both functions are desired. The start of one of the strings will suffice.
x [print][plot] -
is an object of class fARMA returned by the fitting function armaFit and serves as input for the predict, print, print.summary, and plot methods. Some methods allow for additional arguments.
... additional arguments to be passed.

Details

AR - Auto-Regressive Modelling:

The argument x ~ar() calls the underlying functions ar.yw, ar.burg, code{ar.ols} or ar.mle. For definiteness, the AR models are defined through

(x[t] - m) = a[1]*(x[t-1] - m) + ... + a[p]*(x[t-p] - m) + e[t]

Order selection can be achieved through the comparison of AIC values for different model specifications. However this may be problematic, as of the methods here only ar.mle performs true maximum likelihood estimation. The AIC is computed as if the variance estimate were the MLE, omitting the determinant term from the likelihood. Note that this is not the same as the Gaussian likelihood evaluated at the estimated parameter values. With method="yw" the variance matrix of the innovations is computed from the fitted coefficients and the autocovariance of x. method="burg" allows two methods to estimate the innovations variance and hence AIC. Method 1 is to use the update given by the Levinson-Durbin recursion (Brockwell and Davis, 1991), and follows S-PLUS. Method 2 is the mean of the sum of squares of the forward and backward prediction errors (as in Brockwell and Davis, 1996). Percival and Walden (1998) discuss both.
[ts:ar]

ARMA - Auto-Regressive Moving-Average Modelling:

The argument x ~ arma() calls the underlying functions arma from R's tseries package. The following parametrization is used for the ARMA(p,q) model:

y[t] = a[0] + a[1]y[t-1] + ... + a[p]y[t-p] + b[1]e[t-1] + ... + b[q]e[t-q] + e[t],

where a[0] is set to zero if no mean is included. By using the argument fitted, it is possible to fit a parsimonious subset model by setting arbitrary a[i] and b[i] to zero. The optimization uses optim to minimize the conditional sum-of-squared errors, {method="CSS"}. The gradient is computed, if it is needed, by a finite-difference approximation. Default initialization is done by fitting a pure high-order AR model (see ar.ols). The estimated residuals are then used for computing a least squares estimator of the full ARMA model.
[tseries:arma]

ARIMA - Integrated ARMA Modelling:

The argument x ~ arima() calls the underlying function arima from R's ts package. For definiteness, the AR models are defined through

x[t] = a[1]x[t-1] + ... + a[p]x[t-p] + e[t] + b[1]e[t-1] + ... + b[q]e[t-q]

and so the MA coefficients differ in sign from those of S-PLUS. Further, if include.mean is TRUE, this formula applies to x-m rather than x. For ARIMA models with differencing, the differenced series follows a zero-mean ARMA model.
The variance matrix of the estimates is found from the Hessian of the log-likelihood, and so may only be a rough guide.
Optimization is done by optim. It will work best if the columns in xreg are roughly scaled to zero mean and unit variance, but does attempt to estimate suitable scalings. The exact likelihood is computed via a state-space representation of the ARIMA process, and the innovations and their variance found by a Kalman filter. The initialization of the differenced ARMA process uses stationarity. For a differenced process the non-stationary components are given a diffuse prior (controlled by kappa). Observations which are still controlled by the diffuse prior (determined by having a Kalman gain of at least 1e4) are excluded from the likelihood calculations. (This gives comparable results to arima0 in the absence of missing values, when the observations excluded are precisely those dropped by the differencing.)
Missing values are allowed, and are handled exactly in method "ML".
If transform.pars is true, the optimization is done using an alternative parametrization which is a variation on that suggested by Jones (1980) and ensures that the model is stationary. For an AR(p) model the parametrization is via the inverse tanh of the partial autocorrelations: the same procedure is applied (separately) to the AR and seasonal AR terms. The MA terms are not constrained to be invertible during optimization, but they will be converted to invertible form after optimization if transform.pars is true.
Conditional sum-of-squares is provided mainly for expositional purposes. This computes the sum of squares of the fitted innovations from observation n.cond on, (where n.cond is at least the maximum lag of an AR term), treating all earlier innovations to be zero. Argument n.cond can be used to allow comparability between different fits. The ``part log-likelihood'' is the first term, half the log of the estimated mean square. Missing values are allowed, but will cause many of the innovations to be missing.
When regressors are specified, they are orthogonalized prior to fitting unless any of the coefficients is fixed. It can be helpful to roughly scale the regressors to zero mean and unit variance.
Note from arima: The functions parse their arguments to the original time series functions available in R's time series library ts.
The results are likely to be different from S-PLUS's arima.mle, which computes a conditional likelihood and does not include a mean in the model. Further, the convention used by arima.mle reverses the signs of the MA coefficients.
[ts:arima]

FRACDIFF Modelling:

The argument x ~ fracdiff() calls the underlying functions from R's fracdiff package. The estimator calculates the maximum likelihood estimators of the parameters of a fractionally-differenced ARIMA (p,d,q) model, together (if possible) with their estimated covariance and correlation matrices and standard errors, as well as the value of the maximized likelihood. The likelihood is approximated using the fast and accurate method of Haslett and Raftery (1989). Note, the number of AR and MA coefficients should not be too large (say < 10) to avoid degeneracy in the model.
The optimization is carried out in two levels: an outer univariate unimodal optimization in d over the interval [0,.5], and an inner nonlinear least-squares optimization in the AR and MA parameters to minimize white noise variance.
[fracdiff:fracdiff]

Note

There is nothing really new in this package. The benefit you will get with this collection is, that all functions have a common argument list with a formula to specify the model and presetted arguments for the specification of the algorithmic method. For users who have already modeled GARCH processes with SPlus, this approach will be quite natural.

On the other hand, the user can still use the underlying functions from the ts, tseries, or fracdiff packages. No function from these packages is masked or overwritten.

The output of the predict, print and summary methods have all the same style of format with some additional algorithm specific printing. This makes it easier to interpret the results obtained from different algorithms implemented in different functions.

Author(s)

M. Plummer and B.D. Ripley for ar functions and code,
A. Trapletti for arma functions and code,
B.D. Ripley for arima and ARMAacf functions and code,
C. Fraley and F. Leisch for fracdiff functions and code, and
Diethelm Wuertz for the Rmetrics R-port.

References

Brockwell, P.J. and Davis, R.A. (1996); Introduction to Time Series and Forecasting, Second Edition, Springer, New York.

Durbin, J. and Koopman, S.J. (2001); Time Series Analysis by State Space Methods, Oxford University Press.

Gardner, G, Harvey, A.C., Phillips, G.D.A. (1980); Algorithm AS154. An algorithm for exact maximum likelihood estimation of autoregressive-moving average models by means of Kalman filtering, Applied Statistics, 29, 311–322.

Hannan E.J. and Rissanen J. (1982); Recursive Estimation of Mixed Autoregressive-Moving Average Order. Biometrika 69, 81–94.

Harvey, A.C. (1993); Time Series Models, 2nd Edition, Harvester Wheatsheaf, Sections 3.3 and 4.4.

Jones, R.H. (1980); Maximum likelihood fitting of ARMA models to time series with missing observations, Technometrics, 20, 389–395.

Percival, D.P. and Walden, A.T. (1998); Spectral Analysis for Physical Applications. Cambridge University Press.

Whittle, P. (1963); On the fitting of multivariate autoregressions and the approximate canonical factorization of a spectral matrix. Biometrika 40, 129–134.

Haslett J. and Raftery A.E. (1989); Space-time Modelling with Long-memory Dependence: Assessing Ireland's Wind Power Resource (with Discussion), Applied Statistics 38, 1–50.

See Also

GarchModelling, EquationsModelling, RandomInnovations, RegressionModelling.

Examples

## armaSim/armaFit -
   xmpSeries("\nStart: Simulate an ARMA(2, 1) process > ")
   x = armaSim(model = list(ar = c(0.5, -0.5), ma = 0.1), n = 1000)
   Continue = xmpSeries("Press any key > ")
   # Estimate the parameters:
   fit = armaFit(x ~ arma(2, 1))
   print(fit)
   Continue = xmpSeries("Press any key > ") 
   # Diagnostic Analysis:
   par(mfrow = c(3, 2), cex = 0.7)
   summary(fit)
   Continue = xmpSeries("Press any key > ")
   # 5 Steps ahead Forecasts:
   predict(fit, 5)
     
## armaFit -
   xmpSeries("\nNext: Estimate the parameters > ")
   # Now use the arima fitting method with d=0:
   fit = armaFit(x ~ arima(2, 0, 1))
   print(fit)
   Continue = xmpSeries("Press any key > ") 
   # Diagnostic Analysis:
   par(mfrow = c(3, 2), cex = 0.7)
   summary(fit)
   Continue = xmpSeries("Press any key > ") 
   # 5 Step ahead Forecasts:
   predict(fit, n.ahead = 5)
   
## armaRoots -
   xmpSeries("\nNext: Roots of the Characteristic Polynomial > ")
   # Calculate and plot the  of an ARMA process:
   par(mfrow = c(2, 2), cex = 0.7)
   coefficients = c(-0.5, 0.9, -0.1, -0.5)
   armaRoots(coefficients)
   
## armaTrueacf -
   xmpSeries("\nNext: True ACF of an ARMA Process > ")
   model = list(ar = c(0.3, +0.3), ma = 0.1)
   armaTrueacf(model)
   model = list(ar = c(0.3, -0.3), ma = 0.1)
   armaTrueacf(model)

[Package fSeries version 200.10058 Index]