next up previous contents
Next: 2.6 Need for Alternative Up: 2. BACKGROUND Previous: 2.4 Sensitivity and Sensitivity/Uncertainty

Subsections

2.5 Conventional Sensitivity/Uncertainty Analysis Methods

Conventional methods for sensitivity analysis and uncertainty propagation can be broadly classified into four categories: (a) ``sensitivity testing'', (b) analytical methods, (c) sampling based methods, and (d) computer algebra based methods.

Sensitivity testing involves studying model response for a set of changes in model formulation, and for a selected model parameter combinations. Analytical methods involve either the differentiation of model equations and subsequent solution of a set of auxiliary sensitivity equations, or the reformulation of original model using stochastic algebraic/differential equations. On the other hand, the sampling based methods involve running the original model for a set of input/parameter combinations (sample points) and estimating the sensitivity/uncertainty using the model outputs at those points. Yet another sensitivity analysis method is based on direct manipulation of the computer code of the model, and is termed automatic differentiation. These methods are elaborated in the following.

2.5.1 Sensitivity Testing Methods

In this approach, the model is run for a set of sample points for the parameters of concern or with straightforward changes in model structure (e.g., in model resolution). This approach is often used to evaluate the robustness of the model, by testing whether the model response changes significantly in relation to changes in model parameters and structural formulation of the model. The application of this approach is straightforward, and it has been widely employed in conjunction with transport-transformation modeling.

Sistla et al.[189] and Al-Wali et al. [4] used this approach for the sensitivity analysis of urban air quality models. Roselle [173] used this approach to study the effect of biogenic emission uncertainties by performing model simulations at three emission biogenic estimate levels. Vieux et al. [207] and Vanderperk [206] have used a similar approach in water quality modeling.

The primary advantage of this approach is that it accommodates both qualitative and quantitative information regarding variation in the model. However, the main disadvantage of this approach is that detailed information about the uncertainties is difficult to obtain using this approach. Further, the sensitivity information obtained depends to a great extent on the choice of the sample points, especially when only a small number of simulations can be performed.

2.5.2 Analytical Methods

Some of the widely used analytical methods for sensitivity/uncertainty are: (a) differential analysis methods, (b) Green's function method, (c) spectral based stochastic finite element method, and (d) coupled and decoupled direct methods.

2.5.2.1 Differential Analysis Methods

Differential analysis methods include the Neumann expansion [3,197], and the perturbation method [3,198]. The Neumann expansion method involves finding the inverse of the model operator through the expansion of the model equations, and hence has limitations on the type of model equations it can address. The perturbation method involves expansion of model outputs as a series in terms of small random perturbations in model parameters, and the subsequent solution of the series coefficients. The Neumann expansion and perturbation based methods have been applied in uncertainty analysis of ground water models [1,2,224], and in the design of structures [26,118,140]. The main limitation of these methods is the requirement that the perturbation terms be small. Further, these methods are in general difficult to apply in conjunction with the modeling of complex, nonlinear systems, as the model equations are often mathematically intractable.

2.5.2.2 Green's Function Method

In the Green's function method, the sensitivity equations of a model are obtained by differentiating the model equations. The sensitivity equations are then solved by constructing an auxiliary set of Green's functions. This method minimizes the number of differential equations that are solved for sensitivity, and replaces them with integrals that can be easily evaluated [55,56]. This approach has been applied to the sensitivity analysis of atmospheric chemistry models [106,209].

2.5.2.3 Spectral Based Stochastic Finite Element Method

This method relies on the use of representing stochastic processes in terms of a series expansion, specifically the Karhunen-Loeve expansion [83,160]. For finite element method problems, this approach results in a set of linear matrix equations with deterministic matrices multiplied by random vectors. The matrix equations are solved either using operator expansions or by using the Galerkin's method [208]. One of the main features of this method is the representation of random parameters in terms of orthogonal functions of a set of standardized random variables; the expansion is also known as ``polynomial chaos expansion'', and forms the basis for the development of the ``Stochastic Response Surface Method'' (SRSM). The polynomial chaos expansion and the SRSM are discussed in detail in Chapter 3.

2.5.2.4 Coupled/Decoupled Direct Method

The direct method involves the differentiation of model equations and the subsequent solution of the sensitivity equations. The sensitivity equations are then solved along with the original model equations (Coupled Direct Method) [200], or separately (Decoupled Direct Method) [60,165]. The decoupled method is advantageous both in terms of computational efficiency and stability of the solution. This method has evolved into a standard module in conjunction with commercial sensitivity analysis codes for chemical kinetics, such as CHEMKIN [41], and GCKP86 [168]. The decoupled method is also reported to be more efficient than the Green's function method [60].

The analytical methods require access to the governing model equations, and may involve writing additional computer code for the solution of the auxiliary equations, which may be impractical and sometimes impossible. For example, reformulating an existing computational model developed by others could require prohibitive amounts of resources.

2.5.3 Sampling Based Methods

Sampling based methods do not require access to model equations or even the model code. These methods involve running a set of model at a set of sample points, and establishing a relationship between inputs and outputs using the model results at the sample points. Some of the widely used sampling based sensitivity/uncertainty analysis methods are: (a) Monte Carlo and Latin Hypercube Sampling methods, (b) Fourier Amplitude Sensitivity Test (FAST) (c) reliability based methods, and (d) response surface methods.

2.5.3.1 Monte Carlo and Latin Hypercube Sampling Methods

Monte Carlo (MC) methods are the most widely used means for uncertainty analysis, with applications ranging from aerospace engineering [12] to zoology [28]. These methods involve random sampling from the distribution of inputs and successive model runs until a statistically significant distribution of outputs is obtained. They can be used to solve problems with physical probabilistic structures, such as uncertainty propagation in models or solution of stochastic equations, or can be used to solve non-probabilistic problems, such as finding the area under a curve [49,75]. Monte Carlo methods are also used in the solution of problems that can be modeled by the sequence of a set of random steps that eventually converge to a desired solution. Problems such as optimization and the simulation of movement of fluid molecules are often addressed through Monte Carlo simulations [75]. For the interested reader, a wide range of literature describing the methodology, tools, and the applicability of the Monte Carlo methods is available in the literature [116,51,71,177,192].

Since these methods require a large number of samples (or model runs), their applicability is sometimes limited to simple models. In case of computationally intensive models, the time and resources required by these methods could be prohibitively expensive.2.1. A degree of computational efficiency is accomplished by the use of Modified Monte Carlo (MMC) methods that sample from the input distribution in an efficient manner, so that the number of necessary solutions compared to the simple Monte Carlo method is significantly reduced.

The Latin Hypercube Sampling [107,146,194] is one such widely used variant of the standard Monte Carlo method. In this method, the range of probable values for each uncertain input parameter is divided into ordered segments of equal probability. Thus, the the whole parameter space, consisting of all the uncertain parameters, is partitioned into cells having equal probability, and they are sampled in an ``efficient'' manner such that each parameter is sampled once from each of its possible segments. The advantage of this approach is that the random samples are generated from all the ranges of possible values, thus giving insight into the extremes of the probability distributions of the outputs. A detailed description of the implementation of Latin Hypercube Sampling is presented in Section 3.8.1.

Monte Carlo methods and Latin Hypercube Sampling have been applied in probabilistic assessment of risk in physiologically based pharmacokinetic models by considering uncertainty in the estimates of the model parameters [44,67,187]. These methods have also been applied in groundwater contamination models [14,89,172,157,143], in air pollution modeling [48,74,34,179,190,84], in process engineering [169,215], and in the reliability assessment of structures [8].

Guidelines for the application of Monte Carlo and Latin Hypercube Sampling methods in risk assessment are provided by the U.S.EPA [204,203]. Several computer packages containing routines for Monte Carlo and Latin Hypercube Sampling methods are reported in the literature [112,108]. The EPA approved FORTRAN package for Latin Hypercube Sampling by Iman et al. [108] is available through the EPA's Exposure Models Library [202]. Commercial plugins based on spreadsheet software include Crystal ball [46], @RISK [218], RISKMAN [162], and MonteCarlo [164]. This list is only representative of the available software tools. In the present work, a Latin Hypercube Sampling utility in FORTRAN has been developed as a part of the implementation of the Stochastic Response Surface Method, and is freely available to the research community.

2.5.3.2 Fourier Amplitude Sensitivity Test (FAST)

Fourier Amplitude Sensitivity Test (FAST) is a method based on Fourier transformation of uncertain model parameters into a frequency domain, thus reducing the a multi-dimensional model into a single dimensional one. For a model with $m$ model parameters, $k_1,k_2,\ldots,k_m$, and $n$outputs, $u_1,u_2,\ldots,u_m$, such that

\begin{displaymath}u_i = f_i(t;k_1,k_2,\ldots,k_m);\ \ \ i=1,2,\ldots,n,
\end{displaymath}

the FAST method involves the transformation of the parameters into a frequency domain spanned by a scalar $s$, as follows:

\begin{displaymath}k_l = G_l(\sin w_l s), \ \ \ l = 1,2,\ldots,m
\end{displaymath}

The outputs are then approximated as:

\begin{displaymath}\begin{array}{rcl}
\displaystyle \overline{u}_i(t) & = &
\dis...
...t;k_1(s),k_2(s),\ldots,k_m(s))ds - \overline{u}^2_i
\end{array}\end{displaymath} (2.6)

These integrals are evaluated by repeatedly sampling the parameter space of $s$, which corresponds to the sampling in the multidimensional model parameter space. The details of the transformation of model parameters into the frequency domain, and the subsequent sampling are explained by Koda et al. [127], and a computational implementation of FAST is presented by McRae et al. [149].

FAST has been applied in the sensitivity/uncertainty analysis of atmospheric photochemical mechanisms [66], in conjunction with land surface processes in global climate models models [145,40], and in the disposal of radioactive waste [124]. A review article by Helton [96] reports that Monte Carlo methods are more widely applicable than FAST.

2.5.3.3 Reliability Based Methods (FORM and SORM)

First- and second-order reliability methods (FORM and SORM, respectively) are approximation methods that estimate the probability of an event under consideration (typically termed ``failure''). For example, these methods can provide the probability that a contaminant concentration exceeds a target level at a location (or, the probability of failure). In addition, these methods provide the contribution to the probability of failure from each input random variable, at no additional computational effort [147,92,117]. These methods are useful in uncertainty analysis of models with a single failure criterion.

For a model with random parameters

\begin{displaymath}{\ensuremath{\textstyle\mbox{\boldmath ${X}$}} } = (X_1,X_2,\ldots,X_n),
\end{displaymath}

and a failure condition

\begin{displaymath}g(X_1,X_2,\ldots,X_n) < 0
\end{displaymath}

the objective of the reliability based approach is to estimate the probability of failure. In case of pollutant contamination exceedance, the failure condition can be defined as

\begin{displaymath}g({\ensuremath{\textstyle\mbox{\boldmath ${X}$}} })\ =\ C_{R}-C({\ensuremath{\textstyle\mbox{\boldmath ${X}$}} })\ <\ 0,
\end{displaymath}

where $C_{R}$ is a pre-specified maximum permissible concentration at a location of interest.

If the joint probability density function for the set \ensuremath{\textstyle\mbox{\boldmath${X}$ }} is given by $f_{{\mathbf X}}$, then the probability of failure is given by the $n$-fold integral:

\begin{displaymath}\displaystyle P_{F} = P\{g({\ensuremath{\textstyle\mbox{\bold...
...{{\mathbf X}}d{\ensuremath{\textstyle\mbox{\boldmath ${X}$}} }
\end{displaymath}

where the integration is carried out over the failure domain. The evaluation of this integral becomes computationally demanding as the number of random variables (the dimension of the integration) increases; in fact if $m$is the number of function calls of the integrand per dimension, and $n$ is the dimension, the computation time grows as $m^n$ [97]. In addition, since the value of the integrand is small, the numerical inaccuracies can be considerably magnified when integrated over a multi-dimensional space [23].

FORM and SORM use analytical schemes to approximate the probability integral, through a series of the following simple steps, as illustrated by Bjerager [20]:

These methods are reported to be computationally very efficient compared to Monte Carlo methods, especially for scenarios corresponding to low probabilities of failure [92]. Further, SORM is more accurate than FORM, but computationally more intensive, since it involves a higher order approximation. These methods have been applied to problems of ground-water flow and contaminant transport [111,92,91,90] and to problems in safety assessment [39,147].

The main drawbacks of FORM and SORM are that the mapping of the failure function on to a standardized set, and the subsequent minimization of the function, involve significant computational effort for nonlinear black box numerical models [82]. In addition, simultaneous evaluation of probabilities corresponding to multiple failure criteria would involve significant additional effort. Furthermore, these methods impose some conditions on the joint distributions of the random parameters [92], thus limiting their applicability.

2.5.3.4 Response Surface Methods

The response surface methods consist of (i) screening to determine a subset of important model input parameters, (ii) making multiple runs of the computer model using specific values and pairings of these input parameters, and (iii) fitting a general polynomial model to the model data (using the method of least squares). This fitted response-surface is then used as a replacement or proxy for the computer model, and all inferences related to sensitivity/uncertainty analysis for the original model are derived from this fitted model. This approach is sometimes termed as a ``secondary model technique'' [68]. Box et al. [22,21] describe the adaptive nature of these methods, and the methodology for identifying the sampling points and for subsequent analysis is presented in detail in the literature [59,70,123].

Helton [96] studied the applicability of response surface methods in performance assessment of radioactive waste disposal. It has also been used in conjunction with soil erosion modeling [33], with vegetative plant growth modeling [141], and with structural reliability problems [27].

  
2.5.4 Computer Algebra Based Methods


  
Table 2.4: A representative list of packages available for automatic differentiation
\begin{table}
\newcolumntype{B}{>{\centering\noindent\arraybackslash\small\sff...
...h~et~al.~\cite{Rich:1992} & 1992 & Matlab & \\
\hline
\end{tabularx}\end{table}

Computer algebra based methods involve the direct manipulation of the computer code, typically available in the form of a high level language code (such as C or FORTRAN), and estimation of the sensitivity and uncertainty of model outputs with respect to model inputs. These methods do not require information about the model structure or the model equations, and use mechanical, pattern-matching algorithms, to generate a ``derivative code'' based on the model code. One of the main computer algebra based methods is the automatic differentiation, which is sometimes also termed automated differentiation.

Automatic differentiation involves direct manipulation of a model code to generate a corresponding ``derivative calculating code''. Given the source code, and the information about what the dependent and independent variables of interest are, the automatic differentiation tools can generate derivative information without further user intervention. This approach is essentially a ``black-box'' approach, as it does not require access to the model formulation or model equations. The main advantages of this approach are that partial derivatives can be calculated with the accuracy of the machine precision [17], and this approach results in substantial computer time savings [113,105].

An overview of the methodology and applications of automatic differentiation is presented by Griewank [85,86]. In addition, various reviews of automatic differentiation in the fields of engineering design, process engineering and numerical analysis can be found in the literature [13,35,109]. A large number of packages for automatic differentiation of model codes written in various programming languages are currently available. Table 2.4 lists a majority of the automatic differentiation tools currently available, along with some of their applications reported in the literature.

Two of the notable automatic differentiation tools that have been applied in conjunction with transport-transformation models are GRESS and ADIFOR. GRESS (GRadient Enhanced Software System) [100] is a Fortran preprocessor system that searches the model equations based on the appearance of the ``='' symbol, and analyzes the functional dependence of independent and dependent variables. The differentiation is carried out for each statement and the results are stored numerically at each stage of computation. This method has been applied to models describing transport of radionuclides through groundwater [156,161]. It has also been applied a model describing the atmospheric dispersion of radionuclides, AIRDOS [98,152]. The main limitation of GRESS is that the particulars of every operation performed in the code are stored in the memory at each step. This interpretation overhead associated with the storage, and the potentially great memory demands, can sometimes limit its applicability [16].

ADIFOR (Automatic DIfferentiation of FORtran) [17] applies the rules of automatic differentiation, and rewrites the original code, inserting statements for computing the first-order derivatives. The generated code can then be compiled just like a normal program, and can be executed to calculate derivatives in the model, along with the model outputs. This approach avoids the limitation of GRESS, with respect to the interpretation overhead.

In the area of photochemical air quality modeling, Carmichael et al. [29] applied the ADIFOR method to calculate the sensitivity of ozone levels to the initial concentrations (84 species) and all reaction rate constants (178 chemical reactions) for six different chemical regimes. Further, Hwang et al. [105] studied the applicability of automatic differentiation techniques in numerical advection schemes in air quality models. They calculated the sensitivity of concentration to a global perturbation of wind velocity in advection models and comparing that with other methods. Their results indicate that ADIFOR-generated code can produce exact sensitivity information up to the machine epsilon, and that the CPU requirements were lower for the ADIFOR method.

ADIFOR has also been applied in several other research areas, such as groundwater modeling [214], in weather modeling [18], in biostatistics [110], and in aeronautics [101].

The details of the ADIFOR approach and its application are presented in Chapter 4 where the coupling of the Stochastic Response Surface Method and ADIFOR is described.



Footnotes

... expensive.2.1
For example, in order to perform the uncertainty analysis of a grid-based photochemical model, such as the Urban Airshed Model (UAM-IV), which requires about six hours of CPU time for the simulation of air quality near the vicinity of New Jersey (see the example in Section 5.3), a thousand samples would require more than eight months of computer time

next up previous contents
Next: 2.6 Need for Alternative Up: 2. BACKGROUND Previous: 2.4 Sensitivity and Sensitivity/Uncertainty
Sastry S. Isukapalli
1999-01-19