Next: 8.2 Jointly Distributed Random
Up: 8. PROBABILISTIC APPROACH FOR
Previous: 8. PROBABILISTIC APPROACH FOR
Subsections
Random variables are mathematical quantities that are used to represent
probabilistic uncertainty. They can describe the probabilities associated
with each value, or each sub-range of values, an uncertain quantity can
take. For example, if a random variable
represents the uncertainty in
the concentration of a pollutant, the following questions can be answered by
analyzing the random variable
:
Given a concentration
,
what is the
probability that
takes a value lower than
;
or given two numbers
and
,
what is the probability that
takes values between
and
?
Mathematically, a random variable
maps a probability space
onto the real line.
The random variable
can assume values from
to
,
and there is an associated probability for
each value (or interval) that
takes. Based on the nature of the
values that random variables can assume, they can be classified into
three types:
- (a)
- continuous random variables: these variables can assume
continuous values from an interval. Examples include contaminant
concentrations in the environment; emissions from an industrial source;
and physical parameters in an exposure model, such as body weight or
respiration rate. In these cases, one cannot define the probability that a
random variable
is exactly equal to a value
,
since there are
uncountably infinite number of possible values, and the answer for each
point would be zero. Hence, in such cases, probabilities are defined on
intervals (e.g., the probability that
lies between
and
). Further, a probability density can be defined at each point
in the interval; the probability density at a point is representative of
the probability in the vicinity of that point.
- (b)
- discrete random variables: these variables can assume discrete
values from a set. Examples include the following: rolling of dice,
where the outcome can have only integer values between 1 and 6; the number
of days an air quality standard is violated in a year - this can assume
integral values between 0 and 365; and the number of defective cars in a
production line. These variables have probabilities associated with a
countable number of values they can assume. Continuous and discrete random
variables can be contrasted as follows: one can define the probability
that the number of air quality exceedences to be exactly equal to a given
number
,
where as one cannot define the probability that the
atmospheric concentration is exactly equal to a given concentration
.
- (c)
- mixed random variables: these variables can assume continuous as
well as discrete values. For example, the sum of a discrete and a
continuous random variable results in a mixed random variable. They may
have the properties of continuous random variables in certain ranges, and
may have properties of discrete random variables in others.
A major part of uncertainty analysis in environmental modeling and risk
characterization involves uncertainties in continuous quantities. Examples
include the uncertainties in measured or predicted concentrations, and the
uncertainties in the estimated time of exposure to a contaminant.
The present work focuses mainly on uncertainties described by continuous
random variables.
Continuous random variables are characterized through the following
functions:
- (a)
- Cumulative density function,
.
This denotes the
probability that the random variable
has a value less than or
equal to
.
This function is also known as the cumulative distribution.
The probability that
has a value greater than
is given by
1-
.
Further, the probability that
takes on values between
and
for
,
is given by:
The important characteristics of the cumulative density function are:
In population risk characterization, the corresponding cumulative density
function can be considered analogous to the fraction of the population that
is at risk with respect to a given risk criterion.
- (b)
- Probability density function,
.
This function is also known as the
probability distribution. This is the derivative of the cumulative
density function.
The main properties of the probability density function are:
- (c)
- Expected value of a function,
,
of a random variable.
This is defined as
A probability density function or a cumulative density function fully
characterizes a random variable. In the following sections, the probability
density function
and the random variable
are
interchangeably used, as they both represent the same. Further, the terms
``distribution'' and ``random variable'' are also interchangeably used.
Even though the density functions of a random variable provide all the
information about that random variable, they do not provide information on
uncertainty at a quick glance, especially when the distribution functions
consist of a complex algebraic expressions. In such cases, the moments
of a random variable serve as useful metrics that provide a significant
amount of information about a distribution.
The moments of a random variable provide concise information about a random
variable. The following are the moments that describe a random
variable.
- (a)
- The expected value or mean of a random variable: this
denotes the average value for the distribution of the random variable. If
a large number of samples from the distribution are considered, the
expected value of the distribution is equal to the arithmetic mean of the
sample values.
Mathematically, the mean
of a random variable
is given by
It should be noted that the mean does not necessarily represent a realistic
value from the distribution. For example, if a coin is tossed a large number of
times, and if heads is assigned a value 1, and tails a value 0, the
mean is 1/2, which is not a possible outcome in the coin tossing problem.
- (b)
- The variance or dispersion of a distribution: this indicates the
spread of the distribution with respect to the mean value. A lower value of
variance indicates that the distribution is concentrated close to the mean
value, and a higher value indicates that the distribution is spread out over a
wider range of possible values.
The variance
of
is given by:
The square root of variance is called the standard deviation (
).
- (c)
- The skewness of a distribution indicates the asymmetry of the
distribution around its mean, characterizing the shape of the distribution.
It is given by
A positive value of skewness indicates that the distribution is skewed
towards values greater than the mean (i.e., skewed towards the right
side) Refine the definition] and a negative value indicates that
the distribution is skewed towards the left side.
- (d)
- The kurtosis of a distribution indicates the flatness
of the distribution with respect to the normal distribution. It is
given by
A value of kurtosis higher than 3 indicates that the distribution is
flatter compared to the normal distribution, and a smaller value
indicates a higher peak (relative to the normal distribution) around
the mean value.
- (e)
- Higher order moments: The higher order moments of a random
variable are
defined as follows:
The
th moment
of
is given by
The
th central moment
of a random variable
is defined as
Clearly,
= 1,
=
,
= 1,
= 0, and
=
.
Further, the central moments
's and the moments
's are
related as follows:
| |
|
 |
(8.1) |
| |
|
 |
(8.2) |
where
In addition to the information provided by the moments of a distribution,
some other metrics such as the median and the mode provide
useful information. The median of a distribution is the value for the 50th
percentile of the distribution (i.e., the probability that a random variable
takes a value below the median is 0.5). The mode of a distribution is the
value at which the probability density is the highest. For example, for the
normal distribution N(
), with zero mean, the mean, median, and
mode are equal to
,
whereas for the lognormal distribution,
LN(
), the mean is
,
the median is
,
and the mode is
.
Additionally, other
percentiles of the distribution may sometimes be desired. For example, the
95th percentile indicate the values above which 5% of the samples
occur.
Next: 8.2 Jointly Distributed Random
Up: 8. PROBABILISTIC APPROACH FOR
Previous: 8. PROBABILISTIC APPROACH FOR
Sastry S. Isukapalli
1999-01-19