Appendix B - Probability Distributions
Normal and Related Distributions
Normal
| Field | Details |
|---|---|
| Description | The Normal (or Gaussian) distribution is a very widely used two-parameter probability distribution. It is fundamental to most statistical modeling due to the Central Limit Theorem. It fits many natural phenomena such as body height, blood pressure, measurement error, and annual rainfall. |
| Parameters | location (μ)scale (σ) |
| Support | |
| Distribution Functions |
Log-Normal
| Field | Details |
|---|---|
| Description | The log-Normal distribution is a two-parameter positively skewed distribution that describes a random variable whose logarithm is Normally distributed. A log-Normal process arises from the multiplicative product of many independent random variables, each of which is positive. In hydrology, the log-Normal distribution is used for frequency analysis of annual maximum discharge. In reliability analysis, it is often used to model times to repair a system. RMC-BestFit contains two log-Normal distributions. The first, named “Ln-Normal” is based on the natural logarithm, or log base e. This distribution is parameterized using real-space moments to be more intuitive for multi-disciplinary users of the software. The other distribution, named “Log-Normal” uses log base 10 and is parameterized using log_10 moments, which is consistent with typical practice in hydrologic frequency analysis. Both of these log-Normal distributions are functionally identical, and will produce the same statistical inference. |
| Parameters | location (μ)scale (σ) |
| Support | |
| Distribution Functions |
Gamma Family of Distributions
Exponential
| Field | Details |
|---|---|
| Description | The two-parameter (or shifted) Exponential distribution is a special case of the Gamma family of distributions, which includes the two-parameter Gamma, Pearson Type III, and Log-Pearson Type III. The Exponential distribution describes the time between events in a Poisson process; i.e., a process in which events occur continuously and independently at a constant average rate. The Exponential distribution is often used in reliability applications, where it can be used to model data with a constant failure rate. This distribution is useful for modeling highly positively skewed data that have a non-zero lower bound. |
| Parameters | location (ξ)scale (α) |
| Support | |
| Distribution Functions |
Gamma
| Field | Details |
|---|---|
| Description | The Gamma distribution is a two-parameter, positively-skewed distribution. The Exponential, Erlang, and Chi-squared distributions are special cases of the Gamma distribution. There are three different parameterizations for the Gamma distribution in common use:
|
| Parameters | scale (θ)shape (κ) |
| Support | |
| Distribution Functions |
Pearson Type III
| Field | Details | |
|---|---|---|
| Description | The Pearson Type III (PIII) distribution is a three-parameter distribution that is widely used in hydrologic frequency analysis. It has also been used to model the probability of wind speed and rainfall intensity. The PIII distribution is deduced from the two-parameter gamma distribution, and converges to a Normal distribution as its skewness (γ) approaches zero.In RMC-BestFit, the PIII distribution is parameterized using the central moments of the distribution mean (μ), standard deviation (σ), and skewness (γ). The true parameters (location, scale, and shape) are computed from the specified moments. This is done because the moments of the distribution are more intuitively defined by end-users familiar with Bulletin 17B [?] and Bulletin 17C [?]. RMC-BestFit uses the same parameterization as [?], with the underlying location parameter ξ, the scale parameter β, and the shape parameter α. | |
| Parameters | mean: μstandard deviation: σskewness: γ | location: scale: shape: |
| Support | If If If | ; however, the method of maximum likelihood can only produce a solution if |
| Distribution Functions | If If the distribution is Normal If | where: is the Gamma distribution probability density function (PDF); is the Gamma cumulative distribution function (CDF); and is the Gamma inverse CDF.where: is the Normal distribution PDF; is the Normal CDF; and is the Normal inverse CDF. |
Log-Pearson Type III
| Field | Details | |
|---|---|---|
| Description | The log-Pearson Type III (LPIII) distribution is a flexible three-parameter distribution that describes a random variable whose logarithm is PIII distributed. The LPIII distribution was originally used to model annual maximum flood flows.In RMC-BestFit, the LPIII uses log base 10 and is parameterized using log10 moments of the distribution mean (μ), standard deviation (σ), and skewness (γ). The true parameters (location, scale, and shape) are computed from the specified moments. This is done because the moments of the distribution are more intuitively defined by end-users familiar with Bulletin 17B [?] and Bulletin 17C [?]. | |
| Parameters | mean (of log): μstandard deviation (of log): σskewness (of log): γ | location: scale: shape: |
| Support | If If If | ; however, the method of maximum likelihood can only produce a solution if |
| Distribution Functions | If If then the distribution is Normal If | where: is the Gamma distribution probability density function (PDF); is the Gamma cumulative distribution function (CDF); is the Gamma inverse CDF; and .where: is the Normal distribution PDF; is the Normal CDF; and is the Normal inverse CDF. |
Extreme Value Distributions
Gumbel (Extreme Value Type I)
| Field | Details |
|---|---|
| Description | The Gumbel, or Extreme Value Type-I (EVI) distribution, is a two-parameter distribution with a fixed positive skewness of ≈ 1.14. The Gumbel distribution is used to describe the maximum (or minimum) of a number of samples, and has seen widespread use in hydrologic frequency analysis. The Gumbel distribution is a particular case of the Generalized Extreme Value (GEV) distribution when the GEV shape parameter is zero. The Gumbel distribution is also used for a probability plotting scale because it exaggerates the extreme tails of the data. |
| Parameters | location (ξ)scale (α) |
| Support | |
| Distribution Functions | ; where |
Weibull
| Field | Details |
|---|---|
| Description | The Weibull distribution is a two-parameter distribution commonly used in reliability analysis. It is related to a number of other probability distributions. In particular, the Weibull distribution interpolates between the exponential distribution (for κ=1) and the Rayleigh distribution (when κ=2). If the quantity x is the “time to failure”, the Weibull distribution gives the distribution for which the failure rate is proportional to a power of time. |
| Parameters | scale (λ)shape (κ) |
| Support | |
| Distribution Functions |
Generalized Extreme Value
| Field | Details |
|---|---|
| Description | The Generalized Extreme Value (GEV) distribution is a three-parameter distribution that subsumes the three extreme-value distributions: Gumbel (EVI), Fréchet (EVII), and Weibull (EVIII). The shape parameter κ determines which sub-distribution the GEV represents. The extreme value theorem states that the GEV distribution is the limit distribution of maxima of a sequence of independent and identically distributed random values. GEV is used for hydrologic frequency analysis, insurance, and financial risks.RMC–BestFit uses the Hosking’s parameterization (Hosking and Wallis, 1997) [?], in which a negative shape parameter κ has no upper bound, and a positive κ has a fixed upper bound. Other sources adopt the opposite convention, where a negative κ implies an upper bound. |
| Parameters | location (ξ)scale (α)shape (κ) |
| Support | If If If |
| Distribution Functions |
Generalized Pareto
| Field | Details |
|---|---|
| Description | The Generalized Pareto (GPA) distribution is a three-parameter distribution with a fixed lower bound. The GPA is upper bounded when the shape parameter is positive. When κ is zero, the distribution reduces to a shifted Exponential distribution. The GPA distribution is most often used with peaks-over-threshold data. Hydrologic frequency analysis uses the GPA in cases where rainfall or flow maxima exceed a specified threshold.RMC–BestFit uses Hosking’s parameterization (Hosking and Wallis, 1997) [?], in which a negative shape parameter κ has no upper bound, and a positive κ has a fixed upper bound. |
| Parameters | location (ξ)scale (α)shape (κ) |
| Support | If If |
| Distribution Functions |
Logistic Distributions
Logistic
| Field | Details |
|---|---|
| Description | The Logistic distribution is a two-parameter, symmetric distribution with heavier tails (higher kurtosis) than the Normal distribution. The Logistic distribution has applications in hydrology for long duration discharge or rainfall, such as monthly or yearly totals. The most common application is in logistic regression where the errors follow a Logistic distribution. |
| Parameters | location (ξ)scale (α) |
| Support | |
| Distribution Functions | ; where |
Generalized Logistic
| Field | Details |
|---|---|
| Description | The Generalized Logistic (GLO) distribution is a heavy-tailed, three-parameter distribution. The GLO distribution has been used to fit values of extremes, such as stock return fluctuations and sea levels. It has been used extensively for modeling annual rainfall maxima, and for flood frequency analysis.RMC-BestFit uses Hosking’s parameterization (Hosking and Wallis, 1997) [?], in which a negative shape parameter κ has no upper bound, and a positive κ has a fixed upper bound. |
| Parameters | location (ξ)scale (α)shape (κ) |
| Support | If If If |
| Distribution Functions |