# Probability Distributions

Statgraphics contains several procedures for manipulating statistical probability distributions. Each of more than 50 distributions may be plotted, fit to data, and used to calculate critical values or tail areas. Random samples may also be generated from each of the distributions with this stat software.

The following procedures are provided:

Procedure Statgraphics Centurion 18/19 Statgraphics
Sigma express
Statgraphics
stratus
Statgraphics
Web Services
StatBeans
Probability Distributions     Sampling Distributions     Normal Probability Plot     Probability Plots for Nonnormal Data  Distribution Fitting (Uncensored Data)     Distribution Fitting (Censored Data) Distribution Fitting (Arbitrarily Censored Data) Random Number Generation     Multivariate Normal Random Numbers Statistical Tolerance Limits     Multivariate Tolerance Limits Univariate Mixture Distributions (V19 only) Bivariate Mixture Distributions (V19 only) #### Probability Distributions

The Statgraphics Probability Distributions procedure calculates probabilities for 46 discrete and continuous distributions. The stat software will plot the probability density or mass function, cumulative distribution function, survivor function, log survivor function, or hazard function. It also calculates critical values and tail areas. Random samples may be generated from any of the distributions given specified parameters value. #### Sampling Distributions

The stat software's Sampling Distributions procedure calculates tail areas and critical values for the normal, Student's t, chi-square, and F distributions. It also plots the calculated results. #### Normal Probability Plot

The Normal Probability Plot is used to help judge whether or not a sample of numeric data comes from a normal probability distribution. If it does, the points should fall close to a straight line when plotted against the specially scaled Y-axis. For non-normal data, you can often determine the type of departure from normality by examining the way in which the data deviate from the normal reference line. #### Probability Plots for Nonnormal Data

The Probability Plots procedure plots the data in a single numeric column on graphs that are specifically scaled so that, if the data come from a particular distribution, the observations will fall approximately along a straight line. The procedure includes plots for the uniform, normal, lognormal, Weibull, smallest extreme value, logistic, and exponential distributions. #### Distribution Fitting (Uncensored Data)

The Distribution Fitting (Uncensored Data) procedure fits any of 46 probability distributions to a column of numeric data. The data are assumed to be uncensored, i.e., the data represent random samples from the selected distribution. If requested, many distributions may be fit and ordered by the stat software for their ability to match the data. Goodness-of-fit tests are performed to determine which distributions adequately model the observed values. #### Distribution Fitting (Censored Data)

The Distribution Fitting (Censored Data) procedure fits any of 45 probability distributions to a column of censored numeric data. Censoring occurs when some of the data values are not known exactly. For example, when measuring failure times, some items under study may not have failed when the study is stopped, resulting in only a lower bound on the failure times for those items. As with uncensored data, the distributions may be sorted according to their goodness-of-fit. #### Distribution Fitting (Arbitrarily Censored Data)

The Distribution Fitting (Arbitrarily Censored Data) procedure analyzes data in which one or more observations are not known exactly. In particular, observations may be:

1. Left-censored: known only to be less than a stated value.
2. Right-censored: known only to be greater than a stated value.
3. Interval censored: known only to fall within a stated interval.

The procedure calculates summary statistics, fits distributions, creates graphs, and calculates a nonparametric estimate of the survival function. #### Random Number Generation

Random numbers may be generated by the stat software from any of the 46 probability distributions using the Probability Distributions procedure. They may also be generated as part of the Monte Carlo simulations in Statgraphics Centurion. #### Multivariate Normal Random Numbers

This procedure generates random numbers from a multivariate normal distribution involving up to 12 variables. The user inputs the variable means, standard deviations, and the correlation matrix. Random samples are generated which may be saved to the Statgraphics databook. #### Statistical Tolerance Limits

Statistical tolerance limits bound a specified proportion of a population at a specified confidence level. They have many uses, including demonstrating that a large proportion of the members of a population lie within required limits. Statgraphics Centurion calculates statistical tolerance limits for 11 specific probability distributions, as well as nonparametric limits. #### Multivariate Tolerance Limits

The Multivariate Tolerance Limits procedure creates statistical tolerance limits for data consisting of more than one variable. It includes a tolerance region that bounds a selected p% of the population with 100(1-alpha)% confidence. It also includes joint simultaneous tolerance limits for each of the variables using a Bonferroni approach. The data are assumed to be a random sample from a multivariate normal distribution. Multivariate tolerance limits are often compared to specifications for multiple variables to determine whether or not most of the population is within spec. #### Univariate Mixture Distributions (Version 19)

The Distribution Fitting (Univariate Mixture Models) fits a distribution to continuous numeric data that consists of a mixture of 2 or more univariate Gaussian distributions. The components of the mixture may represent different groups in the sample used to fit the overall distribution, or the mixture model may approximate some distribution with a complicated shape. The procedure fits the distribution, creates graphs, and calculates tail areas and critical values. Tools are also provided for determining how many components are needed to represent a data sample. #### Bivariate Mixture Distributions (Version 19)

The Distribution Fitting (Bivariate Mixture Distributions) procedure fits a distribution to continuous numeric data that consists of a mixture of 2 or more bivariate Gaussian distributions. The components of the mixture may represent different groups in the sample used to fit the overall distribution, or the mixture model may approximate some distribution with a complicated shape. The procedure fits the distribution and creates graphs of the fitted model. Tools are also provided for determining how many components are needed to represent a data sample. 