Statistics for Data Science
上QQ阅读APP看书,第一时间看更新

Fitting

Fitting is the process of measuring how well a statistical model or process describes a data scientist's observations pertaining to a recordset or experiment. These measures will attempt to point out the discrepancy between observed values and probable values. The probable values of a model or process are known as a distribution or a probability distribution.

Therefore, a probability distribution fitting (or distribution fitting) is when the data scientist fits a probability distribution to a series of data concerning the repeated measurement of a variable phenomenon.

The object of a data scientist performing a distribution fitting is to predict the probability or to forecast the frequency of, the occurrence of the phenomenon at a certain interval.

One of the most common uses of fitting is to test whether two samples are drawn from identical distributions.

There are numerous probability distributions a data scientist can select from. Some will fit better to the observed frequency of the data than others will. The distribution giving a close fit is supposed to lead to good predictions; therefore, the data scientist needs to select a distribution that suits the data well.