Use of consent criteria. Acceptance criterion What will we do with the received material

Criteria for agreement (compliance)

To test the hypothesis about the correspondence of the empirical distribution to the theoretical law of distribution, special statistical indicators are used - goodness-of-fit criteria (or compliance criteria). These include Pearson's, Kolmogorov's, Romanovsky's, Yastremsky's, etc. criteria. Most of the goodness of fit criteria are based on the use of deviations of empirical frequencies from theoretical ones. Obviously, the smaller these deviations, the better the theoretical distribution matches (or describes) the empirical one.

Consent Criteria - these are criteria for testing hypotheses about the correspondence of the empirical distribution to the theoretical probability distribution. Such criteria are divided into two classes: general and special. General goodness-of-fit criteria apply to the most general formulation of a hypothesis, namely the hypothesis that observed outcomes agree with any a priori assumed probability distribution. Special goodness-of-fit tests imply special null hypotheses that formulate agreement with a certain form of probability distribution.

The goodness of fit criteria, based on the established distribution law, make it possible to establish when the discrepancies between theoretical and empirical frequencies should be recognized as insignificant (random), and when - significant (non-random). It follows from this that the goodness-of-fit criteria make it possible to reject or confirm the correctness of the hypothesis put forward when leveling the series about the nature of the distribution in the empirical series and to answer whether it is possible to accept a model expressed by some theoretical distribution law for a given empirical distribution.

Pearson's x2 (chi-square) goodness-of-fit test is one of the main goodness-of-fit criteria. Proposed by the English mathematician Karl Pearson (1857-1936) to assess the randomness (significance) of discrepancies between the frequencies of empirical and theoretical distributions:

where k- the number of groups into which the empirical distribution is divided; fi- empirical frequency of the trait in i-th group; / ts °р - theoretical frequency of the trait in i-th group.

Criteria Application Scheme y) to assessing the consistency of the theoretical and empirical distributions is reduced to the following.

  • 1. The calculated measure of discrepancy is determined % 2 acc.
  • 2. The number of degrees of freedom is determined.
  • 3. According to the number of degrees of freedom v, using a special table, %^bl is determined
  • 4. If % 2 asch >x 2 abl, then for a given significance level a and the number of degrees of freedom v, the hypothesis of insignificance (randomness) of the discrepancies is rejected. Otherwise, the hypothesis can be recognized as not contradicting the experimental data obtained, and with probability (1 - a) it can be argued that the discrepancies between theoretical and empirical frequencies are random.

Significance level - is the probability of erroneous rejection of the put forward hypothesis, i.e. the probability that the correct hypothesis will be rejected. In statistical studies, depending on the importance and responsibility of the tasks being solved, the following three levels of significance are used:

  • 1) a = 0.1, then P = 0,9;
  • 2) a = 0.05, then P = 0,95;
  • 3) a = 0.01, then P = 0,99.

Using goodness-of-fit y), the following conditions must be observed.

  • 1. The volume of the studied population must satisfy the condition n> 50, while the frequency or size of the group must be at least 5. If this condition is violated, you must first merge small frequencies (less than 5).
  • 2. The empirical distribution should consist of data obtained as a result of random selection, i.e. they must be independent.

The disadvantage of Pearson's goodness-of-fit criterion is the loss of some of the initial information associated with the need to group the observation results into intervals and combine individual intervals with a small number of observations. In this regard, it is recommended to supplement the verification of the correspondence of distributions according to the criterion y) other criteria. This is especially true when the sample size is P ~ 100.

In statistics, the Kolmogorov goodness-of-fit test (also known as the Kolmogorov-Smirnov goodness-of-fit test) is used to determine whether two empirical distributions obey the same law, or to determine whether the resulting distribution obeys an assumed model. The Kolmogorov criterion is based on determining the maximum difference between the accumulated frequencies or the frequencies of empirical or theoretical distributions. The Kolmogorov criterion is calculated according to the following formulas:

where D and d- respectively, the maximum difference between the accumulated frequencies (/-/") and between the accumulated frequencies ( rr") empirical and theoretical series of distributions; N- the number of units in the population.

Having calculated the value x, a special table determines the probability with which it can be argued that the deviations of empirical frequencies from theoretical ones are random. If the sign takes values ​​up to 0.3, then this means that there is a complete coincidence of frequencies. With a large number of observations, the Kolmogorov test is able to detect any deviation from the hypothesis. This means that any difference in the sample distribution from the theoretical one will be detected with its help if there are a lot of observations. The practical significance of this property is insignificant, since in most cases it is difficult to count on obtaining a large number of observations under constant conditions, the theoretical idea of ​​the distribution law to which the sample must obey is always approximate, and the accuracy of statistical checks should not exceed the accuracy of the chosen model.

Romanovsky's goodness-of-fit test is based on the use of Pearson's test, i.e. already found values ​​x 2 > and the number of degrees of freedom:

where v is the number of degrees of freedom of variation.

The Romanovsky criterion is convenient in the absence of tables for x 2. If a K r TO? > 3, then they are not random and the theoretical distribution cannot serve as a model for the empirical distribution under study.

B. S. Yastremsky used in the agreement criterion not the number of degrees of freedom, but the number of groups ( k), a special value 0 depending on the number of groups, and a chi-square value. Yastremsky's agreement criterion has the same meaning as Romanovsky's criterion and is expressed by the formula

where x 2 - Pearson's criterion of agreement; /e gr - number of groups; 0 - coefficient, for the number of groups less than 20 equal to 0.6.

If 1f act > 3, the discrepancies between the theoretical and empirical distributions are not random, i.e. the empirical distribution does not meet the requirements of a normal distribution. If 1f act

MINISTRY OF EDUCATION AND SCIENCE OF UKRAINE

AZOV REGIONAL INSTITUTE OF MANAGEMENT

ZAPORIZHIA NATIONAL TECHNICAL UNIVERSITY

Department of Math

COURSE WORK

H discipline "STATISTICS"

On the topic: "CRITERIA OF CONSENT"

2nd year students

Group 207 Faculty of Management

Batura Tatyana Olegovna

scientific adviser

Associate Professor Kosenkov O.I.

Berdyansk - 2009


INTRODUCTION

1.2 Pearson's χ 2 goodness of fit for a simple hypothesis

1.3 Goodness of fit for complex hypothesis

1.4 Fisher's χ 2 goodness-of-fit tests for a complex hypothesis

1.5 Other consent criteria. Goodness-of-fit for the Poisson distribution

SECTION II. PRACTICAL APPLICATIONS OF CONSENT CRITERION

APPS

LIST OF USED LITERATURE


INTRODUCTION

This course work describes the most common goodness of fit criteria - omega-square, chi-square, Kolmogorov and Kolmogorov-Smirnov. Particular attention is paid to the case when it is necessary to check whether the data distribution belongs to some parametric family, for example, normal. Due to its complexity, this situation, which is very common in practice, has not been fully studied and is not fully reflected in the educational and reference literature.

Goodness-of-fit criteria are called statistical tests designed to test the agreement between experimental data and a theoretical model. This question is best designed if the observations represent a random sample. The theoretical model in this case describes the distribution law.

The theoretical distribution is the probability distribution that governs random selection. Not only theory can give ideas about it. Tradition, past experience, and previous observations can be sources of knowledge here. We only need to emphasize that this distribution must be chosen regardless of the data on which we are going to check it. In other words, it is unacceptable to first “fit” a certain distribution law on a sample, and then try to check the agreement with the obtained law for the same sample.

Simple and complex hypotheses. Speaking about the theoretical law of distribution, which the elements of a given sample should hypothetically follow, we must distinguish between simple and complex hypotheses about this law:

A simple hypothesis directly indicates a certain specific law of probabilities (probability distribution) according to which the sample values ​​arose;

A complex hypothesis indicates a single distribution, and some of them (for example, a parametric family).

The goodness-of-fit criteria are based on the use of various measures of distance between the analyzed empirical distribution and the distribution function of a feature in the general population.

Non-parametric tests of agreement Kolmogorov, Smirnov, omega square are widely used. However, they are also associated with widespread errors in the application of statistical methods.

The fact is that the listed criteria were developed to test the agreement with a fully known theoretical distribution. Calculation formulas, tables of distributions and critical values ​​are widely used. The main idea of ​​the Kolmogorov, omega square and similar criteria is to measure the distance between the empirical distribution function and the theoretical distribution function. These criteria differ in the form of distances in the space of distribution functions.

Starting this course work, I set myself a goal to find out what consent criteria exist, to figure out why they are needed. To achieve this goal, you must complete the following tasks:

1. To reveal the essence of the concept of “consent criteria”;

2. Determine what consent criteria exist, study them separately;

3. Draw conclusions on the work done.


SECTION I. THEORETICAL SUBSTANTIATION OF THE CRITERION OF CONSENT

1.1 Kolmogorov goodness-of-fit criteria and omega-square in the case of a simple hypothesis

Simple hypothesis. Consider a situation where the measured data are numbers, in other words, one-dimensional random variables. The distribution of one-dimensional random variables can be fully described by specifying their distribution functions. And many goodness-of-fit tests are based on checking the closeness of the theoretical and empirical (sample) distribution functions.

Suppose we have a sample of n. Let us denote the true distribution function, to which the observations are subject, G(x), the empirical (sample) distribution function - F n (x), and the hypothetical distribution function - F(x). Then the hypothesis H that the true distribution function is F(x) is written as H: G(·) = F(·).

How to test hypothesis H? If H is true, then F n and F should show a certain similarity, and the difference between them should decrease as n increases. Due to the Bernoulli theorem, F n (x) → F(x) as n → ∞. Various methods are used to quantify the similarity of functions F n and F.

To express the similarity of functions, one or another distance between these functions can be used. For example, one can compare F n and F in the uniform metric, i.e. consider the value:

(1.1)

The statistics D n is called the Kolmogorov statistics.

Obviously, D n is a random variable, since its value depends on the random object F n . If the hypothesis H 0 is true and n → ∞, then F n (x) → F(x) for any x. Therefore, it is natural that under these conditions D n → 0. If the hypothesis H 0 is false, then F n → G and G ≠ F, and therefore sup -∞

As always when testing a hypothesis, we reason as if the hypothesis were true. It is clear that H 0 must be rejected if the value of the statistics D n obtained in the experiment seems unbelievably large. But for this you need to know how the statistics D n are distributed under the hypothesis H: F= G for given n and G.

A remarkable property of D n is that if G = F, i.e. if the hypothetical distribution is specified correctly, then the law of distribution of the statistics D n turns out to be the same for all continuous functions G. It depends only on the sample size n.

The proof of this fact is based on the fact that the statistic does not change its value under monotonic transformations of the x-axis. By such a transformation, any continuous distribution G can be turned into a uniform distribution on the interval . In this case, F n (x) will pass into the distribution function of the sample from this uniform distribution.

For small n, for the statistics D n under the hypothesis H 0, tables of percentage points are compiled. For large n, the distribution D n (under the hypothesis H 0) is indicated by the limit theorem found in 1933 by A.N. Kolmogorov. She talks about statistics

(since the value itself D n → 0 at H 0 , it is necessary to multiply it by an infinitely growing value in order for the distribution to stabilize). Kolmogorov's theorem states that if H 0 is true and if G is continuous:
(1.2)

This amount is very easy to calculate in Maple. To test a simple hypothesis H 0: G = F, it is required to calculate the value of statistics D n from the initial sample. A simple formula works for this:

(1.3)

Here, through x k - elements of the variational series constructed from the original sample. The obtained value D n then must be compared with the critical values ​​extracted from the tables or calculated by the asymptotic formula. The hypothesis H 0 has to be rejected (at the chosen level of significance) if the value of D n obtained in the experiment exceeds the chosen critical value corresponding to the accepted level of significance.

Another popular goodness of fit criterion is obtained by measuring the distance between F n and F in the integral metric. It is based on the so-called omega-square statistics:

(1.4)

To calculate it from real data, you can use the formula:

(1.5)

If the hypothesis H 0 is true and the function G is continuous, the distribution of the omega-square statistic, just like the distribution of the statistic D n , depends only on n and does not depend on G.

Just as for D n , for

for small n, tables of percentage points are available, and for large values ​​of n, the limiting (as n → ∞) distribution of the statistic n should be used. Here again we have to multiply by an infinitely growing factor. The limiting distribution was found by N.V. Smirnov in 1939. Detailed tables and computational programs were compiled for it. An important theoretical property of criteria based on D n and : they are valid against any alternative G ≠ F.

Since all assumptions about the nature of a particular distribution are hypotheses, they must be subjected to statistical verification using consent criteria, which make it possible to establish when the discrepancies between theoretical and empirical frequencies should be recognized as insignificant, i.e. random, and when - significant (non-random). Thus, the goodness-of-fit criteria make it possible to reject or confirm the correctness of the hypothesis put forward when leveling the series about the nature of the distribution in the empirical series.

There are a number of consent criteria. The Pearson, Romanovsky and Kolmogorov criteria are more often used.

Pearson goodness-of-fit test - one of the main

where k is the number of groups into which the empirical distribution is divided,
is the observed frequency of the trait in the i-th group,
is the theoretical frequency.
Tables have been compiled for the distribution, where the critical value of the goodness-of-fit criterion is indicated for the chosen level of significance and degrees of freedom df. (or )
The level of significance is the probability of an erroneous rejection of the put forward hypothesis, i.e. the probability that the correct hypothesis will be rejected. Three levels are used in statistics:

  • a= 0.10, then Р=0.90 (in 10 cases of 100 the correct hypothesis can be rejected);
  • a=0.05, then P=0.95;
  • a=0.01, then P=0.99.

The number of degrees of freedom df is defined as the number of groups in the distribution series minus the number of bonds: df = k –z. The number of connections is understood as the number of indicators of the empirical series used in the calculation of theoretical frequencies, i.e. indicators linking empirical and theoretical frequencies.
For example, when aligned with a normal distribution curve, there are three relationships:
; ; .
Therefore, when leveling along the normal distribution curve, the number of degrees of freedom is defined as df = k –3.
To assess the materiality, the calculated value is compared with the table value.
With full coincidence of theoretical and empirical distributions , otherwise >0. If >, then for a given level of significance and the number of degrees of freedom, we reject the hypothesis of insignificance (randomness) of discrepancies.
If , we conclude that the empirical series is in good agreement with the hypothesis of the expected distribution, and with the probability Р=(1-a) it can be argued that the discrepancy between the theoretical and empirical frequencies is accidental.
Pearson's goodness-of-fit test is used if the population size is large enough, and the frequency of each group must be at least 5.

Romanovsky's criterion with based on the use of the Pearson criterion, i.e. already found values ​​, and the number of degrees of freedom df:

It is useful when there are no tables for .
If with<3, то расхождения распределений случайны, если же с>3, then they are not random and the theoretical distribution cannot serve as a model for the empirical distribution under study.

Kolmogorov's criterion l is based on determining the maximum discrepancy between the accumulated frequencies and the frequencies of empirical and theoretical distributions:
or ,
where D and d are, respectively, the maximum difference between the accumulated frequencies and the accumulated frequencies of the empirical and theoretical series of distributions;
N is the number of population units.
Having calculated the value of l, the table P(l) determines the probability with which it can be argued that the deviations of the empirical frequencies from the theoretical ones are random. Probability Р(l) can vary from 0 to 1. At Р(l)=1 there is a complete coincidence of frequencies, Р(l)=0 – complete discrepancy. If l takes values ​​up to 0.3, then P(l)=1.
The main condition for using the Kolmogorov criterion is a sufficiently large number of observations.

In this section, we will consider one of the issues related to testing the likelihood of hypotheses, namely, the issue of consistency between theoretical and statistical distributions.

Let us assume that the given statistical distribution is leveled using some theoretical curve f(x)(Fig. 7.6.1). No matter how well the theoretical curve is chosen, some discrepancies are inevitable between it and the statistical distribution. Naturally, the question arises: are these discrepancies due only to random circumstances associated with a limited number of observations, or are they significant and are related to the fact that the curve we have chosen poorly levels out this statistical distribution. To answer this question, so-called "consent criteria" are used.

LAWS OF DISTRIBUTION OF RANDOM VARIABLES



The idea behind applying the goodness-of-fit criteria is as follows.

Based on this statistical material, we have to test the hypothesis H, consisting in the fact that the random variable X obeys some definite distribution law. This law can be given in one form or another: for example, in the form of a distribution function F(x) or in the form of distribution density f(x), or in the form of a set of probabilities p t , where pt- the probability that the value X will fall within l something discharge.

Since from these forms the distribution function F(x) is the most general and determines any other, we will formulate the hypothesis H, as consisting in the fact that the value X has a distribution function ^(d:).

To accept or reject a hypothesis H, consider some quantity u, characterizing the degree of discrepancy between the theoretical and statistical distributions. Value U can be selected in various ways; for example, as U one can take the sum of the squared deviations of the theoretical probabilities pt from the corresponding frequencies R* or the sum of the same squares with some coefficients (“weights”), or the maximum deviation of the statistical distribution function F*(x) from theoretical F(x) etc. Let us assume that the quantity U chosen in one way or another. Obviously, there is some random value. The distribution law of this random variable depends on the distribution law of the random variable x, on which experiments were carried out, and from the number of experiments P. If the hypothesis H is true, then the distribution law of the quantity U determined by the distribution law of the quantity X(function F(x)) and number P.

Let us assume that this distribution law is known to us. As a result of this series of experiments, it was found that the measure we have chosen



CONSENT CRITERIA


discrepancies U took on some value a. The question is whether this can be explained by random causes, or whether this discrepancy is too large and indicates the presence of a significant difference between the theoretical and statistical distributions and, therefore, the unsuitability of the hypothesis H? To answer this question, suppose that the hypothesis H is correct, and under this assumption we calculate the probability that, due to random causes associated with an insufficient amount of experimental material, the measure of discrepancy U will be no less than the value observed by us in the experiment and, i.e., we calculate the probability of an event:

If this probability is very small, then the hypothesis H should be rejected as not very plausible; if this probability is significant, it should be recognized that the experimental data do not contradict the hypothesis N.

The question arises, in what way should the measure of discrepancy £/ be chosen? It turns out that for some ways of choosing it, the law of distribution of the quantity U has very simple properties and, for sufficiently large P practically independent of the function F(x). It is precisely such measures of discrepancy that are used in mathematical statistics as criteria for agreement.

Let's consider one of the most commonly used criteria of agreement - the so-called "criterion at?" Pearson.

Assume that there are ha independent experiments, in each of which the random variable X took on a certain value. The results of the experiments are summarized in k digits and are presented in the form of a statistical series.

Theoretical and empirical frequencies. Test for normal distribution

When analyzing variational distribution series, it is of great importance how empirical distribution sign corresponds normal. For this, the frequencies of the actual distribution must be compared with the theoretical ones, which are characteristic of the normal distribution. This means that it is necessary to calculate the theoretical frequencies of the normal distribution curve, which are a function of normalized deviations, from the actual data.

In other words, the empirical distribution curve must be aligned with the normal distribution curve.

Objective characteristic of compliance theoretical and empirical frequencies can be obtained using special statistical indicators, which are called consent criteria.

Concordance criterion called a criterion that allows you to determine whether the discrepancy is empirical and theoretical distributions random or significant, i.e. whether the observational data are consistent with the put forward statistical hypothesis or are not consistent. The distribution of the general population, which it has by virtue of the hypothesis put forward, is called theoretical.

There is a need to establish criterion(rule) that would allow one to judge whether the discrepancy between the empirical and theoretical distributions is random or significant. If the discrepancy is random, then they consider that the observational data (sample) are consistent with the hypothesis put forward about the law of distribution of the general population and, therefore, the hypothesis is accepted; if the discrepancy is significant, then the observational data do not agree with the hypothesis and reject it.

Usually empirical and theoretical frequencies differ due to the fact that:

    the discrepancy is random and associated with a limited number of observations;

    The discrepancy is not accidental and is explained by the fact that the statistical hypothesis that the general population is normally distributed is erroneous.

In this way, consent criteria allow to reject or confirm the correctness of the hypothesis put forward when leveling the series about the nature of the distribution in the empirical series.

Empirical Frequencies obtained from observation. Theoretical frequencies calculated by formulas.

For normal distribution law they can be found like this:

    Σƒ i- sum of accumulated (cumulative) empirical frequencies

    h - difference between two adjacent options

    σ - sample standard deviation

    t-normalized (standardized) deviation

    φ(t) is the probability density function of the normal distribution (find from the table of values ​​of the local Laplace function for the corresponding value of t)

There are several goodness-of-fit tests, the most common of which are: chi-square (Pearson's) test, Kolmogorov's test, Romanovsky's test.

Pearson goodness-of-fit test χ 2 - one of the main ones, which can be represented as the sum of the ratios of the squared differences between the theoretical (f Т) and empirical (f) frequencies to theoretical frequencies:

    k is the number of groups into which the empirical distribution is divided,

    f i is the observed frequency of the trait in the i-th group,

    f T is the theoretical frequency.

For the distribution χ 2 tables are compiled, which indicate the critical value of the fit criterion χ 2 for the chosen level of significance α and degrees of freedom df (or ν). Significance level α is the probability of erroneous rejection of the put forward hypothesis, i.e. the probability that the correct hypothesis will be rejected. R - statistical validity accepting the correct hypothesis. In statistics, three levels of significance are most commonly used:

α=0.10, then P=0.90 (in 10 cases out of 100)

α=0.05, then Р=0.95 (in 5 cases out of 100)

α=0.01, then P=0.99 (in 1 case out of 100) the correct hypothesis can be rejected

The number of degrees of freedom df is defined as the number of groups in the distribution series minus the number of bonds: df = k –z. The number of connections is understood as the number of indicators of the empirical series used in the calculation of theoretical frequencies, i.e. indicators linking empirical and theoretical frequencies. For example, in a bell curve alignment, there are three relationships. Therefore, when aligning bell curve the number of degrees of freedom is defined as df =k–3. To assess the materiality, the calculated value is compared with the tabular χ 2 table

With complete coincidence of the theoretical and empirical distributions χ 2 =0, otherwise χ 2 >0. If χ 2 calc > χ 2 tab, then for a given level of significance and the number of degrees of freedom, we reject the hypothesis of insignificance (randomness) of discrepancies. If χ 2 calc< χ 2 табл то гипотезу принимаем и с вероятностью Р=(1-α) можно утверждать, что расхождение между теоретическими и эмпирическими частотами случайно. Следовательно, есть основания утверждать, что эмпирическое распределение подчиняетсяnormal distribution. Pearson's goodness-of-fit test is used if the population size is large enough (N>50), while the frequency of each group should be at least 5.

Kolmogorov's goodness-of-fit criterion is based on determining the maximum discrepancy between the accumulated empirical and theoretical frequencies:

where D and d are, respectively, the maximum difference between the cumulative frequencies and the cumulative frequencies of the empirical and theoretical distributions. According to the distribution table of Kolmogorov's statistics, the probability is determined, which can vary from 0 to 1. At P(λ)=1- there is a complete coincidence of frequencies, P(λ)=0 - a complete divergence. If the probability value P is significant in relation to the found value λ, then it can be assumed that the discrepancies between the theoretical and empirical distributions are insignificant, i.e., they are of a random nature. The main condition for using the Kolmogorov criterion is a sufficiently large number of observations.

Kolmogorov's goodness-of-fit criterion

Consider how the Kolmogorov criterion (λ) is applied when testing the hypothesis of a normal distribution the general population. The alignment of the actual distribution along the normal distribution curve consists of several steps:

    Compare actual and theoretical frequencies.

    According to the actual data, the theoretical frequencies of the normal distribution curve are determined, which is a function of the normalized deviation.

    Check to what extent the distribution of the feature corresponds to the normal one.

For IV column of the table:

In MS Excel, the normalized deviation (t) is calculated using the NORMALIZE function. It is necessary to select a range of free cells by the number of options (rows of a spreadsheet). Without removing the selection, call the NORMALIZATION function. In the dialog box that appears, specify the following cells, which contain, respectively, the observed values ​​(X i), average (X) and standard deviation Ϭ. The operation must be completed simultaneous by pressing Ctrl+Shift+Enter

For the V column of the table:

The probability density function of the normal distribution φ(t) is found from the table of values ​​of the local Laplace function for the corresponding value of the normalized deviation (t)

For VI column of the table:

Kolmogorov goodness of fit criterion (λ) determined by dividing the modulus max differences between empirical and theoretical cumulative frequencies per square root of the number of observations:

Using a special probability table for the goodness of fit criterion λ, we determine that the value λ=0.59 corresponds to a probability of 0.88 (λ

Distribution of empirical and theoretical frequencies, probability density of theoretical distribution

When applying goodness-of-fit tests to test whether an observed (empirical) distribution is consistent with a theoretical one, one should distinguish between testing simple and complex hypotheses.

The one-sample Kolmogorov-Smirnov normality test is based on maximum difference between the cumulative empirical distribution of the sample and the implied (theoretical) cumulative distribution. If D the Kolmogorov-Smirnov statistic is significant, then the hypothesis that the corresponding distribution is normal must be rejected.