Where does the law of large numbers apply? Laws of large numbers

Distribution function of a random variable and its properties.

Distribution function random variable X is a function F(X), expressing for each x the probability that the random variable X will take a value less than x: F(x)=P(X

Function F(x) sometimes called integral function distribution or integral law of distribution.

Properties of the distribution function:

1. The distribution function of a random variable is a non-negative function between zero and one:

0 ≤ F(x) ≤ 1.

2. The distribution function of a random variable is a non-decreasing function on the entire numerical axis.

3. At minus infinity the distribution function is equal to zero, at plus infinity it is equal to one, i.e.: F(-∞)= , F(+∞)= .

4. The probability of a random variable falling into the interval [x1,x2) (including x1) is equal to the increment of its distribution function on this interval, i.e. P(x 1 ≤ X< х 2) = F(x 2) - F(x 1).


Markov and Chebyshev inequality

Markov's inequality

Theorem: If a random variable X takes only non-negative values ​​and has a mathematical expectation, then for any positive number A the following equality is true: P(x>A) ≤ .

Since the events X > A and X ≤ A are opposite, then replacing P(X > A) we express 1 - P(X ≤ A), we arrive at another form of Markov’s inequality: P(X ≥ A) ≥1 - .

Markov's inequality k applies to any non-negative random variables.

Chebyshev's inequality

Theorem: For any random variable that has a mathematical expectation and variance, the Chebyshev inequality is valid:

P (|X – a| > ε) ≤ D(X)/ε 2 or P (|X – a| ≤ ε) ≥ 1 – DX/ε 2, where a= M(X), ε>0.


The law of large numbers “in the form” of Chebyshev’s theorem.

Chebyshev's theorem: If the variance n independent random variables X1, X2,…. X n are limited to the same constant, then with an unlimited increase in the number n the arithmetic mean of random variables converges in probability to the arithmetic mean of their mathematical expectations a 1 , a 2 ...., a n, i.e. .

The meaning of the law of large numbers is that the average values ​​of random variables tend to their mathematical expectation when n→ ∞ in probability. The deviation of average values ​​from the mathematical expectation becomes arbitrarily small with a probability close to unity if n is large enough. In other words, the probability of any deviation of the average values ​​from A as small as you grow n.



30. Bernoulli's theorem.

Bernoulli's theorem: Event frequency in n repeated independent trials, in each of which it can occur with the same probability p, with an unlimited increase in the number n converge in probability to the probability p of this event in a separate trial: \

Bernoulli's theorem is a consequence of Chebyshev's theorem, because the frequency of an event can be represented as the arithmetic mean of n independent alternative random variables that have the same distribution law.

18. Mathematical expectation of discrete and continuous random variables and their properties.

Mathematical expectation is the sum of the products of all its values ​​and their corresponding probabilities

For a discrete random variable:

For a continuous random variable:

Properties of mathematical expectation:

1. The mathematical expectation of a constant value is equal to the constant itself: M(S)=C

2. The constant factor can be taken out of the sign of the mathematical expectation, i.e. M(kX)=kM(X).

3. The mathematical expectation of an algebraic sum of a finite number of random variables is equal to the same sum of their mathematical expectations, i.e. M(X±Y)=M(X)±M(Y).

4. The mathematical expectation of the product of a finite number of independent random variables is equal to the product of their mathematical expectations: M(XY)=M(X)*M(Y).

5. If all values ​​of a random variable are increased (decreased) by a constant C, then the mathematical expectation of this random variable will increase (decrease) by the same constant C: M(X±C)=M(X)±C.

6. The mathematical expectation of the deviation of a random variable from its mathematical expectation is zero: M=0.

LAW OF LARGE NUMBERS

a general principle, by virtue of which the combination of random factors leads, under certain very general conditions, to a result almost independent of chance. The convergence of the frequency of occurrence of a random event with its probability as the number of trials increases (first noticed, apparently, in gambling) can serve as the first example of the operation of this principle.

At the turn of the 17th and 18th centuries. J. Bernoulli proved a theorem stating that in a sequence of independent trials, in each of which the occurrence of a certain event has the same value, the following relation is true:

for any - the number of occurrences of the event in the first trials, - the frequency of occurrences. This Bernoulli's theorem was extended by S. Poisson to the case of a sequence of independent trials, where the probability of the occurrence of the event A may depend on the number of the trial. Let this probability for the kth trial be equal and let


Then Poisson's theorem States that

for any The first rigorous approach to this theorem was given by P. L. Chebyshev (1846), whose method is completely different from Poisson’s method and is based on certain extreme considerations; S. Poisson derived (2) from an approximate formula for the indicated probability, based on the use of Gauss’s law and at that time not yet strictly substantiated. S. Poisson first encountered the term “law of large numbers,” which he called his generalization of Bernoulli’s theorem.

A natural further generalization of the theorems of Bernoulli and Poisson arises if we notice that random variables can be represented as a sum

independent random variables, where if A appears in the Ath trial, and - otherwise. At the same time, mathematical the expectation (coinciding with the arithmetic mean of mathematical expectations) is equal to p for the Bernoulli case and for the Poisson case. In other words, in both cases the deviation of the arithmetic mean is considered X k from the arithmetic mean of their mathematical expectations.

In the work of P. L. Chebyshev “On average values” (1867), it was established that for independent random variables the relation

(for any ) is true under very general assumptions. P. L. Chebyshev assumed that the mathematician. expectations are all bounded by the same constant, although from his proof it is clear that the requirement of bounded variances is sufficient

or even demands

Thus, P. L. Chebyshev showed the possibility of a broad generalization of Bernoulli’s theorem. A. A. Markov noted the possibility of further generalizations and suggested using the name B. h.z. to the entire set of generalizations of Bernoulli’s theorem [and in particular to (3)]. Chebyshev's method is based on the precise establishment of the general properties of mathematics. expectations and on the use of the so-called. Chebyshev inequalities[for probability (3) it gives an estimate of the form


this boundary can be replaced by a more accurate one, of course, under more significant restrictions, see Bernstein inequality]. Subsequent evidence of various forms of B. h.z. to one degree or another are a development of the Chebyshev method. Applying the proper “cutting” of random variables (replacing them with auxiliary variables namely: , if where are certain constants), A. A. Markov extended the B. part. for cases where variances of the terms do not exist. For example, he showed that (3) takes place if, for certain constants and everyone and

LECTURE 5

Repetition of what has been covered

Part 1 - CHAPTER 9. LAW OF LARGE NUMBERS. LIMIT THEOREMS

When statistically determined
probability it is interpreted as some
the number to which the relative tends
frequency of a random event. At
axiomatic definition of probability –
this is, in fact, an additive measure of the set
outcomes favoring chance
event. In the first case we are dealing with
empirical limit, in the second – with
theoretical concept of measure. Absolutely not
obviously they refer to the same thing
concept. Relationship between different definitions
probability is established by Bernoulli's theorem,
which is a special case of the law of large
numbers.

With increasing number of tests
the binomial law tends to
normal distribution. This is the theorem
Moivre–Laplace, which is
special case of the central limit
theorems. The latter states that the function
distribution of the sum of independent
random variables as the number increases
terms tends to normal
law.
Law of large numbers and central
limit theorem lies at the basis
mathematical statistics.

9.1. Chebyshev's inequality

Let the random variable ξ have
finite mathematical expectation
M[ξ] and variance D[ξ]. Then for
any positive number ε
the following inequality is true:

Notes

For the opposite event:
Chebyshev's inequality is valid for
any distribution law.
Putting
fact:
, we get nontrivial

9.2. Law of large numbers in Chebyshev form

Theorem Let random variables
are pairwise independent and have finite
variances limited to the same
constant
Then for
any
we have
Thus, the law of large numbers says
convergence in probability of the arithmetic mean of random variables (i.e., random variable)
to the arithmetic mean of their mat. expectations (i.e.
to a non-random variable).

9.2. Law of large numbers in Chebyshev form: addition

Theorem (Markov): law of large
numbers is satisfied if the variance
the sum of random variables does not grow
too quickly as n grows:

10. 9.3. Bernoulli's theorem

Theorem: Consider the Bernoulli scheme.
Let μn be the number of occurrences of event A in
n independent trials, p – probability of occurrence of event A in one
test. Then for anyone
Those. the probability that the deviation
relative frequency of a random event from
its probability p will be arbitrarily modulo
is small, it tends to unity as the number increases
tests n.

11.

Proof: Random variable μn
distributed according to the binomial law, therefore
we have

12. 9.4. Characteristic functions

Characteristic function of random
quantity is called a function
where exp(x) = ex.
Thus,
represents
mathematical expectation of some
complex random variable
associated with size. In particular, if
– discrete random variable,
given by the distribution series (xi, pi), where i
= 1, 2,..., n, then

13.

For a continuous random variable
with distribution density
probabilities

14.

15. 9.5. Central limit theorem (Lyapunov's theorem)

16.

Repeated what we had covered

17. FUNDAMENTALS OF PROBABILITY THEORY AND MATHEMATICAL STATISTICS

PART II. MATHEMATICAL
STATISTICS

18. Epigraph

“There are three types of lies: lies,
blatant lies and statistics"
Benjamin Disraeli

19. Introduction

Two main problems of mathematical
statistics:
collection and grouping of statistical
data;
development of analysis methods
received data depending on
research purposes.

20. Methods of statistical data analysis:

assessment of the unknown probability of an event;
unknown function estimate
distribution;
estimation of parameters of known
distribution;
testing statistical hypotheses about a species
unknown distribution or
values ​​of the parameters of the known
distributions.

21. CHAPTER 1. BASIC CONCEPTS OF MATHEMATICAL STATISTICS

22. 1.1. Population and sample

General population - everything
many objects under study,
Sampling – a set of objects, randomly
selected from the general population
for research.
Population size and
sample size - the number of objects in the general population and sample - we will
denoted as N and n respectively.

23.

The sample is repeated when
each selected object before
choosing the next one returns to
the general population, and
repeatable if selected
object in the general population is not
returns.

24. Representative sample:

represents the features correctly
the general population, i.e. is
representative (representative).
According to the law of large numbers, it can be stated that
that this condition is satisfied if:
1) sample size n is large enough;
2) each sample object was selected randomly;
3) for each object the probability of getting
in the sample is the same.

25.

Population and sample
may be one-dimensional
(single factor)
and multidimensional (multifactorial)

26. 1.2. Sample distribution law (statistical series)

Let in a sample of size n
random variable of interest to us ξ
(any object parameter
population) takes n1
times the value of x1, n2 times the value of x2,... and
nk times – xk value. Then observables
values ​​x1, x2,..., xk of a random variable
ξ are called variants, and n1, n2,..., nk
– their frequencies.

27.

The difference xmax – xmin is the range
samples, ratio ωi = ni /n –
relative frequency options xi.
It's obvious that

28.

If we write down the options in ascending order, we get a variation series. A table consisting of these
ordered variants and their frequencies
(and/or relative frequencies)
is called a statistical series or
sample distribution law.
-- An analogue of the discrete distribution law
random variable in probability theory

29.

If the variation series consists of very
a large number of numbers or
some continuous
sign, use grouped
sample. To obtain it, the interval is
which contains all observables
characteristic values ​​are divided into
several usually equal parts
(subintervals) of length h. At
compilation of statistical series in
As xi, the middle is usually chosen
subintervals, and ni is equal to the number
option falling into the i-th subinterval.

30.

40
- Frequencies -
35
30
n2
n3
ns
n1
25
20
15
10
5
0
a
a+h/2 a+3h/2
- Options -
b-h/2
b

31. 1.3. Frequency polygon, sample distribution function

Let us plot the values ​​of the random variable xi over
the abscissa axis, and the ni values ​​along the ordinate axis.
A broken line whose segments are connected
points with coordinates (x1, n1), (x2, n2),..., (xk,
nk), is called a polygon
frequency If instead
absolute values ​​ni
put on the ordinate axis
relative frequencies ωi,
then we get a polygon of relative frequencies

32.

By analogy with the distribution function
discrete random variable by
sample distribution law can be
build a sample (empirical)
distribution function
where the summation is performed over all
frequencies to which the values ​​correspond
option, smaller x. notice, that
empirical distribution function
depends on sample size n.

33.

Unlike the function
,found
for a random variable ξ by an experienced
through the processing of statistical data, the true function
distribution
related to
the general population is called
theoretical. (Usually general
the totality is so large that
it is impossible to process it all, i.e.
you can only explore it
in theory).

34.

Notice, that:

35. 1.4. Properties of the empirical distribution function

Stepped
view

36.

Another graphical representation
the sample we are interested in is
histogram - step figure,
consisting of rectangles whose bases are subintervals
width h, and heights are segments of length
ni/h (frequency histogram) or ωi/h
(histogram of relative frequencies).
In the first case
the area of ​​the histogram is equal to the volume
samples n, in
second – one

37. Example

38. CHAPTER 2. NUMERICAL CHARACTERISTICS OF THE SAMPLE

39.

The problem of mathematical statistics is
get from the available sample
information about general
totality. Numerical characteristics of a representative sample - assessment of the corresponding characteristics
random variable under study,
related to the general
as a whole.

40. 2.1. Sample mean and sample variance, empirical points

The sample mean is called
arithmetic mean of values
option in the sample
The sample mean is used to
statistical evaluation of mathematical
expectations of the random variable under study.

41.

The sample variance is called
value equal
Sample mean square
deviation –

42.

It's easy to show what's running
the following relation is convenient for
variance calculations:

43.

Other characteristics
variation series are:
mode M0 – variant having
the highest frequency, and the median me –
option that divides the variational
row into two parts equal to the number
option.
2, 5, 2, 11, 5, 6, 3, 13, 5 (mode = 5)
2, 2, 3, 5, 5, 5, 6, 11.13 (median = 5)

44.

By analogy with the corresponding
theoretical expressions can be
build empirical points,
used for statistical
assessments of primary and central
moments of the studied random
quantities.

45.

By analogy with moments
theories
probabilities to initial empirical
moment of order m is the quantity
central empirical point
order m -

46. ​​2.2. Properties of statistical estimates of distribution parameters: unbiasedness, efficiency, consistency

2.2. Properties of statistical estimates
distribution parameters: unbiasedness, efficiency, consistency
After receiving statistical estimates
random distribution parameters
values ​​ξ: sample mean, sample variance, etc., you need to make sure
that they are a good approximation
for the corresponding parameters
theoretical distribution ξ.
Let's find the conditions that must for this
be carried out.

47.

48.

The statistical estimate A* is called
unbiased if its mathematical
the expectation is equal to the estimated parameter
population A for any
sample size, i.e.
If this condition is not met, the evaluation
called displaced.
Unbiased estimation is not sufficient
condition for a good approximation of the statistical
A* scores to the true (theoretical) value
of the estimated parameter A.

49.

Scatter of individual values
relative to the average value M
depends on the magnitude of the dispersion D.
If the variance is large, then the value
found from one sample data,
may differ significantly from
the parameter being assessed.
Therefore, for reliable
estimation variance D should
be small. Statistical evaluation
is called effective if
given sample size n it has
the smallest possible variance.

50.

Towards statistical estimates
there is another requirement
solvency. The score is called
consistent if as n → it
tends in probability to
the parameter being assessed. notice, that
the unbiased estimate will be
consistent if as n → its
the variance tends to 0.

51. 2.3. Properties of a sample mean

We will assume that options x1, x2,..., xn
are the values ​​of the corresponding
independent identically distributed random variables
,
having mathematical expectation
and variance
. Then
sample mean is possible
treat as a random variable

52.

Undisplaced. From properties
mathematical expectation it follows that
those. sample mean is
unbiased estimate of the mathematical
expectations of a random variable.
Can also show effectiveness
estimates based on the sample mean of mathematical expectation (for normal
distribution)

53.

Wealth. Let a be the one being evaluated
parameter, namely mathematical
population expectation
– population variance
.
Consider the Chebyshev inequality
We have:
Then
. As n → right side
inequality tends to zero for any ε > 0, i.e.
and therefore the value X representing the sample
estimate tends to the estimated parameter a by probability.

54.

Thus, we can conclude
that the sample mean is
unbiased, effective (according to
at least for normal
distribution) and wealthy
mathematical expectation estimate
random variable associated with
the general population.

55.

56.

LECTURE 6

57. 2.4. Properties of sample variance

Let us examine the unbiasedness of the sample variance D* as
estimates of the variance of a random variable

58.

59.

60. Example

Find the sample mean, sample
variance and mean square
deviation, mode and corrected sample
variance for a sample having the following
distribution law:
Solution:

61.

62. CHAPTER 3. POINT ESTIMATION OF PARAMETERS OF A KNOWN DISTRIBUTION

63.

We will assume that the general form of the law
distribution is known to us and
It remains to clarify the details -
parameters defining it
valid form. Exists
several methods to solve this
tasks, two of which we
consider: the method of moments and the method
most likely

64. 3.1. Method of moments

65.

Method of moments developed by Karl
Pearson in 1894, based on
using these approximate equalities:
moments
are calculated
theoretically according to the known law
distributions with parameters θ, and
selective moments
are calculated
according to the available sample. Unknown
options
are defined in
as a result of solving a system of r equations,
linking the relevant
theoretical and empirical aspects,
For example,
.

66.

It can be shown that the estimates
parameters θ obtained by the method
moments, wealthy, their
mathematical expectations are different
from the true values ​​of the parameters to
value of the order of n–1, and the average
standard deviations are
values ​​of the order of n–0.5

67. Example

It is known that the characteristic ξ of objects
general population, being random
magnitude, has a uniform distribution depending on the parameters a and b:
It is required to determine by the method of moments
parameters a and b based on a known sample
average
and sample variance

68. Reminder

α1 – mathematical expectation β2 – dispersion

69.

(*)

70.

71. 3.2. Maximum likelihood method

The method is based on the likelihood function
L(x1, x2,..., xn, θ), which is the law
vector distribution
, Where
random variables
take values
sampling option, i.e. have the same
distribution. Since random variables
independent, the likelihood function has the form:

72.

The idea of ​​the greatest method
plausibility is that we
we look for such values ​​of parameters θ, with
which are likely to appear in
sampling values ​​option x1, x2,..., xn
is the largest. In other words,
as an estimate of the parameters θ
a vector is taken for which the function
plausibility has a local
maximum for given x1, x2, …, xn:

73.

Estimates using the maximum method
likelihoods are obtained from
necessary condition for an extremum
functions L(x1,x2,..., xn,θ) at the point

74. Notes:

1. When searching for the maximum of the likelihood function
To simplify calculations, you can do
actions that do not change the result: firstly,
use instead of L(x1, x2,..., xn,θ) the log-likelihood function l(x1, x2,..., xn,θ) =
ln L(x1, x2,..., xn,θ); secondly, discard in the expression
for the likelihood function independent of θ
terms (for l) or positive
factors (for L).
2. The parameter estimates we have considered are
can be called point estimates, since for
unknown parameter θ is determined by one
single point
, which is his
approximate value. However, this approach
can lead to gross errors, and spotty
the estimate may differ significantly from the true one
values ​​of the estimated parameter (especially in
in the case of a small sample size).

75. Example

Solution. In this problem it is necessary to evaluate
two unknown parameters: a and σ2.
Log-likelihood function
looks like

76.

By discarding the term in this formula that is not
depends on a and σ2, let’s create a system of equations
credibility
Solving, we get:

77. CHAPTER 4. INTERVAL ESTIMATION OF PARAMETERS OF A KNOWN DISTRIBUTION

78.









(*)

79.

(*)

80. 4.1. Estimation of the mathematical expectation of a normally distributed quantity with a known variance







sample mean
as a random value



81.

We have:
(1)
(2)

82.

(2)
(1)
(*)
(*)

83. 4.2. Estimation of the mathematical expectation of a normally distributed quantity with unknown variance

84.




degrees of freedom. Density

there are quantities

85.

86. Student distribution density with n – 1 degrees of freedom

87.

88.

89.







find by formulas

90. 4.3. Estimating the standard deviation of a normally distributed quantity





deviation σ.

unknown mathematical
waiting.

91. 4.3.1. A special case of the well-known mathematical expectation






Using quantities
,


sample variance D*:

92.



quantities
have normal




93.


conditions
Where
– distribution density χ2


94.

95.

96.

97. 4.3.2. A special case of unknown mathematical expectation








(where the random variable


χ2 with n–1 degrees of freedom.

98.

99. 4.4. Estimating the mathematical expectation of a random variable for a random sample










large sample size (n >> 1).

100.




quantities
having

dispersion
, and the resulting
sample mean
as meaning
random variable

magnitude
has asymptotically


.

101.






use formula

102.

103.

Lecture 7

104.

Repetition of what has been covered

105. CHAPTER 4. INTERVAL ESTIMATION OF PARAMETERS OF A KNOWN DISTRIBUTION

106.

The problem of estimating a parameter of a known
distributions can be solved by
constructing an interval in which, with a given
probability of getting the true value
parameter. This assessment method
called interval estimation.
Usually in mathematics for assessment
parameter θ, the inequality is constructed
(*)
where the number δ characterizes the accuracy of the estimate:
the smaller δ, the better the estimate.

107.

(*)

108. 4.1. Estimation of the mathematical expectation of a normally distributed quantity with a known variance

Let the random variable ξ under study be distributed according to the normal law with a known
standard deviation σ and
unknown mathematical expectation a.
Required by sample mean value
estimate the mathematical expectation ξ.
As before, we will consider the resulting
sample mean
as a random value
values, and the values ​​are the sample option x1, x2, ...,
xn – respectively, both values ​​are the same
distributed independent random variables
, each of which has a checkmate. expectation a and standard deviation σ.

109.

We have:
(1)
(2)

110.

(2)
(1)
(*)
(*)

111. 4.2. Estimation of the mathematical expectation of a normally distributed quantity with unknown variance

112.

It is known that the random variable tn,
given in this way has
Student's t distribution with k = n – 1
degrees of freedom. Density
probability distributions such
there are quantities

113.

114. Student distribution density with n – 1 degrees of freedom

115.

116.

117.

Note. With a large number of degrees
freedom k Student distribution
tends to a normal distribution with
zero mathematical expectation and
unit variance. Therefore, for k ≥ 30
confidence interval is possible in practice
find by formulas

118. 4.3. Estimating the standard deviation of a normally distributed quantity

Let the random variable under study
ξ is normally distributed
with mathematical expectation a and
unknown mean square
deviation σ.
Let's consider two cases: with known and
unknown mathematical
waiting.

119. 4.3.1. A special case of the well-known mathematical expectation

Let the value M[ξ] = a be known and require
estimate only σ or variance D[ξ] = σ2.
Let us recall that, given the known mat. waiting
the unbiased estimate of the variance is
sample variance D* = (σ*)2
Using quantities
,
defined above, we introduce a random
quantity Y, taking values
sample variance D*:

120.

Consider the random variable
The amounts under the sign are random
quantities
have normal
distribution with density fN (x, 0, 1).
Then Hn has a distribution χ2 with n
degrees of freedom as the sum of squares n
independent standard (a = 0, σ = 1)
normal random variables.

121.

Let's determine the confidence interval from
conditions
Where
– distribution density χ2
and γ – reliability (confidence
probability). The quantity γ is numerically equal to
area of ​​the shaded figure in Fig.

122.

123.

124.

125. 4.3.2. A special case of unknown mathematical expectation

In practice, the most common situation is
when both parameters of normal are unknown
distributions: mathematical expectation a and
standard deviation σ.
In this case, building a trust
interval is based on Fisher's theorem, from
cat. it follows that the random variable
(where the random variable
taking unbiased values
sample variance s2, has a distribution
χ2 with n–1 degrees of freedom.

126.

127. 4.4. Estimating the mathematical expectation of a random variable for a random sample

Interval math estimates
expectations M[ξ] obtained for normal
distributed random variable ξ,
are, generally speaking, unsuitable for
random variables having a different form
distributions. However, there is a situation when
for any random variables it is possible
use similar interval
relations - this occurs when
large sample size (n >> 1).

128.

As above, we will consider options
x1, x2,..., xn as values ​​of independent,
identically distributed random
quantities
having
mathematical expectation M[ξi] = mξ and
dispersion
, and the resulting
sample mean
as meaning
random variable
According to the central limit theorem
magnitude
has asymptotically
normal distribution law c
mathematical expectation mξ and variance
.

129.

Therefore, if the value of the variance is known
random variable ξ, then we can
use approximate formulas
If the value of the dispersion of the quantity ξ
is unknown, then for large n it is possible
use formula
where s – corrected rms. deviation

130.

Repeated what we had covered

131. CHAPTER 5. TESTING STATISTICAL HYPOTHESES

132.

A statistical hypothesis is a hypothesis about
form of an unknown distribution or about parameters
known distribution of a random variable.
A testable hypothesis, usually denoted as
H0 is called the null or main hypothesis.
Additionally used hypothesis H1,
contradicting hypothesis H0 is called
competing or alternative.
Statistical test of advanced null
hypothesis H0 consists in its comparison with
sample data. With such a check
Two types of errors may occur:
a) errors of the first type - cases when it is rejected
correct hypothesis H0;
b) errors of the second type - cases when
the incorrect hypothesis H0 is accepted.

133.

The probability of a type I error will be
call the level of significance and designate
as α.
The main technique for checking statistical
hypothesis is that
value is calculated from the available sample
statistical criterion - some
random variable T having a known
law of distribution. Range of values ​​T,
under which the main hypothesis H0 should
be rejected is called critical, and
the range of values ​​of T for which this hypothesis
can be accepted, – area of ​​acceptance
hypotheses.

134.

135. 5.1. Testing hypotheses about the parameters of a known distribution

5.1.1. Testing the hypothesis about the mathematical
expecting a normally distributed random
quantities
Let the random variable ξ have
normal distribution.
We need to check the assumption that
that its mathematical expectation is equal to
to some number a0. Let's consider separately
cases when the variance ξ is known and when
she is unknown.

136.

In the case of known dispersion D[ξ] = σ2,
as in section 4.1, we define a random
quantity taking values
sample mean. Hypothesis H0
initially formulated as M[ξ] =
a0. Since the sample mean
is an unbiased estimate of M[ξ], then
hypothesis H0 can be represented as

137.

Considering the unbiasedness of the corrected
sample variances, the null hypothesis can be
write as follows:
where is the random variable
takes the values ​​of the corrected sample
variance of the value ξ and is similar to random
the value of Z, considered in paragraph 4.2.
As a statistical criterion we choose
random variable
taking the value of the ratio greater
sample variance to less.

145.

The random variable F has
Fischer–Snedecor distribution with
number of degrees of freedom k1 = n1 – 1 and k2
= n2 – 1, where n1 is the sample size, according to
which calculated the larger
corrected variance
, and n2 –
the size of the second sample, for which
a smaller dispersion was found.
Let's consider two types of competing
hypotheses

146.

147.

148. 5.1.3. Comparison of mathematical expectations of independent random variables

Let's first consider the case of normal
distributions of random variables with known
variances, and then based on it - a more general
case of arbitrary distribution of values ​​at
sufficiently large independent samples.
Let the random variables ξ1 and ξ2 be independent and
are normally distributed, and let their variances be D[ξ1]
and D[ξ2] are known. (For example, they can be found
from some other experience or calculated
in theory). Samples of size n1 and n2 are extracted
respectively. Let
– selective
averages for these samples. Required by select
average at a given significance level α
test the hypothesis about the equality of mathematical
expectations of the random variables under consideration are made from a priori considerations,
based on the experimental conditions, and
then assumptions about the parameters
distributions are examined as shown
previously. However, it often occurs
the need to check the advanced
hypothesis about the distribution law.
Statistical tests intended
for such checks are usually called
consent criteria.

154.

Several criteria for agreement are known. Dignity
Pearson's criterion is its universality. With him
can be used to test hypotheses about various
laws of distribution.
The Pearson test is based on a comparison of frequencies
found from the sample (empirical frequencies), with
frequencies calculated using the tested
distribution law (theoretical frequencies).
Typically empirical and theoretical frequencies
vary. It is necessary to find out whether it is by chance
frequency discrepancy or is it significant and explained
in that the theoretical frequencies are calculated based on
incorrect hypothesis about the distribution of the general population
totality.
The Pearson criterion, like any other, responds to
the question is whether the proposed hypothesis agrees with
empirical data at a given level
significance.

155. 5.2.1. Testing the hypothesis of normal distribution

Let there be a random variable ξ and make
sample of a sufficiently large size n with a large
number of different values ​​option. Required
at significance level α, test the null hypothesis
H0 that the random variable ξ is distributed
Fine.
For convenience of sample processing, let’s take two numbers
α and β:
and divide the interval [α, β] by s
subintervals. We will assume that the values ​​are option,
falling into each subinterval are approximately equal
a number that specifies the middle of the subinterval.
By counting the number of variants falling into each Quantilla of order α (0< α < 1) непрерывной
random variable ξ is a number xα such that
for which the equality holds
.
The quantile x½ is called the random median
quantities ξ, quantiles x0 and x2 are its quartiles, a
x0.1, x0.2,..., x0.9 – in deciles.
For the standard normal distribution (a =
0, σ = 1) and, therefore,
where FN (x, a, σ) is the normal distribution function
distributed random variable, and Φ(x) –
Laplace function.
Quantile of standard normal distribution
xα for a given α can be found from the relation

162. 6.2. Student distribution

If
– independent
random variables having
normal distribution with zero
mathematical expectation and
unit variance, then
distribution of a random variable
called the Student distribution
with n degrees of freedom (W.S. Gosset).

Law of Large Numbers

The practice of studying random phenomena shows that although the results of individual observations, even those carried out under the same conditions, may differ greatly, at the same time, the average results for a sufficiently large number of observations are stable and weakly depend on the results of individual observations. The theoretical basis for this remarkable property of random phenomena is the law of large numbers. The general meaning of the law of large numbers is that the combined action of a large number of random factors leads to a result that is almost independent of chance.

Central limit theorem

Lyapunov's theorem explains the widespread distribution of the normal distribution law and explains the mechanism of its formation. The theorem allows us to state that whenever a random variable is formed as a result of the addition of a large number of independent random variables, the variances of which are small compared to the dispersion of the sum, the distribution law of this random variable turns out to be an almost normal law. And since random variables are always generated by an infinite number of causes and most often none of them has a dispersion comparable to the dispersion of the random variable itself, most random variables encountered in practice are subject to the normal distribution law.

Let us dwell in more detail on the content of the theorems of each of these groups

In practical research, it is very important to know in what cases it is possible to guarantee that the probability of an event will be either sufficiently small or as close to one as desired.

Under law of large numbers and is understood as a set of propositions which state that, with a probability anywhere close to one (or zero), an event will occur depending on a very large, indefinitely increasing number of random events, each of which has only a small influence on it.

More precisely, the law of large numbers is understood as a set of propositions that state that with a probability as close to unity as possible, the deviation of the arithmetic mean of a sufficiently large number of random variables from a constant value - the arithmetic mean of their mathematical expectations - will not exceed a given arbitrarily small number.

Individual, isolated phenomena that we observe in nature and in social life often appear as random (for example, a registered death, the gender of a child born, air temperature, etc.) due to the fact that such phenomena are influenced by many factors not related to the essence of the emergence or development of a phenomenon. It is impossible to predict their total effect on an observed phenomenon, and they manifest themselves differently in individual phenomena. Based on the results of one phenomenon, nothing can be said about the patterns inherent in many such phenomena.

However, it has long been noted that the arithmetic average of the numerical characteristics of some signs (relative frequencies of occurrence of an event, measurement results, etc.) with a large number of repetitions of the experiment is subject to very slight fluctuations. In the average, a pattern inherent in the essence of phenomena appears to be manifested; in it, the influence of individual factors that made the results of single observations random is cancelled. Theoretically, this behavior of the average can be explained using the law of large numbers. If some very general conditions regarding random variables are met, then the stability of the arithmetic mean will be an almost certain event. These conditions constitute the most important content of the law of large numbers.

The first example of the operation of this principle can be the convergence of the frequency of occurrence of a random event with its probability as the number of trials increases - a fact established in Bernoulli’s theorem (Swiss mathematician Jacob Bernoulli(1654-1705). Bernull's theorem is one of the simplest forms of the law of large numbers and is often used in practice. For example, the frequency of occurrence of any quality of a respondent in a sample is taken as an estimate of the corresponding probability).

Outstanding French mathematician Simeon Denny Poisson(1781-1840) generalized this theorem and extended it to the case when the probability of events in a test changes regardless of the results of previous tests. He was the first to use the term “law of large numbers.”

Great Russian mathematician Pafnutiy Lvovich Chebyshev(1821 - 1894) proved that the law of large numbers operates in phenomena with any variation and also extends to the law of averages.

A further generalization of the theorems of the law of large numbers is associated with the names A.A.Markov, S.N.Bernstein, A.Ya.Khinchin and A.N.Kolmlgorov.

The general modern formulation of the problem, the formulation of the law of large numbers, the development of ideas and methods for proving theorems related to this law belong to Russian scientists P. L. Chebyshev, A. A. Markov and A. M. Lyapunov.

CHEBYSHEV'S INEQUALITY

Let us first consider the auxiliary theorems: Chebyshev's lemma and inequality, with the help of which the law of large numbers in Chebyshev form can be easily proven.

Lemma (Chebyshev).

If among the values ​​of a random variable X there are no negative ones, then the probability that it will take on some value exceeding the positive number A is no more than a fraction, the numerator of which is the mathematical expectation of the random variable, and the denominator is the number A:

Proof.Let the distribution law of the random variable X be known:

(i = 1, 2, ..., ), and we consider the values ​​of the random variable to be in ascending order.

With respect to the number A, the values ​​of the random variable are divided into two groups: some do not exceed A, and others are greater than A. Let us assume that the first group includes the first values ​​of the random variable ().

Since , then all terms of the sum are non-negative. Therefore, discarding the first terms in the expression we obtain the following inequality:

Because the

,

That

Q.E.D.

Random variables can have different distributions with the same mathematical expectations. However, for them Chebyshev’s lemma will give the same estimate of the probability of one or another test result. This drawback of the lemma is related to its generality: it is impossible to achieve a better estimate for all random variables at once.

Chebyshev's inequality .

The probability that the deviation of a random variable from its mathematical expectation will exceed the absolute value of a positive number is no more than a fraction, the numerator of which is the variance of the random variable, and the denominator is the square

Proof.Since it is a random variable that does not take negative values, we apply the inequality from Chebyshev's lemma for a random variable at:


Q.E.D.

Consequence. Because the

,

That

- another form of Chebyshev's inequality

Let us accept without proof the fact that Chebyshev’s lemma and inequality are also true for continuous random variables.

Chebyshev's inequality underlies the qualitative and quantitative statements of the law of large numbers. It determines the upper bound on the probability that the deviation of the value of a random variable from its mathematical expectation is greater than a certain specified number. It is remarkable that Chebyshev’s inequality gives an estimate of the probability of an event for a random variable whose distribution is unknown, only its mathematical expectation and variance are known.

Theorem. (Law of large numbers in Chebyshev form)

If the variances of independent random variables are limited by one constant C, and their number is sufficiently large, then the probability that the deviation of the arithmetic mean of these random variables from the arithmetic mean of their mathematical expectations will not exceed the absolute value of a given positive number, no matter how small it is, is as close to unity as possible. neither was:

.

We accept the theorem without proof.

Corollary 1. If independent random variables have the same, equal, mathematical expectations, their variances are limited by the same constant C, and their number is large enough, then no matter how small the given positive number is, however close to unity the probability is that the deviation of the average the arithmetic of these random variables will not exceed in absolute value.

The fact that the arithmetic mean of the results of a sufficiently large number of its measurements made under the same conditions is taken as an approximate value of an unknown quantity can be justified by this theorem. Indeed, the measurement results are random, since they are influenced by many random factors. The absence of systematic errors means that the mathematical expectations of individual measurement results are the same and equal. Consequently, according to the law of large numbers, the arithmetic mean of a sufficiently large number of measurements will differ practically as little as desired from the true value of the desired quantity.

(Recall that errors are called systematic if they distort the measurement result in the same direction according to a more or less clear law. These include errors that appear as a result of imperfect instruments (instrumental errors), due to the personal characteristics of the observer (personal errors) and etc.)

Corollary 2 . (Bernoulli's theorem.)

If the probability of the occurrence of event A in each of the independent trials is constant, and their number is sufficiently large, then the probability that the frequency of occurrence of the event differs as little as desired from the probability of its occurrence is arbitrarily close to unity:

Bernoulli's theorem states that if the probability of an event is the same in all trials, then as the number of trials increases, the frequency of the event tends to the probability of the event and ceases to be random.

In practice, it is relatively rare to encounter experiments in which the probability of the occurrence of an event in any experiment is constant, more often it varies in different experiments. The Poisson theorem applies to a test scheme of this type:

Corollary 3 . (Poisson's theorem.)

If the probability of the occurrence of an event in the -th trial does not change when the results of previous tests become known, and their number is sufficiently large, then the probability that the frequency of occurrence of the event differs arbitrarily little from the arithmetic average of the probabilities is arbitrarily close to unity:

Poisson's theorem states that the frequency of an event in a series of independent trials tends to the arithmetic mean of its probabilities and ceases to be random.

In conclusion, we note that none of the theorems considered gives either an exact or even an approximate value of the desired probability, but only its lower or upper limit is indicated. Therefore, if it is necessary to establish the exact or at least approximate value of the probabilities of the corresponding events, the possibilities of these theorems are very limited.

Approximate probabilities for large values ​​can only be obtained using limit theorems. In them, additional restrictions are imposed on random variables (as is the case, for example, in Lyapunov’s theorem), or random variables of a certain type are considered (for example, in the Moivre-Laplace integral theorem).

The theoretical significance of Chebyshev's theorem, which is a very general formulation of the law of large numbers, is great. However, if we apply it to the question of whether it is possible to apply the law of large numbers to a sequence of independent random variables, then if the answer is affirmative, the theorem will often require that there be much more random variables than is necessary for the law of large numbers to take effect. This disadvantage of Chebyshev's theorem is explained by its general nature. Therefore, it is desirable to have theorems that would more accurately indicate the lower (or upper) bound of the desired probability. They can be obtained by imposing some additional restrictions on random variables, which are usually satisfied for random variables encountered in practice.

NOTES ON THE CONTENT OF THE LAW OF LARGE NUMBERS

If the number of random variables is large enough and they satisfy some very general conditions, then no matter how they are distributed, it is almost certain that their arithmetic mean deviates as little as desired from a constant value - the arithmetic mean of their mathematical expectations, i.e. is an almost constant value. This is the content of the theorems related to the law of large numbers. Consequently, the law of large numbers is one of the expressions of the dialectical connection between chance and necessity.

One can give many examples of the emergence of new qualitative states as manifestations of the law of large numbers, primarily among physical phenomena. Let's consider one of them.

According to modern concepts, gases consist of individual particles - molecules that are in chaotic motion, and it is impossible to say exactly where at a given moment it will be and at what speed this or that molecule will move. However, observations show that the total effect of molecules, for example gas pressure on

the wall of the vessel, manifests itself with amazing consistency. It is determined by the number of blows and the strength of each of them. Although the first and second are a matter of chance, the devices do not detect fluctuations in gas pressure under normal conditions. This is explained by the fact that due to the huge number of molecules, even in the smallest volumes

a change in pressure by a noticeable amount is practically impossible. Consequently, the physical law stating the constancy of gas pressure is a manifestation of the law of large numbers.

The constancy of pressure and some other characteristics of gas at one time served as a compelling argument against the molecular theory of the structure of matter. Subsequently, they learned to isolate a relatively small number of molecules, ensuring that the influence of individual molecules still remained, and thus the law of large numbers could not manifest itself to a sufficient extent. Then it was possible to observe fluctuations in gas pressure, confirming the hypothesis about the molecular structure of the substance.

The law of large numbers underlies various types of insurance (insurance of human life for all possible periods, property, livestock, crops, etc.).

When planning the range of consumer goods, the population's demand for them is taken into account. This demand reveals the effect of the law of large numbers.

The sampling method, widely used in statistics, finds its scientific basis in the law of large numbers. For example, the quality of wheat brought from a collective farm to a procurement point is judged by the quality of grains accidentally captured in a small measure. There is not much grain in the measure compared to the entire batch, but in any case, the measure is chosen such that there are enough grains in it for

manifestations of the law of large numbers with an accuracy that satisfies the need. We have the right to take the corresponding indicators in the sample as indicators of contamination, humidity and average grain weight of the entire batch of incoming grain.

Further efforts of scientists to deepen the content of the law of large numbers were aimed at obtaining the most general conditions for the applicability of this law to a sequence of random variables. There have been no fundamental successes in this direction for a long time. After P. L. Chebyshev and A. A. Markov, only in 1926 did the Soviet academician A. N. Kolmogorov manage to obtain the conditions necessary and sufficient for the law of large numbers to be applicable to a sequence of independent random variables. In 1928, the Soviet scientist A. Ya. Khinchin showed that a sufficient condition for the applicability of the law of large numbers to a sequence of independent identically distributed random variables is the existence of their mathematical expectation.

For practice, it is extremely important to fully clarify the question of the applicability of the law of large numbers to dependent random variables, since phenomena in nature and society are mutually dependent and mutually determine each other. Much work has been devoted to clarifying the restrictions that need to be imposed

on dependent random variables so that the law of large numbers can be applied to them, and the most important ones belong to the outstanding Russian scientist A. A. Markov and the prominent Soviet scientists S. N. Bernstein and A. Ya. Khinchin.

The main result of these works is that the law of large numbers can be applied to dependent random variables only if a strong dependence exists between random variables with close numbers, and between random variables with distant numbers the dependence is sufficiently weak. Examples of random variables of this type are numerical characteristics of climate. The weather of each day is noticeably influenced by the weather of the previous days, and the influence noticeably weakens as the days move away from each other. Consequently, the long-term average temperature, pressure and other climate characteristics of a given area, in accordance with the law of large numbers, should practically be close to their mathematical expectations. The latter are objective characteristics of the climate of the area.

In order to experimentally test the law of large numbers, the following experiments were carried out at different times.

1. Experience Buffon. The coin is tossed 4040 times. The coat of arms appeared 2048 times. The frequency of its occurrence turned out to be equal to 0.50694 =

2. Experience Pearson. The coin is tossed 12,000 and 24,000 times. The frequency of loss of the coat of arms in the first case was equal to 0.5016, in the Second - 0.5005.

H. Experience Vestergaard. From an urn in which there were equal numbers of white and black balls, 5011 white and 4989 black balls were obtained after 10,000 draws (with the next removed ball being returned to the urn). The frequency of white balls was 0.50110 = (), and the frequency of black balls was 0.49890.

4. Experience V.I. Romanovsky. Four coins are tossed 21,160 times. The frequencies and frequencies of various combinations of coat of arms and hash marks were distributed as follows:

Combinations of the number of heads and tails

Frequencies

Frequencies

Empirical

Theoretical

4 and 0

1 181

0,05858

0,0625

3 and 1

4909

0,24350

0,2500

2 and 2

7583

0,37614

0,3750

1 and 3

5085

0,25224

0,2500

1 and 4

0,06954

0,0625

Total

20160

1,0000

1,0000

The results of experimental tests of the law of large numbers convince us that experimental frequencies are very close to probabilities.

CENTRAL LIMIT THEOREM

It is not difficult to prove that the sum of any finite number of independent normally distributed random variables is also normally distributed.

If independent random variables are not normally distributed, then some very loose restrictions can be imposed on them, and their sum will still be normally distributed.

This problem was posed and solved mainly by Russian scientists P. L. Chebyshev and his students A. A. Markov and A. M. Lyapunov.

Theorem (Lyapunov).

If independent random variables have finite mathematical expectations and finite variances , their number is quite large, and with unlimited increase

,

where are the absolute central moments of the third order, then their sum has a distribution with a sufficient degree of accuracy

(In fact, we present not Lyapunov’s theorem, but one of its corollaries, since this corollary is quite sufficient for practical applications. Therefore, the condition, which is called Lyapunov’s condition, is a stronger requirement than is necessary to prove Lyapunov’s theorem itself.)

The meaning of the condition is that the effect of each term (random variable) is small compared to the total effect of all of them. Many random phenomena occurring in nature and in social life proceed precisely according to this pattern. In this regard, Lyapunov's theorem is of exceptionally great importance, and the normal distribution law is one of the basic laws in probability theory.

Let, for example, be produced measurement of some size. Various deviations of observed values ​​from its true value (mathematical expectation) are obtained as a result of the influence of a very large number of factors, each of which generates a small error, and . Then the total measurement error is a random variable, which, according to Lyapunov’s theorem, should be distributed according to the normal law.

At firing a gun under the influence of a very large number of random causes, projectiles are scattered over a certain area. Random impacts on the projectile trajectory can be considered independent. Each cause causes only a slight change in the trajectory compared to the total change under the influence of all causes. Therefore, we should expect that the deviation of the projectile explosion location from the target will be a random variable distributed according to a normal law.

According to Lyapunov’s theorem, we can expect that, for example, adult male height is a random variable distributed according to a normal law. This hypothesis, as well as those considered in the previous two examples, agrees well with observations. To confirm this, we present the distribution by height of 1000 adult male workers, the corresponding theoretical numbers of men, i.e., the number of men who should have the height of these groups, based on the assumption of the distribution the height of men according to the normal law.

Height, cm

number of men

experimental data

theoretical

forecasts

143-146

146-149

149-152

152-155

155-158

158- 161

161- 164

164-167

167-170

170-173

173-176

176-179

179 -182

182-185

185-188

It would be difficult to expect a more accurate agreement between the experimental data and the theoretical data.

One can easily prove as a consequence of Lyapunov’s theorem a proposition that will be necessary in the future to justify the sampling method.

Offer.

The sum of a sufficiently large number of identically distributed random variables having absolute central moments of the third order is distributed according to the normal law.

Limit theorems of probability theory, the Moivre-Laplace theorem explain the nature of the stability of the frequency of occurrence of an event. This nature lies in the fact that the limiting distribution of the number of occurrences of an event with an unlimited increase in the number of trials (if the probability of the event is the same in all trials) is a normal distribution.

System of random variables.

The random variables considered above were one-dimensional, i.e. were determined by one number, however, there are also random variables that are determined by two, three, etc. numbers. Such random variables are called two-dimensional, three-dimensional, etc.

Depending on the type of random variables included in the system, systems can be discrete, continuous or mixed if the system includes different types of random variables.

Let's take a closer look at systems of two random variables.

Definition. Law of distribution system of random variables is a relation that establishes a connection between the areas of possible values ​​of a system of random variables and the probabilities of the system appearing in these areas.

Example. From an urn containing 2 white and three black balls, two balls are taken out. Let be the number of white balls drawn, and the random variable is defined as follows:


Let's create a distribution table for the system of random variables:

Since is the probability that no white balls are drawn (which means two black balls are drawn), and , then

.

Probability

.

Probability

Probability - the probability that no white balls are drawn (and, therefore, two black balls are drawn), while , then

Probability - the probability that one white ball is drawn (and, therefore, one black), while , then

Probability - the probability that two white balls are drawn (and, therefore, no black ones), while , then

.

Thus, the distribution series of a two-dimensional random variable has the form:

Definition. Distribution function a system of two random variables is called a function of two argumentsF( x, y) , equal to the probability of joint fulfillment of two inequalitiesX< x, Y< y.


Let us note the following properties of the distribution function of a system of two random variables:

1) ;

2) The distribution function is a non-decreasing function for each argument:

3) The following is true:

4)


5) Probability of hitting a random point ( X, Y ) into an arbitrary rectangle with sides parallel to the coordinate axes, is calculated by the formula:


Distribution density of a system of two random variables.

Definition. Joint distribution density probabilities of a two-dimensional random variable ( X, Y ) is called the second mixed partial derivative of the distribution function.

If the distribution density is known, then the distribution function can be found using the formula:

The two-dimensional distribution density is non-negative and the double integral with infinite limits of the two-dimensional density is equal to one.

From the known density of the joint distribution, one can find the distribution density of each of the components of a two-dimensional random variable.

; ;

Conditional laws of distribution.

As shown above, knowing the joint distribution law, you can easily find the distribution laws of each random variable included in the system.

However, in practice, the inverse problem is often faced - using the known laws of distribution of random variables, find their joint distribution law.

In the general case, this problem is unsolvable, because the distribution law of a random variable does not say anything about the relationship of this variable with other random variables.

In addition, if random variables are dependent on each other, then the distribution law cannot be expressed through the laws of distribution of components, because must establish connections between components.

All this leads to the need to consider conditional distribution laws.

Definition. The distribution of one random variable included in the system, found under the condition that another random variable has taken a certain value, is called conditional distribution law.

The conditional distribution law can be specified both by the distribution function and by the distribution density.

Conditional distribution density is calculated using the formulas:

The conditional distribution density has all the properties of the distribution density of one random variable.

Conditional mathematical expectation.

Definition. Conditional mathematical expectation discrete random variable Y at X = x (x – a certain possible value of X) is the product of all possible values Y on their conditional probabilities.

For continuous random variables:

,

Where f( y/ x) – conditional density of the random variable Y at X = x.

Conditional mathematical expectationM( Y/ x)= f( x) is a function of X and is called regression function X on Y.

Example.Find the conditional mathematical expectation of the component Y at

X = x 1 =1 for a discrete two-dimensional random variable given by the table:

Y

x 1 =1

x 2 =3

x 3 =4

x 4 =8

y 1 =3

0,15

0,06

0,25

0,04

y 2 =6

0,30

0,10

0,03

0,07

The conditional variance and conditional moments of a system of random variables are determined similarly.

Dependent and independent random variables.

Definition. Random variables are called independent, if the distribution law of one of them does not depend on the value of the other random variable.

The concept of dependence of random variables is very important in probability theory.

Conditional distributions of independent random variables are equal to their unconditional distributions.

Let us determine the necessary and sufficient conditions for the independence of random variables.

Theorem. Y were independent, it is necessary and sufficient that the distribution function of the system ( X, Y) was equal to the product of the distribution functions of the components.

A similar theorem can be formulated for the distribution density:

Theorem. In order for random variables X and Y were independent, it is necessary and sufficient that the joint distribution density of the system ( X, Y) was equal to the product of the distribution densities of the components.

The following formulas are practically used:

For discrete random variables:

For continuous random variables:

The correlation moment serves to characterize the relationship between random variables. If random variables are independent, then their correlation moment is equal to zero.

The correlation moment has a dimension equal to the product of the dimensions of random variables X and Y . This fact is a disadvantage of this numerical characteristic, because With different units of measurement, different correlation moments are obtained, which makes it difficult to compare the correlation moments of different random variables.

In order to eliminate this drawback, another characteristic is used - the correlation coefficient.

Definition. Correlation coefficient r xy random variables X and Y is called the ratio of the correlation moment to the product of the standard deviations of these quantities.

The correlation coefficient is a dimensionless quantity. For independent random variables, the correlation coefficient is zero.

Property: The absolute value of the correlation moment of two random variables X and Y does not exceed the geometric mean of their variances.

Property: The absolute value of the correlation coefficient does not exceed one.

Random variables are called correlated, if their correlation moment is different from zero, and uncorrelated, if their correlation moment is zero.

If random variables are independent, then they are uncorrelated, but from uncorrelatedness one cannot conclude that they are independent.

If two quantities are dependent, then they can be either correlated or uncorrelated.

Often, from a given distribution density of a system of random variables, one can determine the dependence or independence of these variables.

Along with the correlation coefficient, the degree of dependence of random variables can be characterized by another quantity, which is called coefficient of covariance. The covariance coefficient is given by the formula:

Example. The distribution density of the system of random variables X is given andindependent. Of course, they will also be uncorrelated.

Linear regression.

Consider a two-dimensional random variable ( X, Y), where X and Y are dependent random variables.

Let us approximately represent one random variable as a function of another. An exact match is not possible. We will assume that this function is linear.

To determine this function, all that remains is to find the constant values a And b.

Definition. Functiong( X) called best approximation random variable Y in the sense of the least squares method, if the mathematical expectation

Takes the smallest possible value. Also functiong( x) called mean square regression Y to X.

Theorem. Linear mean square regression Y on X is calculated by the formula:

in this formula m x= M( X random variable Yrelative to a random variable X. This value characterizes the magnitude of the error generated when replacing a random variableYlinear functiong( X) = aX+b.

It is clear that if r= ± 1, then the residual variance is zero, and therefore the error is zero and the random variableYexactly represented by a linear function of a random variable X.

Mean square regression line X onYis determined similarly by the formula: X and Yhave linear regression functions in relation to each other, then they say that the quantities X AndYconnected linear correlation dependence.

Theorem. If a two-dimensional random variable ( X, Y) is normally distributed, then X and Y are connected by a linear correlation.

E.G. Nikiforova


Share with friends or save for yourself:

Loading...