Least squares method and finding a solution in Excel. Applying the method of least squares in Excel Calculation by the method of least squares xls

Well, at work they reported to the inspection, the article was written at home for the conference - now you can write to the blog. While I was processing my data, I realized that I could not help but write about a very cool and necessary add-in in Excel, which is called. So the article will be devoted to this particular add-in, and I will tell you about it using an example of use method least squares (OLS) to search for unknown equation coefficients when describing experimental data.

How to enable the Find Solution add-in

First, let's figure out how to enable this add-in.

1. Go to the "File" menu and select "Excel Options"

2. In the window that appears, select "Search for a solution" and click "go".

3. In the next window, put a tick in front of the item "search for a solution" and click "OK".

4. The add-on is activated - now it can be found in the "Data" menu item.

Least square method

Now briefly about least squares method (OLS) and where it can be applied.

Let's say we have a dataset after we did some experiment where we studied the effect of the X value on the Y value.

We want to describe this influence mathematically, so that later we can use this formula and know that if we change the value of X by so much, we get the value of Y so and so ...

I'll take a super-simple example (see fig.).

It’s clear that the points are located one after the other as if in a straight line, and therefore we safely assume that our dependence is described by a linear function y = kx + b. At the same time, we are definitely sure that when X is equal to zero, the value of Y is also equal to zero. This means that the function describing the dependence will be even simpler: y = kx (remember the school curriculum).

In general, we have to find the coefficient k. This is what we will do with OLS using the "search for a solution" add-in.

The method consists in the fact that (here - attention: you need to think about it) the sum of the squares of the differences between the experimentally obtained and the corresponding calculated values ​​was minimal. That is, when X1 = 1 the actually measured value Y1 = 4.6, and the calculated y1 = f (x1) is 4, the square of the difference will be (y1-Y1) ^ 2 = (4-4.6) ^ 2 = 0.36 ... With the following the same: when X2 = 2, the actually measured value Y2 = 8.1, and the calculated y2 is 8, the square of the difference will be (y2-Y2) ^ 2 = (8-8.1) ^ 2 = 0.01. And the sum of all these squares should be as small as possible.

So, let's start training on using the OLS and Find Solution Excel Add-ins .

Applying the solution search add-in

1. If you have not turned on the "search for a solution" add-on, then go back to point How to enable the search for a solution add-in and enable 🙂

2. In cell A1, enter the value "1". This unit will be the first approximation to the real value of the coefficient (k) of our functional dependence y = kx.

3. In column B, we have the values ​​of the X parameter, in column C - the values ​​of the Y parameter. In the cells of the D column, we enter the formula: "coefficient k multiplied by the value of X". For example, in cell D1 we enter "= A1 * B1", in cell D2 we enter "= A1 * B2", and so on.

4. We believe that the coefficient k is equal to one and the function f (x) = y = 1 * x is the first approximation to our solution. We can calculate the sum of the squares of the differences between the measured values ​​of Y and those calculated by the formula y = 1 * x. We can do all this manually by driving the appropriate cell references into the formula: "= (D2-C2) ^ 2 + (D3-C3) ^ 2 + (D4-C4) ^ 2 ... etc. Finally we are mistaken and understand that we have lost a lot of time. In Excel, for calculating the sum of squares of differences, there is a special formula, "SUMKVRAZN", which will do everything for us. Enter it in cell A2 and set the initial data: the range of measured values ​​Y (column C) and the range of calculated Y values ​​(column D).

4. The sum of the differences of the squares has been calculated - now we go to the “Data” tab and select “Search for a solution”.

5. In the menu that appears, select cell A1 (the one with the coefficient k) as the cell to be changed.

6. Select cell A2 as the target and set the condition “set equal to the minimum value”. Remember that this is the cell where we calculate the sum of the squares of the differences between the calculated and measured values, and this sum should be minimal. Click "execute".

7. Coefficient k is selected. Now you can verify that the calculated values ​​are now very close to the measured ones.

P.S.

In general, of course, to approximate experimental data in Excel, there are special tools that allow you to describe data using a linear, exponential, power and polynomial function, so you can often do without n solution search add-ons... I talked about all these approximation methods in my mine, so if you're interested, take a look. But when it comes to some exotic function with one unknown coefficient or optimization problems, here superstructure very opportunely.

Find solution add-in can be used for other tasks, the main thing is to understand the essence: there is a cell where we select a value, and there is a target cell in which a condition is set for selecting an unknown parameter.
That's all! In the next article I will tell you a fairy tale about a vacation, so in order not to miss the article,

Least square method is used to estimate the parameters of the regression equation.

One of the methods for studying stochastic relationships between features is regression analysis.
Regression analysis is the derivation of the regression equation, with the help of which the average value of a random variable (feature-result) is found, if the value of another (or other) variables (feature-factors) is known. It includes the following steps:

  1. choice of the form of communication (type of analytical regression equation);
  2. estimation of the parameters of the equation;
  3. assessment of the quality of the analytical regression equation.
Most often, a linear form is used to describe the statistical relationship of features. Attention to linear communication is explained by a clear economic interpretation of its parameters, limited variation of variables and the fact that, in most cases, nonlinear forms of communication for performing calculations are transformed (by logarithm or change of variables) into a linear form.
In the case of a linear pairwise relationship, the regression equation will take the form: y i = a + b x i + u i. Options this equation a and b are estimated from the data statistical observation x and y. The result of such an assessment is the equation:, where, are the estimates of the parameters a and b, is the value of the effective attribute (variable) obtained by the regression equation (calculated value).

The most often used to estimate parameters least squares method (OLS).
The least squares method gives the best (consistent, efficient and unbiased) estimates of the parameters of the regression equation. But only if certain prerequisites are met regarding the random term (u) and the independent variable (x) (see OLS prerequisites).

The problem of estimating the parameters of a linear paired equation by the least squares method consists in the following: to obtain such parameter estimates, at which the sum of the squares of the deviations of the actual values ​​of the effective indicator - y i from the calculated values ​​- is minimal.
Formally OLS criterion can be written like this: .

Least squares classification

  1. Least square method.
  2. Maximum likelihood method (for the normal classical linear regression model, the normality of the regression residuals is postulated).
  3. The generalized least squares OLS method is used in the case of autocorrelation of errors and in the case of heteroscedasticity.
  4. Weighted least squares method (a special case of OLS with heteroscedastic residuals).

Let's illustrate the essence the classical least squares method graphically... To do this, we will build a dot plot according to the observation data (x i, y i, i = 1; n) in a rectangular coordinate system (such a dot plot is called the correlation field). Let's try to find a straight line that is closest to the points of the correlation field. According to the method of least squares, the line is chosen so that the sum of the squares of the vertical distances between the points of the correlation field and this line would be minimal.

Mathematical record of this problem: .
We know the values ​​of y i and x i = 1 ... n, these are observational data. In the S function, they are constants. The variables in this function are the required parameter estimates -,. To find the minimum of a function of 2 variables, it is necessary to calculate the partial derivatives of this function with respect to each of the parameters and equate them to zero, i.e. .
As a result, we get a system of 2 normal linear equations:
Solving this system, we find the required parameter estimates:

The correctness of the calculation of the parameters of the regression equation can be checked by comparing the sums (there may be some discrepancy due to rounding of calculations).
To calculate the parameter estimates, you can build table 1.
The sign of the regression coefficient b indicates the direction of the relationship (if b> 0, the relationship is direct, if b<0, то связь обратная). Величина b показывает на сколько единиц изменится в среднем признак-результат -y при изменении признака-фактора - х на 1 единицу своего измерения.
Formally, the value of parameter a is the average value of y at x equal to zero. If the attribute factor does not and cannot have a zero value, then the above interpretation of the parameter a does not make sense.

Assessment of the tightness of the relationship between the signs is carried out using the coefficient of linear pair correlation - r x, y. It can be calculated using the formula: ... In addition, the linear pairwise correlation coefficient can be determined through the regression coefficient b: .
The range of admissible values ​​of the linear pair correlation coefficient is from –1 to +1. The sign of the correlation coefficient indicates the direction of the link. If r x, y> 0, then the connection is direct; if r x, y<0, то связь обратная.
If this coefficient is close to one in absolute value, then the relationship between the features can be interpreted as a rather close linear one. If its modulus is equal to one ê r x, y ê = 1, then the connection between the features is functional linear. If features x and y are linearly independent, then r x, y is close to 0.
To calculate r x, y, you can also use table 1.

To assess the quality of the obtained regression equation, the theoretical coefficient of determination is calculated - R 2 yx:

,
where d 2 is the variance y explained by the regression equation;
e 2 - residual (not explained by the regression equation) variance y;
s 2 y is the total (total) variance of y.
The coefficient of determination characterizes the proportion of the variation (variance) of the effective trait y, explained by the regression (and, consequently, the factor x), in the total variation (variance) y. The coefficient of determination R 2 yx takes values ​​from 0 to 1. Accordingly, the value 1-R 2 yx characterizes the proportion of variance y caused by the influence of other factors not taken into account in the model and specification errors.
With paired linear regression R 2 yx = r 2 yx.

4.1. Using built-in functions

Calculation regression coefficients carried out using the function

LINEST(Values_y; X-values; Konst; statistics),

Values_y- an array of y values,

X-values- optional array of values x if array NS is omitted, then it is assumed that it is an array (1; 2; 3; ...) the same size as Values_y,

Konst- a boolean value that indicates whether the constant is required b was equal to 0. If Konst has the meaning TRUE or omitted, then b is calculated in the usual way. If the argument Konst is FALSE, then b is set equal to 0 and the values a are selected so that the relation y = ax.

Statistics- a boolean value that indicates whether to return additional regression statistics. If the argument Statistics has the meaning TRUE, then the function LINEST returns additional regression statistics. If the argument Statistics has the meaning LYING or omitted, then the function LINEST only returns the coefficient a and constant b.

It must be remembered that the result of the functions LINEST () is a set of values ​​- an array.

For calculation correlation coefficient function is used

CORREL(Array1;Array2),

returning the values ​​of the correlation coefficient, where Array1- array of values y, Array2- array of values x. Array1 and Array2 must be of the same dimension.

EXAMPLE 1... Addiction y(x) is presented in the table. Build regression line and calculate correlation coefficient.

y 0.5 1.5 2.5 3.5
x 2.39 2.81 3.25 3.75 4.11 4.45 4.85 5.25

Let's enter a table of values ​​into a MS Excel sheet and build a scatter plot. The worksheet will take the form shown in fig. 2.

To calculate the values ​​of the regression coefficients a and b allocate cells A7: B7, turn to the function wizard and in the category Statistical choose the function LINEST... Fill in the appeared dialog box as shown in Fig. 3 and press Ok.


As a result, the calculated value will appear only in the cell A6(fig. 4). In order for the value to appear in the cell B6 it is necessary to enter the editing mode (key F2) and then press the key combination CTRL + SHIFT + ENTER.



To calculate the value of the correlation coefficient per cell C6 the following formula was introduced:

C7 = CORREL (B3: J3; B2: J2).


Knowing the regression coefficients a and b calculate the values ​​of the function y=ax+b for given x... To do this, we introduce the formula

B5 = $ A $ 7 * B2 + $ B $ 7

and copy it to the range C5: J5(fig. 5).

Let's draw the regression line on the diagram. Select the experimental points on the graph, right-click and select the command Initial data... In the dialog box that appears (Fig. 5), select the tab Row and click on the button Add... Fill in the input fields, as shown in Fig. 6 and press the button OK... A regression line is added to the experimental data plot. By default, its graph will be displayed as points not connected by smoothing lines.

Rice. 6

To change the appearance of the regression line, follow these steps. Right-click on the points representing the line graph, select the command Chart type and set the view of the scatter plot, as shown in Fig. 7.

You can change the line type, color and thickness as follows. Select a line in the diagram, right-click and select the command Data series format ... Next, make settings, for example, as shown in Fig. eight.

As a result of all transformations, we will receive a graph of experimental data and a regression line in one graphical area (Fig. 9).

4.2. Using a trend line.

The construction of various approximating dependencies in MS Excel is implemented as a chart property - trend line.

EXAMPLE 2... As a result of the experiment, some tabular dependence was determined.

0.15 0.16 0.17 0.18 0.19 0.20
4.4817 4.4930 5.4739 6.0496 6.6859 7.3891

Select and build an approximating dependence. Build graphs of tabular and selected analytical dependence.

The solution to the problem can be broken down into the following steps: inputting initial data, building a dot plot and adding a trend line to this plot.

Let's consider this process in detail. Let's enter the raw data into the worksheet and plot the experimental data. Next, select the experimental points on the graph, right-click and use the command Add l trend initiative(fig. 10).

The dialog box that appears allows you to build an approximating dependence.

The first tab (Fig. 11) of this window indicates the type of the approximating dependence.

On the second (Fig. 12), the construction parameters are determined:

· The name of the approximating dependence;

Forecast forward (backward) by n units (this parameter determines how many units forward (backward) it is necessary to extend the trend line);

Whether to show the point of intersection of a curve with a straight line y = const;

· Show the approximating function on the diagram or not (the option to show the equation on the diagram);

· Whether to place the value of the standard deviation on the diagram or not (place the parameter on the diagram the value of the approximation reliability).

Let us choose a polynomial of the second degree as an approximating dependence (Fig. 11) and derive the equation describing this polynomial on the graph (Fig. 12). The resulting diagram is shown in Fig. 13.

Similarly, using trend lines you can choose the parameters of such dependencies as

Linear y=a ∙ x+b,

Logarithmic y=a ∙ ln(x)+b,

Exponential y=a ∙ e b,

Power-law y=a ∙ x b,

Polynomial y=a ∙ x 2 +b ∙ x+c, y=a ∙ x 3 +b ∙ x 2 +c ∙ x + d and so on, up to a polynomial of the 6th degree inclusive,

· Linear filtration.

4.3. Using a decision block

Of considerable interest is the implementation in MS Excel of the selection of parameters by the least squares method using a decision block. This technique allows you to choose the parameters of a function of any kind. Let's consider this possibility using the example of the following problem.

EXAMPLE 3... As a result of the experiment, the dependence z (t) was obtained, presented in the table

0,66 0,9 1,17 1,47 1,7 1,74 2,08 2,63 3,12
38,9 68,8 64,4 66,5 64,95 59,36 82,6 90,63 113,5

Select dependency coefficients Z (t) = At ​​4 + Bt 3 + Ct 2 + Dt + K the least squares method.

This problem is equivalent to the problem of finding the minimum of a function of five variables

Let's consider the process of solving the optimization problem (Fig. 14).

Let the values A, V, WITH, D and TO stored in cells A7: E7... Let's calculate the theoretical values ​​of the function Z(t)=At 4 + Bt 3 + Ct 2 + Dt + K for given t(B2: J2). To do this, in the cell B4 enter the value of the function at the first point (cell B2):

B4 = $ A $ 7 * B2 ^ 4 + $ B $ 7 * B2 ^ 3 + $ C $ 7 * B2 ^ 2 + $ D $ 7 * B2 + $ E $ 7.

Let's copy this formula to the range C4: J4 and get the expected value of the function at the points, the abscissas of which are stored in the cells B2: J2.

Into the cell B5 we introduce a formula that calculates the square of the difference between the experimental and calculated points:

B5 = (B4-B3) ^ 2,

and copy it to the range C5: J5... In a cell F7 we will store the total squared error (10). To do this, we introduce the formula:

F7 = SUM (B5: J5).

Let's use the command Service®Search for a solution and solve the optimization problem without restrictions. Fill in the input fields in the dialog box shown in Fig. 14 and press the button Execute... If a solution is found, then the window shown in Fig. 15.

The result of the work of the decision block will be output to cells A7: E7parameter values functions Z(t)=At 4 + Bt 3 + Ct 2 + Dt + K... In cells B4: J4 get expected function value at the starting points. In a cell F7 will be kept total squared error.

You can draw experimental points and a fitted line in the same graphics area by selecting a range B2: J4, call Chart Wizard and then format the appearance of the resulting graphs.

Rice. 17 displays the MS Excel worksheet after the calculations.


5. REFERENCES

1. Alekseev ER, Chesnokova OV, Solution of problems of computational mathematics in the packages Mathcad12, MATLAB7, Maple9. - NT Press, 2006. – 596s. : ill. - (Tutorial)

2. Alekseev E.R., Chesnokova O.V., E.A. Rudchenko, Scilab, solving engineering and mathematical problems. –M., BINOM, 2008. – 260s.

3. Berezin IS, Zhidkov NP, Methods of computation.-Moscow: Nauka, 1966.-632p.

4. Garnaev A.Yu., Using MS EXCEL and VBA in economics and finance. - SPb .: BHV - Petersburg, 1999. – 332s.

5. Demidovich BP, Maron I A., Shuvalova VZ, Numerical methods of analysis. –M.: Nauka, 1967. – 368p.

6. Korn G., Korn T., Handbook of mathematics for scientists and engineers. –M., 1970, 720p.

7. Alekseev E.R., Chesnokova O.V. Methodical instructions for laboratory work in MS EXCEL. For students of all specialties. Donetsk, DonNTU, 2004.112 p.

It has many applications, since it allows an approximate representation of a given function by other simpler ones. OLS can be extremely useful in processing observations, and it is actively used to estimate some quantities from the results of measurements of others that contain random errors. This article will show you how to implement least squares calculations in Excel.

Statement of the problem using a specific example

Suppose there are two indicators X and Y. And Y depends on X. Since OLS is of interest to us from the point of view of regression analysis (in Excel, its methods are implemented using built-in functions), then you should immediately go on to consider a specific problem.

So, let X be the retail space of a grocery store, measured in square meters, and Y - the annual turnover, measured in millions of rubles.

It is required to make a forecast of what turnover (Y) the store will have if it has a particular retail space. Obviously, the function Y = f (X) is increasing, since the hypermarket sells more goods than the stall.

A few words about the correctness of the initial data used for prediction

Let's say we have a table built from data for n stores.

According to mathematical statistics, the results will be more or less correct if data on at least 5-6 objects are examined. In addition, you cannot use "abnormal" results. In particular, an elite small boutique can have a turnover many times greater than the turnover of large retail outlets of the "masmarket" class.

Method essence

The table data can be displayed on the Cartesian plane as points M 1 (x 1, y 1),… M n (x n, y n). Now the solution of the problem will be reduced to the selection of an approximating function y = f (x) with a graph passing as close as possible to the points M 1, M 2, .. M n.

Of course, you can use a high degree polynomial, but this option is not only difficult to implement, but also simply incorrect, since it will not reflect the main trend that needs to be detected. The most reasonable solution is to find the straight line y = ax + b, which best approximates the experimental data, or rather, the coefficients - a and b.

Accuracy assessment

For any approximation, an assessment of its accuracy is of particular importance. Let us denote by e i the difference (deviation) between the functional and experimental values ​​for the point x i, that is, e i = y i - f (x i).

Obviously, to estimate the accuracy of the approximation, the sum of deviations can be used, i.e., when choosing a straight line for an approximate representation of the dependence of X on Y, one should give preference to the one with the smallest value of the sum e i at all points under consideration. However, not everything is so simple, since along with positive deviations, negative deviations will practically be present.

The problem can be solved using the modules of deviations or their squares. The last method is the most widely used. It is used in many areas, including regression analysis (Excel implements two built-in functions), and has long proven its worth.

Least square method

In Excel, as you know, there is a built-in autosum function that allows you to calculate the values ​​of all values ​​located in the selected range. Thus, nothing prevents us from calculating the value of the expression (e 1 2 + e 2 2 + e 3 2 + ... e n 2).

In mathematical notation, it looks like:

Since the decision was initially made to approximate using a straight line, we have:

Thus, the problem of finding the straight line that best describes the specific dependence of the quantities X and Y is reduced to calculating the minimum of a function of two variables:

This requires equating to zero the partial derivatives with respect to the new variables a and b, and solving a primitive system consisting of two equations with 2 unknowns of the form:

After some simple transformations, including dividing by 2 and manipulating the sums, we get:

Solving it, for example, by Cramer's method, we obtain a stationary point with some coefficients a * and b *. This is the minimum, that is, to predict what turnover the store will have for a certain area, the straight line y = a * x + b * is suitable, which is a regression model for the example in question. Of course, it will not allow you to find the exact result, but it will help you get an idea of ​​whether the purchase on credit for a store of a particular area will pay off.

How to implement least squares method in Excel

Excel has a function for calculating the OLS value. It has the following form: "TREND" (known Y values; known X values; new X values; const.). Let's apply the formula for calculating the OLS in Excel to our table.

To do this, in the cell in which the result of the calculation by the least squares method in Excel should be displayed, enter the "=" sign and select the "TREND" function. In the window that opens, fill in the appropriate fields, highlighting:

  • the range of known values ​​for Y (in this case, data for turnover);
  • range x 1,… x n, ie the size of the retail space;
  • both known and unknown values ​​of x, for which you need to find out the size of the turnover (see below for information on their location on the worksheet).

In addition, the formula contains the Boolean variable "Const". If you enter 1 in the corresponding field, this will mean that calculations should be performed, assuming that b = 0.

If you need to know the forecast for more than one value of x, then after entering the formula, you should not press "Enter", but you need to type on the keyboard the combination "Shift" + "Control" + "Enter" ("Enter").

Some features

Regression analysis may even be available to dummies. The Excel formula for predicting the value of an array of unknown variables - "TREND" - can be used even by those who have never heard of the method of least squares. It is enough just to know some of the features of her work. In particular:

  • If you arrange the range of known values ​​of the y variable in one row or column, then each row (column) with known x values ​​will be perceived by the program as a separate variable.
  • If the "TREND" window does not contain a range with known x, then if the function is used in Excel, the program will consider it as an array consisting of integers, the number of which corresponds to the range with the given values ​​of the y variable.
  • To get an array of “predicted” values ​​as an output, the trend expression must be entered as an array formula.
  • If new x values ​​are not specified, then the TREND function considers them to be equal to known. If they are not specified, then array 1 is taken as an argument; 2; 3; 4;…, which is commensurate with the range with the already given parameters y.
  • The range containing the new x-values ​​must be the same or more rows or columns as the range with the given y-values. In other words, it should be commensurate with the independent variables.
  • An array with known x values ​​can contain multiple variables. However, if we are talking only about one, then it is required that the ranges with the given values ​​of x and y are commensurate. In the case of multiple variables, you want the range with the given y values ​​to fit in one column or one row.

FORECAST function

It is implemented with several functions. One of them is called "FORECAST". It is similar to "TREND", that is, it gives the result of calculations using the least squares method. However, only for one X, for which the Y value is unknown.

Now you know the formulas in Excel for dummies that allow you to predict the future value of a given indicator according to a linear trend.

4.1. Using built-in functions

Calculation regression coefficients carried out using the function

LINEST(Values_y; X-values; Konst; statistics),

Values_y- an array of y values,

X-values- optional array of values x if array NS is omitted, then it is assumed that it is an array (1; 2; 3; ...) the same size as Values_y,

Konst- a boolean value that indicates whether the constant is required b was equal to 0. If Konst has the meaning TRUE or omitted, then b is calculated in the usual way. If the argument Konst is FALSE, then b is set equal to 0 and the values a are selected so that the relation y = ax.

Statistics- a boolean value that indicates whether to return additional regression statistics. If the argument Statistics has the meaning TRUE, then the function LINEST returns additional regression statistics. If the argument Statistics has the meaning LYING or omitted, then the function LINEST only returns the coefficient a and constant b.

It must be remembered that the result of the functions LINEST () is a set of values ​​- an array.

For calculation correlation coefficient function is used

CORREL(Array1;Array2),

returning the values ​​of the correlation coefficient, where Array1- array of values y, Array2- array of values x. Array1 and Array2 must be of the same dimension.

EXAMPLE 1... Addiction y(x) is presented in the table. Build regression line and calculate correlation coefficient.

y 0.5 1.5 2.5 3.5
x 2.39 2.81 3.25 3.75 4.11 4.45 4.85 5.25

Let's enter a table of values ​​into a MS Excel sheet and build a scatter plot. The worksheet will take the form shown in fig. 2.

To calculate the values ​​of the regression coefficients a and b allocate cells A7: B7, turn to the function wizard and in the category Statistical choose the function LINEST... Fill in the appeared dialog box as shown in Fig. 3 and press Ok.


As a result, the calculated value will appear only in the cell A6(fig. 4). In order for the value to appear in the cell B6 it is necessary to enter the editing mode (key F2) and then press the key combination CTRL + SHIFT + ENTER.

To calculate the value of the correlation coefficient per cell C6 the following formula was introduced:

C7 = CORREL (B3: J3; B2: J2).

Knowing the regression coefficients a and b calculate the values ​​of the function y=ax+b for given x... To do this, we introduce the formula

B5 = $ A $ 7 * B2 + $ B $ 7

and copy it to the range C5: J5(fig. 5).

Let's draw the regression line on the diagram. Select the experimental points on the graph, right-click and select the command Initial data... In the dialog box that appears (Fig. 5), select the tab Row and click on the button Add... Fill in the input fields, as shown in Fig. 6 and press the button OK... A regression line is added to the experimental data plot. By default, its graph will be displayed as points not connected by smoothing lines.



To change the appearance of the regression line, follow these steps. Right-click on the points representing the line graph, select the command Chart type and set the view of the scatter plot, as shown in Fig. 7.

You can change the line type, color and thickness as follows. Select a line in the diagram, right-click and select the command Data series format ... Next, make settings, for example, as shown in Fig. eight.

As a result of all transformations, we will receive a graph of experimental data and a regression line in one graphical area (Fig. 9).

4.2. Using a trend line.

The construction of various approximating dependencies in MS Excel is implemented as a chart property - trend line.

EXAMPLE 2... As a result of the experiment, some tabular dependence was determined.

0.15 0.16 0.17 0.18 0.19 0.20
4.4817 4.4930 5.4739 6.0496 6.6859 7.3891

Select and build an approximating dependence. Build graphs of tabular and selected analytical dependence.

The solution to the problem can be broken down into the following steps: inputting initial data, building a dot plot and adding a trend line to this plot.

Let's consider this process in detail. Let's enter the raw data into the worksheet and plot the experimental data. Next, select the experimental points on the graph, right-click and use the command Add l trend initiative(fig. 10).

The dialog box that appears allows you to build an approximating dependence.

The first tab (Fig. 11) of this window indicates the type of the approximating dependence.

On the second (Fig. 12), the construction parameters are determined:

· The name of the approximating dependence;

Forecast forward (backward) by n units (this parameter determines how many units forward (backward) it is necessary to extend the trend line);

Whether to show the point of intersection of a curve with a straight line y = const;

· Show the approximating function on the diagram or not (the option to show the equation on the diagram);

· Whether to place the value of the standard deviation on the diagram or not (place the parameter on the diagram the value of the approximation reliability).

Let us choose a polynomial of the second degree as an approximating dependence (Fig. 11) and derive the equation describing this polynomial on the graph (Fig. 12). The resulting diagram is shown in Fig. 13.

Similarly, using trend lines you can choose the parameters of such dependencies as

Linear y=a ∙ x+b,

Logarithmic y=a ∙ ln(x)+b,

Exponential y=a ∙ e b,

Power-law y=a ∙ x b,

Polynomial y=a ∙ x 2 +b ∙ x+c, y=a ∙ x 3 +b ∙ x 2 +c ∙ x + d and so on, up to a polynomial of the 6th degree inclusive,

· Linear filtration.

4.3. Using the Option Analysis Tool: Finding a Solution.

Of considerable interest is the implementation in MS Excel of the selection of the parameters of the functional dependence by the least squares method using the option analysis tool: Search for a solution. This technique allows you to choose the parameters of a function of any kind. Let's consider this possibility using the example of the following problem.

EXAMPLE 3... As a result of the experiment, the dependence z (t) was obtained, presented in the table

0,66 0,9 1,17 1,47 1,7 1,74 2,08 2,63 3,12
38,9 68,8 64,4 66,5 64,95 59,36 82,6 90,63 113,5

Select dependency coefficients Z (t) = At ​​4 + Bt 3 + Ct 2 + Dt + K the least squares method.

This problem is equivalent to the problem of finding the minimum of a function of five variables

Let's consider the process of solving the optimization problem (Fig. 14).

Let the values A, V, WITH, D and TO stored in cells A7: E7... Let's calculate the theoretical values ​​of the function Z(t)=At 4 + Bt 3 + Ct 2 + Dt + K for given t(B2: J2). To do this, in the cell B4 enter the value of the function at the first point (cell B2):

B4 = $ A $ 7 * B2 ^ 4 + $ B $ 7 * B2 ^ 3 + $ C $ 7 * B2 ^ 2 + $ D $ 7 * B2 + $ E $ 7.

Let's copy this formula to the range C4: J4 and get the expected value of the function at the points, the abscissas of which are stored in the cells B2: J2.

Into the cell B5 we introduce a formula that calculates the square of the difference between the experimental and calculated points:

B5 = (B4-B3) ^ 2,

and copy it to the range C5: J5... In a cell F7 we will store the total squared error (10). To do this, we introduce the formula:

F7 = SUM (B5: J5).

Let's use the command Service®Search for a solution and solve the optimization problem without restrictions. Fill in the input fields in the dialog box shown in Fig. 14 and press the button Execute... If a solution is found, then the window shown in Fig. 15.

The result of the work of the decision block will be output to cells A7: E7parameter values functions Z(t)=At 4 + Bt 3 + Ct 2 + Dt + K... In cells B4: J4 get expected function value at the starting points. In a cell F7 will be kept total squared error.

You can draw experimental points and a fitted line in the same graphics area by selecting a range B2: J4, call Chart Wizard and then format the appearance of the resulting graphs.

Rice. 17 displays the MS Excel worksheet after the calculations.

Share with friends or save for yourself:

Loading...