About an Unbiased Estimate of the Gradient with Minimum Variance in the Planning of the Experiment

This paper considers algorithms of search of an extremum, to solve the problem of planning the experiment using the gradient method. A feature of the algorithm is that when searching for the motion (when the maximum) does not occur in the direction of the gradient, which is unknown to us, but its estimate. Estimate of the gradient at the point when this factor space is based on the results of measurements carried out in the neighborhood. Researcher's task is to build a sensible plan with center to determine the estimate of the gradient in it.


Introduction
The task of finding the extremum of the mean characteristics, usually the task of finding the extremum of the response function η=f(х 1 ,х 2 ,…,х к ). Search extremum response function is performed by the response surface study. This study is carried out by measuring the response surface at different points in the factor space.
Use for this purpose immediately known methods of finding the extremum function of many variables cannot be as " measurement " of the response function at each point in the factor space , which put the experience happens to fail.
1. One of the most famous classes of gradient methods for searching the extremes of the response function in the practice of experimental plan can be considered a method developed by Box and Wilson [1,2]. The idea behind it is to use the method of steep climbing (steepest ascent) in conjunction with the series planned factorial experiment to find the gradient estimation. When applying gradient methods for searching the extremes of the response function of one of the most important is the problem of statistical estimation of the gradient components [3][4][5]. Therefore, in describing the method of Box and Wilson this problem is considered the most complete. Research issues of statistical estimation of the gradient when searching, is of great importance for the understanding of the use of gradient methods for planning the experiment. In general, the method of Box and Wilson is to repeat the procedure:  build -factorial experiment [3] in the neighborhood of a point;  estimating the gradient -calculation at this point, the results of the experiment;  finding estimates of the maximum (minimum) of the response function in this direction. As in the method of Box and Wilson , as well as other methods for searching the extremes of the response function , which are based on gradient method is used not the gradient , and its evaluation . We present a general formulation of the gradient estimation problem and its solution.
Let the response function  [3]. It is obvious that the encoded variable

The Task of Estimating the Gradient
The task of estimating the gradient will understand some estimate of the gradient of the response function ) ,..., , ( we can write the response function It should be noted that grad for the case when ). , (  estimates its gradient at this point. Evaluating the gradient in the problem of finding an extremum is assumed that the response function near of the center plan factorial experiment accurately enough approximated by a hyperplane. In real situations, this condition is often not fulfilled. In this case evaluation of the gradient component may be biased. There is a relationship between the components of the gradient estimates and the estimates of the model parameters used to describe the response surface. Knowing bias of the estimates of the model parameters, we can find the bias of the gradient estimation. This implies an extremely important practical research conclusion: even the use of crude models to describe the response surface can not lead to a shift in the center of the gradient evaluation plan. This fact partly explains the reason for the relatively successful application of the method of Box and Wilson in applied research with a relatively crude approximation of the response surface. Problem of unbiased estimation of the gradient (4) near of the center plan [6] is to find unbiased estimates of the components of the gradient , where ) ,..., 2 , Consider the case where Obviously, -matrix factor type 2 k-q and also conditions (8), (9) are satisfied. Write response function (7) as Then not hard to believe that the columns of the matrix of independent variables (10), satisfy the conditions of It follows that the matrix X can be represented as a matrix block [1] ( , )  is a matrix of the full plan 2 k , then conditions (11), (12) is carried out for any k and therefore unbiased OLS -estimate of the gradient exists and is defined by (13).
Consequence2. If the conditions (11) and (12)  In this case, OLS -estimate of the gradient models (7) and (13) is the same.
Note that condition (11), (12) is performed only in the case where the fractional resolutio replicates 2 k-q greater than three [2]. Therefore reformulate Lemma 1 with that in mind. Lemma 2. Application of fractional replicate type 2 k-q allows to obtain an unbiased OLS -estimate of the gradient  , as a resolution equal to four [1], and therefore allows to obtain an unbiased OLS-estimate of the gradient in the center of the plan. Indeed, determining contrasts defining fractional replicate [2,5]  These examples illustrate a general approach to solving the problem of estimating the gradient in situations where the model approximately describes the response surface [3][4][5]. From this discussion it follows that in some important practice [3,4] cases is possible to match gradient estimates for models of varying complexity. Is a very complex problem of comparing the effectiveness of plans in estimating the gradient.

Examples of Specific Implementations
2. Assume that the distribution of the directions grad know that it belongs to some parameter family of distributions. In case of continuous random variable it means that known views density р(grad ( ) f x /θ), but not know the value of the parameter K determining the concrete density [5,7]. Parameter θ can be a vector. For example, for a normal distribution θ=(μ,σ) { }  …,x n ), to the substitution of the instead of the arguments x 1 ,…,x n sampling data of the directions of the gradient we got the numbers, "close" to τ(θ). This proximity can only be done in average. Therefore, the requirements to the quality of the estimates of the gradient, is formulated in probabilistic terms relating to the distribution of the assessments covered as random values [7,8]. This is the requirement to estimate values in most of the experiments were close to the estimated value of the parameter can be formulated as the following definition. But, as a rule, only one unbiasedness requirement does not emit estimation for unambiguously. Therefore, the following desirable requirement is the requirement of minimum of variance of this estimation [7]. To statistics might serve as a good estimate of this parameter τ(θ), it is necessary that the distribution of this statistics was concentrated in sufficiently close to the unknown value τ(θ), so that the probability of large deviations of the statistics from τ(θ) was quite small [7]. Then the systematic application of repeated this statistics as an estimation of the characteristics on the average will be obtained sufficient accuracy. The probability of large deviations will be small, and they will be rare.
So, the viability of the gradient estimation means that if a sufficiently large number of observations n with arbitrarily high accuracy deviation estimate the gradient from the true parameter value less any pre-specified value.
To unbiased estimate of the gradient was valid enough to variance estimation tends to zero if ∞ → n (this follows from the Chebyshev inequality [7]).    ) gr а df x [7], for which the last equality it would be fair for all θ>0. x gr a df x x These examples allow you to move on to more complex tasks [11], namely to the problems of comparison of the effectiveness of plans at the estimation of the gradient.

Conclusions
Thus, the existence of the unbiased estimate of the gradient with minimum variance occurs not always, as far as the variance for these estimates should be minimal uniformly in the parameter. The viability of the gradient estimation means that if a sufficiently large number of observations n with arbitrarily high accuracy deviation estimate the gradient from the true parameter value less any pre-specified value. To unbiased estimate of the gradient was wealthy enough to variance estimation tends to zero ∞ → n .