A New Method to Estimate Parameters in the Simple Regression Linear Equation

Linear regression is widely used in various fields. Research on linear regression uses the OLS and ML method in estimating its parameters. OLS and ML method require many assumptions to complete. It is frequently found there is an unconditional assumption that both methods are not successfully used. This paper proposes a new method which does not require any assumption with a condition. The new method is called SAM (Simple Averaging Method) to estimate parameters in the simple linear regression model. The method may be used without fulfilling assumptions in the regression model. Three new theorems are formulated to simplify the estimation of parameters in the simple linear regression model with SAM. By using the same data, the simple linear regression model parameter estimation is conducted using SAM. The result shows that the obtained regression parameter is not quite far different. However, to measure the accuracy of both methods, a comparison of errors made by each method is conducted using Root Mean Square Error (RMSE) and Mean Averaged Error (MAE). By comparing the values of RMSE and MAE for both methods, SAM method may be used to estimate parameters in the regression equation. The advantage of SAM is free from all assumptions required by regression, such as error normality assumption while the data should be from the normal distribution.


Introduction
Linear regression is widely used in various fields. Research on linear regression uses the OLS method in estimating its parameters. The OLS method requires many assumptions that need to be met. In this paper, a new method is offered that does not require the fulfillment of any assumptions. In terms of calculations, the new method is very practical and does not require high-level mathematics, making it suitable for use by researchers in various fields of research.
The relationship between two or more variables can be analyzed through the association and regression relationships. The association relationship is tenuous or weak (Sembiring, 2003) because in the association only the value of x and y is paired without knowing the form of the relationship between the two variables.
Two variables can be independent of one another or completely interdependent. In the event that the two variables are mutually independent, the correlation is zero. If the two variables are completely interdependent and the relationship is linear (both are called collinear), then the absolute price is one correlation.
In research generally used mathematical models or models which are a simplification and abstraction of the actual natural state. The naturally conditions under study are generally very complicated and the ability to examine them as a whole is also very limited. For this reason, simplifications are needed according to our ability.
According to Sembiring (2003), the use of model is to understand, explain, control, and predict the observed system behaviors. In predicting activities and not predicting, there is a special meaning of interpolation and extrapolation.
The created model illustrates a causal relationship (cause and effect) between two or more variables that the model shows a functional relationship between the constructed variables.
Predictions about the relationship between variables in the observed system are used to formulate the behavior of the system in various situations. Thus, the model created is a theory about the workings of the system under study.
Furthermore, the formulation of the relationship is stated in the form of a hypothesis and tested based on statistical data collected. This approach is induction as opposed to an axiom (deduction).
The model is a functional relationship between the variables that build it so that the model used will be in the form of function and regression to be a powerful tool in its formation. There are two types of data used namely laboratory data (there are controls so that they no longer describe the natural state by manipulating variables that interfere with the way it is made unchanged so that it has no effect and consequently the effect of the investigated variables can be cleaner observed). The second type of data is field data describing natural conditions and containing the influence of many variables that work together very complicated.
The created model illustrates a causal relationship (cause and effect) between two or more variables that the model shows a functional relationship between the constructed variables.
The created model may be in the form of function and regression to become an ultimate weapon in its creation. Linear regression is frequently used in various fields. One model to relate the independent variable with the dependent variable is the simple linear regression model. The use of this model may be found in any field. The research in simple linear regression using OLS (Ordinary Least Square) or ML (Maximum Likelihood) method is conducted to estimate the parameters. Some researchers including Chioma (2009) OLS and ML method require many assumptions to complete. It is frequently found that there is an unconditional assumption that that both methods are not successfully used. This paper proposes a new method which does not require any assumption with a condition. From the calculation point of view, this new method is considered highly practical and does not require higher mathematic level that it is appropriate for the researchers to conduct researchers in various fields.

Materials and Methods
At the initial stage, it is explained that the simple linear regression and its parameters estimation using OLS method and assumptions should be well fulfilled that the method may be used. There are 12 data sample pairs used to obtain the simple linear regression equation using OLS method.
Furthermore, it is explained related to SAM (Simple Averaging Method) developed by Cliff and Billy (2017) to estimate parameters in the simple linear regression model. By using the same data, the simple linear regression model parameter estimation is conducted using SAM. The result shows that the obtained regression parameter is not quite far different. However, to measure the accuracy of both methods, a comparison of errors made by each method is conducted using RMSE (Root Mean Square Error) and MAE (Mean Averaged Error). There is no significant different in measuring the errors that SAM may become an alternative method in simple linear regression parameter estimation.
In the next stage, the method offered by Cliff and Billy (2017) is simplified in the form of a mathematical equation. Thus, three new theorems are constructed in order to estimate the simple linear regression parameters using SAM. The proofs for those theorems are given. The obtained three theorems may be used to estimate the simple linear regression parameters free from any assumption.

Results
The simplest linear regression model is a straight line. In this case there is a free variable, name x and dependent variable that depends on x , named .
y The dependent variable y is sometimes called the response variable. The independent variable x is called the predictor variable.
Examples of problems that can be solved by making a simple linear regression model are a student's achievement in mathematics (variable x ) which is determined by the length of time a teacher has received education (variable y ). In this model it is assumed that the length of education of a teacher will affect the mathematics achievement of students. The hypothesis is that the longer the teacher's education, the better student achievement will be. As a result, the linear regression function graph will simply be an upward straight line. To obtain the equation and graph of the regression function, statistical data collection is then performed.
For this example, for example x stating the education factor with x = 0 means never going to school and x = 16 means having attended 16 years of school (graduating from undergraduate). Suppose student performance y is measured from 0 -10 and a simple linear regression model is used.
The simple linear regression model is represents the probability density function (pdf) y if .
x known. Thus, one or more terokok will be taken from a population with normal distribution In population theory, Equation (1) can be written as and Equation (2) i is not so important to be hypothesized because it can always be obtained by using translation.

Linear Regression Parameter Estimation with OLS Method
Equation (2) will be estimated by   Table 1  The solution with OLS method using equation (5) and (4)

Linear Regression Parameter Estimation with Simple Averaging Method
Cliff and Billy (2017)

Similar Formula in SAM
This part will present two equations which may be used to estimate parameter  using SAM. First is equation (10) and second is equation (11). Equation (10) and (11) may be used as alternatives from equation (8). The following is the calculation example of parameter  using equation (10) and (11). The data used is that shown in Table 1. In this case, parameter  is still calculated using equation (7) and parameter  calculated using equation (10) presented in Table 2.
The obtained regression equation is

SAM Simplification
The estimation for  given by Cliff  The estimation of parameter  for SAM in equation (8) may be stated in the form of

Proof:
Will be shown that      The average of both simple methods calculated to obtain estimation  is the same that is, using equation (7).
However, the calculation to obtain the estimation  has a difference. In the first method, estimation  is calculated with equation (9), while in table 2, it is calculated with equation (9). Furthermore, it will be shown that equation (10) and (11) may be created into equation in Theorem 1 that equation (10) and (11) are identical with equation (8).

Theorem 2:
The estimation of parameter  for SAM in equation

Conclusion
Cliff and Billy (2017) have given a simple linear regression parameter estimation method in the equation (7) and (8)