Liu-Type logistic estimator under Stochastic Linear Restrictions

To conquer the multicollinearity problem in logistic regression, many alternative estimators have been proposed in the literature when some linear restrictions on the parameter space are available in addition to the sample model. In this paper, we propose a new two parameter Liu-type estimator called Stochastic Restricted Liu-Type Logistic Estimator (SRLTLE) by combining Liu-type estimator with the logistic model in the presence of stochastic linear restrictions. Further, a Monte Carlo simulation study is done to compare the performance of the proposed estimator with some existing estimators in the scalar mean squared error (SMSE) sense, and a numerical example is given to illustrate the theoretical results.


INTRODUCTION
It is a known fact that the Maximum Likelihood Estimator (MLE) of each of the parameters in the logistic regression model is highly affected by the multicollinearity among the explanatory variables.As a consequence, the variance of the MLE is inflated, and hence inefficient estimates may produce.To tackle this issue in the logistic regression, many scholars proposed alternative biased estimators to MLE.These estimators are mainly categorized into three different types such as (i) biased estimators based only on sample information, (ii) biased estimators based on sample information and exact linear restrictions as prior information, and (iii) biased estimators based on sample information and stochastic linear restrictions as prior infromation.Some of the bised estimators proposed in the literature under the first type are namely the Logistic Ridge Estimator (LRE) ( Schaefer et al., 1984), the Principal Component Logistic Estimator (PCLE) (Aguilera et al.,2006), the Modified Logistic Ridge Estimator (MLRE) ( Nja et al., 2013), the Logistic Liu Estimator (LLE) ( Mansson et al., 2012), the Liu-Type Logistic Estimator (LTLE) ( Inan and Erdogan, 2013), the Almost Unbiased Ridge Logistic Estimator (AURLE) ( Wu and Asar, 2016), the Almost Unbiased Liu Logistic Estimator (AULLE) (Xinfeng 2015) and the Optimal Generalized Logistic Estimator (OGLE) (Varathan and Wijekoon, 2017).When the exact linear restrictions are available in addition to the sample logistic model (second type), the Restricted Maximum Likelihood Estimator (RMLE) by Duffy and Santner (1989), the Restricted Logistic Liu Estimator (RLLE) by Siray et al. (2015), the Modified Restricted Liu Estimator by Wu (2015), the Restricted Logistic Ridge Estimator (RLRE) and the Restricted Liu-Type Logistic Estimator (RLTLE) by Asar et al. (2016) have been proposed in the literature.When the restrictions on the parameters are stochastic (third type), Nagarajah and Wijekoon (2015) introduced the new estimator called Stochastic Restricted Maximum Likelihood Estimator (SRMLE), and derived the superiority conditions of SRMLE over the LRE, LLE and RMLE.Also, by introducing the Stochastic Restricted Ridge Maximum Likelihood Estimator (SRRMLE) (Varathan and Wijekoon, 2016a), and the Stochastic Restricted Liu Maximum Likelihood Estimator (SRLMLE) (Varathan and Wijekoon, 2016b), the LRE and LLE estimators were further improved in the presence of stochastic restrictions.When comparing the above estimators, it can be noted that incoperating stochastic linear restrictions to the sample model (i.e. the third type) improves the estimators further (Nagarajah and Wijekoon (2015), Varathan and Wijekoon (2016a), Varathan and Wijekoon, (2016b).This information motivated us to propose a new estimator under stochastic linear restrictions by considering to improve the performance of the logistic model.Hence, by adding stochastic restrictions as prior information to the LTLE estimator, a new estimator namely, Stochastic Restricted Liu-Type Logistic Estimator (SRLTLE) is proposed in this research.Rest of the paper contains model specification and estimators, the proposed Stochastic Restricted Liu-Type Logistic Estimator (SRLTLE) and it's stochastic properties, Scalar Mean square error comparisons, Monte Carlo simulation study and a numerical example to discuss the performance of the new estimator, and finally some concluding remarks.

Model Specification and estimators
Consider the general logistic regression model (1) which follows Bernoulli distribution with parameter i π as (2) where i x is the i th row of X , which is an p n ´ data matrix with p explanatory variables and β is a 1 ṕ vector of coefficients, i ε's are independent with mean zero and variance ) ( 1 of the response i y .The Maximum likelihood estimate (MLE) of β can be obtained as follows: (3) where is an unbiased estimate of β and its covariance matrix is (4) The MSE and SMSE of MLE β ˆ are (5) and ( 6) Since C is a positive definite matrix there exists an The maximum likelihood is the preferred estimation technique to estimate the parameters in logistic regression.However, the variance of MLE becomes inflated when the multicollinearity is present.As stated in the introduction, under the first type of estimators, the Logistic Ridge Estimator (LRE) (Schaefer et al., 1984), Logistic Liu estimator (LLE) ( Mansson et al., 2012) and Liu Type Logistic Estimator (LTLE) (Inan and Erdogan, 2013) are defined as below. (7) As an alternative technique to stabilize the variance of the estimator due to multicollinearity, one can use prior information, if available, in addition to the sample model (1) either as exact linear restrictions or stochastic linear restrictions.
Suppose that the following stochastic linear prior information is given in addition to the general logistic regression model ( 1). (10 where h is an (q´1) stochastic known vector, H is a ´ matrix of full rank (q ≤ p) with known elements and υ is an (q´1) random vector of disturbances with mean 0 and dispersion matrix Ψ , which is assumed to be known ) ( q q ´ positive definite matrix.Further, it is assumed that In the presence of exact linear restrictions on regression coefficients ( 0 = υ in ( 10)) in addition to the logistic regression model (1) (second type), Duffy and Santner (1989) proposed the following Restricted Maximum Likelihood Estimator (RMLE).(11) Later, following Duffy and Santner (1989), Restricted Logistic Liu estimator (RLLE) by Siray et al, (2015), Restricted Logistic Ridge Estimator (RLRE) by Asar et al. (2016), Restricted Liu-Type Logistic Estimator (RLTLE) by Asar et al. (2016) were proposed in the presence of exact linear restrictions in addition to sample model (1).These estimators are defined as (12) (13) (14) When the linear restrictions are stochastic as in (10) in addition to the logistic regression model (1) ( Third type), Nagarajah and Wijekoon (2015) proposed the Stochastic Restricted Maximum Likelihood Estimator (SRMLE).
The asymptotic properties of SRRMLE:

RESULTS AND DISCUSSION
The New Proposed Estimator Generally, the estimators based on stochastic linear restrictions perform better than the estimators based on exact linear restrictions.Also the estimators based on two shrinkage parameters k and d improve the performance of the estimators.The estimator based on two shrinkage parameters k and d for exact liner restrictions is available in the literature (Asar et al., 2016).Hence, in this research, we propose a new two parameter Liu-Type estimator under stochastic linear restriction case, which is named as Stochastic Restricted Liu-Type Logistic Estimator (SRLTLE) and defined as (33) The asymptotic properties of SRLTLE: Consequently, the bias, MSE and SMSE of SRLTLE are and

Scalar Mean square error comparisons
In this section we compare the performance of the proposed estimator with respect to the SMSE criterion.
For the estimator β ˆ of β , the Mean Square Error (MSE)  ∆ cannot be easily examined, we compare the performances of these estimators by using a simulation study in the next section.

A Simulation study
To illustrate the performance of the proposed estimator with the existing estimators MLE, SRMLE, SRRMLE and SRLMLE, we perform a Monte Carlo simulation study by considering different levels of multicollinearity.The Scalar Mean Square Error (SMSE) criteria is used for the comparison.Following McDonald and Galarneau (1975) and Kibria (2003), we generate the explanatory variables as follows: (42) where z j 's are independent standard normal pseudorandom numbers and ρ is specified so that the theoretical correlation between any two explanatory variables is given by 2 ρ .Four explanatory variables are generated using (42)   and three different values of ρ corresponding to 0.80, 0.90, and 0.99 are considered.Further in this study, the large and small sample sizes n=50 and n=15 are considered.The dependent variable i y in (1) is obtained from the Bernoulli( i π ) distribution where The simulation is repeated 1000 times by generating new pseudo-random numbers and the simulated SMSE values of the estimators are obtained using the following equation.
(44) where r β ˆ is any estimator considered in the r th simulation.The results of the simulation study are displayed in Tables A1 -A6 (Appendix), and it can be revealed that in general, when increasing the correlation between two explanatory variables the estimated SMSE of all the estimators inflates.For the sample size n=50 (Tables A1-A3 in Appendix) , it is observed that the proposed estimator SRLTLE performed well compared to all the other estimators MLE, SRMLE, SRRMLE and SRLMLE, with respect to almost all the values of k , and d in the range 0.9 0.4 ≤ ≤ d when ρ = 0.8, 0.9; 0.9 0.3 ≤ ≤ d when ρ = 0.99.Further, it can be noticed that, for small d k, values, SRLMLE performed better than the other estimators with respect to all the values of ρ .Such as when ρ = 0.9, and (iii) for when ρ = 0.99.However, for the small sample size n=15 (Tables A4-A6 in Appendix), the proposed estimator SRLTLE performed well compared to other estimators when 0.9 0.3 ≤ ≤ d and 0.9 0.1 ≤ ≤ k for all ρ = 0.8, 0.9, & 0.99.Further, as observed in the large sample case, for small d k, values, SRLMLE performed better compared to other estimators with respect to all the values of ρ .Such as (i) for ; when ρ = 0.9 and ρ =0.99.Moreover, MLE has the worst performance in all of the cases (having the largest SMSE values).

Numerical example
In order to observe the performance of the new estimator SRLTLE, we used a real data set, which was taken from the Statistics Sweden website (http://www.scb.se/).This example was used in Mansson et al.(2012), Asar and Genc (2016), Wu and Asar (2016), and Varathan and Wijekoon (2016b) to illustrate results of their papers.The data consists the information about 100 municipalities of Sweden.The explanatory variables considered in this study are Population ( 1 x ), Number unemployed people ( 2 x ), Number of newly constructed buildings ( 3x ), and Number of bankrupt firms ( 4x ).The variable Net population change ( y ) is considered as response variable, which is defined as The pairwise correlations of the explanatory variables 1 x , 2 x , 3 x and 4 x are very high (greater than 0.95).The corresponding VIF values for the data are 488.17,344.26, 44.99, and 50.71 which measure how much the variance of the estimated regression coefficients are inflated as compared to when the predictor variables are not linearly related.According to the literature multicollinearity is high if VIF > 10.Hence a clear high multicollinearity exists in this data set.Further, the condition number being a measure of multicollinearity is obtained as 188 showing that there exists severe multicollinearity with this data set.Moreover, we use the same restrictions as in (43) for the prior information.
The SMSE values of MLE, SRMLE, SRRMLE, SRLMLE, and SRLTLE for some selected values of

CONCLUSIONS
In this research, we proposed the Stochastic Restricted Liu-Type Logistic Estimator (SRLTLE) for logistic regression model in the presence of linear stochastic restriction when the multicollinearity problem exists.The conditions for superiority of the proposed estimator over some existing estimators were derived with respect to SMSE criterion.Moreover, the performance of the proposed estimator SRLTLE over MLE, SRMLE, SRRMLE and SRLMLE were analyzed by conducting a Monte Carlo simulation study and a numerical example.It can be stated that, in the presence of multicollinearity, the proposed estimator is a better alternative to the other existing estimators under certain conditions.

Nagarajah Varathan and Pushpakanthie Wijekoon
et  al. (2016),Wu and Asar (2015) andMansson et al. (2012), the optimum values of the biasing parameters k , and d can be obtained by minimizing SMSE values with respect to k , and d .However, for simplicity in this paper we select some values of k , d in the range 1

APPENDIXTable A1 :
The estimated MSE values for different k, d when n=50 ,

Table A2 :
The estimated MSE values for different k, d when n=50, and 0.90 =

Table A3 :
The estimated MSE values for different

Table A4 :
The estimated MSE values for different k, d when n=15, and

Table A5 :
The estimated MSE values for different k, d when n=15 and

Table A6 :
The estimated MSE values for different k,d when n=15, and 0.99 =