Missingness Mechanism that Incorporated Joint Modeling of Longitudinal Data with Monotone Dropout

We analyzed repeated measurement of continuous responses with monotone dropout. We are interested in reducing the bias associated with treatment effects, but the results' credibility relies on the validity of the techniques applied to analyze the data, and under the conditions where the techniques gives reliable answers. Furthermore, the robustness of the trial findings are determined through the application of sensitivity analysis which verifies to which extent the results are affected by changes in techniques, values of unmeasured variables and model assumptions. Moreover, the results obtain from the missing not at random (MNAR) is the same as their counterpart in missing at random (MAR). In addition, using multiple imputation (MI) in the analysis also improves the accuracy of results.

first dropout occurs. It is essential to consider the joint modeling for the measurement process together with the dropout process, especially in the modeling of the missing data. Therefore, we assume the full data density is given by where X i and Z i represent the design matrices for measurement process and dropout mechanism respectively, while θ and ψ denote the parametrization vectors for the joint distribution. (1), the factorization of joint density function can be done in two ways to facilitate modeling. First, the selection and pattern mixture models are defined by the conditional factorizations of the joint distribution of Y and R which are discussed in details in [18], and briefly analyzed below. The factorization of the selection model is as follows the first part of the factorization represents the marginal density of the measurement process, while the second part is the density of the dropout process, conditional on measurements. In addition, the alternative factorization based on the pattern mixture models [16,17] is given by These models excellently in [7,9,10,19]. Through the selection model framework, the missing data process was developed by [30,19]. Among the several missing data processes, they make distinctions. Based on the second part of model (2), these processes may be derived, that is f (r i |y i , X i , ψ) = f (r i |y o i , y m i , X i , ψ).
A process is termed missing completely at random (MCAR), if the distribution of missingness process is reduced to f (r i |y i , X i , ψ) = f (r i , X i , ψ) meaning that the process is independent of the measurements. When the dependence of the missingness probability is on the observed measurement y o i , but not on the unobserved measurements y m i , which means that f (r i |y i , X i , ψ)=f (r i |y o i , X i , ψ), then the process is called missing at random(MAR). Lastly, when data are missing not at random (MNAR)or exhibit an informative process then the missingness probability depends on the unobserved measurement, y m i and possibly on the observed measurement, y o i which means f (r i |y i , X i , ψ) = f (r i |y o i , y m i , X i , ψ). However, expression (4) is an informative process which is irreducible.

Modeling framework
In this section, we compare selection model (SM) and pattern mixture model (PMM) using marginal restrictions which is a useful form of the sensitivity analysis. In addition, each model is discussed briefly.
Missingness mechanism that incorporated joint modeling of longitudinal data with monotone dropout

Selection model
As earlier mentioned, selection model factors the joint distribution into two parts: the marginal measurements and missingness models, where the former discusses the complete measurements and the latter characterizes the conditional distribution of the outcome indicators given the observed and unobserved measurements. In this model, the first specification is the measurement distribution, then followed by the probability of being observed which depends on the data. Thus, using a selection model as specified in equation (2), [5] combined a multivariate normal model for the measurement process f (y i |X i , θ), and a logistic regression model for the dropout process f (r i |y i , Z i , ψ).
When we assume that the missingness is due to monotone dropout, for each individual the first measurement y i1 is observed.
However, recall that D i has been stated to be the dropout indicator which denotes the occasion at which the dropout first occurs.
Also, let D i = d i express the dropout time for individual i, where D i = n + 1, if the sequence of measurement is complete.
Thus, the selection model expressed in equation (2) arise when the joint likelihood of the measurement and dropout processes is factorized as following We recall that model for dropout process is based on a logistic regression for the probability of dropout at event j, was present in the study at the previous event. However, this probability is represented by g i (y ij , h ij ), where h ij denotes the history of the measurement process. Therefore, one may assume that g i (y ij , h ij ) satisfies the model of the linear predictor η(h ij , y ij ) which depends on h ij and y ij . Equation (5) simply expresses the modeling of the dropout mechanism by assuming η(h ij , y ij ) which depends only on the current measurement and previous measurement y ij−1 , but not on future measurements or higher order history, with corresponding regression coefficients, ψ 1 and ψ 2 [32]. Justifying the dependence on future unobserved measurement is difficult and therefore not modeled here. The observed history of subject i up to time t i,j−1 can be included, but we assume first order history for simplicity. The logistic dropout model could be written as The model expressed in equation (6) is MAR when ψ 2 equal to zero, since the dropout does not depend on the current measurement this model becomes MCAR, since the dropout does not depend on outcome at all, and finally when ψ 1 is equal to zero the dropout mechanism is MNAR. We use the likelihood ratio (LRT) to compare model fit under a model that assumes the incomplete data due to monotone dropout MCAR versus MAR [5]. The LRT test statistic follows a null asymptotic χ 2 1 distribution. The details of this statistic derivation are in [5,21]. The significance of the LRT is an indication that the least restrictive of the two models is preferred. However, Based on [5], the use of LRT statistic to test the MAR against MNAR is not the behaviour of the LRT statistic for the MNAR parameter ψ 2 is the information available on ψ 2 is rare and interwoven with other features of both measurement and dropout models [11]. Thus, considering the model based on Diggle and Kenward type; it is assumed to be Universal Journal of Applied Mathematics 6(4): 107-122, 2018 111 true, but their tests is conditional on the alternative model. The difference between the MAR and MNAR mechanism can only be made using untestable modeling assumptions, such as distributional form [22]. This assumption gives rise to the dropout in a sample which is unverifiable by the observed measurements, and any test concerning the dropout process is invalidated as stated in [25]. The justification is that the parameters of the dropout model process depend partially on dropout. Furthermore, unless one clarifies it in the posited MNAR model; the distinction (MAR/MNAR) [26]. However, the challenge of model identifiability poses a complex issue when considering the models for dropout mechanism. In addition, it is recommended to conduct sensitivity analysis of the parameters of the measurement model across models that make different assumptions about the dropout process [25].

Identifying restrictions
In this study, patterns are defined based on dropout at every time point, except baseline [2]. However, when the last observed subject has tth outcome value and one dropouts after this point, this individual belongs to pattern t=1,..,T. The uniqueness of Pattern mixture model (PMM) starts from a particular decomposition of the joint distribution of the response variable together with the dropout indicator [2]. The pattern-mixture distribution of complete outcome values y 1 , ..., y T is written as the proportion of pattern t is represented by α t and f t (y 1 , ..., y T ) stands for f (y 1 , ..., y T |t). In equation (7), the expression of the whole population distribution is in terms of a mixture of the pattern populations distributions. The decomposition of these can be written as In equation (8), the first component is identified from the observed response values. Thus, identification of unidentified parameters is an important part of the distributions of conditional pattern that forms the basic framework of the imputation process. This complexity is overcome by identifying restrictions: the unidentified parameters of missing patterns are set equal to appropriate functions of the parameters describing the distributions of other patterns. Further descriptions on the identifying restrictions could be found in [35]. Under the complete-case missing value (CCMV), the identification is based on the completers' pattern (i.e. borrow from subjects with complete outcome profile), which has pattern T . However, this identification can be written as In neighboring-case missing values (NCMV) approach, the closest neighboring pattern is applied. In this case, an incomplete observation at a time point is borrowed from the nearby pattern for which the response values are observed at this time point, but unobserved later. Then we have f t (y s |y 1 , ..., y s ) = f s (y s |y 1 , ..., y s−1 ), s = t + 1, ..., T.
Thus, identification can be based on the identified patterns as suggested in the expression The set of positive ω sj 's is represented as ω s . All ω s that adds up to 1 gives an identification scheme that is validity. Expression (11) becomes a special case of equation (9) by setting ω sT =1 and all other ω sj =0 and to the special case of equation (10) by setting ω ss =1 and all other ω sj =0. Finally, the available-case missing values (ACMV) approach offers a compromise between CCMV and NCMV as missing information is borrowed from all available pattern weighted by occurrence of each pattern. The ω s was determined in [23] such that equation (11) corresponds to ACMV. The coefficients are defined as The status of ACMV is unique because it is a natural counterpart of MAR in the PMM framework. In application, analysis under MAR can be the starting point for sensitivity analyses under MNAR.

Multiple imputation method
In this section, we explain the implementation of PMM analysis for Gaussian outcome variables following the standard three-stage MI approach described in [28].

Pattern parameter estimation
Unique models are developed to fit the response variable within each pattern. Time is the class variable and no specification of random effects.
Let Y i = (y i , 1, ..., y i , T ) the complete outcome vector for the ith subject of pattern t and the observed part Y i,obs = (y i , 1, ..., y i , t). The expression for MMRMs per pattern is written as where i ∼ N (0, Σ t ), Σ t is unstructured, and i 's are independent. The matrix X i contains the known fixed-effects covariates whereas β t contains the unknown parameters. In the first stage, we obtain estimatesβ t , V AR(β t ),Σ t and V AR(col(Σ t )), where col(Σ t ) is the vector containing the coefficients of the diagonal and lower part ofΣ t [2].

Imputation
The missing response values is imputed sequentially by value. Thus, on how to obtain a run of M imputed values of y i,t+1 .
Multiple imputation of y i,t+1,...,Y i,T follows the same process by considering the previous imputed values as observed ones [2]. The imputed values of y i,t+1 are drawn from conditional pattern distributions. The pattern of imputation selection is done through the identifying restrictions chosen. For the purpose of explanation, the first thing is to use pattern r (t + 1 ≤ r ≤ T ).
However, according to [2] µ i,r was introduced for the mean of Y i , which is µ i,r = X i β r . Based on appropriate parts of µ i,r and 11 ) and y i,t+1 ∼ N (µ i,r,2 , Σ r,22 ). The notation of their covariances are Σ r,12 and Σ r,21 . Furthermore, using 2|1 as notation for y i,t+1 |y i,1 , ..., y i,t , the conditional pattern distribution of y i,t+1 , ..., y i,t given y i,1 , ..., y i,t is described as Through the Bayesian posterior predictive distributions; the uncertainty pertaining to the pattern. This is based on the Gaussian distributions and non-informative Jeffreys' priors, where the values ofβ In equation (11), the ACMV suggests that imputing the missing y i,t+1 value is based on a sum of conditional distributions of several patterns weighted by occurrence of each pattern. In accordance with the MI procedure, handling the weighted summation is done via random pattern selection over imputations. The coefficients ω sj in equation (11) is calculated for each imputation and this characterizes the pattern probabilities. In the imputation, random pattern selection is based on coefficient values ω

Pooled analysis
Fitting the complete datasets is done using modeling strategy explained in equation (13) for pattern parameter estimation. The procedure for MMRM incorporates the interaction between the full group-by-time at every time point for the fixed effects and ARH(1) error covariance matrix. The analysis model can be expressed as where * i ∼ N(0,Σ), Σ is ARH(1) structure and the * i 's are independent. The unknown parameters of fixed effects is contained in the vector β * . The MI approach combines the inferences made on the M imputations into a single one. Let us defineβ * (m) , where m = 1, ..., M , the estimators of β * by imputation andβ * the pooled estimator of β. Thus, according to Rubin's expression β * is the average ofβ * (m) 's whereas the pooled variance V is expressed as the average of V AR(β * (m) ) is W and B is the between-imputation variance: Under MNAR, the pooled estimatorβ * is consistent. The F -distributions of the tests of fixed effects are described in [12].
4 Description of the case study 4

.1 Age-related macular degeneration (ARMD) trial
The age-related macular degeneration (ARMD) data arise from a randomized multi-center clinical trial comparing an experimental (Interferon-α) versus placebo for patients diagnosed with the ARMD [8]. weeks respectively [2]. Treatment-effect inferences on visual acuity at visit 4 is the primary focus of the statistical analysis. Each patient's visual acuity was evaluated based on the ability to read lines of letters on standardized vision charts. The charts display lines of five letters decreasing size, which the patient must read from top (largest letters) to bottom (smallest letters). When at least four letters are read correctly on each line; it is called one "line of vision". Furthermore, interest is also on the visual acuity which is defined as the total number of letters read correctly. Thus, another possible approach was to consider visual acuity measured by the number lines read correctly. However, the two approaches are closely linked, as each line of vision contains five letters. We present the general overview of the different missingness patterns in Table 1.

Fitting selection model
We fit the selection models to the ARMD data by combining the measurement model with the logistic regression for dropout model, in line with [5] using a generic function maximization routine. In order to obatain the initial values for the parameters of the measurement model, we use the MMRM of the equation (13). Fitting the model, we assume different intercepts and treatments effects for each of the four visits, with a (4 × 4) ARH(1) structured variance-covariance matrix. Considering a multivariate normal model with unconstrained visits trend under placebo and an occasion-specific treatment effect using equation (13). From this, we obtain the parameter estimates and standard errors for the eight mean model parameters. We fit this model using the SAS procedure MIXED with REPEATED statement. Thereafter, we consider the dropout model. The dropout is an independent covariates. We fit the model with an intercept, an effects for the previous response and current unobserved measurement corresponding to MCAR, MAR and MNAR respectively. It is possible analytically to have dependence on future unobserved measurements, but in avoidance of complication, we model dependence on the current unobserved measurements.
The probability of ARMD data is assumed to follow the logistic regression model (it is the commonly used model for dropout process, as indicated in [25]) in equation (6). Three parameters are involved in the logistic regression model; that is, an intercept (ψ 0 ), the effect of the measurement prior to dropout (ψ 1 ) and effect of the measurement at the visits of dropout (ψ 2 ). For the four visits, the model can be written as logit[g(y i,j−1 , y ij )] = logit[p(D i = j|y i,j−1 , y ij )] = ψ 0 + ψ 1 y i,j−1 + ψ 2 y ij , j = 2, 3, 4, 5.
It is still a major challenge to estimate the selection model for MNAR because the dropout indicators depend on the unobserved measurement. For illustration, in the stated above selection model; the dropout indicators depend in part on the unobserved measurements at the time of dropout. Difficulty in assessing the likelihood function can be handle as provided in [6] that maximizes the log-likelihood for the model using PROC IML. Missingness mechanism that incorporated joint modeling of longitudinal data with monotone dropout The results in Table 2 displays the parameter estimates and standard errors for the fixed effects for the selection model, which includes the eight mean model parameters all into marginal measurement model as well as in the logistic dropout model. For comparison, the MCAR, MAR and MNAR produce results are close to one another, except for negligible differences as shown in the estimates and standard error. From the assumed model, the significance of the results in the dropout model is examined using the LRT test statistic which compares that the MCAR. However, there is a challenge when assessing that dropout mechanism is MNAR because neither the LRT statistic of the models that assume the dropout is MAR as against MNAR nor there is reliability in the assessment of ψ 2 in relation to the its standard error [11]. In addition, the dropout mechanism for MNAR is unverifiable [26]. In this situation, the focus is on the marginal effect because the overall treatment effect is exempted.
Here the conditions that influence dropout are briefly discussed. Based on equation (18), modeling the logistic regression for dropout in selection models are obtained. From Table 2, it is observed that the maximum likelihood estimates for ψ 1 and ψ 2 are (0.02) and (-0.04) which give different signs and also not the same in value. This outcome is not surprising, but confirms the argument raised by [24]. In addition. it is stated in [24] that when there is positive correlation between two simultaneous measurements, the dropout model may rely on the increment, that is, y ij − y i,j−1 . Furthermore, the dropout estimates from the MNAR models is stated below However, insight obtained in the fitted model was achieved by re-parameterizing the parameters for dropout in accordance with increment and sum of the successive measurements. The dropout probabilities from the dropout model in equation (18) was re-parametrized to obtain logit[p(D i = j|y ij−1 , y ij )] = ϑ 0 + ϑ 1 (y i,j + y i,j−1 ) + ϑ 2 (y ij − y i,j−1 ), j = 2, 3, 4, 5.
Thus, ϑ 1 = (ψ 1 + ψ 2 )/2 and ϑ 2 = (ψ 1 − ψ 2 )/2. In the ARMD data, the parameters stated above represent the dependence on level and increment, and these quantities are likely to be much less strongly correlated than are y ij and y i,j−1 . The MNAR model is rewritten as in equation (15) logit[p(D i = j|y ij−1 , y ij )] = −1.81 − 0.01(y i,j + y i,j−1 ) + 0.03(y ij − y i,j−1 ), which indicates that the probability of dropout increases with larger negative increments. This means that patients with reduction in the overall level of the visual acuity from the previous week may have higher probability of dropping out from the experimental trial.

Multiple imputation and sensitivity analysis of the pattern mixture models
Multiple imputation, at least in its basic form [29] requires the missingness mechanism to be MAR. Our investigation com- In Table 3 is MNAR or not, because the model assumptions are different [32]. Consequently, the parameter estimates and standard errors obtain for the treatment effects by applying CCMV are smaller than those obtain for ACMV and NCMV because many completers are available. In the case of ACMV, the PMMs use data from the different patterns to multiply impute new observations, however in NCMV, the PMMs get information from the neighboring case patterns only [32].

Discussion
In this study, we analyzed the incomplete longitudinal data using the SM and PMM where the response variable miss across visit. We used the models to illustrate and compare the results of the analysis. Our focus is on a scenario where the responses are continuous. The focus of the study was on the special cases of SM and PMMs using [5] model and identifying restrictions strategy [16,17]. In selection model, logistic regression was used to model the dropout. The objective was to examine the influence dropout might have on the measurement of the data and how to handle the incomplete observations. Furthermore, the results obtain from the models were compared and also given insights to the ARMD data. In the findings, a great confidence was established because the joint applications of the models display inferences that are similar. In our study, we compared and noticed similar values in the results. A challenge always arises when dealing with dropout that are MNAR. Due to this challenge, it is pertinent to be aware that this assumption increases the dropout which is not to be known in real application. Moreover, several proposed application procedures applied to handle MNAR dropout is unverifiable. In addition, no one is sure if the process of dropout is captured accurately by any of the methods applied. Due to this, several methods were to raised by [25]. In this situation, the sensitivity analysis of the parameter estimates of several mechanism about dropout process is examined [32].
If the parameter estimates are the same under different methods; this may be an indication for ignoring the dropout process. On the other hand, if different parameter estimates are obtained under different methods; this may likely mean that dropout process is an important element for describing the data in the analysis.
The structure of the selection dropout model adopted that dropout increases with a unit change in the ARMD trial [32], this reveals that dropout is in relation to the larger negative increments than the main observation. This further shows that patients with decrease in the overall level from the previous week have a higher probability of dropping out of the trial.
Missingness mechanism that incorporated joint modeling of longitudinal data with monotone dropout In this research, we proposed an application of joint modeling on ARMD; the study is a longitudinal continuous outcome where some of the patients left the study. The dropout process is modeled jointly. Thus, we extend our model by incorporating missingness in diverse ways to estimate the effect on the results. With several application of MNAR models, is the same fit with its MAR version. This indicates that the MAR counterpart properly fits the observed data. Thus, the model formulation included only random intercepts [1].
Moreover, several modifications of the MI analysis were adopted because of the assumptions behind the posterior distribution of the dropouts. These assumptions are untestable and conducting sensitivity analysis is primal. Starting with the basic MI analysis form [29] and then extended to MNAR, under a suite of the identifying restrictions [25,36,37], we say that the performance of the sensitivity analysis using ARMD trial, the inferences pertaining to some parameters can be modified by some identifying restrictions. From the results, we can conclude that the ACMV, CCMV and NCMV restrictions did not differ considerably. In addition, extension can be made to models with more complex data structures. This area can be further exploited for future research.