Analysis of continuous longitudinal data with non-ignorable data.
Abstract
Missing responses are very common in longitudinal data. Much research has been going on, on ways to go around this complication in analysing such a data set. The approaches range from simple remedies like: analysing complete cases only, imputing the missing data, available case analysis and many others, to joint modelling of the measurement process and the missing mechanism. The work of Rubin(1976) on classifications of missing mechanisms contributed greatly to the development of researches on joint models, as missingness could now be classified as ignorable or non-ignorable.
There are at least three joint modelling approaches, three common forms which are differentiated by the factorisation of the full data density are: selection models, pattern mixture models and shared parameter models.
Missing data in longitudinal studies can be classified into two main categories, which are missing intermittently: which is when a subject has a missing value for one occasion or more but will at a later stage during the study period have observed values, and dropouts: this is when we have a monotone missing pattern that is if we have a missing value at a particular point in time, there after the subject continues to have missing values until the completion of the study. The focus of this research is on the latter. Simulation of the missing pattern was done to produce informative dropouts. The major aim of this research was to compare estimates from different modelling approaches, with the main focus being comparing joint models to complete case basing on how their estimates compared to the complete data model estimates. The first part of this research focuses on linear mixed modelling of a complete longitudinal data set. The extensive modelling process starts from exploration of the data to estimation of parameters forms the baseline of the second part of the research which is the joint modelling process.
The E-M algorithm (Dempster at al (1977) formed the backbone of the likelihood estimation under these approaches and at times convergence would not be reached or would be slow, and the in such cases the modified forms like the stochastic E-M
algorithm would be used.
The complete case estimates were very close to the complete data estimates. However it is difficult for this researcher to conclude that complete case analysis performs better than the other models. This researcher feels one could reach to a solid conclusion after considering different proportions of dropouts and also different patterns in which the dropouts are distributed throughout the study period.