Below is information about how to identify patterns of change in longitudinal data on behavior using Repeated Measures Latent Class Analysis (RMLCA).
The text that appears in green is not part of the code; these are comments added to clarify the meaning of the code.
The data file for this code needs to be in wide format, in which the repeated measures appear as separate columns rather than as separate rows in the data file. It is possible to convert a long file to a wide file using any data management program, including PROC SQL or PROC TABULATE in SAS.
A key step before this is to determine the indicators you want to use and code them in preparation for modeling. Note that 0 values are not allowed. You will need to use a minimum value of 1 for your categorical indicators. In the example below, we used 27 binary indicators (named Smok1 through Smok27), each of which had a value of 1 (no smoking) or 2 (smoking). You would substitute Smok1-Smok27 with your list of indicators.
The first step to selecting an unconditional model is to determine the number of classes to retain. You determine this by running models with varying numbers of classes (starting with 1 and then increasing by 1) until model fit indices (G2, AIC, BIC, BLRT) start to indicate that you have gone too far and extracted too many classes. The syntax below will run 1 to 8 class solutions.
<style=”color:>
%MACRO Num_class (num);
/* Macro name */
%DO i = 1 %TO #
/* DO LOOP for number of classes */
PROC LCA DATA=libname.filename
/*Enter your library and filename, separated by . */
OUTPARAM=libname.parm_&i.Class;
/* Specify output file names, will get 1 per no. of classes run */
TITLE “Unconditional RMLCA to determine number of classes in model of 27 repeats of binary smoking status with 200 starts and seed 314728”;
NCLASS #
/* Specify number of classes to be run.*/
ITEMS Smoke1 – Smoke27;
/* Identify your RMLCA indicator variables; can use a dash rather than list individually if they are sequential columns in your database */
CATEGORIES 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2;
/* Specify the number of levels of each of the RMLCA indicators identified in the previous line of code; in our model, all indicators were binary (coded 1 or 2, must be non-zero), but it is possible for some of the categories to be multinomial; you are likely to run into problems with many repeats of multinomial indicators, however */
CORES 6;
/* Specify the number of PC processor cores to use for this analysis (check your hardware); using more cores will reduce computation time */
SEED 314728;
/* Specify a random number seed to generate a starter for rho estimates; different seeds will generate different results, some of which may be local, but not absolute, maximum likelihood functions*/
NSTARTS 200;
/* Specify number of times you want to run the models using different seed numbers; you may want to start with 20 or 100 at first, but it is important to increase this to 200 or even 1000 to make sure that your final models are the best (truly maximum likelihood functions, not just local maxima) across a large number of starting seeds*/
RHO PRIOR = 1;
/* Set prior value of rho to 1 to stabilize rho parameters and generate standard errors and confidence intervals */
RUN;
%END;
/* End loop. */
%MEND Num_class;
/* End macro. */
%Num_class (8);
/* Maximum number of classes to be tested. */
</style=”color:>