This required a lot of changes but I made it. The project is attached.
Many things to learn from it
1: To avoid that the cumulative probabilities will be < than the marginal(p54 must be >p5 for example), you need to parametrize differently as I did (see the code)
2: It is not a must but I used a trick to avoid that the optimizer would randomly get the cumulative P54 for example to be less than P5 (see the code below)
e1=(e1<(-10)?(-10):e1)
e2=(e2<(-10+0.000001)?( -10+0.000001 ):e2)
e3=(e3<(-10+0.000002)?( -10+0.000002 ):e3)
e4=(e4<(-10+0.000003)?( -10+0.000003 ):e4)
If you use reasonable initial estimates this code is not necessary
Now you can see that CE go up to 140. Therefore you need to put for the product th6*Ce a value that is not too big (usually 0.1 to 1 is good). Therefore I put th6=0.001 to have the maximum product of 0.001 *140~0.2
Same trick for th5
I then completed the code for multi statement and LL statement and tried 2 completely different initial estimates. It converged immediately and no issue.
Note also I changed the scores that were for -1 to 3 to 0 to 4. This is needed when you use the multi statement (not the LL statement).
Thank you Simon and Serge for your insightful guidance.
I realized that the code should have cumulative probabilities, the probabilities of transition from one score to the other over time.
Also, the role of optimizer is important to converge the model.
Parameterizing coefficients is important and I lack knowledge in that area; are there any literature to follow to have some basic idea?
I know all MLL estimations are computer-intensive and can be done through MsExcel too but definitely a literature/slidedeck/lecture would have been helpful.
I was thinking as how it is different than hidden markov model (HMM), as both has probability of phase transitions accounted.
The data I used has pain relief scores rising from the baseline with max.score of 4 (no pain) and then transitions to 3 or 2 (mild to moderate pain) and even 1 (severe pain).
Any way the group would like to add to the knowledge of modelling this data in the form of HMM?
I wanted to simulate PR (pain relief) scores of a different formulation (with different “Ce” values coming from sigmoidal Emax model) from the logit-modeled data we earlier used.
The problem is:
I cannot internally validate the same formulation (with same “Ce” values) with which the model was built. A change in PR score over time for this formulation is not reflecting on simulation.
While simulating I found that only one of the PR scores is found to be dominating throughout the overall timeframe of 24 hours.
That means the model is misspecified or there’s a problem in the code, which I tried to change with initial estimates and difference in coding but could not breakthrough.
Tried accommodating the changing pattern of categorical scores but with no luck, maybe the method is not correct.
Then, I looked into the earlier data and found the PID scores’ frequency in terms of score category is: 4<3<0<2<1 (More 1’s than 2’s than 0’s and so on…).
The PR scores’ frequency in terms of score category is: 3<4<2<0<1.
I assume this difference in PR scores has a bearing on initial estimates, on thetas. Tried changing the initial estimates but of no use.
I doubt the transition of probabilities from low to high scores and vice versa over time is not coded properly.
I have attached the project with the same coding used for PID (pain intensity difference) though (not reflecting the exercises I did) with the new PR score data.