Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Is it possible to analyze 4 categorical dependent variables simultaneously? - Panel data

    Dear Statalist Forum,
    I have panel data, where individuals over 4 rounds (representing time) make 4 decisions, coded as 1 if the correct decision was made, 0 otherwise.

    I would like to simultaneously analyze these four categorical decisions as dependent variables.

    I have transformed the data using xtset and xtreg, I cluster standard errors at the individual level and use random effects as the allocation to treatment or control is set (in a balanced way) at the beginning of the experiment and does not change.

    So far, I only managed to analyze each decision on its own.

    Previous posts suggested ologit, probit, and logit models, but these only allow for one dependent variable, but I would really like to analyze the four decisions per round at the same time.

    Multiple logistic regression (logistic) seems not to work as it focuses on only one dependent variable but multiple independent ones.

    Instead of having 4 dependent categorical variables, do you think having one dependent variable (still categorical) that combines the possible decisions is better (below is a table for better understanding)? Would this change my analysis as compared to having four dependent variables?
    Decision 1 Decision 2 Decision 3 Decision 4 Overall - dep var- code
    1 1 1 1 1
    1 1 1 0 2
    1 1 0 0 3
    1 0 0 0 4
    0 0 0 0 5
    0 1 1 1 6
    0 0 1 1 7
    0 0 0 1 8
    1 0 1 0 9
    0 1 0 1 10
    Thank you very much in advance and kind regards!

  • #2
    It's not clear just what the objective of your analysis is, but depending upon that, you could consider anything from Item Response Theory (help irt) to reshape long and using a categorical predictor for the four decisions (and its interaction with another categorical variable for the four rounds) with, say, melogit. In between those two, you could set up four equations, one for each decision, in gsem.

    Comment


    • #3
      Hello Joseph, Thank you very much for your suggestions! The objective is to see how individuals behave/ decide per round in each of the four decisions (per round) and how my independent variables affect the decisions per round. Previously, I looked at the individual decisions per round but would like deeper insights.

      I'm unsure whether the interaction effects between the dependent variables would truly capture the same thing as having all four dependent categorical variables on their own.

      My independent variables are treatment or control group - categorical, a continuous variable that measures past success of right decisions (in a previous experiment) (in %), risk (categorical), gender of the participant (categorical).

      I have already reshaped the data to the long format, such that one ID (or personal unique identifier) spans over 4 rows, where each row hosts four of the decisions made, one aggregate (that is 1 iff all individual decisions in that round were correct).

      Why would I need a predictor variable if I already have the answer if the correct decision was made in that round?

      It seems like you think it's not possible to have all four categorical decisions as the dependent variables and have risk, TG/CG, gender, etc. as the independent variables. Is this correct?

      If I understand it correctly, you suggest to have four independent equations, where the decision is always the dependent variable, have the same independent variables and then model all 4 together using gsem?

      Thank you so much!

      Comment


      • #4
        Originally posted by Mary Burckhette View Post
        Why would I need a predictor variable if I already have the answer if the correct decision was made in that round?
        It's to distinguish the four decisions in the round. I assume that the four decisions are distinguishable on some characteristic, for example, temporal sequence or level of difficulty. But if you consider the four decisions within the round to be exchangeable, then there is no need for a variable to distinguish them. In that case, you could consider them a binomial outcome of four Bernoulli trials and just sum the successes for each round for each participant.

        It seems like you think it's not possible to have all four categorical decisions as the dependent variables and have risk, TG/CG, gender, etc. as the independent variables. Is this correct?
        If you're not using gsem, then you'll need to reshape long once more, because estimation commands such as xtlogit, melogit and xtgee can take only a single outcome variable. (The latter two do allow a binomial() option if you consider the four decisions within a round to be exchangeable.)

        If I understand it correctly, you suggest to have four independent equations, where the decision is always the dependent variable, have the same independent variables and then model all 4 together using gsem?
        That's one approach, but it is more involved and it might not be necessary to attain your objective. You can really go to town modeling these things, with the round-within-participants having its own variance component or even longitudinally with autocorrelation, but your data might not be sufficient to fit these complicated models with all of their additional parameters. Even if you can it might be overkill if all you want to do is to explore potential relationships between participant characteristics, treatment condition and rate of success in making decisions.

        Comment


        • #5
          Thank you very much Joseph! You have been most helpful!

          Comment

          Working...
          X