Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Seeking help with unbalanced panel data analysis

    Hi, I have recently analyzed an unbalanced panel data, but since it is my first time to work with such data, I have some questions and I hope you can help me.

    First of all, my data is very unbalanced panel data. Subjects were defined as those who participated in any of the 2nd, 3rd, 5th, and 7th waves of the survey (please see below). Each subject was measured once for the exposure variable (DHEA-S; continuous) and the outcome variable (fr_0_12; 0 or 1, dichotomous) each time they took the survey. I have pooled all the data together and tertile the exposure variable (dhea_s_3cn; 1, 2, or 3, categorical). I would like to ask if the random effects model analysis I performed with this data can be called a longitudinal analysis?
    Click image for larger version

Name:	1.png
Views:	1
Size:	66.4 KB
ID:	1682014

    Click image for larger version

Name:	2.png
Views:	1
Size:	77.0 KB
ID:	1682015


    Secondly, before proceeding to the formal analysis, what would you recommend in terms of data processing?

    Third, I have analyzed the data using the following model, and what do you see as the problem with regard to that result?
    I am interested in the association between dhea_s_3cn and fr_0_12.
    Click image for larger version

Name:	3.png
Views:	1
Size:	31.0 KB
ID:	1682016

    Click image for larger version

Name:	4.png
Views:	1
Size:	29.1 KB
ID:	1682017


    Finally, what would be a good way to use to characterize subjects based on the exposure variables (dhea_s_3cn; 1, 2, or 3, categorical)? Because my data are unbalanced panel data, each subject has a different number of repeated measures of characteristics. Using a general linear model or χ2 test is obviously not a good way to measure between-group differences in characteristics of exposure variables.
    Click image for larger version

Name:	5.png
Views:	1
Size:	56.2 KB
ID:	1682018


    Thank you all for your help.

  • #2
    Shu:
    1) Stata can handle both balanced and unbalanced panel datasets; therefore this is not an issue;
    2) that said, unbalancedness may have a bearing on the coefficients and their statististcal signifcance (with the usual caveats about ist relevance);
    3) data and results are what they are and we have to live with them (even when they disappoint us and/or let us down);
    4) I would stick with dhea_s_3cn; 1, 2, or 3, categorical.
    As as aside, please do not post screenshots, but use CODE delimiters to share what you typed and what Stata gave you back, as recommended by the FAQ. Thanks.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Hi, Carlo

      Thank you for your comments.
      My apologies. This is my first time posting a question here, so I don't quite understand the rules of posting. I will pay attention in the future.

      I would like to ask, I always thought my analysis of the data is longitudinal, but some researchers consider it as cross-sectional analysis. May I ask what you think?

      Also, what method can be used to get the P-value in the last table? Since this involves repeated measurements, the test method labeled below the table may not be appropriate.

      Best,
      Shu

      Comment


      • #4
        Shu:
        a panel data analysis implies that the same sample of units is measured on the same variabkle at equally spaced time intervals.
        Obviously, due to attrition the starting sample may decrease as the panel stretches over years.
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          Carlo:

          I have some basic understanding of panel data. However, among the participants I analyzed were both those who had taken multiple surveys and those who had taken just one survey. Among those who have taken multiple surveys, they may also have taken different waves of surveys (as seen in xtdescribe, patterns(1000)). So even if I specify the id and wave with xtset, my analysis cannot be considered longitudinal, right?

          Best,
          Shu

          Comment


          • #6
            Shu:
            we should talk about data waves (not surveys, that are completely different beasts). As per -xtdescribe- outcome, your panel dataset stretches over 6 waves, whereas the max number of waves participants were observed is 4 (see -xtdescribe- enter in Stata .pdrf manual for further details).
            Therefore, provided that the sample of participants observed through panel waves is the same, I'd say that you have an unbalanced panel data.
            Eventually, panel dataset and longitudinal study share the same meaning.
            Kind regards,
            Carlo
            (Stata 19.0)

            Comment


            • #7
              Carlo:

              Thank you very much for your reply. I can now confirm that my analysis is longitudinal (as I was confused by some researchers who pointed out earlier that my analysis was actually cross-sectional).

              I have one more question, as shown in the last table (Table1), what method do I need to use to give the correct p-value?

              Best,
              Shu

              Comment


              • #8
                Shu:
                you ought to retrieve the p-values you're interested in directly from the regression output.
                As per footnote b, Table 1, Authors seemengly used something like -glm- but, since Stata provides you with an XT suite, I'd go -xtreg-.
                Kind regards,
                Carlo
                (Stata 19.0)

                Comment


                • #9
                  Carlo:

                  I used -mixed- for continuous variables, -xtlogit- for dichotomous variables, and -xtmlogit- for categorical variables, are these approporiate?

                  Best,
                  Shu

                  Comment


                  • #10
                    Shu:
                    the selection of the estimator should be based on the type of regressand.
                    I was wrong in suggesting -xtreg-, as your dependent variable is categorical.
                    That said, the p-values are those reported in -xtlogit- outcome table.
                    Again, panel unbalancedness is not an issue.
                    Kind regards,
                    Carlo
                    (Stata 19.0)

                    Comment

                    Working...
                    X