Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Creating a change variable

    Hi,

    I have data where two continuous variables are cognitive scores measured on the same individuals two years apart: 'cog1' and 'cog2'. I am interested in the difference between these cognitive scores as a continuous outcome measure. As mean cog2 score were lower, I created a change variable by using the syntax: gen cog_change = cog1-cog2. I've since been looking online and thinking I should have set my data as panel data (xtset). However, none of my other variables are longitudinal, they are all cross-sectional.

    Could anyone provide advice on what is the correct syntax to use?

    Thank you.

  • #2
    Penny:
    if individuals (N) are measured on the same set of variable at (theoretically) equally spaced time intervals (T), you have a panel dataset and your dataset should be -reshape-d in -long- format, with (at least) one observation per -panelid-.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      There are several possibilities here. Rather than write an endless tome accounting for all of them, I ask you to post back showing example data. That will make it evident which of the possibilities is your situation and then give you a straight answer on how to proceed. When posting back, be sure to use the -dataex- command to show the example data. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

      Note: If when using -dataex- you encounter an "output line too long" error, try again, specifying a list of variables in the -dataex- command, which list should include any individual identifier variables, any date or time variables, cog1, cog2, and a small but representative subset of the remaining variables.

      When asking for help with code, always show example data. When showing example data, always use -dataex-.

      Added: Crossed with #2, which proposes the solution that would cover what is probably the most common situation here.

      Comment


      • #4
        Working backwards, the point about reshape long is to allow whatever commands need that layout, and in particular you can't declare panel data using xtset or tsset unless different times are represented by different observations. If you don't need any such commands then your dataset sounds to me as being fit for purpose.

        Comment


        • #5
          Below -dataex- for reply #3.
          The syntax I want to be able to run is regress cog_change c.brain_volume##c.cog_reserve

          Code:
          *Example generated by -dataex-. For more info, type help dataex
          clear
          input float(cog1 cog2 cog_change brain_volume cog_reserve)
          .5147248 .4377843 -.07694048 1152.34 101.6528
          .6211652 1.0542531 .43308794 1180.68 111.74068
          1.3962865 1.4861206 .0898341 1076.75 135.16483
          1.2264864 1.0481688 -.17831767 . 121.72263
          .50860894 .22750993 -.28109902 1072.81 104.8262
          .069486396 .4582633 .3887779 1037.12 88.29156
          -1.225916 -.8031045 .4228114 1007.51 94.7383
          -.0735328 .3350031 .4085359 1045.48 100.49938
          -.2452323 -.5658025 -.3205702 1187.67 .
          -1.612533 -1.3282038 .2843293 1052.7 91.69584
          end

          Comment


          • #6
            As you show no participant identifier variable, I am assuming that each observation (row) in your data represents a different person. If so, you can run that command on this data without any modification. You do not need to (and cannot) -xtset- the data.

            An alternative analysis of this, which can also be done without modifying the data, is to have cog2 as the outcome variable and include cog1 as a covariate.
            Code:
            regress cog2 c.brain_volume##c.cog_reserve cog1
            For a discussion of the pros and cons of these two approaches, and specific information about the circumstances in which each is best, see Frank Harrell's https://www.fharrell.com/post/errmed/#change.

            Comment


            • #7
              Thank you so much for all of your advice and help. That is great news that the command I already ran works (by creating a change score between the baseline and follow-up measures). Thanks for the link to Frank Harrell's website - very interesting.

              Comment

              Working...
              X