Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Predict using data from two Datasets

    Dear colleagues,

    I was wondering if anyone can help me with this one.

    I have two separate datasets, one with patient data ("patients.dta) and one with the intercept and slope for 100 linear models ("estimates.dta") derived from another (independent) dataset.
    This is what the datasets look like:

    patients.dta
    id x y z etc. (n=100)
    aa 12 14 15
    bb 14 15 15
    cc 11 13 14
    etc.
    estimates.dta
    variable intercept slope
    x 1 2
    y 1 1.5
    z 2 2
    etc. (n=100)
    I would like to generate the predictions for each variable in the patients dataset using the values from the estimates dataset:
    id x y z etc. prediction_x prediction_y prediction_z prediction_etc
    aa 12 14 15
    bb 14 15 15
    cc 11 13 14
    etc.
    Any suggestions?

    Thanks a lot!
    J

  • #2
    The information provided in this post is insufficient. Your estimates.dta file contains an intercept and slope for variables x, y, and z, but there is nothing that says what variable serves as the predictor here. In addition, whatever that variable is, it appears not to be included in the patients.dta data set.

    Please clarify.

    Also, when reposting with clarification, please repost the data examples using the -dataex- command, so that if somebody wants to help you, that person can readily import the data into Stata. If you are not yet familiar with -dataex-, run -ssc install dataex- and then run -help dataex- to learn how to use it. Use -dataex- whenever you show example data on this Forum.

    Comment


    • #3
      Thanks, Clyde. Sorry for not being clear.
      The variables that serve as predictors are "x y z.." from my patients dataset.
      That is, for example, I'd like to generate predictions for patient "aa" as follows:

      pred_x = 1 + 2*(12)
      pred_y = 1 + 1.5*(14)
      pred_z = 2 + 2*(15)

      and so on for all ids, with all variables (x y, z ...n=100).

      Here are the data examples using dataex:

      patients:

      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input str4 id byte(x y z)
      "aa" 12 14 15
      "bb" 14 15 15
      "cc" 11 13 14
      end
      estimates:

      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input str1 variable byte intercept float slope
      "x" 1   2
      "y" 1 1.5
      "z" 2   2
      end

      Comment


      • #4
        Got it. The trick, as is so often the case, is to go to long layout. Then it's easy. In order to -reshape long-, you have to give the variable names x, y, and z some prefix that stays behind.

        Code:
        * Example generated by -dataex-. To install: ssc install dataex
        clear
        input str4 id byte(x y z)
        "aa" 12 14 15
        "bb" 14 15 15
        "cc" 11 13 14
        end
        tempfile patients
        save `patients'
        
        * Example generated by -dataex-. To install: ssc install dataex
        clear
        input str1 variable byte intercept float slope
        "x" 1   2
        "y" 1 1.5
        "z" 2   2
        end
        tempfile estimates
        save `estimates'
        
        use `patients', clear
        rename (x y z) value=
        reshape long value, i(id) j(variable) string
        
        merge m:1 variable using `estimates', assert(match) nogenerate
        
        gen predict_ = intercept + slope * value
        
        drop intercept slope
        reshape wide value predict_, i(id) j(variable) string
        rename value* *
        Note: I have done the -reshape wide- to bring you back to your original layout as you asked for. But it is likely that whatever you are going to do with this data next, it will be easier to do if you keep it long. Most Stata commands are most effectively used with long data. So think seriously about skipping the -reshape wide- and subsequent -rename- commands.

        Comment


        • #5
          It worked great! Now I see what you mean by the reshape issues...
          Thank you very much.

          Comment

          Working...
          X