Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • scatter plot with line of best fit

    Dear Stata users,

    I am interested in investigating the association between specific measurements (baseline albuminuria and eGFR change). To do this, I am trying to use cubic spline and create a scatter plot of baseline UACR and eGFR change, where the line of best fits with 95% CI adjusted for other covariates.

    My codes look as below, but not sure whether I am in right direction or not.

    Codes:
    HTML Code:
    by ID (Years_from_baseline), sort: gen gfr_change = gfr - gfr[1]  
    gen log_uacr=log(uacr)
    mkspline uacr_spline = log_uacr, nknots(5) cubic
    regress gfr_change uacr_spline* i.gender age sbp  
    predict gfr_change_predicted, xb
    by log_uacr, sort: egen gfr_change_margin = mean(gfr_change_predicted)
    
    ???  twoway (scatter gfr_change_margin log_uacr, msymbol(.) msize(tiny)) 
    Would anyone be able to help?

    Many thanks,
    Oyun

  • #2
    This will give you a "scatter plot" that is really more like the points on a curve. Each point will show the average predicted gfr change corresponding to each level of log_uacr in your data. It seems a reasonable approach. The main drawback to this approach is that the particular observations having different values of log_uacr will likely also differ in their distributions of gender, age, and sbp. Consequently the points on this graph are not fully adjusted for the other variables. It is a bit like using -margins, over()-. It's a legal thing to do, but you have to realize that you are not fully adjusting and are really almost doing a separate analysis for each value of log_uacr.

    A better, but more complicated approach, that fully adjusts for age, gender, and sbp, involves calculating the values of each of the uacr_spline* variables at each of the values of log_uacr (or enough of them to flesh out an adequate curve--you don't want to use to many). Then you use those values in the -at()- option of a -margins- command. The technique is illustrated in the following code using the built-in auto.dta dataset.

    Code:
    sysuse auto, clear
    
    mkspline mpg_ = mpg, nknots(5) cubic
    
    //    BUILD UP THE AT() OPTIONS FOR LATER USE WITH -margins-
    preserve
    keep mpg*
    sort mpg_1
    duplicates drop
    local ats
    forvalues i = 1/`=_N' {
        local atspec
        forvalues j = 1/4 {
            local atspec `atspec' mpg_`j' = `=mpg_`j'[`i']'
        }
        local ats `ats' at(`atspec')
    }
    restore
    display `"`ats'"'
    
    regress price mpg_* i.foreign displacement
    tempfile margins_results
    margins, `ats' nose saving(`margins_results')
    use `margins_results', clear
    graph twoway line _margin _at1
    Evidently you will need to adapt this code to use your own variables. You can also, after you -use `margins_results'-, rename the variables or change the variable labeling to get the axis titles that you want. Note that the use of _at1 as the variable for the horizontal axis comes from the fact that when you make a cubic spline, the first spline variable is equal to the original variable.

    The above code will you give you the kind of graph you would get with -marginsplot-, were it possible to simply use -marginsplot- with a cubic spline.

    Comment


    • #3
      As always, thank you so much for your help prof.Schechter. As suggested I will try to adapt the code according to my dataset.

      Oyun

      Comment

      Working...
      X