Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Interpretation of Cubic Spline interaction term. Can anyone help?

    Dear all, I have a question regarding the use of cubic splines as an interaction factor within a linear regression model.

    In my case the outcome is mortality and I have only one predictor (province, variable "prov", 3 levels).

    I used spline functions (variable "time", 7 nodes) as an interaction term to model the different mortality trend over time of the 3 provinces.

    I'm having a hard time figuring out how to interpret the interaction coefficients.

    For example, I understand that compared to the period of time 1 (the period before the first knot, 18 days) the mortality of the province 1 compared to the reference province decreases by 5.69.
    It's correct?

    Thank you very much in advance!

    Code:
    prov#c.time2 |
              1  |  -5.692354   .3235686   -17.59   0.000    -6.326537   -5.058171
              2  |   .6234578   .3235686     1.93   0.054    -.0107251    1.257641
    The whole output below:

    Code:
    . mkspline time=period, cubic displayknots nknots(7)
    
                 |     knot1      knot2      knot3      knot4      knot5      knot6      knot7 
    -------------+-----------------------------------------------------------------------------
          period |        18    127.264        237        347        457    566.736        676 
    
    . mixed letpm ib3.prov##c.time*
    
    Mixed-effects ML regression                     Number of obs     =      2,079
                                                    Wald chi2(20)     =   30909.08
    Log likelihood = -6989.1022                     Prob > chi2       =     0.0000
    
    ------------------------------------------------------------------------------
           letpm | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
    -------------+----------------------------------------------------------------
            prov |
              1  |  -.0302228   1.790283    -0.02   0.987    -3.539114    3.478668
              2  |   3.961484   1.790283     2.21   0.027     .4525932    7.470375
                 |
           time1 |   .5075055    .017888    28.37   0.000     .4724457    .5425652
           time2 |  -8.232592   .2287976   -35.98   0.000    -8.681027   -7.784157
           time3 |   22.54454   .6930546    32.53   0.000     21.18618    23.90291
           time4 |  -21.72282   .9505297   -22.85   0.000    -23.58582   -19.85982
           time5 |   9.229315   .9854686     9.37   0.000     7.297832     11.1608
           time6 |  -2.798356   .9505318    -2.94   0.003    -4.661364    -.935348
                 |
    prov#c.time1 |
              1  |   .5111577   .0252974    20.21   0.000     .4615757    .5607396
              2  |   .0123322   .0252974     0.49   0.626    -.0372498    .0619142
                 |
    prov#c.time2 |
              1  |  -5.692354   .3235686   -17.59   0.000    -6.326537   -5.058171
              2  |   .6234578   .3235686     1.93   0.054    -.0107251    1.257641
                 |
    prov#c.time3 |
              1  |   12.81385   .9801272    13.07   0.000     10.89284    14.73487
              2  |  -2.354465   .9801272    -2.40   0.016     -4.27548   -.4334513
                 |
    prov#c.time4 |
              1  |  -7.144312   1.344252    -5.31   0.000    -9.778997   -4.509626
              2  |   3.172587   1.344252     2.36   0.018     .5379018    5.807273
                 |
    prov#c.time5 |
              1  |  -1.813644   1.393663    -1.30   0.193    -4.545173    .9178858
              2  |  -1.611369   1.393663    -1.16   0.248    -4.342899     1.12016
                 |
    prov#c.time6 |
              1  |   2.381864   1.344255     1.77   0.076    -.2528274    5.016555
              2  |  -.0992673   1.344255    -0.07   0.941    -2.733959    2.535424
                 |
           _cons |   29.61729   1.265921    23.40   0.000     27.13613    32.09845
    ------------------------------------------------------------------------------
    
    ------------------------------------------------------------------------------
      Random-effects parameters  |   Estimate   Std. err.     [95% conf. interval]
    -----------------------------+------------------------------------------------
                   var(Residual) |   48.69839   1.510436      45.82618    51.75062
    ------------------------------------------------------------------------------

  • #2
    in my opinion this is best done via graphs (and maybe only possible using graphs); for an example see my version of RCS in STB 10 (available at the www.stata.com)

    Comment


    • #3
      I agree with Rich that graphs (or -margins-, if Stata had a built-in way to recognize splines) are the only way to interpret the effects implied by restricted cubic splines (or indeed, any non-linear splines). You may also be interested in Maarten Buis' -postrcspline- package which will allow you to explore post-estimation results after using splines.

      Comment


      • #4
        I havent tried with "mixed" regressions, but my package "f_able" (from ssc) also comes with its own spline functions.
        They would allow you to estimate effects of any nonlinear transformation including splines, as long as the derivatives exists, and the variable transformation is continuous.
        see https://www.stata.com/meeting/belgiu..._RiosAvila.pdf
        and https://journals.sagepub.com/doi/pdf...6867X211000005
        Fernando

        Comment


        • #5
          Dear FernandoRios , Leonardo Guizzetti and Rich Goldstein , thanks for your suggestion.

          Since I have to write a paper, if I understand you correctly, you suggest reporting the model you see in my post and not commenting on the coefficients.
          Instead, it is better commenting only the margins with any pairwise comparisons between the three provinces.

          Is it correct?

          Since the study is exploratory (there are no pre-established hypotheses), do you think the margins can be calculated in correspondence with the spline knots?

          Thanks a lot again,
          I wish you a good day.
          Gianfranco

          Comment


          • #6
            This is my opinion and others may have a different view. You can report the model, but when it comes to describing the model, using some kind of plot of margins would be helpful to understand what is actually happening. Margins should be calculated over the range of values of the variable that has been transformed so that you can show the nonlinear relationship being modeled.

            You may also wish to consider showing some support that splines better fit your data than simple relationships as a reviewer could easily ask you about it.

            Comment


            • #7
              I would do this graphically. Here is an example. As a bonus, it shows how to have both the smoothness of a cubic spline and still include discrete events like the Spanish flu pandemic.

              Code:
              // general settings
              clear all
              set scheme s1color
              
              // open and prepare example dataset
              sysuse uslifeexp
              
              keep year le_wmale le_wfemale le_bmale le_bfemale
              reshape long le_, i(year) j(gender_race) string
              rename le_ le
              gen byte group:group_lb = 1 if gender_race == "bfemale"
              replace  group          = 2 if gender_race == "wfemale"
              replace  group          = 3 if gender_race == "bmale"
              replace  group          = 4 if gender_race == "wmale"
              
              label define group_lb 1 "black female" ///
                                    2 "white female" ///
                                    3 "black male"   ///
                                    4 "white male"
                                    
              gen byte spanish_flu = (year == 1918)                      
              
              // make the spline
              mkspline spyear = year, cubic nknots(5)
              
              reg le (c.spyear1 c.spyear2 c.spyear3 c.spyear4 i.spanish_flu)##i.group
              
              // graph the results
              margins , over(year group)
              marginsplot, recastci(rarea) ciopts(astyle(ci)) ///
                           plotopts(msymbol(i)) ///
                           ylabel(,angle(0)) ytitle(life expectancy)
              Click image for larger version

Name:	Graph.png
Views:	1
Size:	126.3 KB
ID:	1709987
              ---------------------------------
              Maarten L. Buis
              University of Konstanz
              Department of history and sociology
              box 40
              78457 Konstanz
              Germany
              http://www.maartenbuis.nl
              ---------------------------------

              Comment


              • #8
                Dear Maarten Buis , thanks so much for your advice. I will try.

                My data (as I told you,it's a time series stratified by three geographic areas (three italian provinces)) is shown below.
                What I ask from the margins is to know if at certain timepoints the difference between the three provinces is statistically significant.
                Click image for larger version

Name:	Immagine1.png
Views:	2
Size:	107.2 KB
ID:	1710034

                I also wonder if using a linear regression model with splines is appropriate or not.
                After all, it's correlated data, and that doesn't meet the assumptions of linear regression.

                Maybe a model with an autoregressive correlation would be better?
                Something like that:
                mixed letpm ib3.prov##c.time* || prov:, residuals(ar 1, t(day))

                Thanks again for helping.
                ​​​​​​​Gianfranco.

                Attached Files

                Comment

                Working...
                X