Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Graphing interaction between a categorical variable and restricted cubic spline

    Dear Statalist,

    I am having trouble producing an interaction plot between a (restricted cubic) spline variable and a nominal variable. I am using Stata 14.2 for Mac.

    My model is a poisson regression with robust standard errors producing relative risks. The dependent variable is non-receipt of surgery for breast cancer (no_surg). The independent variables are hormone receptor status (hr_status), which is a nominal variable with 3 levels and comorbidity severity score (c3), which is a restricted cubic spline with 3 knots. I ran a model with the main effects and interaction. I want to produce an interaction plot showing how the relationship between comorbidity severity and non-receipt of surgery varies by hormone receptor status. I would like to plot the range of C3 values from the 5th to the 95% centile, which is approximately 0 to 3 and includes non-integer values.

    I have explored using margins however this is problematic as margins does not recognise that the two variables that make up the spline belong together.

    Code:
    // Create a restricted cubic spline for C3
    mkspline c3spline = c3, knots(0 0.1 2.8) cubic
    
    // Estimate the regression model with main effects & interaction
    qui poisson no_surg c.c3spline*##hr_status, vce(robust) irr
    
    // Obtain predictions
    qui margins, at(c3=(0(0.1)3)) over(hr_status) expression(exp(xb()))
    
    // Error message
    c3 ambiguous abbreviation
    r(111);
    I then trialled an approach suggested by Maarten Buis1 and mentioned in a previous Statalist post (https://www.statalist.org/forums/for...-cubic-splines). This uses margins to graph predictions for a continuous variable (fixed at various functions of the mean and standard deviation) against a linear spline. The issue I had with this was that C3 contains non-integer values.

    Code:
    // Obtain predictions
    qui margins, at(hr_status=1 hr_status=2 hr_status=3) over(c3) expression(exp(xb()))
    
    // Error message
    only integer values are allowed in by() variables
    r(198);
    I also tried looking at the interaction between hormone receptor status and age (also a restricted cubic spline with 3 knots) using the same approach thinking this would work given age has only integers. However this produced a graph opposite to what I am after, that is, with hormone receptor status on the x axis and predictions for every value of age. I figure this is because I am using a categorical variable rather than continuous as was used in the example.

    Code:
    // Create a restricted cubic spline for age
    mkspline agespline = age, knots(38 57 83) cubic
    
    // Estimate the regression model with main effects & interaction
    qui poisson no_surg c.age*##hr_status, vce(robust) irr
    
    // Obtain predictions
    qui margins, at(hr_status=1 hr_status=2 hr_status=3) over(age) expression(exp(xb()))
    
    // Plot predictions
    marginsplot, noci
    Any advice on this problem would be greatly appreciated.

    Reference
    Buis, M. Additional examples for: Stata tip #: Exploring model consequences by plotting different predictions (Translated to Stata > 12). Cited [2018 Mar 25]. Available from http://maartenbuis.nl/wp/inter_quadr...quadr_new.html.

  • #2
    Unfortunately, this is complicated. You cannot use an -at(c3 = anything)- option in your -margins- command because c3 itself is not part of the regression model. You have to phrase it in terms of the values of the c3spline* variables that correspond to the values of c3 you are interested in. This means that you have to have Stata calculate those values, and then you have to write out a series of -at()- options, each specifying the values of the c3spline variables corresponding to a single value of c3.

    Regarding the age interaction graph not coming out as desired, -marginsplot- has an -xdimension()- option that allows you to specify which variable you want on the horizontal axis.

    Comment


    • #3
      Code:
      // prepare data (make sensible units)
      sysuse auto, clear
      gen wtonne = 0.000453592*weight
      label var wtonne "weight in metric tons"
      gen price1000 = price/1000
      label var price1000 "price in 1000 $s"
      
      // make cubic spline
      mkspline wei=wtonne, cubic nknots(3)
      
      // estimate model
      poisson price1000 foreign##(c.wei1 c.wei2) i.rep78, irr vce(robust)
      
      // we are changing the data in a way we don't want to store
      preserve
      
      // fix rep78 at 3
      replace rep78 = 3
      
      // predict the price
      predict pricehat , n
      
      // create a predicted price for every value of foreign
      // with nice labels
      separate pricehat, by(foreign) veryshortlabel
      
      // create the graph
      twoway line pricehat? wtonne, sort ///
         ytitle(price in 1000s dollars)  ///
         note(for cars with an average repair status)
         
      // undo the changes to the data
      restore
      ---------------------------------
      Maarten L. Buis
      University of Konstanz
      Department of history and sociology
      box 40
      78457 Konstanz
      Germany
      http://www.maartenbuis.nl
      ---------------------------------

      Comment


      • #4
        Thank you Clyde and Maarten for your very helpful suggestions.

        I have a couple of queries following from this.

        My example above was a simplification of my actual model which contains other covariates and interactions. Thank you Maarten for explaining how to fix the covariates at desired representative values. One of my covariates however is also a splined variable (age - a RCS with 3 knots - variable called agespline). I'd like to fix age at 60 years (approximately the median) but I'm not sure how to go about this given that the spline is made up of 2 variables. Fixing age itself at 60 results in an unusable graph (see attached). My model also contains another interaction, between C3spline and tumour grade (grade), a 3 level ordinal variable. I will fix grade at its' base category (1) but do I need to address the interaction term at all?

        Also I would be very grateful if you could clarify the meaning of the Y axis in the context of a Poisson model for a binary outcome - I've got myself very confused.

        Code:
        // Make cubic splines
        mkspline c3spline = c3, knots (0 0.1 2.8) cubic
        mkspline agespline = age, knots (38 57 83) cubic
        
        // Estimate model
        poisson no_surg c3spline* agespline* i.grade i.hr_status c.c3spline#hr_status c.c3spline#grade, vce(robust) irr 
        
        preserve
        
        // Fix covariates
        replace grade = 1
        replace age = 60                       // ??
        
        // Predict incidence rate
        predict no_surghat, ir
        
        // Create predicted incidence rate for every value of hr_status
        seperate no_surghat, by(hr_status) veryshortlabel
        
        // Create graph
        twoway line no_surghat? c3, sort, if inrange(c3,0,3)
        Attached Files

        Comment

        Working...
        X