Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Graph of predicted probabilities after xtset

    Hello everyone,

    I have panel data for two cities and I would like to present the predicted probabilities of my depend variable (working from home) by observation age. I managed to do it (I think) for each city separately using the following code:

    xtset id _t
    xtlogit home age_observation if city==1, vce(cluster id)
    margins, predict(pr) at(age_observation=(15(1)60))
    marginsplot, noci //
    xlabel(15(5)60)

    I can do the same for the other city but I would like to have the distributions of the predicted probabilities for both cities on the same graph.
    Any idea? Thanks.
    Anne

  • #2
    Originally posted by Anne Calves View Post
    I have panel data for two cities and I would like to present the predicted probabilities of my depend variable (working from home) by observation age. I managed to do it (I think) for each city separately . . . I can do the same for the other city but I would like to have the distributions of the predicted probabilities for both cities on the same graph. Any idea?
    Fit an omnibus logistic regression model to data for both cities at once. Use an interaction term to separate slopes for age by city. See an example below for how. (Begin at the "Begin here" comment. Artificial dataset for illustration constructed above that. Variable names kept short for clarity. I've introduced a quadratic term for age for interest.)
    Code:
    version 19
    
    clear *
    
    // seedem
    set seed 954165096
    
    quietly set obs 250
    generate `c(obs_t)' pid = _n // Survey Participant ID
    generate double pid_u = rnormal()
    generate byte age0 = runiformint(5, 60)
    
    generate byte cid = mod(_n, 2) // City ID
    
    quietly expand 2
    bysort pid: generate byte age = age0 + runiformint(0, 3)
    
    generate byte out = rbinomial(1, invlogit(pid_u))
    
    *
    * Begin here
    *
    xtlogit out i.cid##c.age##c.age, i(pid) nolog
    margins cid, at(age=(15(5)60)) predict(pr)
    #delimit ;
    marginsplot ,
        xdimension(age) plotdimension(cid)
        title("") scheme(s2color)
        plotopts(mcolor(black) lcolor(black) mfcolor(white) msize(medium))
            plot1opts(msymbol(O)) plot2opts(msymbol(S))
        noci
        xtitle(Participant Age (y)) xlabel(15(15)60)
        ytitle(Proportion Working from Home)
            ylabel( , format(%04.2f) angle(horizontal) nogrid)
        legend(off);
    #delimit cr
    
    exit
    Last edited by Joseph Coveney; 25 Jun 2025, 22:28.

    Comment


    • #3
      Thanks Joseph!
      It took forever for my computer to run the margins command and then I got an error message : "invalid syntax"
      My syntax is :

      xtset id _t
      xtlogit domicile age_observation##city, nolog
      margins city, at(age_observation=(15(5)60)) predict(pr)


      It is because I'm using version 16?

      Comment


      • #4
        Originally posted by Anne Calves View Post
        It took forever for my computer to run the margins command and then I got an error message : "invalid syntax" . . . It is because I'm using version 16?
        No, probably it's because when you do not specify the nature of the variables in an interaction term, Stata assumes by default that both are categorical. If age_observation has many distinct values, then it will take forever and will probably overload things.

        Try this instead.
        Code:
        xtlogit domicile c.age_observation##i.city, i(id _t) nolog
        margins city, at(age_observation=(15(5)60)) predict(pr)
        See the help file for factor variables for further advice.

        Comment


        • #5
          It's worth noting that the results from running entirely separate regressions for the two cities are will not usually be identical to results from introducing a city fixed effect (which allows for different intercepts and age slopes for each city, but will also use information from both cities in the error structure for each). If you want to run separate regressions and overlay the graphs from each, one way to do it might be to use the community-contributed command coefplot available via the Stata Journal.

          Consider this:
          Code:
          webuse union, clear
          xtset idcode year
          
          xtlogit union c.age if south == 0, vce(cluster idcode)
          margins , predict(pr) at(age = (16(1)45) ) post
          est sto South_0
          
          xtlogit union c.age if south == 1, vce(cluster idcode)
          margins, predict(pr) at(age = (16(1)45) ) post
          est sto South_1
              
          coefplot South_0 South_1, at ///
              xtitle("Age in current year") ytitle("Pr(union) = 1") ///
              noci lwidth(*1) connect(l) ///
              xlabel(15(5)45) ///
              scheme(stcolor)
          which produces:
          Click image for larger version

Name:	Screenshot 2025-06-26 at 3.35.17 PM.png
Views:	1
Size:	166.4 KB
ID:	1779278

          Comment


          • #6
            Thank you Joseph. It worked! Hemanshu, thanks for your advice. I will look into the command coefplot as well.

            Comment

            Working...
            X