Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Collect Revising the Order of Coefficients

    I'm editing my paper for SJ, and I discovered something pretty interesting that's been described similarly elsewhere on Statalist. For my case, I found that a portion of syntax that uses "collect " will change the order of a matrix of coefficients that I wish to save as a collection. Consider this example
    Code:
    clear *
    
    loc panel id
    loc depvar gdpcap
    loc int_time = 1975
    
    
    import delim "https://raw.githubusercontent.com/SucreRouge/synth_control/master/basque.csv", clear
    replace regionname = "Asturias" if region == "Principado De Asturias"
    
    replace regionname = "Madrid" if region == "Madrid (Comunidad De)"
    replace regionname = "La Rioja" if region == "Rioja (La)"
    replace regionname = "Murcia" if region == "Murcia (Region de)"
    replace regionname = "Comunidad Navarra" if region == "Navarra (Comunidad Foral De)"
    
    
    replace regionname = "Basque" if regionname == "Basque Country (Pais Vasco)"
    labvars year gdpcap "Year" "ln(GDP per 100,000)"
    egen id = group(regionname), label(regionname) // makes a unique ID
    
    order id, b(year)
    
    
    drop if inlist(id,18)
    drop region v1
    xtset id year, y
    
    local lbl: value label `r(panelvar)'
    
    
    loc unit ="Basque":`lbl'
    
    g treat = cond(`r(panelvar)'==`unit' & `r(timevar)' >=`int_time',1,0)
    
    cls
    
    mat b = .1067506 ,  .58899527 ,  .24047106 ,  .02650127 ,  .75729063
    
    mat colnames b = "gdpcap3"   "gdpcap10"   "gdpcap15"   "gdpcap16"      "_cons"
    
    loc weight_cols: colsof b
    
    mat W = b[1, 1..`weight_cols'-1]'
    
    loc we: rowfullnames W
    // getting the rownames
    
    local newrow : subinstr loc we " " ",", all
    // put commas between these elements
    
    foreach v of var `depvar' {
    
    local newrow : subinstr local newrow "`v'" "", all
    
    qui levelsof `panel' if inlist(`panel',`newrow'), l(labs2) sep(",")
    
    cap decode `panel', g(id2)
    
    qui levelsof id2 if inlist(`panel',`labs2'), l(labs)
    
    mat rownames W = `labs'
    mat colnames W = Weights
    }
    
    cls
    mat l W
    mata : st_matrix("r(W)", st_matrix("W"))
    mata : st_matrix("r(W)", sort(st_matrix("r(W)"), 1))
    mata : st_matrixrowstripe("r(W)", st_matrixrowstripe("W"))
    mata : st_matrixcolstripe("r(W)", st_matrixcolstripe("W"))
    collect clear
    collect get r(W)
    collect style cell colname, nformat(%4.3f)
    
    collect layout (rowname)(colname)
    Now, let's break this down. Our original matrix is
    Code:
    . mat l b
    b[1,5]
          gdpcap3   gdpcap10   gdpcap15   gdpcap16      _cons
    r1   .1067506  .58899527  .24047106  .02650127  .75729063
    where the weights are (from left to right), Asturias (val label 3), Catalunia (val label 10), La Rioja (val label 15) and Madrid (val label 16). After I grab the labels and transpose, we get
    Code:
    mat l W
    W[4,1]
                Weights
    Asturias   .1067506
    Cataluna  .58899527
    La Rioja  .24047106
      Madrid  .02650127
    But, the collection table lists it as
    Code:
    collect layout (rowname)(colname)
    
    Collection: default
          Rows: rowname
       Columns: colname
       Table 1: 4 x 1
    
    ------------------
             | Weights
    ---------+--------
    Asturias |   0.027
    Cataluna |   0.107
    La Rioja |   0.240
    Madrid   |   0.589
    ------------------
    Which is not only out of order, but has drastic implications for how the analysis is interpreted in the first place. I still want to have the collection be an option for myself and others, but I need to have collect show the correct ordering and coefficients for it to be useful. In other words, I need the unit ID value label to match up with its respective row. How might I fix this? I should note that the basics of this approach is not one I came up with myself, it was suggested to me by Bjarte Aagnes and (I think) daniel klein, both of whom have been quite helpful in my endeavors so far.

  • #2
    You should take a look at John Mullahy's post in this thread: https://www.statalist.org/forums/for...ne-of-its-rows. It doesn't seem straightforward in Mata.

    Comment


    • #3
      Here is how you can incorporate my suggestion from that thread using Stata instead of Mata. The manipulations are done in a new frame.

      Code:
      clear *
      
      loc panel id
      loc depvar gdpcap
      loc int_time = 1975
      
      
      import delim "https://raw.githubusercontent.com/SucreRouge/synth_control/master/basque.csv", clear
      replace regionname = "Asturias" if region == "Principado De Asturias"
      
      replace regionname = "Madrid" if region == "Madrid (Comunidad De)"
      replace regionname = "La Rioja" if region == "Rioja (La)"
      replace regionname = "Murcia" if region == "Murcia (Region de)"
      replace regionname = "Comunidad Navarra" if region == "Navarra (Comunidad Foral De)"
      
      
      replace regionname = "Basque" if regionname == "Basque Country (Pais Vasco)"
      labvars year gdpcap "Year" "ln(GDP per 100,000)"
      egen id = group(regionname), label(regionname) // makes a unique ID
      
      order id, b(year)
      
      
      drop if inlist(id,18)
      drop region v1
      xtset id year, y
      
      local lbl: value label `r(panelvar)'
      
      
      loc unit ="Basque":`lbl'
      
      g treat = cond(`r(panelvar)'==`unit' & `r(timevar)' >=`int_time',1,0)
      
      cls
      
      mat b = .1067506 ,  .58899527 ,  .24047106 ,  .02650127 ,  .75729063
      
      mat colnames b = "gdpcap3"   "gdpcap10"   "gdpcap15"   "gdpcap16"      "_cons"
      
      loc weight_cols: colsof b
      
      mat W = b[1, 1..`weight_cols'-1]'
      
      loc we: rowfullnames W
      // getting the rownames
      
      local newrow : subinstr loc we " " ",", all
      // put commas between these elements
      
      foreach v of var `depvar' {
      
      local newrow : subinstr local newrow "`v'" "", all
      
      qui levelsof `panel' if inlist(`panel',`newrow'), l(labs2) sep(",")
      
      cap decode `panel', g(id2)
      
      qui levelsof id2 if inlist(`panel',`labs2'), l(labs)
      
      mat rownames W = `labs'
      mat colnames W = Weights
      }
      
      frame create sort
      frame sort{
          set obs `=max(`=colsof(W)', `=rowsof(W)')'
          svmat W, names(col)
          gen row=""
          local i 1
          foreach row of local labs{
              quietly replace row = "`row'" in `i'
              local ++i
          }
          sort Weights row
          ds row, not
          mkmat `r(varlist)', rown(row)  mat("W_sorted")
          local rows: di `"`:rowname W_sorted, quoted'"'
          local rows: subinstr local rows "_" " ", all
          mat rown W_sorted= `rows'
      }
      frame drop sort
      
      mat l W
      mat l W_sorted
      Res.:

      Code:
      . mat l W
      
      W[4,1]
                  Weights
      Asturias   .1067506
      Cataluna  .58899527
      La Rioja  .24047106
        Madrid  .02650127
      
      . mat l W_sorted
      
      W_sorted[4,1]
                  Weights
        Madrid  .02650127
      Asturias   .1067506
      La Rioja  .24047107
      Cataluna  .58899528
      Last edited by Andrew Musau; 09 Mar 2023, 17:07.

      Comment


      • #4
        The problem is with the last two lines below

        Code:
        mata : st_matrix("r(W)", st_matrix("W"))
        mata : st_matrix("r(W)", sort(st_matrix("r(W)"), 1))
        mata : st_matrixrowstripe("r(W)", st_matrixrowstripe("W")) // <- problem here
        mata : st_matrixcolstripe("r(W)", st_matrixcolstripe("W")) // <- problem here
        You are sorting the matrix values but grab the original row and column names. You need to also sort the rownames.

        Here is a simple function that I have posted elsewhere to sort a Stata matrix preserving row and column names:

        Code:
        void sort_st_matrix(
            
            string scalar    matname,
            real   rowvector columns
            
            )
        {
            string matrix   rownames
            real  colvector sort_order
            
            
            rownames = st_matrixrowstripe(matname)
            
            sort_order = order(st_matrix(matname), columns)
            
            st_replacematrix(matname, st_matrix(matname)[sort_order,])
            
            st_matrixrowstripe(matname, rownames[sort_order,])
        }

        The only necessary change in your code is then

        Code:
        mata : st_matrix("r(W)", st_matrix("W"))
        // mata : st_matrix("r(W)", sort(st_matrix("r(W)"), 1)) // <- no; do not sort here
        mata : st_matrixrowstripe("r(W)", st_matrixrowstripe("W"))
        mata : st_matrixcolstripe("r(W)", st_matrixcolstripe("W"))
        mata : sort_st_matrix("r(W)", 1) // <- instead, sort here, preserving row and column names
        Here is the relevant output

        Code:
        . collect layout (rowname)(colname)
        
        Collection: default
              Rows: rowname
           Columns: colname
           Table 1: 4 x 1
        
        ------------------
                 | Weights
        ---------+--------
        Madrid   |   0.027
        Asturias |   0.107
        La Rioja |   0.240
        Cataluna |   0.589
        ------------------
        Last edited by daniel klein; 10 Mar 2023, 01:21.

        Comment

        Working...
        X