Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to rename around 200 rows of a matrix simultaneously or in iteration

    Hi, I'm very new to Stata and I've searched and tried answers for this question for three hours. I'm using Stata16. I have covid related data (daily cumulative/new cases/deaths data) of each country and I would like to set row's name to be according country's name. To examine on each country's data, I firstly split the data set to 195 dta files:
    levelsof location, local(level1)
    foreach x of local level1 {
    preserve
    keep if location=="`x'"
    save file`x',replace
    restore
    }


    Then I repeat a regression on each country's data using foreach, resulting in a matrix of R^2
    Click image for larger version

Name:	2021-08-02 155102.png
Views:	1
Size:	8.7 KB
ID:	1621648

    I've renamed columns. However, I don't know how to rename around 200 rows easily.

    I have two ideas. I firstly tried to use levelsof to extract location names from original covid related dataset and set it to be the rows' names:
    frame owid: levelsof location, local(rownames)
    matrix rownames R2 = `rownames'

    However, it seems that the dataset is another frame, the list of locations cannot be row names? The row names were not changed.

    Then, I tried to rename each row in iteration:
    local myfilelist: dir . files "*.dta"
    local rownames
    foreach filename of local myfilelist {
    use `filename', clear
    local rownames `rownames' `filename'
    }
    matrix rownames R2 = `rownames'

    But I got error message: fileafghanistan: operator invalid

    I'm sorry if I am not clear in questioning. I previously used other softwares and Stata is quite new for me. I will appreciate if you can give some advice on codes.

    Thank you for your time and help!


  • #2
    You are better off using ISO country codes to identify countries. Country names violate Stata's naming conventions, most prominently, no spaces. Otherwise, see

    Code:
    help strtoname()
    that may help with converting strings with invalid characters to names. The bigger problem with your example is that it appears you do not have control over what is stored in your local "rownames". With valid names, -levelsof- can create the local for you, and it sorts the names alphabetically. So when creating the matrix, make sure that the rows correspond to an alphabetical sort order. Here is an example:

    Code:
    sysuse census, clear
    *STATE2 IS THE 2 LETTER STATE IDENTIFIER
    sort state2
    mkmat pop popurban, mat(P)
    levelsof state2, local(states) clean
    mat rownames P= `states'
    mat l P
    Res.:

    Code:
    . mat l P
    
    P[50,2]
             pop  popurban
    AK    401851    258567
    AL   3893888   2337713
    AR   2286435   1179556
    AZ   2718215   2278728
    CA  23667902  21607606
    CO   2889964   2329869
    CT   3107576   2449774
    DE    594338    419819
    FL   9746324   8212385
    GA   5463105   3409081
    HI    964691    834592
    IA   2913808   1708232
    ID    943935    509702
    IL  11426518   9518039
    IN   5490224   3525298
    KS   2363679   1575899
    KY   3660777   1862183
    LA   4205900   2887309
    MA   5737037   4808339
    MD   4216975   3386555
    ME   1124660    534072
    MI   9262078   6551551
    MN   4075970   2725202
    MO   4916686   3349588
    MS   2520638   1192805
    MT    786690    416402
    NC   5881766   2822852
    ND    652717    318310
    NE   1569825    987859
    NH    920610    480325
    NJ   7364823   6557377
    NM   1302894    939963
    NV    800493    682947
    NY  17558072  14858068
    OH  10797630   7918259
    OK   3025290   2035082
    OR   2633105   1788354
    PA  11863895   8220851
    RI    947154    824004
    SC   3121820   1689253
    SD    690768    320777
    TN   4591120   2773573
    TX  14229191  11333017
    UT   1461037   1233060
    VA   5346818   3529423
    VT    511456    172735
    WA   4132156   3037014
    WI   4705767   3020732
    WV   1949644    705319
    WY    469557    294639
    Last edited by Andrew Musau; 02 Aug 2021, 03:03.

    Comment


    • #3
      Originally posted by Andrew Musau View Post
      You are better off using ISO country codes to identify countries. Country names violate Stata's naming conventions, most prominently, no spaces. Otherwise, see

      Code:
      help strtoname()
      that may help with converting strings with invalid characters to names. The bigger problem with your example is that it appears you do not have control over what is stored in your local "rownames". With valid names, -levelsof- can create the local for you, and it sorts the names alphabetically. So when creating the matrix, make sure that the rows correspond to an alphabetical sort order. Here is an example:

      Code:
      sysuse census, clear
      *STATE2 IS THE 2 LETTER STATE IDENTIFIER
      sort state2
      mkmat pop popurban, mat(P)
      levelsof state2, local(states) clean
      mat rownames P= `states'
      mat l P
      Res.:

      Code:
      . mat l P
      
      P[50,2]
      pop popurban
      AK 401851 258567
      AL 3893888 2337713
      AR 2286435 1179556
      AZ 2718215 2278728
      CA 23667902 21607606
      CO 2889964 2329869
      CT 3107576 2449774
      DE 594338 419819
      FL 9746324 8212385
      GA 5463105 3409081
      HI 964691 834592
      IA 2913808 1708232
      ID 943935 509702
      IL 11426518 9518039
      IN 5490224 3525298
      KS 2363679 1575899
      KY 3660777 1862183
      LA 4205900 2887309
      MA 5737037 4808339
      MD 4216975 3386555
      ME 1124660 534072
      MI 9262078 6551551
      MN 4075970 2725202
      MO 4916686 3349588
      MS 2520638 1192805
      MT 786690 416402
      NC 5881766 2822852
      ND 652717 318310
      NE 1569825 987859
      NH 920610 480325
      NJ 7364823 6557377
      NM 1302894 939963
      NV 800493 682947
      NY 17558072 14858068
      OH 10797630 7918259
      OK 3025290 2035082
      OR 2633105 1788354
      PA 11863895 8220851
      RI 947154 824004
      SC 3121820 1689253
      SD 690768 320777
      TN 4591120 2773573
      TX 14229191 11333017
      UT 1461037 1233060
      VA 5346818 3529423
      VT 511456 172735
      WA 4132156 3037014
      WI 4705767 3020732
      WV 1949644 705319
      WY 469557 294639

      Hi Andrew,

      Many thanks for your reply. I will try to use ISO country codes instead of country names. I noticed that you mentioned -levelsof- sorts alphabetically in reply for someone else, so I ensured it can work for my dataset. I wonder what should I do to control local rownames? I am quite confused about this part. As I understand, all R-squared are calculated using dta files in a directory, and aimed rownames are extracted from another dta file named owid. Can local rownames defined in frame owid be used in R2 matrix constructed using other dta files? I'm quite struggled with separate frames in Stata. I think variable m defined in frame A cannot be used in operations on frame B, right? Sorry, I'm too confused about the processing.

      Comment


      • #4
        I do not think that's it. This works:

        Code:
        clear
        mat A= 1\2\3\4
        mat l A
        frame create new
        frame new{
            sysuse auto, clear
            gen make2= strtoname(make)
            levelsof make2 if _n<5, local(names) clean
        }
        mat rownames A= `names'
        frame drop new
        mat list A
        As I stated, you probably need to pass the strings through -strtoname()- before running -levelsof-. If this does not help, post a data example using the dataex command of the variable containing the names that you want to retrieve (see FAQ Advice #12 for details).

        Res.:

        Code:
        . mat list A
        
        A[4,1]
                      c1
         AMC_Concord   1
           AMC_Pacer   2
          AMC_Spirit   3
        Buick_Cent~y   4
        
        .
        Last edited by Andrew Musau; 02 Aug 2021, 09:54.

        Comment


        • #5
          Originally posted by Andrew Musau View Post
          I do not think that's it. This works:

          Code:
          clear
          mat A= 1\2\3\4
          mat l A
          frame create new
          frame new{
          sysuse auto, clear
          gen make2= strtoname(make)
          levelsof make2 if _n<5, local(names) clean
          }
          mat rownames A= `names'
          frame drop new
          mat list A
          As I stated, you probably need to pass the strings through -strtoname()- before running -levelsof-. If this does not help, post a data example using the dataex command of the variable containing the names that you want to retrieve (see FAQ Advice #12 for details).

          Res.:

          Code:
          . mat list A
          
          A[4,1]
          c1
          AMC_Concord 1
          AMC_Pacer 2
          AMC_Spirit 3
          Buick_Cent~y 4
          
          .
          Andrew,

          Sure! I will try it. Great thanks!

          Comment

          Working...
          X