Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • District recode issue for three province data for 3 different years

    I am using stata 14. I am using MICS data of 3 provinces. But the issue is variables are differently coded in all 3 provinces.
    Like District variable is coded from 1-36 as there are 36 districts in 1 province. In the province district codes are from 1-29 and in the 3rd province 1-32. When I append these data sets of 3 provinces in different years, only districts are 1-36.
    But analyzing and using fixed effects at the district level gives meaningful results. However, when I try to recode districts, it ranges from 1-97 and all my results collapse.
    I want to know how to fix this issue.
    . sum HH7 district div division
    Variable Obs Mean Std. Dev. Min Max
    HH7 11,587 16.6383 9.422488 1 36
    district 11,587 16.6383 9.422488 1 36
    div 11,587 4.302839 2.198942 1 9
    division 11,587 4.302839 2.198942 1 9

  • #2
    Originally posted by Chanda Moon View Post
    Like District variable is coded from 1-36 as there are 36 districts in 1 province. In the province district codes are from 1-29 and in the 3rd province 1-32. When I append these data sets of 3 provinces in different years, only districts are 1-36. But analyzing and using fixed effects at the district level gives meaningful results. However, when I try to recode districts, it ranges from 1-97 and all my results collapse.

    The implication is that if the districts are in different provinces, then they are not the same. Therefore, it will not be correct to append the datasets and use the dataset-specific codes as district 1 in province 1 is different from district 1 in province 2. That being the case, having 36+29+32= 97 districts in the appended dataset appears to be the correct approach. I do not understand what you mean when you say "the results collapse".
    Last edited by Andrew Musau; 21 Jun 2022, 07:04.

    Comment


    • #3
      Thank you. I do the same 36+29+32= 97. Before appending I recoded the districts and it range from 1-97, and then append.
      sum HH7 district
      Variable Obs Mean Std. Dev. Min Max
      HH7 11,550 16.64277 9.422681 1 36
      district 11,550 43.3181 29.05817 1 97
      But if I append without recoding the districts, i get meaningful results ( with mean if district 16.64 as in above table) using i.district in regression with 11500 observations.

      However, when i recode and append datasets , with same observations 11500 my results totally changes using i.district. ( now the mean of district is 43.31 as in above table)

      code: for province 2
      gen district = HH7
      recode district (1 = 66)(2 = 67) (3 = 68)(4 = 69) (5 = 70)(6 = 71)(7 = 72)(8 = 73)(9 = 74)(10 = 75) (11 = 76) (12 = 77)(13 = 78) (14 = 79) (15 = 80)(16 = 81) (17 = 82) (18 = 83)(19 = 84)(20 = 85)(21 = 86) (22 = 87)(23 = 88) (24 = 89) (25 = 90)(26 = 91) (27 = 92) (28 = 93)(29 = 94)(30 = 95)(31 = 96)(32 = 97)

      Is there other way to fix issue?

      Comment


      • #4
        But if I append without recoding the districts, i get meaningful results
        These results are not meaningful because as Andrew pointed out District 1 in Province 1, District 1 in Province 2, and District 1 in Province 3 are not the same district - they are three different districts and must be analyzed as such. If the results of doing so are not what you want, you have to improve your model, not combine districts that have nothing to do with each other.
        Last edited by William Lisowski; 21 Jun 2022, 07:51.

        Comment


        • #5
          Originally posted by William Lisowski View Post

          These results are not meaningful because as Andrew pointed out District 1 in Province 1, District 1 in Province 2, and District 1 in Province 3 are not the same district - they are three different districts and must be analyzed as such. If the results of doing so are not what you want, you have to improve your model, not combine districts that have nothing to do with each other.
          Yes. The results are meaningful without recoding , But when I recode as district 1 is different in 3 provinces, the results become meaningless.

          Comment


          • #6
            Perhaps we have a difference in terminology here.

            The results may be statistically significant, but they are without substantial meaning because they can combine two or three districts that have no relation to each other other than by coincidence having the same district code. If your district codes were assigned in a different order, the results would change, even though no data has changed.

            Comment


            • #7
              Originally posted by William Lisowski View Post
              Perhaps we have a difference in terminology here.

              The results may be statistically significant, but they are without substantial meaning because they can combine two or three districts that have no relation to each other other than by coincidence having the same district code. If your district codes were assigned in a different order, the results would change, even though no data has changed.
              Thank you for your guidance. How can I recode correctly, if I am doing something wrong

              Comment


              • #8
                You are not doing anything wrong - the recode into 97 values of district was the correct approach, and

                The analysis with 36 districts was incorrect - the i.District effect with 36 districts assumed the effect of district 1 was the same in each of the three provinces. That cannot be meaningful.

                Here is an example of an easier way to accomplish the recoding, using example data with three provinces.
                Code:
                clear all
                cls
                * Example generated by -dataex-. For more info, type help dataex
                clear
                input float(province district)
                1 1
                1 2
                1 3
                1 4
                2 1
                2 2
                2 3
                3 1
                3 2
                end
                egen prov_dis = group(province district)
                list, clean
                Code:
                . list, clean
                
                       province   district   prov_dis  
                  1.          1          1          1  
                  2.          1          2          2  
                  3.          1          3          3  
                  4.          1          4          4  
                  5.          2          1          5  
                  6.          2          2          6  
                  7.          2          3          7  
                  8.          3          1          8  
                  9.          3          2          9

                Comment

                Working...
                X