Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Coding variables in Sata

    Hi. My dataset is panel data organized from year 2010 to 2015. However, some of the data from 2015 is subject to low quality due various reasons. However, I am able to fix/correct some of the problems myself with the help of you guys, hopefully.

    The data is organized by
    Code:
     sort company year
    Code:
    xtset company year
    Some of the variables have no observations in 2015, but some of my data have not changed over the given timeperiod. For instance, a company which is in the health sector is unlikely to suddenly have changed sector to manufacturing, this is an assumption I have to make. Based on this reasoning, I want to give the variable SICCODE the same value in 2015 as in 2014(or 2013 for that matter, but 2014 is enough). So the values for company x is equal to each other in 2014 and 2015.

    I have tried with the command.


    Code:
    gen SICCODEDUMMY =
    Code:
    replace SICCODEDUMMY=1 if year==2012 & SICCODE==1 
    Code:
    replace SICCODEDUMMY=1 in year==2015 if BIG4==1 in year==2014
    I`m not sure how to go about it other than using the replace command. But there must be one way I just could say that SICCODE in year 2014 for company Y = SICCODE in year 2015 for company Y. Have anyone been involved in something similar? It should be an easy command, but I have used some hours now without getting any way.

    Best regards, Rune.

  • #2
    neither your question nor your data are clear; however, at a guess, the "carryforward" command from SSC will help

    Comment


    • #3
      Alternatively, check out mipolate (SSC) See announcement in http://www.statalist.org/forums/foru...-interpolation

      Comment


      • #4
        I will check in to the carryforward command. But to simplify the question : How can one copy/duplicates a variable in 2014 to 2015. In my case with (company and year), I want the SIC code for company 4444 to be the same in 2015 as in 2014.

        Comment


        • #5
          That doesn't simplify the question: it just gives an example of your problem. It is not fact helpful to you to be thinking of replacing values in each observation individually, which would just commit you to many, many operations.

          Commenting specifically on code like

          Code:
           replace SICCODEDUMMY=1 in year==2015 if BIG4==1 in year==2014
          I note that there are several problems here:

          1. in is not a synonym for if: it refers only to observation numbers and not observations defined by some logical condition. So your in conditions would make sense only if translated to if.

          2. But if is allowed at most once in a command (like this).

          3. The if conditions could be combined into one, but a bigger problem is that nothing in your code instructs Stata to look in any observation but the present observation. (That's the intent, clearly, but it won't be achieved by this syntax.)

          4. Nothing in your code uses the panel structure either directly or indirectly.

          Trying to work out the problem from first principles is admirable but in this case it is better to start with programs dedicated to the task. Not surprisingly, I prefer my own program here....

          There is a tutorial FAQ at http://www.stata.com/support/faqs/da...issing-values/ with some relevance to your problem. Reading that would give more insight into technique here, but I recommend that you first identify a program that works for you and then try to understand how it does that.
          Last edited by Nick Cox; 12 Oct 2015, 04:37.

          Comment


          • #6
            Thanks for the insightful feedback, Nick. This was highly appreciated.

            After installing mipolate I made several attempts to copy the observations from year 2012 to 2013. This was the closest I arrived with the following command:

            Code:
             mipolate NEGEQUITYyear, forward gen (new variable)
            NEGEQUITY2013 variable is a binary variable with the value 1 if the company is has negative equity and 0 otherwise.

            By using the mipolate command, I get that hat all companies in both 2012 and 2013 achieve the number “1” on my new variable (both positive and negative equity firms).

            How can I solve this problem and assign the the correct value (0,1) to to the designated company`? Now, all of my companies in 2013 tboth with negative EQ and positive EW ake the value 1.


            Best regards, Rune =)

            Comment


            • #7
              Sorry, but I can't see your dataset and don't know anything about it that you don't. The syntax you give is quite illegal as a new variable name can't include a space. Presumably you're rewriting it, but the FAQ warns absolutely that you should give the exact syntax you used.

              It's a clear implication of the help for mipolate that you must specify the panel structure if there is one using by:

              Comment


              • #8
                I will try to describe my dataset once again.

                My dataset includes year, id number of the company and negative equity dummy.

                Year goes from 2010-2013.

                So it will be like this:



                +------------------------------------+
                pcid year BIG4DU~Y ly80
                ------------------------------------
                1. 686910169 2010 0 .
                2. 686910169 2011 0 .
                3. 686910169 2012 0 1
                4. 686910169 2013 0 1
                5. 686913889 2010 0 .
                ------------------------------------
                6. 686913889 2011 0 .
                7. 686913889 2012 0 1
                8. 686913889 2013 0 1
                9. 686936219 2010 0 .
                10. 686936219 2011 0 .
                ------------------------------------
                11. 686936219 2012 0 1
                12. 686936219 2013 0 1
                13. 686974799 2010 1 .
                14. 686974799 2011 1 .
                15. 686974799 2012 1 1
                ------------------------------------
                16. 686974799 2013 0 1
                17. 686981919 2010 1 .
                18. 686981919 2011 1 .
                19. 686981919 2012 0 1
                20. 686981919 2013 0 1
                +------------------------------------+




                I have now used the BIG4DUMMY instead of the NEGEQUITY dummy, but it is exactly the same problem with this variable. My dataset includes many companies, but I give here an example of three. The observations from all companies in 2013 is equal to missing. My goal is to set the 2013 observations equal to the 2012 numbers. By using the code:



                Code:
                mipolate NEGEQUITY year, forward gen(new variable)
                Then, I achieve the "new variable" as you can see from the table above. I have to many compnanies in mye dataset to use the "replace" command. Will this help you to understand my data and find the right command to achieve my goal?

                Comment


                • #9
                  What I am guessing youre attempting to achieve is the following

                  Code:
                  gen SICCODEDUMMY = SICCODE
                  bys company: replace SICCODEDUMMY = SICCODE[_n-1] if Year == 2015

                  Comment


                  • #10
                    Jorrit Gosens: That fixed all of my problems for both variables. Thanks, highly appreciated.

                    Nick Cox : Looks like you don`t always need to overcomplicate things. Atleast, I have learned something about Mipolate, so the effort was not in vain.

                    Best regards, Rune.

                    Comment


                    • #11
                      Rune:

                      Thanks for the listing, which is helpful. There are no examples in the data given of values missing in 2013, so I won't use it below. Note that anyone using your listing would need to edit the output In any future posts, so please note the advice in the Statalist FAQ and in the "New topic" text to use dataex (SSC) .

                      The interpolation method you ended up using is precisely that explained in the FAQ I wrote and then referenced in #5. (However, an important detail is flagged in my next post.) Did you read it?

                      mipolate is a general interpolation tool offering a variety of techniques. For completeness, and because others may be interested, I show how to use it for your problem. Far from being an attempt to over-complicate matters, it is, I believe, the most versatile interpolation tool available for Stata.

                      Code:
                      bysort pcid: mipolate BIG4DUMMY year, forward gen(BIG4DUMMY_1)
                      
                      by pcid: mipolate BIG4DUMMY year, groupwise gen(BIG4DUMMY_2)
                      The forward method is the one you tried and is equivalent to the method you used. However, as flagged in #5 and again in #7 it is essential in any interpolation syntax to respect panel structure. In the case of mipolate, and as already pointed out, by: is essential.

                      The forward interpolation method is popular but not necessarily appropriate unless the evidence is consistent with invariant properties. If you try what I have called the "groupwise" method you will see that it fails whenever the indicator variable (in this example) is not constant within groups when non-missing.

                      See also http://www.stata.com/support/faqs/da...ions-in-group/


                      Last edited by Nick Cox; 13 Oct 2015, 08:11.

                      Comment


                      • #12
                        Notice that the code in #9 will overwrite non-missing values if they exist.

                        Code:
                         
                         bys company: replace SICCODEDUMMY = SICCODE[_n-1] if Year == 2015 & missing(SICCODEDUMMY)

                        Comment

                        Working...
                        X