Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Creating a dummy variable - New User of Stata 15

    Hello everyone, undergraduate Bsc Economics student here, I am struggling to use Stata 15. This may be a very simple question to those of you that are competent or experienced in using Stata, but it's has me confused.

    I have a dataset (Argentinian enterprise panel survey) containing 527 variables from the years 2015,2016,2017 and 2018.

    I am trying to
    1. generate a new variable exp_dum (export dummy) which equals 1 if sales_exp is positive, otherwise 0.
    2. generate new variable foreign_dum (foreign ownership dummy) which equals 1 if foreign is positive, otherwise 0.
    3. generate new variable soe_dum (state ownership dummy) which equals 1 if soe is positive, otherwise 0.
    Additionally, I want to produce a summary statistic table of all variables to include the three dummies variables. including observation number, mean, standard deviation, minimum, and maximum values etc

    How would I go about generating these three dummy variables in stata 15?

    Many thanks in advance

    Kieran


  • #2

    A general recipe for an indicator (you say dummy) that is 1 for positive values of a given variable, 0 for zero or negative values of the same and missing for missing ditto is


    Code:
    gen wanted  = myvar > 0 if myvar < .
    See documentation at

    https://www.stata.com/support/faqs/d...mmy-variables/

    https://www.stata.com/support/faqs/d...rue-and-false/

    https://www.stata-journal.com/articl...article=dm0099 (if you have access)

    Comment


    • #3
      Kieran (but seemingly subscribed as Keiran):
      as an aside to Nick's helpful advice, you can create a table collecting all the descriptice statistics you're interested in via -tabstat-:
      Code:
      . use "C:\Program Files\Stata16\ado\base\a\auto.dta"
      (1978 Automobile Data)
      
      . tabstat foreign rep78, stat(count mean sd p50 min max)
      
         stats |   foreign     rep78
      ---------+--------------------
             N |        74        69
          mean |  .2972973  3.405797
            sd |  .4601885  .9899323
           p50 |         0         3
           min |         0         1
           max |         1         5
      ------------------------------
      Moreover, if you're dealing with panel data, take a look at the -xt- suite in Stata .pdf manual.

      Kind regards,
      Carlo
      (Stata 19.0)

      Comment


      • #4
        Hello,

        Thank you that has helped a lot. I hope you do not mind I have one more question if you are able to assist. The data I am using for my analysis is panel data, which command may I use for panel data if I want to produce a summary statistics table that contains the observation, mean, sd, min and max values of all variables including the new indicator variables I have now generated.

        Kindest Regards

        Keiran
        (Stata 15)

        Comment


        • #5
          tabstat works fine with panel data. Did you try it?

          Comment


          • #6
            Keiran:
            as Nick claim, you can easily -tabstat- with panel data, too:
            Code:
            se "C:\Program Files\Stata16\ado\base\a\auto.dta"
            . tabstat age race , stat(count mean sd p50 min max)
            
               stats |       age      race
            ---------+--------------------
                   N |     28510     28534
                mean |  29.04511  1.303392
                  sd |  6.700584  .4822773
                 p50 |        28         1
                 min |        14         1
                 max |        46         3
            ------------------------------
            That said (but possibly off topic here), the -xt- suite provides other commands to explore some features of your data that may be good to know before challenging yourself with panel data regression:
            Code:
            . xtsum age
            
            Variable         |      Mean   Std. Dev.       Min        Max |    Observations
            -----------------+--------------------------------------------+----------------
            age      overall |  29.04511   6.700584         14         46 |     N =   28510
                     between |             5.485756         14         45 |     n =    4710
                     within  |              5.16945   14.79511   43.79511 | T-bar = 6.05308
            
            . xtsum race
            
            Variable         |      Mean   Std. Dev.       Min        Max |    Observations
            -----------------+--------------------------------------------+----------------
            race     overall |  1.303392   .4822773          1          3 |     N =   28534
                     between |             .4862111          1          3 |     n =    4711
                     within  |                    0   1.303392   1.303392 | T-bar = 6.05689
            
            .
            Kind regards,
            Carlo
            (Stata 19.0)

            Comment


            • #7
              Hello Nick & Carlo,

              Thank you for the help, tabstat has worked fine, I am now using asdoc package to save the summary statistics table to a word document. However, is there a way I can reduce the values to 3 decimal places?

              The command I am using is:

              asdoc tabstat 'variables', stat(count mean sd p50 min max), save(ICC)

              This method works, but I would like to reduce the values on the table to 3 decimals.

              kindest regards

              Keiran

              Comment


              • #8
                I've never used asdoc (which you should please explain as community-contributed). Attaullah Shah, the program author, or one of its users is likely to help.

                Comment

                Working...
                X