Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Dummy variable with 2 conditions

    Hi everyone!

    I'm currently working with a panel dataset that holds data for the years 2010-2020 on residential property values and the distribution centres. I am attempting to create a dummy variable for each given year in the dataset and whether a distribution centre has been built. Thus, the dummy variable must hold for two conditions; the given year (2010, 2011, or 2012, etc.) and a distribution centre surface area which is greater than 1. So far I have tried the following commands but these haven't led to the desired results. I am hoping someone can help me or least tell me what I'm doing wrong.

    sum surface_area
    gen d.construction2010=0
    replace d.construction2010=1 if surface_area>1 & year==2010

    __________________________________________________ ____________

    gen d.construction2010=1 if surface_area>1 & year==2010


    Your help is much appreciated, thanks!

  • #2
    First of all, you can't have a Stata variable name with a . in the middle of it. So there's that.

    Code:
    forvalues y = 2010/2020 {
        gen d_construction`y' = (surface_area > 1) & (year == `y')
    }
    Note: As no example data was provided, this code is untested and may prove unworkable in your actual data. In the future, when asking for help with code, always show example data. And always use the -dataex- command to do that. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    Also, it is not helpful to say that something "didn't yield the desired results." After all, there are infinitely many ways that could happen, and you have provided no clue what results you got, nor how they depart from what you were looking for. In the future, when asking for troubleshooting of code you have tried, show the results you got (including any error messages) and explain what you don't like about those results unless it is blatantly obvious to anyone who can read.

    Comment


    • #3
      Sophie:
      what if you code:
      Code:
      d_construction
      instead of:
      Code:
      d.construction
      ETA: Clyde was already at the party when I reached it!! 😀
      Last edited by Carlo Lazzaro; 09 Jul 2022, 13:39.
      Kind regards,
      Carlo
      (Stata 19.0)

      Comment


      • #4
        Please note that Clyde's advice on Statalist expectations repeats advice given in your initial topic on Statalist a month ago.

        In preparation for your next topic, please take a few moments to review the Statalist FAQ linked to from the top of the page, as well as from the Advice on Posting link on the page you used to create your post. Note especially sections 9-12 on how to best pose your question. It is particularly helpful to copy commands and output from your Stata Results window and paste them into your Statalist post using code delimiters [CODE] and [/CODE] (which you had done successfully in your initial topic!), and to use the dataex command to provide sample data, as described in section 12 of the FAQ. There is also an expectation that if advice solves your problem, you will post one last time to the topic confirming that the advice led to a solution, so that others who are led to the topic by a search engine sometime later will know that the advice was tested and found helpful.

        The more you help others understand your problem, the more likely others are to be able to help you solve your problem.
        Last edited by William Lisowski; 09 Jul 2022, 14:39.

        Comment


        • #5
          Thank you all for your responses. As well as using "d." in my variable name, I tried using "d_" in my variable name. Nevertheless, this didn't make any difference.

          Below I have included a snippet of my dataset which might help in

          Code:
          * Example generated by -dataex-. To install: ssc install dataex
          clear
          input long(municipality_code year) byte construction_year long(surface_area jobs_DC roadlength_km TOTALJOBS AVERAGE_WOZ)
          484 2010 0 0 0  324  32300 241000
          484 2011 0 0 0  326  32600 238000
          484 2012 0 0 0  326  32400 234000
          484 2013 0 0 0  326  32000 225000
          484 2014 0 0 0  533  40500 220000
          484 2015 0 0 0  544  40400 217000
          484 2016 0 0 0  544  40700 215000
          484 2017 0 0 0  543  42500 219000
          484 2018 0 0 0  549  43300 231000
          484 2019 0 0 0  547  44200 245000
          484 2020 0 0 0  551  41400 264000
          502 2010 0 0 0  211  39600 203000
          502 2011 0 0 0  211  39900 196000
          502 2012 0 0 0  209  39400 193000
          502 2013 0 0 0  209  35400 188000
          502 2014 0 0 0  212  34100 176000
          502 2015 0 0 0  212  34300 171000
          502 2016 0 0 0  211  35900 172000
          502 2017 0 0 0  211  36000 178000
          502 2018 0 0 0  211  37200 189000
          502 2019 0 0 0  214  37500 202000
          502 2020 0 0 0  217  36000 228000
          503 2010 0 0 0  313  51500 198000
          503 2011 0 0 0  313  52900 195000
          503 2012 0 0 0  309  52000 196000
          503 2013 0 0 0  309  50600 190000
          503 2014 0 0 0  298  52000 181000
          503 2015 0 0 0  298  52300 177000
          503 2016 0 0 0  296  53400 179000
          503 2017 0 0 0  296  55300 192000
          503 2018 0 0 0  285  56500 191000
          503 2019 0 0 0  284  58000 210000
          503 2020 0 0 0  283  58100 240000
          513 2010 0 0 0  235  39300 210000
          513 2011 0 0 0  241  38700 206000
          513 2012 0 0 0  240  38600 197000
          513 2013 0 0 0  240  38600 192000
          513 2014 0 0 0  242  37800 182000
          513 2015 0 0 0  237  34200 174000
          513 2016 0 0 0  235  36400 176000
          513 2017 0 0 0  235  36900 180000
          513 2018 0 0 0  237  36800 188000
          513 2019 0 0 0  238  36100 206000
          513 2020 0 0 0  237  36900 222000
          518 2010 0 0 0 1118 276300 213000
          518 2011 0 0 0 1128 271800 210000
          518 2012 0 0 0 1115 268200 207000
          518 2013 0 0 0 1121 260500 199000
          518 2014 0 0 0 1118 261700 188000
          518 2015 0 0 0 1118 272000 184000
          518 2016 0 0 0 1125 266500 188000
          518 2017 0 0 0 1127 275700 197000
          518 2018 0 0 0 1133 281400 212000
          518 2019 0 0 0 1131 288100 242000
          518 2020 0 0 0 1134 283600 271000
          546 2010 0 0 0  350  65600 237000
          546 2011 0 0 0  349  68400 231000
          546 2012 0 0 0  347  68100 226000
          546 2013 0 0 0  350  68600 220000
          546 2014 0 0 0  357  67600 211000
          546 2015 0 0 0  356  69900 205000
          546 2016 0 0 0  357  69200 208000
          546 2017 0 0 0  354  69300 217000
          546 2018 0 0 0  349  69100 234000
          546 2019 0 0 0  348  70600 259000
          546 2020 0 0 0  349  72300 287000
          547 2010 0 0 0  112  10900 273000
          547 2011 0 0 0  112  11300 262000
          547 2012 0 0 0  112  10000 258000
          547 2013 0 0 0  112   9400 248000
          547 2014 0 0 0  108   9500 239000
          547 2015 0 0 0  109   9300 235000
          547 2016 0 0 0  109   9400 239000
          547 2017 0 0 0  109   9600 243000
          547 2018 0 0 0  110  10000 258000
          547 2019 0 0 0  114  10400 281000
          547 2020 0 0 0  115  10600 302000
          599 2010 0 0 0 2004 376400 164000
          599 2011 0 0 0 2061 371900 162000
          599 2012 0 0 0 2035 363300 159000
          599 2013 0 0 0 2052 357500 153000
          599 2014 0 0 0 2111 365000 148000
          599 2015 0 0 0 2120 377800 146000
          599 2016 0 0 0 2127 379000 147000
          599 2017 0 0 0 2123 388900 154000
          599 2018 0 0 0 2126 399600 166000
          599 2019 0 0 0 2125 408100 192000
          599 2020 0 0 0 2048 398100 222000
          1621 2010 0 0 0  374  20800 305000
          1621 2011 0 0 0  375  22300 295000
          1621 2012 0 0 0  374  23000 289000
          1621 2013 0 0 0  376  23900 282000
          1621 2014 0 0 0  376  23900 268000
          1621 2015 1 56327 5576373  377  25100 257000
          1621 2016 0 0 0  377  26200 264000
          1621 2017 0 0 0  376  26400 273000
          1621 2018 0 0 0  374  27100 287000
          1621 2019 0 0 0  386  26600 318000
          1621 2020 0 0 0  390  28800 349000
          end
          The commands I used to create a dummy variable for the 2 conditions specified are:
          Code:
           sum surface_area
          Code:
           gen d_construction2010=0
          Code:
           replace d_construction2010=1 if surface_area>1 & year==2010
          I hope this is of any help, thanks in advance!

          Comment


          • #6
            You can go

            Code:
            gen d_construction2020 = surface_area > 1 & year, == 2010
            as explained at e.g.

            https://www.stata.com/support/faqs/d...rue-and-false/

            https://www.stata.com/support/faqs/d...mmy-variables/

            https://www.stata-journal.com/articl...article=dm0099

            As your example data for 2010 has surface area zero for all observations, the indicator variable should be 0 whenever that is true.

            A check would be

            Code:
            tab year if surface_area > 1
            which should show how many values of 1 you expect for the corresponding indicators.
            Last edited by Nick Cox; 10 Jul 2022, 03:19.

            Comment


            • #7
              Thank you very much, much appreciated!

              Comment


              • #8
                Sophie:
                Code:
                . gen d_construction2020 = surface_area > 1 & year== 2010
                
                . bysort surface_area: tab d_construction2020
                
                ------------------------------------------------------------------------------------------------------------------------------------------
                -> surface_area = 0
                
                d_construct |
                    ion2020 |      Freq.     Percent        Cum.
                ------------+-----------------------------------
                          0 |         98      100.00      100.00
                ------------+-----------------------------------
                      Total |         98      100.00
                
                ------------------------------------------------------------------------------------------------------------------------------------------
                -> surface_area = 56327
                
                d_construct |
                    ion2020 |      Freq.     Percent        Cum.
                ------------+-----------------------------------
                          0 |          1      100.00      100.00
                ------------+-----------------------------------
                      Total |          1      100.00
                
                
                .
                Kind regards,
                Carlo
                (Stata 19.0)

                Comment


                • #9
                  Please ignore comma in first line of code in #6.

                  Comment

                  Working...
                  X