Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Generating variables based on other variables

    Hello everyone,
    For my bachelor thesis I examine the effect of business terrorism and non-business terrorism on cross-border M&A. I am struggling with making and combining the dataset for this research for a week now, so I hope someone can help me.

    For M&A deals, I extract data from Thomson One. I have a dataset with all the deals per row, from 1980 till 2016. The variables are target country, acquirer country, deal value and deal date.

    For terrorism data I extract data from GTD (global terrorism database). First, I adjusted the excel file, and then I uploaded it to Stata.
    In the rows I have single terror attack since 1980 till 2016, with variables of date, country, number of deaths, and target type. Target type consists of 16 categories, the first is business and the other 15 are non-business.

    I tried to work with the data in excel. I wanted to count all terror attacks specified by target type, year and country. The results should be the sum of all business and non-business terror attacks and the sum of all casualties per country per year. But to do this manually in excel would take a whole week with the amount of data I have.

    So I want to create 4 variables:
    1. BUSSTERR: number of business terrorist attacks per country per year
    2. NONBUSSTERR: number of non-business terrorist attacks per country per year
    3. BUSCAS: number of business casualties per country per year
    4. NONBUSCAS: number of non-business casualties per country per year.
    I hope someone can help me, if you have more questions ask them!



  • #2
    Asking for code without showing example data (descriptions never have enough information, now matter how hard you try) is like asking for driving directions without saying where you are starting from. Nobody can possibly answer. So, use the -dataex- command to post an example of your terrorism data here, and I'm confident you will get a helpful response in a timely manner.

    If you are running Stata version 15.1 or a fully updated version 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    When asking for help with code, always show example data. When showing example data, always use -dataex-.

    Also, even if it were easy to do, you should not be doing data management and analysis in Excel. For any serious purpose, and I count a bachelor's thesis as a serious purpose, you need a complete beginning-to-end audit trail of everything you do. Excel does not create any records of what you do with the data. So it is not a poper data management or analysis tool. It is fine for sharing data among people who do not have a common statistical analysis program, or for creating displays of final results for viewing by human eyes. But it should never be used for actual analysis of any kind, unless you are just "playing around." Not only should you not use it for data management or analysis, you should refuse to accept from anybody else results which lack a complete audit trail.

    Comment


    • #3
      Thank you for your quick response!
      I will insert my example data a soon as possible. I also will start over with my data management in Stata.

      I run Stata 14.0, I installed Dataex. I will figure out how to use this to upload an example.
      Last edited by Eszti Bambacht; 26 Apr 2018, 12:48.

      Comment


      • #4
        Thank you for the example. I assume the real data set is not restricted to a single year's worth of data.

        I also assume that by casualties you mean the total of killed plus wounded; I believe that is the standard definition.

        I notice that some observations have missing values for killed or wounded, which makes calculating those totals not, strictly speaking, possible. In the code I show below, I assume that you want to treat those missing values as if they were zeroes.

        In your example, at least, targtype1 is never missing. The code I show below depends crucially on this being true, and so I test for it in the beginning of the code. If that is not true in your real data, you will have to explain how you want to treat observations with missing targtype1 to get code that accomplishes your goal.

        You indicated you want to create four new variables in the data:
        Code:
        assert !missing(targtype1)
        by country iyear, sort: egen busterr = total(targtype1 == 1)
        by country iyear: egen nonbusterr = total(targtype1 != 1)
        gen casualties = nkill + nwound
        by country iyear: egen buscas = total(cond(targtype == 1, casualties, .))
        by country iyear: egen nonbuscas = total(cond(targtype != 1, casualties, .))
        But from context, I'm guessing that what you really want to do is create a new data set that contains the country and year along with those four variables but nothing else, and then perhaps go on to merge that with your other data set. While you can get to that point from the above code, a more direct route would be:

        Code:
        assert !missing(targtype1)
        gen bus_target = cond(targtype == 1, "bus", "nonbus")
        gen casualties = nkill + nwound
        collapse (count) terr = targtype1 (sum) cas = casualties, by(country_txt iyear bus_target)
        reshape wide @terr @cas, i(country_txt iyear) j(bus_target) string

        Comment


        • #5
          Thank you very much!
          Yes, casualties is wounded plus killed. Real data set is indeed from 1980 to 2016.
          There are no missing values for targtype1.

          I run the code in Stata, first there was an error message. But one variable name was incorrect. It worked!
          Last edited by Eszti Bambacht; 26 Apr 2018, 14:23.

          Comment


          • #6
            I want indeed the merge these dataset with another dataset. Can you explain the difference between the two options?

            Comment


            • #7
              Did you run both of the codes shown in #5? While the numeric results they produce are the same, the layout and organization of the results is different. Perhaps open up two instances of Stata and run one of them in each. Then look at the two resulting data sets side-by-side and you will see what the difference is.

              Comment


              • #8
                Yes I did, thank you. The second one is indeed better. I do have a lot of missing variables in the new dataset..

                Comment


                • #9
                  When I run my dataset, the new variable busterr and nonbusterr have a lot of missing values, whereas the original dataset didnt have any missing values on those variables


                  Code:
                  * Example generated by -dataex-. To install: ssc install dataex
                  clear
                  input int year str32 tarnatname int tarnatnum long busterr double buscas long nonbusterr double nonbuscas
                  1987 "Afghanistan"          4  .   .    1     2
                  1988 "Afghanistan"          4  .   .   11   234
                  1989 "Afghanistan"          4  4  71    6    23
                  1990 "Afghanistan"          4  .   .    2   134
                  1991 "Afghanistan"          4  2   4   28   191
                  1992 "Afghanistan"          4  1  81   35   153
                  1994 "Afghanistan"          4  .   .    9    52
                  1995 "Afghanistan"          4  .   .    6    15
                  1996 "Afghanistan"          4  .   .    4    61
                  1997 "Afghanistan"          4  .   .    1    41
                  1998 "Afghanistan"          4  .   .    1    38
                  1999 "Afghanistan"          4  .   .    9    70
                  2000 "Afghanistan"          4  1   0   13    24
                  2001 "Afghanistan"          4  1   0   13    83
                  2002 "Afghanistan"          4  4  77   34   289
                  2003 "Afghanistan"          4  3   9   97   329
                  2004 "Afghanistan"          4  2  27   86   477
                  2005 "Afghanistan"          4 11  40  144   644
                  2006 "Afghanistan"          4 12  25  270  1507
                  2007 "Afghanistan"          4 25  83  315  2133
                  2008 "Afghanistan"          4 36 147  379  2229
                  2009 "Afghanistan"          4 70 291  432  2363
                  2010 "Afghanistan"          4 57 306  484  2411
                  2011 "Afghanistan"          4 29 557  392  2799
                  2012 "Afghanistan"          4 37 238 1431  7827
                  2013 "Afghanistan"          4 32 129 1409  7645
                  2014 "Afghanistan"          4 30 190 1791  9604
                  2015 "Afghanistan"          4 48 301 1879 12159
                  2016 "Afghanistan"          4 26 107 1589 12088
                  1981 "Albania"              5  .   .    1     0
                  1990 "Albania"              5  .   .    1     0
                  1991 "Albania"              5  .   .    1     0
                  1992 "Albania"              5  .   .    3     1
                  1994 "Albania"              5  .   .    2     1
                  1996 "Albania"              5  1  35    5     9
                  1997 "Albania"              5  4  20   37    83
                  1998 "Albania"              5  .   .    7     5
                  1999 "Albania"              5  .   .    3    11
                  2000 "Albania"              5  .   .    2     0
                  2001 "Albania"              5  .   .    1     0
                  2003 "Albania"              5  .   .    1     0
                  2009 "Albania"              5  .   .    1     0
                  2013 "Albania"              5  .   .    1     0
                  2014 "Albania"              5  .   .    2     3
                  2015 "Albania"              5  1   0    3     0
                  2016 "Albania"              5  .   .    2     0
                  1990 "Algeria"              6  .   .    2     4
                  1991 "Algeria"              6  1   0   29   115
                  1992 "Algeria"              6  8   5  207   505
                  1994 "Algeria"              6 20  36  207   597
                  1995 "Algeria"              6 22 193  163  1285
                  1996 "Algeria"              6 27 330  102  1291
                  1997 "Algeria"              6 27 236  317  5812
                  1998 "Algeria"              6 10  27  141  1398
                  1999 "Algeria"              6  3  46  103   690
                  2000 "Algeria"              6  7  76  131   691
                  2001 "Algeria"              6 10  73  103   733
                  2002 "Algeria"              6  8  33  124   825
                  2003 "Algeria"              6  3   2   72   348
                  2004 "Algeria"              6  2   4   65   343
                  2005 "Algeria"              6  2   3  102   440
                  2006 "Algeria"              6  4   6  148   445
                  2007 "Algeria"              6  7  13  117   879
                  2008 "Algeria"              6  5  32  102   545
                  2009 "Algeria"              6  6  20  102   386
                  2010 "Algeria"              6  3   9   97   307
                  2011 "Algeria"              6  .   .   15    59
                  2012 "Algeria"              6  1   0   40   135
                  2013 "Algeria"              6  2  80   20    69
                  2014 "Algeria"              6  .   .   13    67
                  2015 "Algeria"              6  1   0   15    30
                  2016 "Algeria"              6  .   .    9    11
                  1981 "Angola"               8  .   .    1     3
                  1982 "Angola"               8  .   .    2     2
                  1983 "Angola"               8  .   .    7   601
                  1984 "Angola"               8  1   3   10   246
                  1985 "Angola"               8  1   0    5     5
                  1986 "Angola"               8  1   2    4   140
                  1987 "Angola"               8  .   .    3    15
                  1988 "Angola"               8  2   0   10   266
                  1989 "Angola"               8  2   1   10    97
                  1990 "Angola"               8 11  19  194   345
                  1991 "Angola"               8  2   4   14   140
                  1992 "Angola"               8 10   8   40   145
                  1994 "Angola"               8  3   4    6   179
                  1995 "Angola"               8  1  10    9    68
                  1996 "Angola"               8  1   9    3    15
                  1997 "Angola"               8  .   .    7    66
                  1998 "Angola"               8  2  53   18   102
                  1999 "Angola"               8  4   8   30   456
                  2000 "Angola"               8  2   1   20   107
                  2001 "Angola"               8  3  22   37  1162
                  2002 "Angola"               8  .   .    6    52
                  2009 "Angola"               8  .   .    1     2
                  2010 "Angola"               8  1   2    1     9
                  2016 "Angola"               8  1   0    1     0
                  1992 "Antigua and Barbuda" 10  .   .    1     0
                  1980 "Argentina"           11  1   0    5     1
                  1981 "Argentina"           11  .   .    2     0
                  1982 "Argentina"           11  4   0    5     1
                  1983 "Argentina"           11  .   .   18     4
                  1984 "Argentina"           11  5   0   41     5
                  1985 "Argentina"           11  6   3   37     9
                  1986 "Argentina"           11  3   0   30    14
                  1987 "Argentina"           11  9   0   71    14
                  1988 "Argentina"           11 12  12   21     9
                  1989 "Argentina"           11  4   0   28    43
                  1990 "Argentina"           11 20   1   11     1
                  1991 "Argentina"           11  3   0   24     8
                  1992 "Argentina"           11 14   2   27   254
                  1994 "Argentina"           11  2   1   12   351
                  1995 "Argentina"           11  .   .   16     1
                  1996 "Argentina"           11  8   0   11    13
                  1997 "Argentina"           11  3   1    8     4
                  1999 "Argentina"           11  .   .    1     0
                  2001 "Argentina"           11  .   .    2     1
                  2003 "Argentina"           11  1   0    .     .
                  2005 "Argentina"           11  3   0    .     .
                  2009 "Argentina"           11  .   .    1     0
                  2010 "Argentina"           11  2   0    3     0
                  2011 "Argentina"           11  .   .    1     0
                  2012 "Argentina"           11  .   .    2     0
                  2013 "Argentina"           11  .   .    2     2
                  2015 "Argentina"           11  .   .    1     0
                  2016 "Argentina"           11  1   0    1     2
                  1991 "Armenia"             12  .   .    1     0
                  1992 "Armenia"             12  .   .    2     3
                  1994 "Armenia"             12  1   0    3    61
                  1995 "Armenia"             12  .   .    1    22
                  1996 "Armenia"             12  1   2    .     .
                  1998 "Armenia"             12  1   0    1     1
                  1999 "Armenia"             12  .   .    2     0
                  2000 "Armenia"             12  .   .    2     0
                  2001 "Armenia"             12  .   .    2     1
                  2007 "Armenia"             12  .   .    1     0
                  2009 "Armenia"             12  .   .    1     1
                  2013 "Armenia"             12  .   .    1     1
                  2015 "Armenia"             12  .   .    2     1
                  2016 "Armenia"             12  .   .    2     7
                  1980 "Australia"           14  4   2    2     3
                  1981 "Australia"           14  .   .    1     1
                  1982 "Australia"           14  .   .    2     3
                  1986 "Australia"           14  .   .    2    23
                  1988 "Australia"           14  .   .    3     0
                  1989 "Australia"           14  .   .    2     0
                  1991 "Australia"           14  1   0    3     0
                  1992 "Australia"           14  .   .    4     5
                  1994 "Australia"           14  .   .    9    11
                  1995 "Australia"           14  2   0    3     0
                  1996 "Australia"           14  1   3    4     0
                  end

                  Comment


                  • #10
                    Ah, yes. -collapse-'s (count) operator returns missing values rather than zeroes when it encounters no qualifying observations. So you need to add one more line of code at the end:
                    Code:
                    mvencode busterr nonbusterr, mv(0)
                    Sorry, I should have noticed that earlier.

                    Comment


                    • #11
                      Thank you Clyde that totally worked.
                      And for business casaulties (buscas). When there is no business-terrorism related attack, the value at business casaulties is missing. This is logical. But can I change that to zero without problems?

                      Code:
                      replace buscas=0 if buscas==.

                      Comment


                      • #12
                        and the same for nonbusiness casualties ofcourse.

                        Comment


                        • #13
                          Yes.

                          Comment

                          Working...
                          X