Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Replacing observations for variables within dataset

    currently my data set looks like this:

    . dataex p_wine_on gor year

    ----------------------- copy starting from the next line -----------------------
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float p_wine_on byte gor int year
    2.385 3 2012
    2.385 3 2012
    2.385 2 2012
    2.385 3 2012
    2.385 3 2012
    2.385 2 2012
    2.385 4 2012
    2.385 4 2012
    2.385 3 2012
    2.385 2 2012
    2.385 3 2012
    2.385 2 2012
    2.385 3 2012
    2.385 4 2012
    2.385 4 2012
    2.385 5 2012
    2.385 4 2012
    2.385 2 2012
    2.385 6 2012
    2.385 4 2012
    2.385 2 2012
    2.385 6 2012
    2.385 3 2012
    2.385 3 2012
    2.385 3 2012
    2.385 4 2012
    2.385 2 2012
    2.385 3 2012
    2.385 4 2012
    2.385 3 2012
    2.385 2 2012
    2.385 3 2012
    2.385 6 2012
    2.385 6 2012
    2.385 2 2012
    2.385 3 2012
    2.385 2 2012
    2.385 4 2012
    2.385 4 2012
    2.385 3 2012
    2.385 4 2012
    2.385 6 2012
    2.385 3 2012
    2.385 3 2012
    2.385 3 2012
    2.385 3 2012
    2.385 3 2012
    2.385 4 2012
    2.385 3 2012
    2.385 4 2012
    2.385 7 2012
    2.385 3 2012
    2.385 4 2012
    2.385 2 2012
    2.385 2 2012
    2.385 3 2012
    2.385 2 2012
    2.385 4 2012
    2.385 6 2012
    2.385 7 2012
    2.385 4 2012
    2.385 6 2012
    2.385 2 2012
    2.385 1 2012
    2.385 2 2012
    2.385 1 2012
    2.385 5 2012
    2.385 3 2012
    2.385 3 2012
    2.385 3 2012
    2.385 4 2012
    2.385 3 2012
    2.385 4 2012
    2.385 2 2012
    2.385 2 2012
    2.385 5 2012
    2.385 3 2012
    2.385 4 2012
    2.385 3 2012
    2.385 1 2012
    2.385 3 2012
    2.385 3 2012
    2.385 4 2012
    2.385 3 2012
    2.385 3 2012
    2.385 2 2012
    2.385 3 2012
    2.385 1 2012
    2.385 1 2012
    2.385 1 2012
    2.385 2 2012
    2.385 2 2012
    2.385 2 2012
    2.385 4 2012
    2.385 3 2012
    2.385 4 2012
    2.385 3 2012
    2.385 3 2012
    2.385 4 2012
    2.385 2 2012
    end
    label values gor Gorx
    label def Gorx 1 "North East", modify
    label def Gorx 2 "North West & Merseyside", modify
    label def Gorx 3 "Yorkshire and the Humber", modify
    label def Gorx 4 "East Midlands", modify
    label def Gorx 5 "West Midlands", modify
    label def Gorx 6 "Eastern", modify
    label def Gorx 7 "London", modify

    There are 7 Gor groups which stand for government region. At the moment my p_wine_on (price for wine on trade) variable only varies with year (there are 6 years). However i want to replace the p_wine_on for different gor groups within the years
    e.g at the moment for all Gor groups in 2007 price of wine on trade is 1.825 but now i want it to be different for each 7 groups and this is the case for every year. There are 6 years in the study

    How shall i do this?

    I have tried using the following code:
    Code:
    replace p_wine_on = 1.9564 if gor=1 & year=2007
    but it says invalid syntax
    Last edited by Anya hewertson; 09 May 2019, 12:41.

  • #2
    You may use "==" for the if clauses.
    Best regards,

    Marcos

    Comment


    • #3
      Hi all.

      I'm stuck with something that is probably very simple.

      I have a list of firms, years, and the count of specific actions per year in the following format:

      Firm Year Count
      A 1999 1
      A 2000 0
      B 1999 2
      B 2001 3

      What I am trying to do is get to this format:

      Firm Year Count Count_1999 Count_2000 Count 2001
      A 1999 1 1 0 0
      A 2000 0 1 0 0
      B 1999 2 2 0 3
      B 2001 3 2 0 3

      To then collapse it to:

      Firm Count_1999 Count_2000 Count 2001
      A 1 0 0
      B 2 0 3

      I tried:

      gen Count_1999==.
      bysort Firm Year: replace Count_1999 = Count if year==2000

      That produces the following.

      Firm Year Count Count_1999
      A 1999 1 1
      A 2000 0 .
      B 1999 2 .
      B 2001 3 .

      Any ideas how to fill in 1 for the second A firm? I have thousands of firms in the sample, so I can't set it equal to a single value. I need to set all values, in the Count_1999 variable equal to a single specific count per year, in an automated way.

      Thanks in advance,

      Killian

      Comment


      • #4
        You'll increase your chances of a useful answer by following the FAQ on asking questions - provide Stata code in code delimiters, readable Stata output and sample data using dataex. We don't even know exactly what you have run.

        You first should xtset your data letting you use leads and lags. Stata then knows not to lead or lag across panels. Then


        g Count_1999 = L.Count if year==2000
        replace Count_1999 = Count if year==1999

        ​​​​​​​g Count_2000 = F.Count if year==1999
        replace Count_2000 = Count if year==2000

        Then you can collapse or (if you really only have 2 years per firm) just keep the first .or second observation
        keep if year==1999

        Comment


        • #5
          Hi killian mccarthy ,

          It looks to me like you want to reshape your data from long to wide (help reshape). I will give you an example below, although it turns out that most things in Stata are actually easier if you keep them in long format. (You can search this forum for reshape). For example, see here and here.

          reshape is a *very* useful command in Stata, but it takes some getting used to. Also, even though I have used it many times, I still have to look up the help (either in Stata or in Google) every time to remember what to use for i() and j().

          I created some toy data to illustrate:
          Code:
          dataex firm year count // Data shared using -dataex-. To install: ssc install dataex
          clear
          input str9 firm int year byte count
          "Google"    2005 1
          "Google"    2006 2
          "Google"    2007 5
          "Google"    2008 3
          "Google"    2009 5
          "Google"    2010 6
          "Google"    2011 1
          "Google"    2012 2
          "Google"    2013 1
          "Microsoft" 2005 2
          "Microsoft" 2006 3
          "Microsoft" 2007 1
          "Microsoft" 2008 6
          "Microsoft" 2009 1
          "Microsoft" 2010 4
          "Microsoft" 2011 2
          "Microsoft" 2012 1
          "Microsoft" 2013 5
          "IBM"       2005 3
          "IBM"       2006 5
          "IBM"       2007 1
          "IBM"       2008 1
          "IBM"       2009 5
          "IBM"       2010 6
          "IBM"       2011 1
          "IBM"       2012 6
          "IBM"       2013 5
          end
          ------------------ copy up to and including the previous line ------------------


          Code:
          . list, sepby(firm) noobs
          
            +--------------------------+
            |      firm   year   count |
            |--------------------------|
            |    Google   2005       1 |
            |    Google   2006       2 |
            |    Google   2007       5 |
            |    Google   2008       3 |
            |    Google   2009       5 |
            |    Google   2010       6 |
            |    Google   2011       1 |
            |    Google   2012       2 |
            |    Google   2013       1 |
            |--------------------------|
            | Microsoft   2005       2 |
            | Microsoft   2006       3 |
            | Microsoft   2007       1 |
            | Microsoft   2008       6 |
            | Microsoft   2009       1 |
            | Microsoft   2010       4 |
            | Microsoft   2011       2 |
            | Microsoft   2012       1 |
            | Microsoft   2013       5 |
            |--------------------------|
            |       IBM   2005       3 |
            |       IBM   2006       5 |
            |       IBM   2007       1 |
            |       IBM   2008       1 |
            |       IBM   2009       5 |
            |       IBM   2010       6 |
            |       IBM   2011       1 |
            |       IBM   2012       6 |
            |       IBM   2013       5 |
            +--------------------------+
          
          
          help reshape
          reshape wide count, i(firm) j(year)
          * Count is the name of the variable that Stata will use as a "Stub", and so will create count2005, count2006, etc
          * The variable you list in j() tells Stata what to put at the end of the stub. (In this case 2005, 2006, etc)
          
          rename count* c_*  // just shortening the variable name to make it easier to print here (help rename group for more on this)
          list, noobs
          
            +--------------------------------------------------------------------------------------------+
            |      firm   c_2005   c_2006   c_2007   c_2008   c_2009   c_2010   c_2011   c_2012   c_2013 |
            |--------------------------------------------------------------------------------------------|
            |    Google        1        2        5        3        5        6        1        2        1 |
            |       IBM        3        5        1        1        5        6        1        6        5 |
            | Microsoft        2        3        1        6        1        4        2        1        5 |
            +--------------------------------------------------------------------------------------------+
          Last edited by David Benson; 12 May 2019, 00:04.

          Comment

          Working...
          X