Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Data preparation - Store value of variable as local macro in loop

    Dear Statalisters,

    I am quite new in stata and I am preparing my data for an event study.

    I have a problem creating a loop that stores the value of a variable for a specific row in my dataset as a local macro. This stored macro will later be used in the loop in my preperation.

    I have created an exampel to illustrate what i mean:

    If I have a dataset looking like this:
    var1 var2
    1 3
    2 6
    3 7
    4 1
    5 12
    6 7
    7 12
    8 2
    9 4
    10 8
    And I have created the following code:

    foreach l of var1 {
    local v = var2
    display `v'
    }

    I would in this case like to get the output displayed:
    3
    6
    7
    1
    12
    7
    12
    2
    4
    8

    The numbers above is just an example and a simplification of my dataset, but the main problem for me is the same. Stata stores the value of var2 for the first row as v. In this case "3".

    I would appreciate any help I can get

    Regards

  • #2
    Your loop is a loop over one case, which is harmless but not what you want. Stata doesn't interpret that syntax as "look inside this variable and loop over its values in turn".

    What is inside the loop does not vary with the loop index, so it's the same instruction.

    You have found out that

    display varname

    is always interpreted as

    display varname[1]

    which is documented. http://www.stata.com/support/faqs/pr...-if-qualifier/

    In order to display the value of each observation in turn, you must loop explicitly over all the observations and refer to each value in turn. This is a general way to do it:

    Code:
    forval i = 1/`=_N' {
        display varname[`i']
    }
    The code `=_N' evaluates the number of observations on the fly, so that the loop code sees an actual number 42, or 666, or 7 billion, or whatever.

    Of course that's dopey, as list varname gets you there, but it's understood that your example is proxy for your real problem.

    Notice that is there is no gain to just putting a value of a variable into a local macro and then taking it out again. That's like: here I have a pen, now I put it in a box, now I take it out again. If you just want the pen, the business with the box can be dispensed with. There is a local macro in the code, and it's what controls the loop and (here, as is typical although not always necessary) it is also used inside the loop.
    Last edited by Nick Cox; 12 Apr 2016, 03:18.

    Comment


    • #3
      You mention that you're quite new to Stata, that you're doing data preparation and that you're looping over observations to do it.

      In addition to Nick's excellent advice, I would like to point out something that might help you in case you're not already aware of it. In general, Stata's functions are "vectorized". Because of this, it's not often that you will need to explicitly loop over observations in order to accomplish most data-preparation objectives.

      I'm not sure what you're ultimately trying to do, but there is liable to be a more "Stata-ish" approach (more elegant and faster) available to you than one that requires explicitly looping over observations.

      Comment


      • #4
        Thank you very much for your answer. However I am not able to make the loop function as I want to. I will below describe my dataset in more detail.


        I have a large dataset that looks like this:

        company id date industry offer
        1 2015.01.01 10
        1 2015.01.02 10 1
        1 2015.01.03 10
        1 2015.01.04 10
        2 2015.01.01 10
        2 2015.01.02 10
        2 2015.01.03 10
        2 2015.01.04 10
        3 2015.01.01 12
        3 2015.01.02 12
        3 2015.01.03 12
        3 2015.01.04 12 1
        4 2015.01.01 12
        4 2015.01.02 12
        4 2015.01.03 12
        4 2015.01.04 12
        5 2015.01.01 16
        5 2015.01.02 16
        5 2015.01.03 16
        5 2015.01.04 16
        I want to make a new variable that says one for all companies within the same industry at the date of the offer in that industry. In other words I would like it to look like this:
        company id date industry offer offer_in_industry
        1 2015.01.01 10
        1 2015.01.02 10 1 1
        1 2015.01.03 10
        1 2015.01.04 10
        2 2015.01.01 10
        2 2015.01.02 10 1
        2 2015.01.03 10
        2 2015.01.04 10
        3 2015.01.01 12
        3 2015.01.02 12
        3 2015.01.03 12
        3 2015.01.04 12 1 1
        4 2015.01.01 12
        4 2015.01.02 12
        4 2015.01.03 12
        4 2015.01.04 12 1
        5 2015.01.01 16
        5 2015.01.02 16
        5 2015.01.03 16
        5 2015.01.04 16
        My real dataset contains several 100 offers, some on the same day for different industries. One company can also have several offers on different days.

        I was trying to code a loop that go through all offers in the dataset and for each offer saves the date and industry for that particular line as local macros. The code then go through the entire dataset and set the variable offer_in_industry as 1 if the local macro that saves the industry = industry and the local macro that saves the date = date.

        Maybe this is not the right way to do it?

        Regards

        Comment


        • #5
          I realised that the tables above became quite hard to read when the post was made. Below they are formated so they should be more easy to read:
          company id date industry offer
          1 2015.01.01 10
          1 2015.01.02 10 1
          1 2015.01.03 10
          1 2015.01.04 10
          2 2015.01.01 10
          2 2015.01.02 10
          2 2015.01.03 10
          2 2015.01.04 10
          3 2015.01.01 12
          3 2015.01.02 12
          3 2015.01.03 12
          3 2015.01.04 12 1
          4 2015.01.01 12
          4 2015.01.02 12
          4 2015.01.03 12
          4 2015.01.04 12
          5 2015.01.01 16
          5 2015.01.02 16
          5 2015.01.03 16
          5 2015.01.04 16
          company id date industry offer offer_in_industry
          1 2015.01.01 10
          1 2015.01.02 10 1 1
          1 2015.01.03 10
          1 2015.01.04 10
          2 2015.01.01 10
          2 2015.01.02 10 1
          2 2015.01.03 10
          2 2015.01.04 10
          3 2015.01.01 12
          3 2015.01.02 12
          3 2015.01.03 12
          3 2015.01.04 12 1 1
          4 2015.01.01 12
          4 2015.01.02 12
          4 2015.01.03 12
          4 2015.01.04 12 1
          5 2015.01.01 16
          5 2015.01.02 16
          5 2015.01.03 16
          5 2015.01.04 16
          Thank you again for your help.

          Comment


          • #6
            Joseph is absolutely right. Looping over observations is the absolute last resort. In your case, code could be, as I understand it,

            Code:
             
            bysort industry date (offer) : gen offer_in_industry = offer[_N]
            assuming that whatever is not 1 should be 0 in the final two variables.

            Comment


            • #7
              Thank you very much for your answer. It worked just as it was supposed to

              Comment

              Working...
              X