Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Tagging lagging variable based on time

    Hello,

    I'm trying to create a lagging flag based on the observations in the next category. The variable I'm trying to create is Flag(t-1) as per below.
    Flag(t-1) should equal the value of Flag from the next day. Eg the calculation for the first cell in Flag(t-1), to be the value of Classification aaa on Day 2.

    Would appreciate some advice on the code required to achieve this.

    I have tried sorting and creating a group, but its not working at all.

    Thanks


    Click image for larger version

Name:	Lagging var example.PNG
Views:	1
Size:	20.2 KB
ID:	1630921
    Attached Files

  • #2
    The two screenshots you show are quite different. The first one seems to match what you wrote in words; the second is nothing like it. I'll assume you want what you said and showed in the first screenshot; I'll ignore the second--I don't know what it has to do with this.

    Code:
    xtset Var Time
    gen wanted = F1.Flag
    Note: In order to even talk about "from the next day" it must be the case that for any value of Var, there is at most one observation on any given day. If there are two or more observations for, say, day 3, then it is meaningless to speak of "from the next day" at day 2 as the choice among them is indeterminate. The -xtset- command will verify this condition, and will halt execution with an error message if it is not met. If that happens, it means that your data are not appropriate for this concept--usually due to errors in the data itself. In that case, -duplicates list Var Time- will show you the offending pariings of Var and Time. You can then investigate further to understand what is going on, and, hopefully, fix the data errors. Or, you might realize that in fact the data are correct and the concept of "from the next day" is just not applicable--in which case you need a new plan.

    In the future, when showing data examples, please use the -dataex- command to do so. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. (Screenshots are the least helpful way to show data.) It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    Comment


    • #3
      Thanks Clyde. I have a very large dateset (29m rows) and the screen shot sample was the quickest option for an absolute Stata novice. I tried the randomtag option but that also didn't work given I needed a few observations for 5 different stocks on different days.

      The revised picture (and again an excel pic given I need to learn more coding in Stata to use dataex) but I have now included the variable type in the label field and included the table below (excluding my arrows). The pic though is still the clearest depiction of what I need to do.

      Description of data:
      - Each row represents a trade for a particular stock at a time through a day
      - Some days there are more than one trade in a day (eg day 2 for stock aaa)
      - The Flag field will always be the same for that stock on the day irrespective of how many trades occur

      The field I would like to generate is called Flag(t-1) and this field is derived as follows:
      - It simply takes the value of the Flag field from the following day for that same stock
      - eg for aaa on day 2, the Flag(t-1) field takes on the value of the Flag field for aaa on day 3, for all three trades on day 2




      Click image for larger version

Name:	Lagging var example 2.PNG
Views:	1
Size:	28.6 KB
ID:	1631179
      Stock (str6) Trades in day Day (long %tdD_m_Y) Flag (byte %8.0g) Flag (t-1)
      aaa 1 1
      0
      1
      aaa 3 2 1 1
      aaa 3 2 1 1
      aaa 3 2 1 1
      aaa 1 3 1 -
      bbb 1 1 1 0
      bbb 1 2 0 0
      bbb 1 3 0 -
      ccc 1 1
      0
      1
      ccc 1 2 1 1
      ccc 2 3
      1
      1
      ccc 2 3 1 1
      ccc 1 4 1 1
      ccc 1 5 1
      ddd 1 1 1 0
      ddd 1 2 0 0
      ddd 1 3 0 0

      Comment


      • #4
        OK, the presence of multiple observations for the same stock/day makes it a little more complicated, but not that much.

        Code:
        clear*
        * Example generated by -dataex-. For more info, type help dataex
        clear
        input str3 stock byte(tradesinday day flag)
        "aaa" 1 1 0
        "aaa" 3 2 1
        "aaa" 3 2 1
        "aaa" 3 2 1
        "aaa" 1 3 1
        "bbb" 1 1 1
        "bbb" 1 2 0
        "bbb" 1 3 0
        "ccc" 1 1 0
        "ccc" 1 2 1
        "ccc" 2 3 1
        "ccc" 2 3 1
        "ccc" 1 4 1
        "ccc" 1 5 1
        "ddd" 1 1 1
        "ddd" 1 2 0
        "ddd" 1 3 0
        end
        
        //  VERIFY FLAG IS CONSTANT WITHIN DAY
        by stock day (flag), sort: assert flag[1] == flag[_N]
        
        sort stock day, stable
        by stock: gen wanted = flag[_n+1] if day != day[_n+1]
        by stock day: replace wanted = wanted[_N]
        The first "paragraph" verifies your claim that flag is always the same for a given stock on a given day. That is critical--if it isn't true then the problem is ill-posed and can't be solved. So we verify it first. The second paragraph actually does the work.

        Concerning -dataex-, it is one of the easiest commands in all of Stata. If you need to learn more Stata programming to use it, then you are not ready to do anything useful in Stata. Given that you are a beginner, I think you should step back from this project or other production work and take some time to learn the basics. Access the PDF manuals that come installed with your Stata. (You can reach them from the Help menu in Stata.) Read the Getting Started [GS] volume appropriate to your setup, and then read the User's Guide [U]. It's a lot of reading, and you won't remember everything. But it will give you an exposure to the basics of using Stata--it covers the commands that everyone needs to be able to use in order to be productive. From there, in most situations, you will have a picture of how to approach your problem, and then the online help files will enable you to fill in the details. The time you invest this will, I assure you, be amply repaid.

        Comment

        Working...
        X