Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Looping through multiple levels of variables not working

    Hello, I am working on a project but I am struggling with my stata code. I have a data set that I created using public FARS data. The file has information on fatal car crash data. I am trying to analyze and reorganize the variables that I currently have. There are variables for county, case number (unique by county), vehicle number (unique by case), person number (unique by vehicle). There are also variables for if the driver was drunk (dr_drunk), demographic variables (sex, age, etc.), and if the person died (inj_sev).

    My task is to put together city/date panel data on fatal car accidents. First I need to create a variable for if an accident involved a drunk driver, how many fatalities, and how many cars were involved, etc.

    Starting off, I created a variable of unique case numbers by combining the county and case codes into a single code called incidentcode. Now I need to look at each case to make my new variables. Starting with drunk_involved this is what I have so far.

    gen drunk_involved = 0
    tempvar drunk
    gen `drunk' = 0
    quietly levelsof unique_code, local(incidentcode)
    foreach unique_code of local incidentcode {
    `drunk' = 0
    foreach veh of varlist veh_no {
    foreach per of varlist per_no {
    `drunk' = 1 if dr_drink == 1
    dis `drunk'
    }
    }
    drunk_involved = 1 if `drunk' = 1
    }

    Obviously my code is not working. I have coding experience, although I am somewhat new to stata and I always struggle figuring out how to make loops work. If I could get help making the variable drunk_involved work I am sure I can figure out the other variables.

    Another question, when I try to make panel data I am running into issues. I think this is because of the multiple levels in the data. Because there are multiple vehicles and people in each case when I try to create the panel I get an error saying I have repeated time values within my panel. I simply don't know how to deal with this issue. Thanks so much for your help!

  • #2
    Hi Eli,

    Welcome to Statalist. The good news is this is an eminently solvable problem. In order to help people on the forum answer your question though, could you first install dataex (if you don't already have it) by typing ssc install dataex in Stata, then use dataex to provide a snippet of your data? You can copy and paste it into this thread.

    With that in hand, i'm sure you'll get an answer to your question very soon!

    Comment


    • #3
      Thanks Chris for the help. I'm really new to this forum. Here is a snippet of my data:
      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input byte(veh_no per_no) long st_case float incidentdate byte dr_drink
      1 3 390001 17898 1
      1 2 390001 17898 1
      1 1 390001 17898 1
      1 4 390001 17898 1
      1 5 390001 17898 1
      2 3 390001 18267 0
      2 1 390001 18267 0
      2 2 390001 18267 0
      1 2 390001 18267 0
      1 1 390001 18267 0
      end
      format %d incidentdate

      Comment


      • #4
        Sure thing - i'm happy to help. I notice in your code you're trying to create a local called incidentcode that contains the distinct levels of a variable unique_code (with the levelsof command). You don't have a variable called incidentcode or unique_code in the data snippet you shared though. I can just see incidentdate.

        Comment


        • #5
          There are various bugs here and even more puzzling details. It would help to know what you want to do more precisely.

          I suspect that your code so far boils down to

          Code:
          * code so far?
          gen drunk_involved = dr_drink == 1
          What is certainly wrong includes command lines like

          Code:
          `drunk' = 0
          
          `drunk' = 1 if dr_drink == 1
          as no Stata command starts with a variable name. I guess that something like replace is intended as the command name. If so, then the first command of those two is still pointless as the variable in question is already 0 in all observations. The second of those commands together with your first command does what I suggest as your code so far.

          I can't see why a temporary variable is being used here. You want to create a new variable as extra in the dataset. But the intent to create a variable for a drunk driver seems to be already satisfied by dr_drink that already exists.


          You have three nested loops, starting

          Code:
          foreach unique_code of local incidentcode {
          
          foreach veh of varlist veh_no {
          
          foreach per of varlist per_no {
          but nowhere inside any of those loops do you refer to the local macros declared in the foreach statement (namely unique_code veh per). That's not illegal, but it's unusual,

          The first loop looks intended to do something different for each distinct value of
          unique_code but as Chris Larkin points out we can't see any such variable and your code inside the loop, so far, does nothing dependent on those distinct values. So, it's not clear why you have a loop there.

          The second and third loop are loops over one item and as such have no obvious point.

          Previous programming experience can mislead because many things in Stata are done automatically with an implied loop over observations, so for example


          Code:
          gen foo = 42 
          is an implied loop setting a variable equal to 42 in every observation. .

          The first few programs in any language are very hard and lots of mistakes are common.....
          Last edited by Nick Cox; 18 Sep 2020, 19:00.

          Comment

          Working...
          X