Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • For Loop Syntax Error - Subsetting Data

    Hello,

    I have two datasets. I have a Maindata which I would like to split into around 40 datasets based on a criteria in a separate dataset. Both datasets contain longitudes and latitudes and the criteria dataset specifies the range for a given group. So it has around 40 observations. This is the main dataset.

    Code:
    Name      Census2001_Lat      Census2001_Lon
    "ABC"        12.747113                     79.847343
    "DEF"          12.874169                    79.653198
    "GHI"           12.87979                     79.675159
    "JKL"           12.867902                    79.66732
    "MNO"          12.721048                    79.753166
    This is the criteria dataset.

    Code:
    clear
    Groups                 maxlat                          minlon                    minlat                                  maxlon
    "GroupA"             12.95                   79.18333333333334 12.633333333333333             79.75
    "GroupB "          12.816666666666666 79.11666666666666 12.483333333333333 79.36666666666666
    "GroupC "          13.783333333333333 78.96666666666667        13.4                      79.58333333333333
    "GroupD "           13.516666666666668              78.8                 13                            79.31666666666666
    A variable id=_n has been defined for the Criteria. I am trying to run a for loop such that the code iterates over each of 40 observations (ie criteria) in the criteria dataset and create separate datasets in the Maindata based on the longitudes and latitudes given there. This is the code I used:

    Code:
    use Criteria
    forval i=1/`id'{
    preserve
    use if (inrange(Census2001_Lon,`minlon',`maxlon') & inrange(Census2001_Lat,`minlat',`maxlat')) using MainData
    save data_`id', replace
    restore
    }
    However, I get a syntax error with the for loop. It says invalid syntax. Could someone please explain what's wrong? Thanks!
    Last edited by Minnie Smith; 15 Jul 2018, 11:38.

  • #2
    I'm sure others will devise more efficient ways of doing what you want, but based on your process, the following code works. The main problem with your code is that the macros you are using are not defined. Also, please use dataex (see FAQ #12.2) when providing data examples.

    Code:
    clear
    input str5 Name float(Census2001_Lat Census2001_Lon)
    "ABC" 12.747113 79.84734
    "DEF"  12.87417  79.6532
    "GHI"  12.87979 79.67516
    "JKL" 12.867902 79.66732
    "MNO" 12.721048 79.75317
    end
    save  temp_main.dta, replace
    
    clear
    input str10 Groups float(maxlat minlon minlat maxlon)
    "GroupA"     12.95 79.18333 12.633333    79.75
    "GroupB" 12.816667 79.11667 12.483334 79.36667
    "GroupC" 13.783334 78.96667      13.4 79.58334
    "GroupD" 13.516666     78.8        13 79.31667
    end
    save  temp_crit.dta, replace
    
    clear
    use temp_crit
    local id=_N
    forval i=1/`id'{
        local minlon=minlon[`i']
        local maxlon=maxlon[`i']
        local minlat=minlat[`i']
        local maxlat=maxlat[`i']
        preserve
        use if (inrange(Census2001_Lon,`minlon',`maxlon') & inrange(Census2001_Lat,`minlat',`maxlat')) using temp_main
        di "this is the data based on criteria #`i'"
        list
        save data_`i', replace
        restore
        }
    Stata/MP 14.1 (64-bit x86-64)
    Revision 19 May 2016
    Win 8.1

    Comment


    • #3
      This worked! Thank you. Apologies about dataex - I tried to use it but wasn't sure why my output seemed to look a bit strange. I will look up again.

      Comment

      Working...
      X