Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Missing lat/lon data

    Hi Statalist,

    I have a dataset with latitude/longitude collected twice for each household and thus is contained in 2 variables for both latitude and longitude. Due to cloud covering, many of the lat/lon values are missing and in an inconsistent manner. That is, some observations have data for all 4 variables, some only for lat1/lon1, some only for lat2/lon2. What's the most efficient loop that will:
    1. If all 4 variables !=. --> take the average of each of the two lat/lon variables and place in a new variable
    2. If only lat1/lon1 data !=. --> populate the new variable generated above with this data
    3. If only lat2/lon2 data!=. --> same as above
    Appreciate any tips. Let me know if more information would be useful.

  • #2
    Erin:
    I would recommend you to take a look at http://www.missingdata.org.uk/ which is maintained by Jeremy Bartlett, whose posts appears on this list from time to time.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      A statement for latitude suffices:

      Code:
       
      gen  LAT = cond(missing(lat1, lat2), min(lat1, lat2), (lat1 + lat2)/2)
      If either value is missing, we can choose the minimum. If both values are missing, we'll get missing any way, but we can expect nothing else. (Actually, the maximum works fine too.)

      If no value is missing, we can average.

      For more on cond() see http://www.stata-journal.com/sjpdf.h...iclenum=pr0016

      For more on missing() see http://www.stata-journal.com/sjpdf.h...iclenum=dm0049

      Comment

      Working...
      X