Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • 'No observations' error message after 'collapse'

    Dear Statalists,

    My goal is to create daily temperature data for several regions.
    I have daily temperature data for several cities and villages which I try to aggreate to the regional level. I have 100 regions. To accomplish that I have calculated distances between the center of each region and every city or village, which resulted in 100 variables that contain distances between the regional center and every village or city. Now I would like to compute regional temperature averages, in which every village or city that lies within a certain distance contributes to the average. I also require that the further away a place is from its centroid, the lower should be the contribution to the average regional temperature.

    That worked fine and the results are stored in the temp_cleaned_2009.dta file, which I then use to compute the inverse--weights and the inverse--weighted temperatures. So far, everything worked fine. When I try to collapse the weighted temperatures and the inverse weights in order to obtain their sums and hence eventually their daily averages, STATA SOMETIMES responds that 'no observations' exist.

    Particularly, if I only apply collapse to one variable, it works fine. Sometimes it also works for two variables. I have attached my code to this post.
    Maybe it also helps to know that the variables
    ContributionTemp and InverseWeight contain many missing values due to the distance truncation.Moreover, I already checked that none of the variables are actually strings. All of the relevant variables are floats.

    Any helpful comments would be greatly appreciated.
    Martin



    use temp_cleaned_2009, clear

    forvalues i = 1/75{
    generate DistanceDummy`i' = 0
    replace DistanceDummy`i' = 1 if Distance`i' < 100

    generate WithinDistance`i' = DistanceDummy`i' * Distance`i'
    generate InverseWeight`i' = 1/WithinDistance`i'

    generate ContributionTemp`i' = temperature * InverseWeight`i'

    drop WithinDistance`i'
    drop DistanceDummy`i'
    }

    collapse (sum) ContributionTemp1-ContributionTemp75 InverseWeight1-InverseWeight75, cw by(date)


  • #2
    I am sorry: I forgot to mention that I am using STATA15.

    Comment


    • #3
      Welcome to Statalist.

      The option "cw" on your collapse command instructs Stata to ignore any observation for which any of the 150 variables being collapsed is missing. Thus, for any region for which there are fewer than 75 cities or villages, Distance75 will br missing in every observation, and thus InverseWeight75 and ContributionTemp75 will be missing in every observation, so every observation will be excluded from collapse, and you will be told by Stata that no observations exist.

      Comment


      • #4
        Dear William,

        thank you for your answer.
        I am not so sure whether what you say is correct and I think this has been a mistake from my side: The variable names correspond to 75 different regions, so Temp75 is a vector that contains temperatures in each city/village multiplied by the inverse weight of the respective city/village to its regional center. Moreover, all 75 regions contain more than 75 villages or cities.

        I also mentioned in my post that the command worked perfectly well for some variable combinations and did not work for others: ContributionTemp1-ContributionTemp3 did not work either.
        However, I managed to get around this problem by looping over the variable pairs InverseWeight`i' and ContributionTemp`i', collapsin the pairs individually, writing them to individual files and combining these files eventually.

        Comment


        • #5
          Your success by treating each variable pair separately confirms that the source of your problem was the use of the "cw" option on your collapse command.

          It wasn't for the reason I guessed, having misunderstood your problem description, but I hope you do understand that by collapsing each pair separately, you have no longer done casewise deletion in response to missing values as your initial command required.

          Comment


          • #6
            Dear STATALISTS,

            Recently I'm doing a project on environmental economics. Now I have the database of all the observations/interviewees who have a distance less than 50km from the valid temperature monitor stations. Basically, it looks like the following:

            Interviewee Number StationID Distance between Obs and Station
            1 A 18km
            2 B 24km
            3 C 12km
            3 B 14km
            3 D 16km

            Since No.3 interviewee has more than one station that is within 50km from it, I want to make the three stations in one row, it's:

            Interviewee Number SID1 Distance1 between Obs and St SID 2 Distance2 between Obs and St SID3 Distance3 between Obs and St
            1 A 18km
            2 B 24km
            3 C 12km B 14km D 16km

            I tried to copy paste this database, and use "merge", but it always said "Interviewee Number is not uniquely identified". Is it because I have obs like Number 3 who has several stations at the same time?

            Also, I have the temperature of each station,I want to calculate the inverse distance weighted average temperature for each observation, I have already installed the gwtmean, but it still doesn't work.

            Will be very very grateful if someone can help me with the command. Thanks in advance!

            Comment

            Working...
            X