Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Missing value for year 1996, trying to use average value for years: 1997 and 1995.

    Hi,

    My data is missing values for all days for the year 1996. I would therefore like to use the average value of each day for the year 1997 and 1995 instead to replace it with the missing value for each day of the year 1996. However my code does not seem to work.


    Code:
    destring year, replace
    generate avg_mean = .
    
    unab varlist : debe001-debe126 deni136
    
    foreach var of varlist `varlist' {
        egen daymonthavg_`var' = mean(`var'), by(day month year)
    }
    
    tempfile data95_97
    save `data95_97' if year == 1995 | year == 1997, replace daymonthavg_*
    
    use `data95_97', clear
    collapse (mean) mean_value = daymonthavg_* 
    
    display "Mean of variables for 1995 and 1997: " `mean_value'
    
    tempfile data96
    save `data96', replace: daymonthavg_* if year == 1996
    
    use `data96', clear
    replace avg_mean = `mean_value' if year == 1996
    I have tried many different versions of this code, and none of them seem to work. In this case the issue is that the if command for the part "save `data95_97' if year == 1995 | year == 1997, replace daymonthavg_*"
    is not allowed.

    Does anyone have a good idea of how I could improve my code?

    Kind regards.



  • #2
    There is no data example here and I can't easily follow what you are trying to do with this elaborate code. But several points are quite illegal or at least unlikely to give you useful results.

    collapse won't work with a variable wildcard.

    As you don't define a local macro mean_value there is no point to seeing what is inside and no scope to use its contents as you wish.

    daymonthavg_* can't be an option or options to save.

    There are other wild guesses and confusions here, but no point in trying to add further comments in that vein.


    The goal of using averages for 1995 and 1997 as imputed values for 1996 I think I follow and this may work as a more direct solution:

    Code:
    foreach v of var debe001-debe126 deni136 {
          su `v' if inlist(year, 1995, 1997), meanonly
          replace `v' = r(mean) if year == 1996
    }
    On the face of it this strategy has no real advantages and several downsides. But what you want to do doesn't need any heaving around of different datasets, if I understand the goal correctly. There is one dataset and no need to create others.

    Still, your goal is not really clear as you seem to move back and forth between averages for each of several variables and averages across all of those variables.
    Last edited by Nick Cox; 02 Apr 2023, 09:08.

    Comment

    Working...
    X