Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • What kind of interpolating should I use?

    I have a daily dataset with global aggregate numbers of an index and a monthly index of countries specifically. As the changes in the daily ones are very sporadic and volatile I can't use linear interpolation, which my supervisor agreed on. So I am trying to find a way to interpolate the country specific monthly figures to daily figures based on the global aggregate daily figure changes. I am taking the assumption that both are extremely correlated to validate this approach. How should I approach this and through which stata commands as I am not familiar with different interpolation methods and their pros and cons.

    Example of the dataset for the first month where GPRD = is daily variable and GPRC_DEU the monthly
    date GPRD GPRC_DEU
    01/01/2022 62.95 1.26
    02/01/2022 37.86
    03/01/2022 55.74
    04/01/2022 93.54
    05/01/2022 95.46
    06/01/2022 73.92
    07/01/2022 119.16
    08/01/2022 38.28
    09/01/2022 63.70
    10/01/2022 135.54
    11/01/2022 192.97
    12/01/2022 81.38
    13/01/2022 177.09
    14/01/2022 134.17
    15/01/2022 71.06
    16/01/2022 45.75
    17/01/2022 115.06
    18/01/2022 144.46
    19/01/2022 155.56
    20/01/2022 138.22
    21/01/2022 194.31
    22/01/2022 145.81
    23/01/2022 106.86
    24/01/2022 184.03
    25/01/2022 296.21
    26/01/2022 271.75
    27/01/2022 267.22
    28/01/2022 189.41
    29/01/2022 217.44
    30/01/2022 63.53
    31/01/2022 149.44
    01/02/2022 192.65 2.62
    Many thanks!

    Steven
    Last edited by Steven Zhao; 21 Feb 2023, 15:27. Reason: Additional text

  • #2
    This sounds like an insoluble problem to me. If you assume a relationship between the daily variable and the monthly variable then just about 29/30 of your values will be interpolated or rather estimated. You will no longer have two variables but two versions of one variable that are strongly correlated and at best there will be weird artefacts and at best you'll just discover the correlation you assumed in the first place.

    Much depends on whether the monthly variable is a snapshot at the beginning of the month or some kind of overall average or representative value that refers to the whole month. If the latter, then it seems to me that the only defensible action is to spread the values for the first day of each month to all days in the same month.

    This is a delicate point on which researchers may disagree and so different views might be expressed

    Comment

    Working...
    X