Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • count, tabulate, *browse or list variables based on date variable with an elapsed date format

    How can I count, tabulate, browse or list variables based on the value of a date variable registered in STATA numeric date format?

    Below is my example using _dataex-
    The data variable registered in STATA numeric date format is “Birthdate”

    I want to browse or count my data by the "Birthdate" variable. However, the STATA outputs are empty values. Even the output of "Birthdate" for -_dataex- is also empty as you see below.
    However, the variables are not empty. Below, I also copy and past the "Birthdate" values for your reference.

    Birthdate
    6/5/2001
    10/24/2001
    10/24/2001
    10/5/2001
    6/7/2001
    6/7/2001
    6/7/2001
    3/16/2001
    2/19/2001
    2/23/2001
    2/28/2001
    1/22/2001
    1/22/2001
    1/22/2001
    1/15/2001
    1/15/2001
    1/15/2001
    10/11/2001
    10/11/2001
    10/11/2001
    10/19/2001
    10/19/2001
    1/8/2001
    1/8/2001
    1/8/2001
    1/8/2001
    1/29/2001
    1/29/2001
    1/29/2001
    1/29/2001


    If I randomly select one value of the "Birthdate" variable and try to count all observations below as presented in the command here (count if Birthdate <2/28/2001), I get 0. However, when I browse and look into the data, there are observations below that values. It seems the function cannot work with this date value. When I do count if Birthdate > 2/28/2001 ( it gives me 30, the number of all observations). This is strange because all observations do not have Birthdate values above 2/28/2001. Moreover, when I do count if Birthdate == 2/28/2001, I get 0 again. This is also strange because as you see in the list of values for the "Birthdate" variable above, one observation has Birthdate== 2/28/2001.

    Would you please give me your suggestion,

    Thank you.

    Below is my example using _dataex-


    . dataex ID Birthdate Weight

    ----------------------- copy starting from the next line -----------------------
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input long ID int Birthdate double Weight
    1960010821 15131 .
    1960010821 15272 .
    1960010821 15272 .
    1960010821 15253 .
    1960010821 15133 .
    1960010821 15133 .
    1960010821 15133 .
    1960010821 15050 .
    1960010821 15025 .
    1960010821 15029 .
    1960010821 15034 .
    1960010821 14997 .
    1960010821 14997 .
    1960010821 14997 .
    1960010821 14990 .
    1960010821 14990 .
    1960010821 14990 .
    1960010821 15259 .
    1960010821 15259 .
    1960010821 15259 .
    1960010821 15267 .
    1960010821 15267 .
    1960010821 14983 .
    1960010821 14983 .
    1960010821 14983 .
    1960010821 14983 .
    1960010821 15004 .
    1960010821 15004 .
    1960010821 15004 .
    1960010821 15004 .
    end
    format %tdnn/dd/CCYY Birthdate
    ------------------ copy up to and including the previous line ------------------

    Listed 30 out of 30 observations

    .
    . des Birthdate

    Variable Storage Display Value
    name type format label Variable label
    ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    Birthdate int %td..

    .
    . br Birthdate

    .
    . count if Birthdate < 2/28/2001
    0

    .
    . count if Birthdate == 2/28/2001
    0

    .
    . count if Birthdate > 2/28/2001
    30

    .
    end of do-file

    .
    Last edited by tig som; 14 Nov 2022, 12:19.

  • #2
    Stata dates are numbers (or referred to as SIF values). So 2/28/2001 is just how the value of such a variable may be displayed. You have convenience functions for what you want, see

    Code:
    help td()
    for example. Therefore, you want something like

    Code:
    count if Birthdate < td(28feb2001)

    Comment


    • #3
      To take Andrew Musau's point further, to Stata 2/28/2001 is just the result of two divisions and will not be interpreted as you wish.

      Code:
      . di 2/28/2001
      .0000357
      Now 1 January 1960 is 0 as a daily date, so all dates in your data that are 2 January 1960 or later will be greater than this number and none will be equal and none will be less, as you found.

      Comment


      • #4
        Thank you very much, Andrew Musau and Nick Cox!!

        Comment

        Working...
        X