Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • codebook differences between %td and %tc formats

    Dear Stata community:

    I am wondering whether there is a bug in the Stata 17 codebook command affecting what it indicates about the %tc date-time format, or whether I am misunderstanding date-time formats?

    Background: When I have applied the %td format to a numeric variable, the codebook tells me that the variable is a daily date, as shown with date2 in the example below. This makes sense to me--I have a numeric variable, and because I applied the %td format, Stata knows it is a daily date, and the codebook tells me so.

    . codebook date2 // Starting with a numeric variable.

    ------------------------------------------------------------------------------------------------------------
    date2 (unlabeled)
    ------------------------------------------------------------------------------------------------------------

    Type: Numeric (float)

    Range: [22471,22986] Units: 1
    Unique values: 49 Missing .: 0/76

    Mean: 22900.8
    Std. dev.: 103.416

    Percentiles: 10% 25% 50% 75% 90%
    22883 22894 22922.5 22953.5 22978

    . format date2 %td // Applying daily date formatting so the variable is human-readable.

    . codebook date2 // Output is as I expect--Yay!

    ------------------------------------------------------------------------------------------------------------
    date2 (unlabeled)
    ------------------------------------------------------------------------------------------------------------

    Type: Numeric daily date (float)

    Range: [22471,22986] Units: 1
    Or equivalently: [10jul2021,07dec2022] Units: days
    Unique values: 49 Missing .: 0/76

    Mean: 22900.8 = 12sep2022(+ 18 hours)
    Std. dev.: 103.416
    Percentiles: 10% 25% 50% 75% 90%
    22883 22894 22922.5 22953.5 22978
    26aug2022 06sep2022 04oct2022 04nov2022 29nov2022



    In contrast, when I apply %tc formatting to a numeric variable, the codebook does NOT indicate it is a date-time variable, even though the variable displays as a date-time the way I want it to when using the -browse- or -list- commands, as shown with date1 below. Why doesn't the -codebook- command produce output with a human-readable date1?

    . codebook date1 // Starting with a numeric variable.

    ------------------------------------------------------------------------------------------------------------
    date1 (unlabeled)
    ------------------------------------------------------------------------------------------------------------

    Type: Numeric (float)

    Range: [1.977e+12,1.987e+12] Units: 100000
    Unique values: 65 Missing .: 10/76

    Mean: 2.0e+12
    Std. dev.: 3.0e+09

    Percentiles: 10% 25% 50% 75% 90%
    2.0e+12 2.0e+12 2.0e+12 2.0e+12 2.0e+12

    . format date1 %tc // Applying date-time formatting so the variable is human-readable.

    . codebook date1 // Output is not as I expect, there is no indication that date1 is formatted as a date-time variable.

    ------------------------------------------------------------------------------------------------------------
    date1 (unlabeled)
    ------------------------------------------------------------------------------------------------------------

    Type: Numeric (float)

    Range: [1.977e+12,1.987e+12] Units: 100000
    Unique values: 65 Missing .: 10/76

    Mean: 2.0e+12
    Std. dev.: 3.0e+09

    Percentiles: 10% 25% 50% 75% 90%
    2.0e+12 2.0e+12 2.0e+12 2.0e+12 2.0e+12


    . list date1 // However this output IS as I expect! Yay! (But why doesn't the codebook reflect the %tc formatting?)

    +--------------------+
    | date1 |
    |--------------------|
    1. | 29aug2022 10:28:25 |
    2. | 30aug2022 09:57:27 |
    .
    .
    .


    I would appreciate any insight into the differences in how the codebook displays %tc vs. %td variables!
    Melissa

  • #2
    I agree that codebook does not indicate that your date1 variable is a clock date variable in the same way that it indicates date2 is a daily date variable.

    I initially thought it might be because your date1 variable is incorrectly stored as a float, which is not appropriate for a clock date variable - help datetime tells us clock values must be stored in double variables. But the following example shows us that even stored as a double, codebook does not identify it as a time variable.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float var2
    22471
    22986
    end
    
    gen date2 = var2
    format date2 %td
    
    generate var1 = dhms(date2,8,20,30) // incorrect
    generate date1 = var1
    format date1 %tc
    
    generate double date1d = dhms(date2,8,20,30)
    format %tc date1d
    
    list
    describe
    codebook
    Code:
    . list
    
         +------------------------------------------------------------------------+
         |  var2       date2       var1                date1               date1d |
         |------------------------------------------------------------------------|
      1. | 22471   10jul2021   1.94e+12   10jul2021 08:20:47   10jul2021 08:20:30 |
      2. | 22986   07dec2022   1.99e+12   07dec2022 08:19:36   07dec2022 08:20:30 |
         +------------------------------------------------------------------------+
    
    . describe
    
    Contains data
     Observations:             2                  
        Variables:             5                  
    ------------------------------------------------------------------------------------------------
    Variable      Storage   Display    Value
        name         type    format    label      Variable label
    ------------------------------------------------------------------------------------------------
    var2            float   %9.0g                 
    date2           float   %td                   
    var1            float   %9.0g                 
    date1           float   %tc                   
    date1d          double  %tc                   
    ------------------------------------------------------------------------------------------------
    Sorted by: 
         Note: Dataset has changed since last saved.
    
    . codebook
    
    ------------------------------------------------------------------------------------------------
    var2                                                                                 (unlabeled)
    ------------------------------------------------------------------------------------------------
    
                      Type: Numeric (float)
    
                     Range: [22471,22986]                 Units: 1
             Unique values: 2                         Missing .: 0/2
    
                Tabulation: Freq.  Value
                                1  22471
                                1  22986
    
    ------------------------------------------------------------------------------------------------
    date2                                                                                (unlabeled)
    ------------------------------------------------------------------------------------------------
    
                      Type: Numeric daily date (float)
    
                     Range: [22471,22986]                 Units: 1
           Or equivalently: [10jul2021,07dec2022]         Units: days
             Unique values: 2                         Missing .: 0/2
    
                Tabulation: Freq.  Value
                                1  22471  10jul2021
                                1  22986  07dec2022
    
    ------------------------------------------------------------------------------------------------
    var1                                                                                 (unlabeled)
    ------------------------------------------------------------------------------------------------
    
                      Type: Numeric (float)
    
                     Range: [1.942e+12,1.986e+12]         Units: 100000
             Unique values: 2                         Missing .: 0/2
    
                Tabulation: Freq.  Value
                                1  1.942e+12
                                1  1.986e+12
    
    ------------------------------------------------------------------------------------------------
    date1                                                                                (unlabeled)
    ------------------------------------------------------------------------------------------------
    
                      Type: Numeric (float)
    
                     Range: [1.942e+12,1.986e+12]         Units: 100000
             Unique values: 2                         Missing .: 0/2
    
                Tabulation: Freq.  Value
                                1  1.942e+12
                                1  1.986e+12
    
    ------------------------------------------------------------------------------------------------
    date1d                                                                               (unlabeled)
    ------------------------------------------------------------------------------------------------
    
                      Type: Numeric (double)
    
                     Range: [1.942e+12,1.986e+12]         Units: 100000
             Unique values: 2                         Missing .: 0/2
    
                Tabulation: Freq.  Value
                                1  1.942e+12
                                1  1.986e+12
    
    .

    Comment

    Working...
    X