Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Date fromating

    Hi,
    I have a sort of date (about 6 months daily data from 1st January 2016 untill 30 st June 2016) for about 1000 companies. The date data are in a numeric format, as shown in original date. Here is the command I am using to make data in time series: "tsset ISIN1 date, format(%tdNN/DD/CCYY)". As shown below, the formatted date is different from the original date which I had from the beginning. I mean the original date from 1st of January 2016 (Original date ) changed into the date 1st of April 2016 (Formatted date). I am not sure which command should I chose in order to get the correct date.
    I would be grateful if someone guides me in this regard.

    Original date Formatted date
    ISIN date ISIN Date
    GB0001771426 1/1/2016 AT0000785407 04/01/2016
    GB0001771426 1/4/2016 AT0000785407 04/04/2016
    GB0001771426 1/5/2016 AT0000785407 04/05/2016
    GB0001771426 1/6/2016 AT0000785407 04/06/2016
    GB0001771426 1/7/2016 AT0000785407 04/07/2016
    GB0001771426 1/8/2016 AT0000785407 04/08/2016
    GB0001771426 1/11/2016 AT0000785407 04/11/2016
    GB0001771426 1/12/2016 AT0000785407 04/12/2016
    GB0001771426 1/13/2016 AT0000785407 04/13/2016
    GB0001771426 1/14/2016 AT0000785407 04/14/2016
    GB0001771426 1/15/2016 AT0000785407 04/15/2016
    GB0001771426 1/18/2016 AT0000785407 04/18/2016
    GB0001771426 1/19/2016 AT0000785407 04/19/2016
    GB0001771426 1/20/2016 AT0000785407 04/20/2016
    GB0001771426 1/21/2016 AT0000785407 04/21/2016
    GB0001771426 1/22/2016 AT0000785407 04/22/2016
    GB0001771426 1/25/2016 AT0000785407 04/25/2016
    GB0001771426 1/26/2016 AT0000785407 04/26/2016
    GB0001771426 1/27/2016 AT0000785407 04/27/2016
    GB0001771426 1/28/2016 AT0000785407 04/28/2016
    GB0001771426 1/29/2016 AT0000785407 04/29/2016
    GB0001771426 2/1/2016 AT0000785407 05/02/2016
    GB0001771426 2/2/2016 AT0000785407 05/03/2016

    Thanks in Advance,
    Mahmoud

  • #2
    Pleqse use dataex to show your data. A display like that in #1 doesn't make unambiguous how the original dates were stored.

    Extracts below from https://www.statalist.org/forums/help#stata with emphasis added.

    12.2 What to say about your data

    We can understand your dataset only to the extent that you explain it clearly.

    The best way to explain it is to show an example. The community-contributed command dataex makes it easy to give simple example datasets in postings. It was written to support Statalist and its use is strongly recommended. Usually a copy of 20 or so observations from your dataset is enough to show your problem. See help dataex for details.

    As from Stata 15.1 (and 14.2 from 19 December 2017), dataex is included with the official Stata distribution. Users of Stata 15 (or 14) must update to benefit from this.

    Users of earlier versions of Stata must install dataex from SSC before they can use it. Type ssc install dataex in your Stata.

    The merits of dataex are that we see your data as you do in your Stata. We see whether variables are numeric or string, whether you have value labels defined and what is a consequence of a particular display format. This is especially important if you have date variables. We can copy and paste easily into our own Stata to work with your data.

    Comment


    • #3
      Hi Nick,

      In below the original data are shown (produced by
      dataex).

      input long ISIN1 int date double AskPrice
      46 20457 11.414373583684
      46 20458 11.4680789882926
      46 20459 11.5348173933393
      46 20460 11.3615743244231
      46 20461 11.2905567249764
      46 20464 11.3798793341572
      46 20465 11.3105808948843
      46 20466 11.3943557681226
      46 20467 11.4093597763109
      46 20468 11.2317549461201
      46 20471 11.2462574765024
      46 20472 11.1631033604167
      46 20473 11.2071404245658
      46 20474 11.2459751485683
      46 20475 11.3639571536563
      46 20478 11.295663199886
      46 20479 11.3544040409838
      46 20480 11.2358371633062
      46 20481 11.289321196807
      46 20482 11.3076524216524
      end
      format %tdnn/dd/CCYY date
      label values ISIN1 ISIN1
      label def ISIN1 46 "GB0001771426", modify
      [/CODE]

      I ran this command " tsset ISIN1 date, format(%tdNN/DD/CCYY)" to make data time series then, in the data, I see that the format of the date and the date are changed while I am willing to change only the format of the date.

      Kind regards,
      Mahmoud

      Comment


      • #4
        Thanks for the data example. If I input the data but then follow with

        Code:
        . label values ISIN1 ISIN1
        
        . label def ISIN1 46 "GB0001771426", modify
         
        . list date in 1/5
        
             +-------+
             |  date |
             |-------|
          1. | 20457 |
          2. | 20458 |
          3. | 20459 |
          4. | 20460 |
          5. | 20461 |
             +-------+
        
        .
        . format %tdnn/dd/CCYY date
        
        .
        . tsset ISIN1 date, format(%tdNN/DD/CCYY)
               panel variable:  ISIN1 (strongly balanced)
                time variable:  date, 01/04/2016 to 01/29/2016, but with gaps
                        delta:  1 day
        
        .
        . list date in 1/5
        
             +------------+
             |       date |
             |------------|
          1. | 01/04/2016 |
          2. | 01/05/2016 |
          3. | 01/06/2016 |
          4. | 01/07/2016 |
          5. | 01/08/2016 |
             +------------+
        
        .
        . format date %6.0f
        
        .
        . list date in 1/5
        
             +-------+
             |  date |
             |-------|
          1. | 20457 |
          2. | 20458 |
          3. | 20459 |
          4. | 20460 |
          5. | 20461 |
             +-------+
        I see precisely no problem. The underlying numeric values. to be interpreted as, dates are unaffected by any change of display format, as is standard. That is shown by undoing the date display format.

        I don't have an explanation of why it appears otherwise in #1.

        Comment


        • #5
          Thanks for the solution Nick. I tried to train again with an example of only 3 companies (as shown below by ISIN indicator) I got the same result as yours. However, when I am importing my data including about 1500 companies, the date changes after running time series command "tsset ISIN1 date, format(%tdNN/DD/CCYY)". I am not sure why this error is occurring.

          input str12 ISIN int date double AskPrice

          "GB0001771426" 20457 11.414373583684
          "GB0001771426" 20458 11.4680789882926
          "GB0001771426" 20459 11.5348173933393
          "GB0001859296" 20457 13.3099749109744
          "GB0001859296" 20458 13.4886452862299
          "GB0001961035" 20457 9.78374878601489
          "GB0001961035" 20458 9.80247703999296
          "GB0001961035" 20459 9.90637258486785
          end



          . format %tdNN/DD/CCYY date

          . tsset ISIN1 date, daily
          panel variable: ISIN1 (unbalanced)
          time variable: date, 01/04/2016 to 01/06/2016
          delta: 1 day

          Comment


          • #6
            Sorry, but you're just repeating an implausible (no, incredible) claim and not backing it up. For example, 20457 is a Stata daily date of 4 January 2016 or however else you want to write it. That's true before and after you have changed the display format via tsset. The underlying numeric values are quite unchanged.

            .
            Code:
             di %td 20457
            04jan2016
            
            . di %tdNN/DD/CCYY 20457
            01/04/2016
            tsset never changes the data. If it ever did that would be a bug so horrendous it would have been noticed long, long since.

            Comment


            • #7
              I agree with you. I do not think if it's a bug in Stata, I just realised that there might be something wrong with my ISIN number. When I run tsset I have such information shown below which is warning about unbalanced ISIN. Do you think if unbalanced ISIN is the problem here? I am not understanding what does it mean by unbalanced here!

              . tsset ISIN1 date, daily
              panel variable: ISIN1 (unbalanced)
              time variable: date, 01/03/2016 to 05/31/2017, but with gaps
              delta: 1 day



              Comment


              • #8
                You don't tell us much about ISIN1. All we can say from what you've told us is that it is numeric (otherwise tsset would not work) and in the example of #2 it happens to be 11.

                I can't even tell why you think there might be something wrong with ISIN1, let alone explain whether there is.

                But my answer is the same as before: tsset doesn't change your data. Lack of balance doesn't change that fact.

                You should minimally read the help for tsset which gives you definitions

                a set of panels are strongly balanced if they all have the same time values, otherwise balanced if same number of time values, otherwise unbalanced

                Comment


                • #9
                  In this context, "unbalanced" means that different values of ISIN1 have data for different sets of dates. In your case, ISIN "GB0001859296" does not have a value for date 20459, while the other two ISINs do.

                  Let me suggest the following.
                  Code:
                  clonevar date_save = date
                  tsset ISIN1 date, format(%tdNN/DD/CCYY)
                  assert date==date_save
                  If the value of date is changed by the tsset, the assert command will halt the do-file and tell you so. Like Nick, I find it inconceivable that the date is being changed. Perhaps when tsset sorts your dataset, the order of the data is changing and you are seeing observations that were not in the same place as before it was sorted.

                  Comment


                  • #10
                    Further, looking again at #1 what I see are (or appear to be) two different panels with different dates, aligned side by side. The alignment is spurious as the dates don't line up in any "row" (for once, the spreadsheet terminology is better).

                    I guess you're flipping back and forth between different ways of holding the data. perhaps looking at some spreadsheet or other original.

                    Comment


                    • #11
                      Thank you so much Nick.

                      That is true that tsset did not change any value. I had, unfortunately, some misunderstanding on my data. It was my mistake in reading the data properly. I just understood what does balanced and unbalanced means.
                      I have no more question. Have a nice day!

                      Kind regards,
                      Mahmoud

                      Comment


                      • #12
                        Thanks William!
                        You are right! The tsset sorted my dataset, the order of the data was changed thus, I couldn't read it properly. Now, my problem is resolved.

                        Comment


                        • #13
                          Thanks for the closure here.

                          Comment

                          Working...
                          X