Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Assistance with date stored as double

    Hi everyone,

    Please I need your help to extract only the years from a date variable stored as double. For instance, a date appears in cells as 26.12.16 but stored as 26dec2016 00:00:00. Another is 01.01.17 (01jan2017 00:00:00).

    After extracting the years and putting them into a new variable (Year_1), I want to generate another variable, Year_2, and replace it to 2016 if Year_1==2016. Then use similar approach for 2017, i.e. replace Year_2= 2017 if Year_1==2017

    I use Stata 14. Kindly walk me through the require steps. Thank you for your assistance.

  • #2
    You really should show example data using the -dataex- command. From what you describe, I will guess that your variable is a Stata clock variable. If my guess is correct, then you want:
    Code:
    gen year_1 = year(dofc(original_double_date_variable))
    As for your year_2 variable, it seems like you just want it to be the same as the year_1 variable. Simple enough: clonevar year_2 = year_1. But why do you want two variables with the exact same information? What is the point of that?

    Finally, if my guess about your date variable being a clock variable is wrong, then we have both wasted our time. So in the future, always use the -dataex- command and show actual example data when you have a programming problem. No exceptions. If your version 14 Stata is fully updated, -dataex- is already part of your official Stata installation. If not, either update your Stata by running -update all-, or run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    Comment


    • #3
      Thank you Clyde. I do not want two variables in the exact same information. Here is sample of the data.

      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input double DATEINTR
      1.7982432e+12
      1.7982432e+12
      1.7981568e+12
      1.7981568e+12
      1.7981568e+12
      1.7981568e+12
      1.7981568e+12
      1.7981568e+12
      1.7981568e+12
      1.7981568e+12
      1.7981568e+12
      1.7981568e+12
      1.7981568e+12
      1.7981568e+12
      1.7981568e+12
      1.7981568e+12
      1.7981568e+12
      1.7981568e+12
      1.7981568e+12
      1.7981568e+12
      1.7981568e+12
      1.7981568e+12
      1.7981568e+12
      1.7981568e+12
      1.7981568e+12
      1.7981568e+12
      1.7981568e+12
      1.7981568e+12
      1.7981568e+12
      1.7981568e+12
      1.7981568e+12
      1.7981568e+12
      1.7981568e+12
      1.7981568e+12
      1.7981568e+12
      1.7981568e+12
      1.7982432e+12
      1.7982432e+12
      1.7982432e+12
      1.7982432e+12
      1.7982432e+12
      1.7982432e+12
      1.7982432e+12
      1.7982432e+12
      1.7982432e+12
      1.7982432e+12
      1.7982432e+12
      1.7982432e+12
      1.7982432e+12
      1.7982432e+12
      1.7982432e+12
      1.7982432e+12
      1.7982432e+12
      1.7982432e+12
      1.7982432e+12
      1.7982432e+12
      1.7982432e+12
      1.7983296e+12
      1.7982432e+12
      1.7982432e+12
      1.7982432e+12
      1.7982432e+12
      1.7982432e+12
      1.7982432e+12
      1.7982432e+12
      1.7982432e+12
      1.7982432e+12
      1.7982432e+12
      1.7982432e+12
      1.7982432e+12
      1.7982432e+12
      1.7982432e+12
      1.7982432e+12
      1.7982432e+12
      1.7982432e+12
      1.7982432e+12
      1.7982432e+12
      1.7982432e+12
      1.7982432e+12
      1.7982432e+12
      1.7982432e+12
      1.7982432e+12
      1.7982432e+12
      1.7982432e+12
      1.7982432e+12
      1.7982432e+12
      1.7982432e+12
      1.7982432e+12
      1.7982432e+12
      1.7982432e+12
      1.7982432e+12
      1.7982432e+12
      1.7982432e+12
      1.7982432e+12
      1.7982432e+12
      1.7982432e+12
      1.7982432e+12
      1.7982432e+12
      1.7982432e+12
      1.7982432e+12
      end
      format %tcDD.NN.YY DATEINTR



      All I want is to extract year from the date (into a new variable called Year_1) and use the extract (Year_1) to create a categorical variable called Year_2. In the Year_2, the first group (1) will be 2016 (in Year_1), the next group(2) will be 2017, the last group (3) will be 2018. At the end when I tabulate Year_2 I should see percentage and frequency distributions for 2016, 2017 and 2018. Thereafter, I will use i.Year_2 in a regression model.

      Comment


      • #4
        Originally posted by David Okunlola View Post
        All I want is to extract year from the date (into a new variable called Year_1) and use the extract (Year_1) to create a categorical variable called Year_2. In the Year_2, the first group (1) will be 2016 (in Year_1), the next group(2) will be 2017, the last group (3) will be 2018.
        Code:
        generate int Year_1 = year(dofc(DATEINTR))
        egen byte Year_2 = group(Year_1) //, label

        Comment


        • #5
          Joseph Coveney provides the code to do as you explained in #3. That said, I feel the need to point out that using i.Year_2 in a regression will not produce anything different from what you would get just using i.Year_1.

          In fact, i.Year_1 will be more convenient because the results will be labeled with the actual year, whereas using i.Year_2 you will get results labeled 1, 2, 3,..., which then requires you to either calculate or look up what years those correspond to when examining the results. While you are actively engaged in working with your results that may seem like a minuscule problem. But if you have to go back to your results and look at them again many months from now, you will likely find it a truly annoying nuisance.

          Comment


          • #6
            Originally posted by Joseph Coveney View Post
            Code:
            generate int Year_1 = year(dofc(DATEINTR))
            egen byte Year_2 = group(Year_1) //, label
            Many thanks Joseph

            Comment


            • #7
              Originally posted by Clyde Schechter View Post
              Joseph Coveney provides the code to do as you explained in #3. That said, I feel the need to point out that using i.Year_2 in a regression will not produce anything different from what you would get just using i.Year_1.

              In fact, i.Year_1 will be more convenient because the results will be labeled with the actual year, whereas using i.Year_2 you will get results labeled 1, 2, 3,..., which then requires you to either calculate or look up what years those correspond to when examining the results. While you are actively engaged in working with your results that may seem like a minuscule problem. But if you have to go back to your results and look at them again many months from now, you will likely find it a truly annoying nuisance.
              Thank you for the advice, Clyde. It is well noted.

              Comment


              • #8
                Originally posted by David Okunlola View Post
                Thank you for the advice, Clyde. It is well noted.
                And if it bothers you, then you can always uncomment the , label above.in #4. You'll see the years then.

                Depending upon your application, you might actually want the 1, 2, 3, . . . though.

                Comment

                Working...
                X