Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • half-yearly data importing to Stata

    I have half-yearly data starting from June 2017 to June 2019. In excel wide format data it is written as x1 2017/06 x1 2017/12.....x12019/06 for example for x1 variable. When I import it to Stata and reshape to long, the time variable is not what it is actually. For example, 2017/06 is read as 1960h2 (second half of 1960). How do I write it in Excel so that it gets imported correctly or, anyway to fix it after import? Many thanks.

  • #2
    Welcome to Statalist.

    I am having trouble imagining what your data is like after you have imported it into Stata.

    Even the best descriptions of data are no substitute for an actual example of the data. There are many ways your data might be organized that are consistent with your description, and each would require a somewhat different approach. In order to get a helpful response, you need to show an example of your data after you have imported it into Stata, before you try reshaping it or doing anything else to it. That will help us understand if you need to do something in Excel, or if your problem can be handled entirely in Stata.

    Be sure to use the dataex command to do this. If you are running version 15.1 or a fully updated version 14.2, dataex is already part of your official Stata installation. If not, run ssc install dataex to get it. Either way, run help dataex and read the simple instructions for using it. dataex will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    When asking for help with code, always show example data. When showing example data, always use dataex.

    Comment


    • #3
      [CODE]
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input double(OL201706 OL201712 OL201806)
      245332 279413.30000000005 309043.30000000005
      14688.1 14419 14861
      139469.1 144122.7 147920.9
      381036.5 412656 450032.9
      Is this okay to understand the problem?

      Comment


      • #4
        Assuming the 2017/06, 2017/12, 2018/06 are in different cells in Excel, why not use Find/Replace to change 2017/06 to either 2017_06 or 2017h1, 2017/12 to 2017_12 or 2017h2. Then when you import it into Stata, Stata won't read them in as dates. You can then reshape and then convert them into the appropriate dates.

        Comment


        • #5



          Here is #3 cleaned up. I can't see that you have included any date variables at all.

          Code:
          * Example generated by -dataex-. To install: ssc install dataex
          clear
          input double(OL201706 OL201712 OL201806)
          245332 279413.30000000005 309043.30000000005
          14688.1 14419 14861
          139469.1 144122.7 147920.9
          381036.5 412656 450032.9

          Comment


          • #6
            Nick Cox, apparently, the variable name defines the period here. Here is one way:

            Code:
            * Example generated by -dataex-. To install: ssc install dataex
            clear
            input double(OL201706 OL201712 OL201806)
              245332 279413.30000000005 309043.30000000005
             14688.1              14419              14861
            139469.1           144122.7           147920.9
            381036.5             412656           450032.9
            end
            
            gen id=_n
            reshape long OL, i(id) j(time)
            gen yh= cond(substr(string(time), -2, 2)=="06",substr(string(time), 1, 4)+"-"+"1", substr(string(time), 1, 4)+"-"+"2")
            gen period= halfyearly(yh, "yh")
            format period %th
            Res.:

            Code:
            . drop id yh
            
            . l, sep(12)
            
                 +----------------------------+
                 |   time         OL   period |
                 |----------------------------|
              1. | 201706     245332   2017h1 |
              2. | 201712   279413.3   2017h2 |
              3. | 201806   309043.3   2018h1 |
              4. | 201706    14688.1   2017h1 |
              5. | 201712      14419   2017h2 |
              6. | 201806      14861   2018h1 |
              7. | 201706   139469.1   2017h1 |
              8. | 201712   144122.7   2017h2 |
              9. | 201806   147920.9   2018h1 |
             10. | 201706   381036.5   2017h1 |
             11. | 201712     412656   2017h2 |
             12. | 201806   450032.9   2018h1 |
                 +----------------------------+

            Comment


            • #7
              Andrew Musau Well spotted.

              Comment


              • #8
                Andrew Musau Thanks a lot for sharing the code. The date variable works fine now. Also thanks to David Benson.

                Comment

                Working...
                X