Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Reshape

    I need help in reshaping my data. The desired variable are date, name, code, and return.
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str10 name str19(AVIC PSP ZEGONA HEALTHCARE ULKER CRE)
    "Code"       "CNE000001JY5(RI)~U$" "CH0018294154(RI)~U$" "GB00BVGBY890(RI)~U$" "US42225P5017(RI)~U$" "TREULKR00015(RI)~U$" "JP3048680007(RI)~U$"
    "20/04/2018" "225.17"              "480.76"              "131.5"               "162.21"              "230.28"              "0.95"               
    "23/04/2018" "213.22"              "479.75"              "131.45"              "162.41"              "228.27"              "0.94"               
    "24/04/2018" "220.69"              "475.64"              "131.6"               "163.59"              "232.75"              "0.94"               
    "25/04/2018" "216.4"               "472.13"              "135.19"              "160.56"              "228.58"              "0.94"               
    "26/04/2018" "215.8"               "475.24"              "140.95"              "160.95"              "225.26"              "0.94"               
    "27/04/2018" "209.17"              "477.98"              "139.58"              "163.46"              "231.71"              "0.95"               
    end

  • #2
    The main obstacle here is that the first observation is unlike any of the others. In addition the variable called name has nothing to do with names. So step 1 is to safely store away the information in observation 1, and then drop that observation. From there a bit of renaming makes it simple to apply -reshape long-. Nice touches include creating a real Stata internal format numeric date variable to hold the dates. The last step is to restore the information originally in observation 1 to a proper variable.

    Code:
    ds name, not
    local firms `r(varlist)'
    
    foreach f of local firms {
        local `f'_code = `f'[1]
    }
    
    drop in 1
    gen int date = daily(name, "DMY")
    assert missing(date) == missing(name)
    drop name
    format date %td
    
    isid date, sort
    
    rename (`firms') ret=
    
    reshape long ret, i(date) j(firm) string
    destring ret, replace
    
    gen code = ""
    foreach f of local firms {
        replace code = `"``f'_code'"' if firm == `"`f'"'
    }

    Comment


    • #3
      Many thanks Clyde Schechter for the efficacious code!

      Comment


      • #4
        I am working on a similar reshape problem.
        Code:
        * Example generated by -dataex-. To install: ssc install dataex
        clear
        input int name double BANJALUKASEBIRSPRICEINDEX str45 ERROR double BEL20PRICEINDEX str45 G double OMXAFFARSVARLDENSGENERALPRI
        20454 658.99 "$$ER: E100,INVALID CODE OR EXPRESSION ENTERED"  3700.3 "$$ER: E100,INVALID CODE OR EXPRESSION ENTERED" 490.46
        20457 658.92 ""                                              3642.79 ""                                              474.87
        20458 658.66 ""                                              3653.18 ""                                               472.3
        20459 658.66 ""                                              3613.69 ""                                               472.3
        20460 658.66 ""                                              3550.71 ""                                              461.58
        20461 658.66 ""                                              3493.98 ""                                              458.15
        20464 656.61 ""                                              3494.14 ""                                              456.99
        20465 659.32 ""                                              3518.44 ""                                              458.64
        20466 660.16 ""                                               3528.1 ""                                              460.31
        20467 660.27 ""                                              3455.01 ""                                              451.89
        20468 664.49 ""                                               3384.2 ""                                              440.12
        20471 662.69 ""                                              3339.75 ""                                              439.39
        20472 663.92 ""                                              3382.48 ""                                              447.51
        20473 667.31 ""                                              3285.15 ""                                              434.03
        20474 663.07 ""                                              3355.89 ""                                              440.24
        end
        format %tdnn/dd/CCYY name
        I get the error _OMXAFFARSVARLDENSGENERALPRI_code invalid name. Please have a look at the Stata and Excel files here: https://drive.google.com/open?id=1U_...Lg5eAoM4H8yyXF . Thank you.

        Comment


        • #5
          The maximum length of a Stata variable name is 32 characters. _OMXAFFARSVARLDENSGENERALPRI_code is 46 characters. So you need to abbreviate the values of these variables if you want to use them as the -j()- option in a -reshape wide- command.

          Comment


          • #6
            I renamed the variable as OMX. Now I get the error type mismatch.

            Comment


            • #7
              In addition, when I try to apply the codes to my actual data(https://drive.google.com/open?id=1U_...Lg5eAoM4H8yyXF), I get the error assertion is false.
              Click image for larger version

Name:	Statalist Reshape.png
Views:	1
Size:	7.0 KB
ID:	1447047

              Last edited by Budu Gulo; 31 May 2018, 18:31.

              Comment


              • #8
                Re #6: without seeing the actual -reshape- command, I cannot comment on this. It may be in the files you attached, but, like many others here, I do not risk downloading attachments from strangers. Please show the command directly here in the Forum, along with a -dataex- example of the code that demonstrates this problem.

                Re #7. This means that some, 368 out of 607, of your date variables are not valid dates in DMY format. You need to review your data and fix that. This often arises with data imported from text files where the dates are sometimes entered as DMY and other times as MDY, and perhaps other arrangements, too, and sometimes they spell out four-digit years but other times give only 2-digit years. That's always a mess, but clearly you have to fix it before you can make use of the data. What can be particularly problematic is if there are entries like 06/01/2018 which might be June 1 or 1 January--it often requires some digging around to figure out what is what. The other possibility is that everything is in DMY format but some of the dates are impossible, such as 31/06/2018. My condolences for having to deal with this.

                Comment


                • #9
                  Thanks for the helpful comments Clyde Schechter. I have managed to solve the problem. The culprit was indeed the date variable.

                  Comment

                  Working...
                  X