Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Creating new variables repeatedly

    Dear Statalisters,

    I have 1696 variables for which there daily values available for a period of 16 years. For instance, I have 252 values for first year for first variable, 252 values for second year for first variable and so on. I want to calculate the year-wise standard deviation of each variable and store it one by one separately for all 1696 variables. I have done this for one variable by using the following command.

    bysort Year: egen SDVariable1 = sd( Variable1 )

    A new variable gets created which has the standard deviation for each year for Variable1. Now, I would like to repeat this process for all 1696 variables so that 1696 new variables are created which contain standard deviation for all years for a particular variable.

    Please guide me with a way to create new variables repeatedly

    Thanks!


  • #2
    Code:
    sort Year
    forvalues j = 1/1696 {
        by Year: egen SDVariable`j' = sd(Variable`j')
    }
    This is basic looping--do read the PDF manual sections that come installed with your Stata covering -forvalues- and -foreach-.

    In the future, when asking for help with code, it is best to post example data. In this case, it seems the question could be answered without that, but usually a data description is not adequate to determine how to code a solution. Data examples should be posted using the -dataex- command. If you are running version 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.


    When asking for help with code, always show example data. When showing example data, always use -dataex-.

    Comment


    • #3
      I didn't understand this


      For instance, I have 252 values for first year for first variable, 252 values for second year for first variable and so on
      unless you mean

      For instance, I have 252 values for first year for first variable, 252 values for second year for second variable and so on
      On that understanding: Your data layout isn't good for Stata purposes. You should reshape long: the layout you have is wide, and only rarely is that a good idea in Stata.

      Similarly, wanting to create thousands more variables will not help. It is going further in the wrong direction, What strategy do you have for future analysis?

      After 124 posts here it is time to read and act on the FAQ Advice. Show us a data example. Give the big picture: Are your prices for different products, different places or what?

      If your variable names are really like
      Variable1 you should think in terms of something more evocative.


      Comment


      • #4
        Clyde Schechter and Nick Cox: Thanks a lot for your response. Here's the data description: I have 1696 companies (these are my variables). For each company, I have daily stock return data for 16 years. So for each company, I have approximately 252 daily returns for each year since there are approx. 252 trading days in a year. This means that for each company, I have almost 252*16 = 4032 observations. My objective is to calculate standard deviation of daily returns for each company, year-wise. In this way, I would like to have a standard deviation of daily returns for each year for each company. Finally, I should have only 16 different values of standard deviation for 16 different years respectively for each company. As required, below is the example of my dataset.

        Code:
        * Example generated by -dataex-. To install: ssc install dataex
        clear
        input int FinancialYear double(MoonsTechnologiesLtd ZensarTechnologiesLtd ZicomElectronicSecuritySystem)
        2001   -.07981829980532129   -.07991913746630737   -.07741935483870976
        2001   -.07992477668077104   -.08012304086714515   -.07925407925407918
        2001   -.07996934082779754   -.07977707006369424   -.07257383966244721
        2001    -.0799777839489031   -.07942550614293124   -.07916287534121931
        2001    .07968608511922722    .07988721804511278    .07905138339920949
        2001    .07967570589879788    -.0798955613577024    .07967032967032969
        2001    .07975142413257383   -.03518728717366628    .06191687871077181
        2001   .025179856115107913    .07392156862745104    .07907348242811495
        2001   -.07976608187134508   -.07997078692714994   -.07994078460399702
        2001     -.079816980172852   -.07997618575114113   -.07964601769911497
        2001   -.07983425414364635   -.05996548748921487   -.07954545454545461
        2001   -.07985589912939063   -.08008260670032114   -.07977207977207969
        2001    -.0799347471451876   -.07982040409079577  -.045407636738906146
        2001    -.0797872340425532   -.07996747085931154   -.07243243243243247
        2001   -.07976878612716759   -.07984678845020617    -.0745920745920745
        2001  -.007537688442211103    .07973102785782896    .07934508816120903
        2001     -.079746835443038   .034994068801897975   .023337222870478413
        2001   -.07977991746905093  -.004584527220630275  -.007981755986317021
        2001   -.07972097658196313   -.11974668969487633   -.07931034482758627
        2001   -.07958852192744986   -.05166775670372788   -.07990012484394497
        2001    .07941176470588235    .07931034482758623    .07869742198100403
        2001    .07956403269754765    .02428115015974446    .07924528301886789
        2001     .0797576981322565   -.11946350592638807    .07925407925407922
        2001    .07947639083683965    -.0825363088912505    .07883369330453577
        2001   -.07968817669987009   -.07335907335907331    .07907907907907899
        2001    -.0795294117647059   -.11999999999999995    .07421150278293136
        2001   -.07975460122699384   -.11979166666666671   -.07944732297063906
        2001                     .   -.11941904249596551   -.05534709193245771
        2001                     .    .11912034208918748   -.03674280039721949
        2001                     .    .12008733624454145  -.009278350515463977
        2001                     0   -.03508771929824556    -.0707596253902185
        2001                     .                     0 -.0044792833146695575
        2001                     .   -.03484848484848491   -.04274465691788539
        2001                     .  -.026687598116169442   .010575793184488903
        2001                     0    .04838709677419347   .025581395348837244
        2001                     .                     0  -.043083900226757336
        2001                     .   -.02051282051282044  .0035545023696682125
        2001                     .  -.005235602094240912                     0
        2001                     .    .05789473684210534    .07910271546635186
        2001                     .    .11990049751243781    .06455142231947474
        2001                     .    .08840515326521538   -.04316546762589931
        2001    .07969723953695461    .12000000000000005  -.007518796992481081
        2001    .03958762886597936    .11989795918367344    .07900432900432897
        2001   .039666798889329745    .10608525870484874    .02908726178535598
        2001    .03967951163677976    .10679611650485422  -.030214424951267003
        2001    .03963302752293582    .04944178628389154  -.006030150753768787
        2001   .039887045534768836  -.060283687943262304  -.039433771486349904
        2001   .039375424304141086   -.11886792452830199   .010526315789473684
        2001                     0   .015295197308045275 -.0031249999999999702
        2001     .0398432397126061   -.11991563724013249  .0031347962382444845
        2001    .03988693467336698   -.07463197535090721   .005208333333333333
        2001    .03986710963455146    .10173880873103958  -.031088082901554404
        2001   .039790880046471064   -.03626595030221631    .07914438502673803
        2001   .039944134078212325   -.04668989547038327    .01883052527254699
        2001   .039752887456352436   -.01096491228070178   -.03599221789883272
        2001     .0397830018083182   .022912047302291242   -.05852674066599392
        2001   .011428571428571486  -.060332369942196463   -.07609860664523038
        2001    .02505526897568162  -.030757400999615558    .01160092807424594
        2001    .03738317757009337  -.024990083300277768  -.008027522935779848
        2001 -.0020790020790020266    .04841334418226206  -.028901734104046242
        2001   -.03981481481481479  -.005044625533566124 -.0035714285714285375
        2001                     .  -.026521060842433684   -.03703703703703714
        2001                     .    .02443910256410254   .031017369727047148
        2001  .0017495626093476345   -.03597966366836143   .010830324909747361
        2001 -.0029940119760478758   .004868154158215051                     0
        2001  .0027527527527526673  -.017359709325797324  -.027380952380952346
        2001  -.011729473421512326  -.039852095316351636   .003671970624234971
        2001   -.07979797979797985   -.05434317501069746   -.06951219512195125
        2001   -.07985729967069145   -.05656108597285067   -.07994757536041933
        2001   -.07992842230838058   .056594724220623484     .0256410256410256
        2001   -.09270664505672617    -.0467544257830232   -.00138888888888881
        2001   -.07824222936763123  -.020476190476190464   -.07093184979137703
        2001  -.015891472868217033   .013125911521633426    .04790419161676651
        2001  .0003938558487593317  -.043186180422264804  -.022857142857142777
        2001   -.06299212598425197   .007021063189568556   -.07163742690058488
        2001  -.007983193277310948   .015438247011952306   .020472440944881844
        2001  -.004235493434985176  -.031878371750858364   -.06790123456790122
        2001   -.07996597192683957   -.12056737588652479  -.006622516556291367
        2001    -.0060101710587148    .06336405529953926    .07833333333333338
        2001  .0018604651162790961   -.04983748645720486   -.05718701700154564
        2001    .06778087279480034    .04104903078677323   -.07868852459016389
        2001   -.07652173913043475   -.04381161007667035    .02669039145907473
        2001  -.020715630885122436  .0005727376861396341  -.006932409012131814
        2001   .002403846153846154  -.007441327990841386   -.07853403141361257
        2001     .1597122302158274 -.0017301038062284392   -.07954545454545448
        2001    .11373035566583953   -.03177354130560354   -.07818930041152272
        2001    -.1578165614556257  -.001789976133651619   -.06249999999999994
        2001    .04894179894179891  -.020920502092050295    .00952380952380949
        2001    .13408995376208493   -.02808302808302803   .014150943396226448
        2001   .013343217197924261   .045226130653266375    .05116279069767449
        2001   -.03328456474030712   -.04807692307692312  -.028761061946902748
        2001   .019296254256526545   -.05050505050505055   -.07972665148063782
        2001  -.049740163325909345  .0033244680851064303   -.07178217821782175
        2001  -.038281250000000044   .057654075546719745     .0773333333333333
        2001   -.07311129163281885  -.030701754385964924    .07920792079207928
        2001    .05740578439964953   .036199095022624465   .027522935779816415
        2001  .0037297969332779827    .03119151590767311               .078125
        2001    .15978530140379849   .029643073200241888    .07867494824016573
        2001   -.07653969384122464  .0023501762632196915    .07869481765834935
        2001    .07787201233616055 -.0035169988276669826    .07829181494661919
        end
        Please provide your valuable guidance.

        P.S.: I sincerely admire and respect both of you for the fantastic work you are doing on this platform.

        Thanks!

        Comment


        • #5
          Well, it isn't quite clear what you want the end result to look like. Do you still want to retain the daily observations, or do you just want a data set with one observation for each year?

          In either case, this data set clearly needs to be reshaped into long layout in order to make it usable for analysis in Stata. The firm names as variable names complicate the matter a bit because some of them are long. If the -rename- command below fails, you may have to abbreviate some of the long firm names first to make this work. Also, you need something to distinguish the days within each year. As you don't have such a variable, the code below creates one. (If your real data set has such a variable, don't make a new one, and use it's name wherever day_in_year appears in the code below:

          Code:
          sort FinancialYear, stable
          by FinancialYear: gen day_in_year = _n
          
          quietly ds FinancialYear day_in_year, not
          rename (`r(varlist)') r=
          reshape long r, i(FinancialYear day_in_year) j(firm) string
          rename r return
          With that behind you, if you need to retain the detail of daily observations for whatever you'll be doing next, do this:
          Code:
          by firm FinancialYear, sort: egen yearly_sd = sd(return)
          Notice that in long layout this is a simple one-liner because there is now a single return variable, instead of a myriad of them.

          If you do not need to retain daily observations and just need the yearly aggregates, then instead of the -egen- command do this:
          Code:
          collapse (sd) return, by(firm FinancialYear)

          Comment


          • #6
            The command
            Code:
            rename r return
            is not working. Nevertheless, I would like to give you a glimpse of the kind of variables I would like generate for all firms. Below is my dataset in this regard. The variable 'SDMoonsTechnologiesLtd' has values of standard deviation for MoonsTechnologiesLtd for all years (although the value of standard deviation for each year gets repeated as many times the number of days a year has). I would like to create such a variable for each of my 1696 companies.

            Code:
            * Example generated by -dataex-. To install: ssc install dataex
            clear
            input int(Date FinancialYear) double MoonsTechnologiesLtd float SDMoonsTechnologiesLtd
            14703 2001   -.07981829980532129 .06677212
            14704 2001   -.07992477668077104 .06677212
            14705 2001   -.07996934082779754 .06677212
            14706 2001    -.0799777839489031 .06677212
            14707 2001    .07968608511922722 .06677212
            14710 2001    .07967570589879788 .06677212
            14711 2001    .07975142413257383 .06677212
            14712 2001   .025179856115107913 .06677212
            14713 2001   -.07976608187134508 .06677212
            14717 2001     -.079816980172852 .06677212
            14718 2001   -.07983425414364635 .06677212
            14719 2001   -.07985589912939063 .06677212
            14720 2001    -.0799347471451876 .06677212
            14724 2001    -.0797872340425532 .06677212
            14725 2001   -.07976878612716759 .06677212
            14726 2001  -.007537688442211103 .06677212
            14727 2001     -.079746835443038 .06677212
            14728 2001   -.07977991746905093 .06677212
            14732 2001   -.07972097658196313 .06677212
            14733 2001   -.07958852192744986 .06677212
            14734 2001    .07941176470588235 .06677212
            14735 2001    .07956403269754765 .06677212
            14738 2001     .0797576981322565 .06677212
            14739 2001    .07947639083683965 .06677212
            14740 2001   -.07968817669987009 .06677212
            14741 2001    -.0795294117647059 .06677212
            14742 2001   -.07975460122699384 .06677212
            14745 2001                     . .06677212
            14746 2001                     . .06677212
            14747 2001                     . .06677212
            14748 2001                     0 .06677212
            14749 2001                     . .06677212
            14752 2001                     . .06677212
            14753 2001                     . .06677212
            14754 2001                     0 .06677212
            14755 2001                     . .06677212
            14756 2001                     . .06677212
            14759 2001                     . .06677212
            14760 2001                     . .06677212
            14761 2001                     . .06677212
            14762 2001                     . .06677212
            14763 2001    .07969723953695461 .06677212
            14766 2001    .03958762886597936 .06677212
            14767 2001   .039666798889329745 .06677212
            14768 2001    .03967951163677976 .06677212
            14769 2001    .03963302752293582 .06677212
            14770 2001   .039887045534768836 .06677212
            14773 2001   .039375424304141086 .06677212
            14774 2001                     0 .06677212
            14775 2001     .0398432397126061 .06677212
            14776 2001    .03988693467336698 .06677212
            14777 2001    .03986710963455146 .06677212
            14780 2001   .039790880046471064 .06677212
            14781 2001   .039944134078212325 .06677212
            14782 2001   .039752887456352436 .06677212
            14783 2001     .0397830018083182 .06677212
            14784 2001   .011428571428571486 .06677212
            14787 2001    .02505526897568162 .06677212
            14788 2001    .03738317757009337 .06677212
            14789 2001 -.0020790020790020266 .06677212
            14790 2001   -.03981481481481479 .06677212
            14791 2001                     . .06677212
            14794 2001                     . .06677212
            14795 2001  .0017495626093476345 .06677212
            14796 2001 -.0029940119760478758 .06677212
            14797 2001  .0027527527527526673 .06677212
            14798 2001  -.011729473421512326 .06677212
            14801 2001   -.07979797979797985 .06677212
            14802 2001   -.07985729967069145 .06677212
            14803 2001   -.07992842230838058 .06677212
            14804 2001   -.09270664505672617 .06677212
            14805 2001   -.07824222936763123 .06677212
            14808 2001  -.015891472868217033 .06677212
            14809 2001  .0003938558487593317 .06677212
            14810 2001   -.06299212598425197 .06677212
            14811 2001  -.007983193277310948 .06677212
            14812 2001  -.004235493434985176 .06677212
            14815 2001   -.07996597192683957 .06677212
            14816 2001    -.0060101710587148 .06677212
            14817 2001  .0018604651162790961 .06677212
            14818 2001    .06778087279480034 .06677212
            14819 2001   -.07652173913043475 .06677212
            14822 2001  -.020715630885122436 .06677212
            14823 2001   .002403846153846154 .06677212
            14824 2001     .1597122302158274 .06677212
            14825 2001    .11373035566583953 .06677212
            14826 2001    -.1578165614556257 .06677212
            14829 2001    .04894179894179891 .06677212
            14830 2001    .13408995376208493 .06677212
            14831 2001   .013343217197924261 .06677212
            14832 2001   -.03328456474030712 .06677212
            14833 2001   .019296254256526545 .06677212
            14836 2001  -.049740163325909345 .06677212
            14838 2001  -.038281250000000044 .06677212
            14839 2001   -.07311129163281885 .06677212
            14840 2001    .05740578439964953 .06677212
            14843 2001  .0037297969332779827 .06677212
            14844 2001    .15978530140379849 .06677212
            14845 2001   -.07653969384122464 .06677212
            14846 2001    .07787201233616055 .06677212
            end
            format %tddd-Mon-YY Date
            Waiting for your reply!

            Comment


            • #7
              There is very little that is more useless than to say that a command is "not working" without providing any information about in what way, specifically, it did not work. Did Stata crash? Did it hang? Did you get error messages--if so, what were they? Show the exact and complete output you got from Stata so we can see what is going on.

              In addition, since -rename r return- is a simple and clearly syntactically correct command, it seems likely that the problem lies with something having gone wrong prior to that point such that perhaps the -reshape- failed and variable r does not exist. Another possibility is that the dataset already contained a variable called return, so that this renaming would not be possible. So, can you verify that the preceding -reshape- command executed without any error messages? If it did produce error messages, what were they? Can you verify that there is no pre-existing variable called return in your data set?

              For the kind of output you are looking for, as shown in #6, you should use the -egen- command, not the -collapse- command from #5.

              Comment


              • #8
                Dear Clyde Schechter: The commands you mentioned in #5 have worked without any errors now. In fact, I have matched the output of standard deviation for some years with my manual calculation in Excel and there does not seem to be any mismatch. Earlier, I had actually stopped the command 'reshape long r, i(FinancialYear day_in_year) j(firm) string' from running in between since I thought it was taking too much time. Apologies for not sharing with you the exact description of the error.

                And, thanks a lot for your help! Really! Also, I would like to know how should I go about learning coding in Stata? Given your experience, what is the step-by-step strategy to learn programming in Stata?

                Comment


                • #9
                  Further, please consider the structure of my resultant dataset below.

                  Code:
                  * Example generated by -dataex-. To install: ssc install dataex
                  clear
                  input int(Date FinancialYear) str30 firm float yearly_sd
                  14845 2001 "KanoriaChemicalsIndsLtd"           .05524299
                  14705 2001 "CentenialSurgicalSutureLtd"       .020611877
                  15013 2001 "GujaratHotelsLtd"                  .11089911
                  15057 2001 "MunjalAutoIndsLtd"                 .04327252
                  14894 2001 "WestCoastPaperMillsLtd"           .036983907
                  15005 2001 "MohotaIndustriesLtd"            .00005265689
                  14916 2001 "BrightBrothersLtd"                 .04988215
                  14903 2001 "VenkySIndiaLtd"                     .0459145
                  14725 2001 "AkshOptifibreLtd"                  .05059998
                  15048 2001 "HindRectifiersLtd"                 .06200729
                  14773 2001 "TapariaToolsLtd"                   .08790273
                  14908 2001 "JRFoodsLtd"                                .
                  14880 2001 "StandardIndustriesLtd"             .08004004
                  14886 2001 "MountShivalikIndsLtd"              .12162056
                  14761 2001 "UnitechLtd"                        .04881753
                  14801 2001 "AbanOffshoreLtd"                   .03950319
                  15000 2001 "ECEIndustriesLtd"                          .
                  14984 2001 "WeizmannLtd"                       .15032548
                  14903 2001 "LaOpalaRGLtd"                      .05619229
                  14971 2001 "ManorEstatesIndsLtd"                .1122986
                  14703 2001 "PratikshaChemicalsLtd"                     .
                  14843 2001 "AparIndustriesLtd"                 .06636721
                  14825 2001 "AimcoPesticidesLtd"                .10044735
                  15006 2001 "ITLIndustriesLtd"                          .
                  14703 2001 "GangotriTextilesLtd"               .04821774
                  15053 2001 "TVTodayNetworkLtd"                         .
                  14710 2001 "BombayOxygenInvestmentsLtd"         .0555854
                  14895 2001 "KatareSpinningMillsLtd"                    .
                  14921 2001 "GoodluckIndiaLtd"                          .
                  15064 2001 "KeltechEnergiesLtd"                        .
                  14889 2001 "MangalamOrganicsLtd"               .10726316
                  14838 2001 "RavalgaonSugarFarmLtd"             .04773302
                  15036 2001 "HiranOrgochemLtd"                  .10835334
                  14748 2001 "MonnetIspatEnergyLtd"              .07529701
                  15015 2001 "OrientPressLtd"                    .09901819
                  14829 2001 "BioconLtd"                                 .
                  14802 2001 "SampreNutritionsLtd"               .13359454
                  15046 2001 "StoneIndiaLtd"                     .08855326
                  14775 2001 "GenusPowerInfrastructuresLtd"      .06657097
                  14795 2001 "SubexLtd"                          .04649145
                  14741 2001 "AlokIndustriesLtd"                 .04598932
                  14844 2001 "VikramThermoIndiaLtd"                      .
                  14857 2001 "MaharashtraSeamlessLtd"           .035894528
                  14742 2001 "ZenithExportsLtd"                          .
                  14910 2001 "VijayShanthiBuildersLtd"           .20903945
                  14857 2001 "SplendidMetalProductsLtd"           .0892134
                  14921 2001 "FrontlineCorporationLtd"                   .
                  14725 2001 "ModernSteelsLtd"                   .13072562
                  14845 2001 "MPSLtd"                                    .
                  14875 2001 "SandurManganeseIronOresLtd"       .020288056
                  14903 2001 "AdvancePowerinfraTechLtd"                  .
                  14902 2001 "AlembicLtd"                        .03939192
                  14902 2001 "SonataSoftwareLtd"                 .05706235
                  14763 2001 "CranexLtd"                         .06363508
                  14768 2001 "KRBLLtd"                           .05292131
                  15021 2001 "HindAluminiumIndsLtd"              .09286046
                  14948 2001 "ShreeSteelWireRopesLtd"                    .
                  14728 2001 "MuktaArtsLtd"                      .04528559
                  15013 2001 "SaharaOneMediaEntertainment"      .025268244
                  14712 2001 "GreenplyIndustriesLtd"            .032673284
                  14732 2001 "MachinoPlasticsLtd"                .05104965
                  14724 2001 "UPLLtd"                            .08186691
                  14971 2001 "BergerPaintsIndiaLtd"             .025340656
                  14867 2001 "BasilInfrastructureProjectsLt"             .
                  14762 2001 "PanamaPetrochemLtd"                .05951063
                  14725 2001 "FineLineCircuitsLtd"                .1383549
                  14823 2001 "ShyamTelecomLtd"                  .067131214
                  14990 2001 "ShreeRajasthanSyntexLtd"           .09668606
                  15034 2001 "NeulandLaboratoriesLtd"            .04074075
                  14705 2001 "MahindraMahindraLtd"               .03930141
                  14896 2001 "ForceMotorsLtd"                    .03742669
                  14822 2001 "HaryanaLeatherChemicalsLtd"         .1665071
                  14956 2001 "FineLineCircuitsLtd"                .1383549
                  14868 2001 "EIHLtd"                           .028215474
                  14983 2001 "BergerPaintsIndiaLtd"             .025340656
                  14880 2001 "KamatHotelsIndiaLtd"               .04887861
                  14742 2001 "GoraniIndustriesLtd"                       .
                  15042 2001 "PAELtd"                            .13027866
                  14979 2001 "IonExchangeIndiaLtd"               .04186457
                  15047 2001 "GlobalOffshoreServicesLtd"         .11202559
                  14704 2001 "DharaniFinanceLtd"                         .
                  14965 2001 "KilburnEngineeringLtd"             .10341997
                  15013 2001 "SuryajyotiSpinningMillsLtd"        .14448695
                  14895 2001 "ButterflyGandhimathiAppliances"    .13754557
                  14966 2001 "FDCLtd"                           .034416594
                  14907 2001 "AjwaFunWorldResortLtd"                     .
                  15026 2001 "MountShivalikIndsLtd"              .12162056
                  14815 2001 "SancoTransLtd"                     .06564457
                  14738 2001 "CiplaLtd"                         .031726513
                  14822 2001 "JKPaperLtd"                       .026114456
                  15014 2001 "HimadriSpecialityChemicalLtd"      .12061264
                  15053 2001 "GoraniIndustriesLtd"                       .
                  14843 2001 "NehaInternationalLtd"              .11009697
                  14826 2001 "JetkingInfotrainLtd"               .05032428
                  14962 2001 "KiranSyntexLtd"                     .1542038
                  15060 2001 "YorkExportsLtd"                    .10501102
                  14860 2001 "MachinoPlasticsLtd"                .05104965
                  14802 2001 "JhaveriFlexoIndiaPvtLtd"           .14072178
                  15012 2001 "PerfectpacLtd"                     .04715546
                  14875 2001 "SwissGlascoatEquipmentsLtd"         .0518464
                  end
                  format %tddd-Mon-YY Date
                  Now, since there are almost 252 trading days in a year, the value of standard deviation for a given company gets repeated almost 250 times for a particular year (although the value of this standard deviation remains same for a given company for a given year). This has lead to generation of 7656251 rows in my dataset and I am unable to transport the same to Excel for further analysis. Is there a way by which I can trim my dataset in such a manner so that I have value of standard deviation of a particular firm for a given year only once. In this way, I shall have only 16 different values of standard deviation corresponding to 16 different years for each company. And number of rows shall reduce to 1696*16 = 27136. Below is the data structure I intend to obtain.

                  Code:
                  * Example generated by -dataex-. To install: ssc install dataex
                  clear
                  input int FinancialYear str23 firm double yearly_sd
                  2001 "ABCGasInternationalLtd"   .14819550223658096
                  2002 "ABCGasInternationalLtd"   .22640233403794863
                  2003 "ABCGasInternationalLtd"   .04043564512429687
                  2004 "ABCGasInternationalLtd"    .2710073735357711
                  2005 "ABCGasInternationalLtd"    .3229529979988134
                  2006 "ABCGasInternationalLtd"    .2776193742474138
                  2007 "ABCGasInternationalLtd"    .8817123925177446
                  2008 "ABCGasInternationalLtd"    .4161479030609633
                  2009 "ABCGasInternationalLtd"     .598790001176269
                  2010 "ABCGasInternationalLtd"    .9180727778310465
                  2011 "ABCGasInternationalLtd"  .013638578756989372
                  2012 "ABCGasInternationalLtd"   .22641783060623055
                  2013 "ABCGasInternationalLtd"   .19574076087116643
                  2014 "ABCGasInternationalLtd"    .5889549234929253
                  2015 "ABCGasInternationalLtd"     .291809916420402
                  2016 "ABCGasInternationalLtd"    .3702289838306456
                  2001 "BombaySwadeshiStoresLtd" .012364558679375248
                  2002 "BombaySwadeshiStoresLtd"     .33163962865411
                  2003 "BombaySwadeshiStoresLtd"   .9930125734139239
                  2004 "BombaySwadeshiStoresLtd"  .40302889603442693
                  2005 "BombaySwadeshiStoresLtd"  .47051710288343684
                  2006 "BombaySwadeshiStoresLtd"   .4022417062332093
                  2007 "BombaySwadeshiStoresLtd"   .2831090769847442
                  2008 "BombaySwadeshiStoresLtd" .020733555508002266
                  2009 "BombaySwadeshiStoresLtd"  .44979157296224836
                  2010 "BombaySwadeshiStoresLtd" .013689528080304325
                  2011 "BombaySwadeshiStoresLtd"  .12022009900099029
                  2012 "BombaySwadeshiStoresLtd"   .1417826172896326
                  2013 "BombaySwadeshiStoresLtd"    .976248705364949
                  2014 "BombaySwadeshiStoresLtd"   .4614835903542206
                  2015 "BombaySwadeshiStoresLtd"  .40985741446817425
                  2016 "BombaySwadeshiStoresLtd"   .5089623909098224
                  2001 "Arex Industries Ltd."       .912610995743385
                  2002 "Arex Industries Ltd."     .41653196926149516
                  2003 "Arex Industries Ltd."      .0423727280452213
                  2004 "Arex Industries Ltd."      .2713336337437606
                  2005 "Arex Industries Ltd."      .2109170159993896
                  2006 "Arex Industries Ltd."      .5807524366354153
                  2007 "Arex Industries Ltd."      .7934648236752433
                  2008 "Arex Industries Ltd."      .4899784501444904
                  2009 "Arex Industries Ltd."      .6475779121843187
                  2010 "Arex Industries Ltd."      .9750762612935926
                  2011 "Arex Industries Ltd."      .5021898791830434
                  2012 "Arex Industries Ltd."      .9834999880869138
                  2013 "Arex Industries Ltd."      .8288562004983742
                  2014 "Arex Industries Ltd."      .6732698620749004
                  2015 "Arex Industries Ltd."     .16223395232488302
                  2016 "Arex Industries Ltd."      .7633566238849268
                  end
                  Waiting for your valuable response.

                  Comment


                  • #10
                    Code:
                    collapse (first) yearly_sd, by(FinancialYear firm)
                    Concerning how to learn to code in Stata, #5 at https://www.statalist.org/forums/for...uplicate-dates, where William Lisowski explains what he and I regard as the first stage. When I was a Stata beginner, I followed that up by taking a series of 3 Stata NetCourses on programming (see the www.stata.com for information). By the end of that, I was pretty proficient. The rest just came from experience. And being willing to try things out to see what they do (and figure out why when what they do isn't what I expected.)

                    Comment


                    • #11
                      The command you have mentioned works absolutely fine. And thanks a lot for your guidance on learning Stata. People like you inspire me to work hard, help others and grow together.

                      Comment


                      • #12
                        Hi Statalist,

                        I am working with rainfall data and i would like to generate two shock variables:
                        Flood and drought shocks are defined as +/- standard deviations with respect to the historical monthly mean of each region.

                        I have used the codes below to create the flood shock, however when i tried to generate the drought shock, it is capturing the flood as well.

                        bys reg: egen stdr=sd(meanrain)
                        gen flood=0 if raininyear!=.
                        replace flood=1 if raininyear > meanrain + stdr & raininyear!=.

                        Thank for your help.

                        Regards!

                        Comment


                        • #13
                          Well, you don't show the code you used to try to generate the drought shock, so how is anyone supposed to help you fix it?

                          By the way, your flood code can be simplified to one line:
                          Code:
                          gen flood = (rainyear > meanrain + stdr) if !missing(rainyear, meanrain, stdr)
                          I would imagine that the drought code should, similarly, be:

                          Code:
                          gen drought = (rainyear < meanrain - stdr) if !missing(rainyear, meanrain, stdr)
                          By the way, are flood and drought really defined as a deviation of 1 standard deviation above and below mean rainfall respectively? That seems to be a pretty loose criterion. If rainfall has a Gaussian distribution, that implies that more than 1 in 3 years will be either a flood year or a drought year.

                          Comment


                          • #14
                            Dear Clyde,

                            Much thanks for your reply.

                            I have used the codes below for the drought variable:

                            gen drought=o if raininyear!=.
                            gen drought=1 if raininyear > mean rain - stdr & raininyear!=. (When i did this, the drought was coded as 1 when there was a flood)

                            I am also trying to generate a “normal rainfall” variable for a given month if rainfall fell within one standard deviation of historical mean for the calendar month within regions.

                            In this instance, I have generated normal rainfall by using the code:

                            gen normal rain =1 if flood=0
                            replace normal rain =o if normalrain =.

                            Can you tell me if this correct?



                            Many thanks!

                            Regards

                            Comment


                            • #15
                              gen drought=o if raininyear!=.
                              gen drought=1 if raininyear > mean rain - stdr & raininyear!=. (When i did this, the drought was coded as 1 when there was a flood)

                              Well, this is, first of all, a syntax error. The blank space between mean and rain is illegal. So clearly this is not the code you actually ran. I'll guess that except for that blank space it is what you ran. On that assumption, and based on my understanding is that droughts are periods of decreased rainfall, the code should not be -...if rainyear > meanrain - stdr…- but -if rainyear
                              <
                              meanrain-stdr…-.

                              gen normal rain =1 if flood=0
                              replace normal rain =o if normalrain =.

                              Again, due to illegal blank spaces this code generates nothing but syntax error messages. Also, is that lower case o supposed to be a zero? If there is another variable named o in your data set, and if it is numeric, then it is legal to have -replace normalrain = o if normalrain=.-, but I', guessing that's not the case, and even if it is, I'm guessing it isn't what you want or intend. It seems to me that once you get your drought variable straightened out, the code for normal rain would just be:

                              Code:
                              gen normal_rain = !flood & !drought if !missing(flood, drought)
                              Before posting further in this thread, please read the Forum FAQ and follow its advice regarding:

                              A. When posting code post exactly what you ran. Don't edit it. In fact, the best way to do it is to copy it from your do-file, log-file, or Results window to your computer's clipboard and then paste that in to the Forum editor.

                              B. And then surround it by code delimiters to maximize its readability.

                              Comment

                              Working...
                              X