Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Function for highest and lowest increase in a variable

    Hi,

    I am currently writing my thesis about the impact Robinhood users have on the stock price volatility. I think it would be very interesting to see if firms that experienced a high increase in Robinhood users also experienced an increase in stock price volatility in comparison to firms who did not experience an increase in Robinhood users. For this analysis, I want to compare two groups. Group 1 is the lowest decile of firms that have experienced the least increase in Robinhood users and Group 2 which is the highest decile of firms that have experienced the greatest increase in Robinhood users.

    Is there a function in Stata that can look for the level increases in a given variable (in this case Robin_user)

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input long date str9 TICKER float Robin_user double RETX_sd_21days float(E_vol Asset_growth Long_term_debt Size)
    21306 "A"       590  .024704669955529814 .004914847 .04267474 .2168521 10.130047
    21307 "A"  586.4737   .02496701844494897 .004914847 .04267474 .2168521 10.130047
    21308 "A"  587.4583  .025027515252421216 .004914847 .04267474 .2168521 10.130047
    21311 "A"     585.9   .02490161882111235 .004914847 .04267474 .2168521 10.130047
    21312 "A"   577.125  .025622605385144024 .004914847 .04267474 .2168521 10.130047
    21313 "A"  581.2083  .025625072442753808 .004914847 .04267474 .2168521 10.130047
    21314 "A"  583.0417  .025548991795362092 .004914847 .04267474 .2168521 10.130047
    21315 "A"   595.625   .02545921972439692 .004914847 .04267474 .2168521 10.130047
    21318 "A"  661.9167  .025375674797751033 .004914847 .04267474 .2168521 10.130047
    21319 "A"  910.1667  .025373823067314984 .004914847 .04267474 .2168521 10.130047
    21320 "A" 1186.4584  .012971544203729387 .004914847 .04267474 .2168521 10.130047
    21321 "A" 1225.4375  .012755792037015906 .004914847 .04267474 .2168521 10.130047
    21322 "A" 1146.5294  .012332868317013777 .004914847 .04267474 .2168521 10.130047
    21325 "A" 1082.5834  .013091860825230424 .004914847 .04267474 .2168521 10.130047
    21326 "A" 1001.3333  .012486383993162544 .004914847 .04267474 .2168521 10.130047
    21327 "A"  939.4583   .01237881906090757 .004914847 .04267474 .2168521 10.130047
    21328 "A"  921.7059   .01236852128382127 .004914847 .04267474 .2168521 10.130047
    21329 "A"   905.875  .013410349475215385 .004914847 .04267474 .2168521 10.130047
    21333 "A"       904  .013439015279697474 .004914847 .04267474 .2168521 10.130047
    21334 "A"     909.6  .013001905520743713 .004914847 .04267474 .2168521 10.130047
    21335 "A"  905.8333  .012621857295848906 .004914847 .04267474 .2168521 10.130047
    21336 "A"  895.2083  .012523655122617675 .004914847 .04267474 .2168521 10.130047
    21339 "A"  882.4583  .012168525698544297 .004914847 .04267474 .2168521 10.130047
    21340 "A"  875.1667  .011937756780267936 .004914847 .04267474 .2168521 10.130047
    21341 "A"  862.0833  .012134404918243163 .004914847 .04267474 .2168521 10.130047
    21342 "A"  828.2083   .01061856243625401 .004914847 .04267474 .2168521 10.130047
    21343 "A"   813.375  .011131506125058354 .004914847 .04267474 .2168521 10.130047
    21346 "A"  796.7826  .011176947470294222 .004914847 .04267474 .2168521 10.130047
    21347 "A"  779.7083  .011295382294606197 .004914847 .04267474 .2168521 10.130047
    21348 "A"  776.0417   .01153289893858018 .004914847 .04267474 .2168521 10.130047
    21349 "A"  767.9583   .01153200235438123 .004914847 .04267474 .2168521 10.130047
    21350 "A"       758  .011653067059779672 .004914847 .04267474 .2168521 10.130047
    21353 "A"  750.4583  .011772842312328433 .004914847 .04267474 .2168521 10.130047
    21354 "A"  750.5833  .011237425044435619 .004914847 .04267474 .2168521 10.130047
    21355 "A"   752.625  .010960555351483168 .004914847 .04267474 .2168521 10.130047
    21356 "A"  752.2917  .010684257958031636 .004914847 .04267474 .2168521 10.130047
    21357 "A"  747.2917  .010676820606914777 .004914847 .04267474 .2168521 10.130047
    21360 "A"  742.2174  .011185137263403989 .004914847 .04267474 .2168521 10.130047
    21361 "A"   739.375   .00990601093982708 .004914847 .04267474 .2168521 10.130047
    21362 "A"   739.125  .009871991730532175 .004914847 .04267474 .2168521 10.130047
    21363 "A"    765.25  .010000063895312309 .004914847 .04267474 .2168521 10.130047
    21364 "A"  817.6667  .010961502868554667 .004914847 .04267474 .2168521 10.130047
    21367 "A"    838.25  .011670944449788922 .004914847 .04267474 .2168521 10.130047
    21368 "A"    840.75  .012258936231664424 .004914847 .04267474 .2168521 10.130047
    21370 "A"   834.375  .012179594288376101 .004914847 .04267474 .2168521 10.130047
    21371 "A"   826.625  .012135405427447432 .004914847 .04267474 .2168521 10.130047
    21374 "A"  819.0417   .01224513724268452 .004914847 .04267474 .2168521 10.130047
    21375 "A"   811.375  .012438193392232616 .004914847 .04267474 .2168521 10.130047
    21376 "A"  807.8333  .012479095720895696 .004914847 .04267474 .2168521 10.130047
    21377 "A"  800.7083   .01233595111914223 .004914847 .04267474 .2168521 10.130047
    21378 "A"  791.2917  .012402939763123942 .004914847 .04267474 .2168521 10.130047
    21381 "A"  791.3333  .012478065908349892 .004914847 .04267474 .2168521 10.130047
    21382 "A"   785.875   .01233477422095114 .004914847 .04267474 .2168521 10.130047
    21383 "A"    777.25  .014297394053910848 .004914847 .04267474 .2168521 10.130047
    21384 "A"    769.75  .014469021234493664 .004914847 .04267474 .2168521 10.130047
    21385 "A"  761.0833  .014297753285104114 .004914847 .04267474 .2168521 10.130047
    21388 "A"       757  .014271448963207489 .004914847 .04267474 .2168521 10.130047
    21389 "A"  752.3333   .01432832617966895 .004914847 .04267474 .2168521 10.130047
    21390 "A"  741.7083  .014065552309581333 .004914847 .04267474 .2168521 10.130047
    21391 "A"   733.087  .013494042048833756 .004914847 .04267474 .2168521 10.130047
    21392 "A"  725.1667  .013254176921549187 .004914847 .04267474 .2168521 10.130047
    21395 "A"   719.625  .012833193685748324 .004914847 .04267474 .2168521 10.130047
    21396 "A"   712.625  .012327518573474392 .004914847 .04267474 .2168521 10.130047
    21397 "A"  712.2083  .011458601390482364 .004914847 .04267474 .2168521 10.130047
    21398 "A"  713.2917  .010802028275263269 .004914847 .04267474 .2168521 10.130047
    21399 "A"  715.1667  .010794200241529796 .004914847 .04267474 .2168521 10.130047
    21402 "A"  714.2917  .010796396703504855 .004914847 .04267474 .2168521 10.130047
    21403 "A"  707.1667  .010575760454405767 .004914847 .04267474 .2168521 10.130047
    21404 "A"  699.7826  .009939910726452516 .004914847 .04267474 .2168521 10.130047
    21406 "A"    723.92  .009743962426818655 .004914847 .04267474 .2168521 10.130047
    21409 "A"  760.5833  .009681166005387647 .004914847 .04267474 .2168521 10.130047
    21410 "A"  794.2083  .009610244348013781 .004914847 .04267474 .2168521 10.130047
    21411 "A"  843.9583  .010259044237580056 .004914847 .04267474 .2168521 10.130047
    21412 "A"     789.5  .007498114760397694 .004914847 .04267474 .2168521 10.130047
    21413 "A"   765.875  .007215992326041391 .004914847 .04267474 .2168521 10.130047
    21416 "A"  754.4583 .0064096497229258935 .004914847 .04267474 .2168521 10.130047
    21417 "A"   752.375  .006385461347471412 .004914847 .04267474 .2168521 10.130047
    21418 "A"  743.0833 .0075833596737493285 .004914847 .04267474 .2168521 10.130047
    21419 "A"  732.4167   .00772817923294989 .004914847 .04267474 .2168521 10.130047
    21420 "A"  719.6667   .00833255412334665 .004914847 .04267474 .2168521 10.130047
    21423 "A"   714.625    .0083555016791896 .004914847 .04267474 .2168521 10.130047
    21424 "A"  713.1667   .00839080976977848 .004914847 .04267474 .2168521 10.130047
    21425 "A"   708.625  .008314352829244233 .004914847 .04267474 .2168521 10.130047
    21426 "A"  708.3043  .008421647941021607 .004914847 .04267474 .2168521 10.130047
    21427 "A"     706.5  .008586653345913646 .004914847 .04267474 .2168521 10.130047
    21431 "A"  699.0833   .00860877523068432 .004914847 .04267474 .2168521 10.130047
    21432 "A"  695.5833  .008487650903677098 .004914847 .04267474 .2168521 10.130047
    21433 "A"   691.375  .009804726534148074 .004914847 .04267474 .2168521 10.130047
    21434 "A"       691  .009911790257500516 .004914847 .04267474 .2168521 10.130047
    21437 "A"  691.4167  .010337012160953424 .004914847 .04267474 .2168521 10.130047
    21438 "A"    692.25  .010248501746247796 .004914847 .04267474 .2168521 10.130047
    21439 "A"  688.3333  .013519025713368245 .004914847 .04267474 .2168521 10.130047
    21440 "A"  683.0435   .01460767195318382 .004914847 .04267474 .2168521 10.130047
    21441 "A"     678.5  .014599385812976824 .004914847 .04267474 .2168521 10.130047
    21444 "A"     682.5  .014645848949743344 .004914847 .04267474 .2168521 10.130047
    21445 "A"    689.25  .016022547768937433 .004914847 .04267474 .2168521 10.130047
    21446 "A"   687.875   .01603286970705724 .004914847 .04267474 .2168521 10.130047
    21447 "A"  688.2174  .016449046264918842 .004914847 .04267474 .2168521 10.130047
    21448 "A"  687.2609  .015783693242935083 .004914847 .04267474 .2168521 10.130047
    21451 "A"   688.875  .015784839472585386 .004914847 .04267474 .2168521 10.130047
    end
    format %tdCCYY-NN-DD date
    If you need more information, please let me know!

    Kind regards,
    Fabien

  • #2
    You need to explain how you want to operationalize "increase in Robinhood users." Among the possibilities are: the difference between the number of users on the late date and the number on the first date, the difference between number on the date with the lowest number and the number on the the date with the highest number after that. In both of these possibilities, you might also consider the ratio instead of the difference. Or maybe you want to fit a regression line showing the growth trend in Robinhood users over time and look at the slope of that. These are just the things that popped into my head. Probably if I spent a little time on it I could come up with several other possible definitions. Anyway, you need to specify what you want here before anybody can help you code it.

    Comment


    • #3
      Dear Clyde,

      Many thanks for your response!

      Looking at your answer, I think I want to measure this based on two different approaches. The first approach is the difference between the lowest number and the number on the the date with the highest number after that and the second approach is to fit a regression line showing the growth trend over time. For the second approach, is there any code that can put the coefficients of the growth trend in the columns of each firm?

      Now that I have specified the way I want to operationalize this increase, I will have another look at the statalist forum to find similar posts. However, if anyone can provide me with a code or offer assistance regarding my post, please don't hesitate to help!

      Kind regards,
      Fabien

      Comment


      • #4
        Code:
        //  FIND DATE WITH LOWEST NUMBER OF ROBINHOOD USERS; IF TIED, CHOOSE EARLIES
        by TICKER (Robin_user date), sort: gen date_lowest = date[1]
        by TICKER (Robin_user date): gen lowest_robin = Robin_user[1]
        format date_lowest %tdCCYY-NN-DD
        
        //  NOW FIND THE HIGHEST NUMBER OF ROBINHOOD USERS AFTER THAT DATE
        by TICKER: egen highest_robin = max(cond(date > date_lowest, Robin_user, .))
        
        gen increase_in_robinhood_users = highest_robin - lowest_robin
        sort TICKER date
        
        //  REGRESSION SLOPE
        rangestat (reg) Robin_user date, by(TICKER) interval(date . .)
        At the end of this code, the new variable increase_in_robinhood_users will give you the first definition. And the regression slope will be in the variable b_date.

        -rangestat- is written by Robert Picard, Nick Cox, and Roberto Ferrer, and is available from SSC. You can also get this regression result from official Stata commands, but doing it with -rangestat- is a one-liner, so I like doing it that way. It's very easy, and -rangestat- is so useful in other ways.

        Comment


        • #5
          Hi Clyde,

          Many thanks again for your response! For my analysis, I used the best of both worlds from your code. Namely, I made a regression between the date of the lowest Robinhood users and the date of the highest Robinhood users after the date of the lowest Robinhood user. Thanks a lot for providing me this code!!

          My last question on this topic is about the actually analysis where I am trying to find evidence that the highest increase in Robinhood users had a higher increase in stock price volatility compared to a small increase in Robinhood users. I have divided my dataset, based on the "b-date" slope into ten deciles and I only use decile 1 and 10 for my analysis. To find the regression slope of the stock price volatility, I made a slight change to your code:

          Code:
          rangestat (reg) RETX_SD_21days date, by(TICKER) interval(date . .)
          I ran a t-test to see whether the mean of the slope of decile 1 is significant lower than the mean of the slope of decile 10, however, I do not find any support for this in the T-test. Is there maybe a more "sophisticated" analysis to see if the firms with the highest increase in Robinhood users also observed a higher/stronger increase in stock price volatility?

          Many thanks!

          With kind regards,
          Fabien

          Comment


          • #6
            Breaking the data into deciles degrades the quality of the Robinhood usage increase variable. Continuous variables should be kept as continuous in nearly all circumstances. Making categories out of them introduces noise and discards information. Moreover, there could be different relationships in the 2nd through 9th deciles than there are in deciles 1 and 10. You need to capture that as well.

            I would do some graphical exploration first. Look at a scatterplot of the volatility increase measure vs the Robinhood user increase measure. What, if anything, does it suggest? Try using the -lowess- command to see what a locally weighted best fit curve looks like. That might help you see what is going on and guide you in selecting a suitable model/analysis.

            Added: Also, your model does not adjust for any other variables that might be influencing volatility and that are also associated with Robinhood users coming to, or fleeing from, the stock. You need to think about those. As I'm not in the finance field, I can't help you with identifying what those might be. But in observational studies, you always have to worry about measured and unmeasured confounders. Are there attributes of stocks that are associated with high volatility that might also tend to draw in Robinhood users (or, conversely, drive them away). Such confounders need to be included in the analysis to adjust for their effects.

            Finally, always bear in mind that your hypothesis simply may not be correct.
            Last edited by Clyde Schechter; 08 Jun 2022, 14:32.

            Comment


            • #7
              Hi Clyde,

              Many thanks for all your help and information!

              With kind regards,
              Fabien

              Comment

              Working...
              X