Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Research design question regarding impact of leadership on business performance during crisis (covid-19)

    Hello Everyone,

    I am currently in the process of conducting my analysis for my master thesis. My topic of choice is investigating the influence of leadership attributes as defined by the GLOBE database, on business performance of hotels during a crisis period. Since my DV data is on the company level, and the GLOBE leadership dimensions are on the Country-level I am planning conduct mulit-level analysis, but before doing this I was wondering on some input on my structure of choice. My supervisor is unwilling to spar and discuss potential variables so I was hoping some more experienced researchers here would be willing to voice their opinion as it is all still relatively new to me (yet I am excited to learn and become better in it).

    My design is as follows;
    IV: Leadership dimensions (Charismatic/Value-based, Team-Oriented, and Self-protective as defined and scored by GLOBE) .
    DV: Business performance (measured using revenue, EBIT, Current ratio)
    Control:
    • Size (Country GDP)
    • Market competition (I calculated HHI)
    • Economic Condition (calculated using principal component analysis for GDP growth, unemployment rate, inflation rate)
    • Political stability index
    • Hofdstede's cultural dimensions
    • Globe leadership attributes (I want to control for the attributes not in focus to isolate effects)
    Time period:
    • Precrisis 2015-2019
    • Crisis 2020-2022
    Due to data availability there are 23 countries included in my study with complete values for all variables. My main question is if anyone sees any obstacles right now with my proposed design, and whether my controls of choice are agreed upon.

    Thanks for taking the time reading this.

    Best regards,
    Jesse Nooijen

  • #2
    Jesse:
    1) DV: I guess that you used PCA for creating one regressand from a set of three DVs;
    2) the correctness of your predictors and controls depends on the literature in your resrach field (that I'm not famiiar with);
    3) as far as -Crisis- variable is concerned, a 0/1 categorical variable (see -fvvarlist- notation for further details) is the way to go;
    4) from your description, I assume that you're not planning a DID (otherwise, you'd need a control group);
    5) the only issue that I can see so far is the complete case analysis, as you run the risk to perform a regression on a sample that may differ form the original one.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Originally posted by Carlo Lazzaro View Post
      Jesse:
      1) DV: I guess that you used PCA for creating one regressand from a set of three DVs;
      2) the correctness of your predictors and controls depends on the literature in your resrach field (that I'm not famiiar with);
      3) as far as -Crisis- variable is concerned, a 0/1 categorical variable (see -fvvarlist- notation for further details) is the way to go;
      4) from your description, I assume that you're not planning a DID (otherwise, you'd need a control group);
      5) the only issue that I can see so far is the complete case analysis, as you run the risk to perform a regression on a sample that may differ form the original one.
      Dear Carlo,

      thanks for your reply.

      1. I actually plan to run three separate regressions to see the influence on the respective measures. I did plan to incorporate PCA for my control variables of choice, but after some trial and error I could not solve it and investigation led me to believe that my variable unemployment rate had too many missing values, hence I proceeded with 2 separate controls for GDP Growth and Inflation rate rather than PCA for the three.

      2. I had some struggles in determining the controls I wanted to include, but I feel like I have a solid list now and do not want to overcomplicate my model. So I will most likely keep Country GDP, Company Size (total assets), Market Compeittion, GDP Growth, inflation rate, political stability index, Globe leadership dimensions.

      3, I planned to create two distinct periods and use period non-crisis to determine expected outcomes in terms of revenue, profitability and current ratio for the years 2020, 2021, 2022 and then compare them with the actual values during crisis to see the differences. Reading up on 0/1 categorical variable I am interpreting it as that allows me to do the same as it holds 2 positions; crisis 0 and non-crisis 1 for the periods I give it. Do I read that correct?

      4. You are correct that I did not plan a DiD as I do not have a control group, I plan to use time series to determine the expected values in the crisis years and then run a comparison as I explained above. Reading up on literature I assumed this was a solid approach in creating a counterfactual and comparing, but I am not sure so please correct me if you believe otherwise.

      5. I wanted to refrain from using incomplete data, so therefore I have reduced my country sample down to 23 now that have data on all my variables for all years. I am not sure I follow the point entirely, as I was under the assumption that refraining from listwise deletions etc would benefit my study, but again please correct me if this was mistaken.


      Thanks once again for taking the time to answer my questions, I really appreciate it.

      Comment


      • #4
        Jesse:
        thanks for clarifying.
        My two cents on two points of your list:
        1) I'd thoroughly discuss with your supervisor your choice of running three separate regressions. Usually, when the coefficients of different regression are compared, is the right-hand side that changes.
        5) I try to make my previous point clearer. Usually, there are informative reasons why some observations (oftentimes sharing some charasteristics) have missing values. When detected, the mechanism (and the pattern) underlying the missingness should be diagnosed. Rarely, values are missing completely at random (put differently, the resulting sample is a random sub-sample of the original one; it's simply less effcient because you lost observations for your analysis due to listiwise deletion), but (much) more often your unobserved data are missing at random or not at random (and the way they should be managed is increasingly tricky); see -mi- entries in Stata .pdf manual (and related references). That said, if you decide to go complete case analysis your justification to follow this approach should be, methodologically speaking, "discussantproof". I'd bring this issue too up to your supervisor's attention, just to avoid painful issues when the runway is in sight.
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          I see, sadly I don't have much conversation with my supervisor which is why I reached out here. He is not open to sparring/discussing about potential approaches as that is ''up to me''. I'm not here to put him in a bad daylight though, just trying to gain some more insights from experienced researchers like you. What would your view be on running two separate regressions instead of three and then focusing on revenue, and current ratio (important in crisis) and seeing how results might differ here?


          I have been having a look at missing at random and not at random and since my variable Total Assets appears to have quite some missing values I'll look into how to proceed from here as I do not want to reduce my sample by much more. Again, I can't really discuss this with my supervisor sadly, but I will look into discussantproof if I cannot work my way around the missing data issues and have to reduce further.

          I appreciate your quick reply one again!

          Comment


          • #6
            Jesse:
            probably two regressions with two different dependent variables are easier to justify than three. Again, most depends on the tribal rules in your research field.
            As an aside, when I wrote "discussantproof" I did not mean any source you shoud delve into, but simply that your reserach hypotheses/assumptions should be robustly justified, so that they can resist discussant's possible criticisms.
            Kind regards,
            Carlo
            (Stata 19.0)

            Comment


            • #7
              I understand now, thanks for clarifying. After running tests I decided to create a compound instead and it will consist of Revenue and Current, as my EBIT data appeared to be MNAR and the missing values are correlated to my other predictors. Furthermore, when I tried to regress the missingness of EBIT using the other predictors it resulted in no convergence. While I think a compound on the three separate measures would have been stronger, due to the data availability and complexity I am leaving EBIT out.

              I have attached an image of my EBIT plot below, which suggests that smaller companies or certain types of firms are not reporting EBIT. Total Assets is displayed as mean as I had to collapse my data when multiple companies operated under the same name in the same country for the same year. But now I am not sure in how I should continue in creating my composite, given that my data for Revenue and Current are MNAR.

              Click image for larger version

Name:	stata.png
Views:	1
Size:	62.8 KB
ID:	1733080


              Last edited by Jesse Nooijen; 07 Nov 2023, 12:03.

              Comment


              • #8
                Jesse:
                one of my favourite papers on MNAR management is "van Buuren S, Boshuizen HC, Knook DL. Multiple imputation of missing blood pressure covariates in survival analysis. Stat Med. 1999 Mar 30;18(6):681-94." Take a look at their methods.
                Kind regards,
                Carlo
                (Stata 19.0)

                Comment


                • #9
                  Thanks Carlo, I will look into it!

                  Comment


                  • #10
                    I decided to run a Heckman model but I had a lot of trouble getting it working this afternoon. My statistics determined that I am only missing total assets on 1.59% of my data set (7010 observations), would it be appropriate to simply exclude these companies? Given that the missing data appears to be of companies that score low on either revenue or current ratio as well.

                    Comment


                    • #11
                      Jesse:
                      the issue is more theoretical than practical.
                      While 1.59% missing values is clearly not an issue, the raeson why ompanies with the characteristics you described present missing values might be worth investigating.
                      I would skip -heckman- here and follow one of the folowing two (opoosite) roads:
                      1) managing MNAR;
                      2) being really convincing (think of a mean discussant when planning how to write this block of text in the Method section of your master thesis) in justifying why you decided to get rid of all the observations with missing values and go complete case analysis. Despite the relationship with your supervisor does not seem to be that wonderful, I would drop her/him some lines explaining what you're planning to do with your data and asking for her/his opinion.

                      Kind regards,
                      Carlo
                      (Stata 19.0)

                      Comment


                      • #12
                        Thanks Carlo,

                        I have read up on it, and tomorrow I am going to pursue a Censored regression model (Tobit) as most of my values are missing on the low end, if that does not work for me I will pursue option 2.

                        Comment


                        • #13
                          Jesse:
                          but what is the censoring value in your dataset?
                          Kind regards,
                          Carlo
                          (Stata 19.0)

                          Comment


                          • #14
                            Originally posted by Carlo Lazzaro View Post
                            Jesse:
                            but what is the censoring value in your dataset?
                            I expected / found some evidence for relatively many missing values to be below the <1 million euro threshold but I have been having many issues so decided to drop those cases. After messaging my sueprvisor my intends and the two options I might pursue, his reply to asking his opinion was ''This is a question you can deal with yourself.", hence I decided after a few hours of trying I want to go full case as I could not get it to work and I feel like dropping the 1.59% won't hurt the core of my research objective.

                            I am currently struggling dealing with collinearity issues, as trying to run my regression to test for fe/re, it omits my IV CharismaticValuebased. Running a reg command and checking for VIF provides me the following:

                            commands:
                            reg Revenue CharismaticValuebased TeamOriented SelfProtective HumaneOriented Participative Autonomous HHI GDPG Inflation ln_GDP Political ln_Assets
                            vif

                            VIF results:

                            . vif

                            Variable | VIF 1/VIF
                            -------------+----------------------
                            TeamOriented | 16.89 0.059217
                            Charismati~d | 12.47 0.080216
                            SelfProtec~e | 6.67 0.149928
                            HumaneOrie~d | 5.72 0.174754
                            Participat~e | 3.01 0.332722
                            Political | 2.61 0.382852
                            ln_GDP | 2.54 0.394413
                            HHI | 2.47 0.405499
                            Autonomous | 2.34 0.427405
                            Inflation | 1.48 0.675859
                            ln_Assets | 1.26 0.793884
                            GDPG | 1.10 0.908914
                            -------------+----------------------
                            Mean VIF | 4.88
                            I understand that this indicates multicollinearity issues for 4 leadership dimensions. Throughout this week my approach changed quite a few times, but right now I aimed to run 12 regressions (3 IVs, 2DVs without interaction terms) and the same with interaction terms. I might continue this road and apply either LASSO, or PCR, but I am not familiar with those so will read up on them. However, I am contemplating whether it would be best to exclude my globe leadership dimensions as control, that way I could run 4 regressions, keeping the IVs together and running them for my 2 DVs with and without interaction terms. What would your opinion be on this approach?

                            Comment


                            • #15
                              Jesse:
                              the 4 leadership dimension are highly corelated and unavoidably so.
                              You may want to go PCA to sum them up and using thebresulting vector as your new dependent variable.
                              Kind regards,
                              Carlo
                              (Stata 19.0)

                              Comment

                              Working...
                              X