Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Small sample size - regression

    Hi I have a small sample size. My outcome variable is a proportion. My outcome variable is skewed as is some of my independent variables.
    I have way too many independent variables, and I am specifically interested in testing four novel variables. Although I have 15 variables of interest, four are of particular interest. I also have around 15 other variables with a theoretical association to the outcome. I need to to decide how to choose the variables (of 30 I want probably less than 10 for the model).
    I keep testing different models and variables become significant and non-significant with each change. I have no idea how to choose the best one and it feels like if I did pick one, I would be picking it out of the sky and just making it up! Is there any way of making a justified decision?
    I want to report odds ratios and was thinking either fractional or beta regression.
    Last edited by Jane Smith; 27 Sep 2018, 02:54.

  • #2
    I think the most important element when choosing the right independent variables is based on theoretical grounds. In other words, you have to make an argument as to why it is important to include such variables. I do not think that just including the independent variables that have the largest beta coefficients or the lowest p-values is a good approach when constructing a model.

    With that being said, some researchers choose independent variables based on correlation to the dependent variable or largest standardized coefficient.
    If you choose to take this path then try the command "corr,"

    Code:
    help corr
    
    corr depv indpv indpv indpv

    and look at which independent variables have the highest correlations

    Here is a link that can help as well:

    http://blog.minitab.com/blog/adventu...ression-models

    Another way is to look at standardized is to run a regression and adding ,beta at the end.

    Code:
    reg depv indpv indpv, beta
    also look at

    Code:
    search listcoef
    
    help listcoef
    I also found this presentation by one of our regular contributors Marteen Buis, which can help you with having a proportion as a dependent variable:

    http://www.maartenbuis.nl/presentations/UKsug06.pdf

    Best,
    Last edited by Paolo Velasquez; 27 Sep 2018, 03:24.

    Comment


    • #3
      Jane:
      as an aside to Paolo's helpful advice, your model should first give a fair and true view of the data generating process. Explore the literature in your research field and see what other researchers did in the past when presented with the same research topic.
      Kind regards,
      Carlo
      (Stata 19.0)

      Comment

      Working...
      X