Panzar and Rose Model Code in Stata

roman garcia

Join Date: Nov 2018

Posts: 3
#1

Panzar and Rose Model Code in Stata

25 Nov 2018, 04:52

Dear Forum members,

I am a beginner in Stata and working on the thesis. I am trying to implement the following Panzar and Rose model in Stata:

ln II = α + β ln AFR + γ ln PPE + δ ln PCE + ∑ ξj ln BSFj + η ln OI + error, where II (interest income) , AFR (annual funding rate), PPE ( personal expenses); PCE (price of capital expenditure) ; BFS (Bank specific exogenous factors; and OI (ratio of other income to total assets).

I would really appreciate any tips or advice regarding how to implement in STATA this regression model or just a regression model similar. I understand that I should 1) use the command regress , 2) label my data in excel including the independent variables of this equation (AFR, PPE, etc.. ), 3) import the data file in Stata, but not really sure how.
Thank you
Roman
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30097
#2

25 Nov 2018, 10:32

It is premature to discuss coding anything until you have brought your data into Stata. To do that, read -help import excel- and click on the link near the top of the page to the complete PDF documentation for that command. Read that chapter and you will learn how to bring data into Stata from Excel.

Once you have a Stata data set, you can start working with your data. If you need additional help, post back with additional specific questions. Be sure, when asking for help with code, to show example data that is relevant to your question, and be sure to use the -dataex- command to do that. If you are running version 15.1 or a fully updated version 14.2, it is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

If you are new to Stata, before proceeding with a consequential project, it is best to invest some time in learning the basics of Stata. For that, open the PDF documentation (select PDF Documenation on the Help menu) and read the Getting Started [GS] and User's Guide [U] volumes. This is a rather lengthy read, but it will take you through the fundamentals of Stata syntax and introduce you to the most important commands that are used routinely in Stata data management and analysis. You will also learn the overall approach to and conception of data that underlies Stata. (It is very different from the way you think about data in a spreadsheet.) You won't remember everything you read, but by the time you are done, you will, in most situations, have a sense of which commands are likely to play a role in solving your problem, and if you don't recall the syntax or other details, you will be comfortable with referring back to the help files or the PDF documentation for that.
1 like
Comment
roman garcia

Join Date: Nov 2018

Posts: 3
#3

30 Nov 2018, 03:45

Dear Clyde,
Many thanks for the comprehensive answer. I manged to create a dataset and import it into Stata, and perform a linear regression with my model. However, now I read in the paper I am trying to replicate that I need to perform maximum likelihood estimates for a pooled panel model of the regression above (ln II = α + β ln AFR + γ ln PPE + δ ln PCE + ∑ ξj ln BSFj + η ln OI + error, ).
Can you please help me to understand the commands or menu items that I should run to do this? Do I need to declared data to be panel data first?. I know that I suppose to use "ml model method progname eq [eq ...] [if] [in] [weight] [,model_options svy diparm_options]" . Do you maybe have some code which can serve me as an example?
(I already read the chapter indicated plus the Getting Started [GS] and User's Guide [U] volumes)
Many thanks again for any kind of orientation, I am working full time and it it not easy to self learn in my free time.
Roman
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30097
#4

30 Nov 2018, 06:11

There is no need to go through -ml- for this. That is a level of complexity that you should bypass and just use -regress-. The model you propose is a straightforward linear regression. The -generate- command will enable you to create the log-transformed variables that are in the equation. The -regress- command will then enable you to estimate the coefficients in the model. (-regress- uses least squares estimation rather than maximum likelihood, but in the case of a simple linear model like this one the two methods produce the same results, and least squares is so much easier!)

Note: I cannot tell from anything in this thread so far if you have just one observation per person or if this is panel data. If it is panel data, then you need to use -xtset- with the variable that identifies people to declare panel data, and then use -xtreg- instead of -regress-. Given, however, that your model does not include a person-level error term, I'm guessing it is not panel data, so you can just stick with -regress- and skip -xtset-.
Comment
roman garcia

Join Date: Nov 2018

Posts: 3
#5

03 Dec 2018, 13:36

Many thanks again for your response. It was indeed very useful. I need to work a bit on it but I feel I have already improved a lot thanks to your messages. I think I have panel data because I have 26 banks with 7 different variables per each bank from 2008 to 2014 (meaning 26 x 7 = 182 observations?) so I suppose that I need to use as you mentioned xtset and xtreg. However, I am now experience one new challenge before starting my regressions. I have some missed random data for some years on same variables, I think is what is called unbalanced panel data. I don’t really sure how I should treat it. Should I delete the year where I don’t have all the data? Should I just leave replace the missing data by blanks?
I read that in the case of randomly missing data, most Stata commands can be applied to unbalanced panels without causing inconsistency of the estimators. Are there any special Stata commands or a special procedure for unbalanced panels?
Thank you so much
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30097
#6

03 Dec 2018, 14:15

There are some commands that will only work with balanced panel data, but -xtreg- does not have this limitation. And while there are sometimes things that have to be calculated differently in unbalanced data, Stata's commands all take that into account. So you can go ahead with -xtreg- and not worry about the unbalancedness of the data.

You should definitely not remove years where the data are missing for some banks. Nor should you remove banks where the data are missing for some years. I take it you have not yet actually brought your data into Stata--it is in some other type of data file. Whatever software you use for importing it to Stata, be it a native Stata command like -import excel- or a package like StatTransfer, the missing values will be appropriately handled for you, so you don't have to worry about "putting" anything there. A Stata missing value will already be there for you.

But the real issue here is why the missing data are missing. You refer to it as being "random." But I wonder if that is really true. That is seldom the case in the real world. There may be no obvious pattern or system to the missingness, but that does not qualify as true randomness. It is better described as "haphazard." It is not feasible to give a full discussion of the treatment of missing data in one of these posts. You can read https://statisticalhorizons.com/wp-c...-Materials.pdf to get an introduction to this area. (It gets a bit technical towards the end, and some of the slides are oriented towards SAS rather than Stata, but those parts are not important for present purposes.) You should make efforts to understand why the values that you are missing are missing. Sometimes data are suppressed because the values in question reflect poorly on the entity producing the data: that's not missing at random, that's very biased missingness, and even raises ethical questions. Sometimes data are missing because of events that interfered with the ability to gather the data, but those events themselves may have been related to the values that would have been observed. For example, a firm that is going broke might devote fewer resources to the production of comprehensive data on its financial performance--again biased data results. So one has to look into such matters.

Suffice it to say that -xtreg- will omit from the analysis any observation where any variable mentioned in the command is missing (but only those observations, not other observations from the same year or same bank). This is complete cases analysis. It is subject to biased results even when the data are missing at random (in the technical sense of the term), but not completely at random (in the technical sense of the term.) I have the impression that this is the approach most commonly taken in financial work--I'm not sure why. I also have the impression, that most financial data sets are really rather complete. If you really only have a very small number of missing observations, I wouldn't worry about it. If missing data is a pervasive problem, I would be very concerned about why and make sure to a) have a good understanding of why that happened, and b) make best efforts to analyze the data in ways that minimize resulting bias.
Comment

Announcement

Panzar and Rose Model Code in Stata

Comment

Comment

Comment

Comment

Comment