Hello,

I have a panel of banks, each with their unique ids, from the years 1905 to 1910. The panel is unbalanced, so some banks may only have observations for a few consecutive years between 1905 and 1910 (e.g. 1907-1909).

Each banks reports losses each year, but some values in the variable "loss" are missing. For example, for Bank A in year 1906 "loss" = 123, but for the same bank, "loss" = . in year 1907.

I want to count how many banks have missing values for "loss" - this would mean, for each unique id_bank, to count whether there is at least one value of "loss" = .

So far I have managed to create a dummy ("count") to count all the missing values of "loss", but this doesn't work since some banks may have missing values for more than one year.

I know how to create a variable that counts how many id_banks there are by counting either the first or the last occurrence of each id_bank, e.g.:

bysort: Id_Bank: gen n_bank = _n == 1

which counts each first occurrence of each Id_Bank, but I can't use this together with the dummy "count" to create a new dummy = 1 when a bank has a missing value for "loss" because it will disregards the observations where "loss" = . in an observation that is *not* the first one in each bank.

So my question is:

Apologies if the question is less than clear, but I am not quite sure how else to express it. Thank you very much for any help with this!

Best wishes,

Beatrice

I have a panel of banks, each with their unique ids, from the years 1905 to 1910. The panel is unbalanced, so some banks may only have observations for a few consecutive years between 1905 and 1910 (e.g. 1907-1909).

Each banks reports losses each year, but some values in the variable "loss" are missing. For example, for Bank A in year 1906 "loss" = 123, but for the same bank, "loss" = . in year 1907.

I want to count how many banks have missing values for "loss" - this would mean, for each unique id_bank, to count whether there is at least one value of "loss" = .

So far I have managed to create a dummy ("count") to count all the missing values of "loss", but this doesn't work since some banks may have missing values for more than one year.

I know how to create a variable that counts how many id_banks there are by counting either the first or the last occurrence of each id_bank, e.g.:

bysort: Id_Bank: gen n_bank = _n == 1

which counts each first occurrence of each Id_Bank, but I can't use this together with the dummy "count" to create a new dummy = 1 when a bank has a missing value for "loss" because it will disregards the observations where "loss" = . in an observation that is *not* the first one in each bank.

So my question is:

**How do I create a dummy that accounts for missing values of "loss" only once for each id_bank?**Apologies if the question is less than clear, but I am not quite sure how else to express it. Thank you very much for any help with this!

Best wishes,

Beatrice

## Comment