Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Getting a value of the string variable from a given point in time

    Dear Stata-users,

    I struggle with creating a new variable. I work on a panel data. I have few firms with firm id. For every id in a given point in time t I have the string variable name. Unfortunately for the same id I have names which are slightly different in different points in time. E.g. for id=1 I have "companyA ltd" for 01.10.2016 and for 31.12.2021 I have "company A ltd London". I wanna have a column with the names from 31.10.2021 for every id, so that they would be the same. Unfortunately I have no idea how to do it and my searching is not fruitful

    Please help me

  • #2
    As you do not show example data, I have made a toy data set out of the StataCorp -grunfeld- file that illustrates the approach. Suppose we want to use the name as reported in 1945 as the uniform name:

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input byte company int year str8 name
    1 1935 "abc co"  
    1 1936 "abc co"  
    1 1937 "abc corp"
    1 1938 "abc corp"
    1 1939 "abc co"  
    1 1940 "abc co"  
    1 1941 "abc corp"
    1 1942 "abc co"  
    1 1943 "abc co"  
    1 1944 "abc co"  
    1 1945 "abc corp"
    1 1946 "abc co"  
    1 1947 "abc co"  
    1 1948 "abc co"  
    1 1949 "abc co"  
    1 1950 "abc co"  
    1 1951 "abc corp"
    1 1952 "abc co"  
    1 1953 "abc co"  
    1 1954 "abc co"  
    2 1935 "def co"  
    2 1936 "def co"  
    2 1937 "def corp"
    2 1938 "def co"  
    2 1939 "def co"  
    2 1940 "def co"  
    2 1941 "def corp"
    2 1942 "def co"  
    2 1943 "def co"  
    2 1944 "def co"  
    2 1945 "def co"  
    2 1946 "def corp"
    2 1947 "def co"  
    2 1948 "def co"  
    2 1949 "def co"  
    2 1950 "def co"  
    2 1951 "def corp"
    2 1952 "def co"  
    2 1953 "def co"  
    2 1954 "def co"  
    end
    format %ty year
    
    gen long obs_no = _n
    by company (year), sort: egen obs_no_for_1945 = max(cond(year == 1945, obs_no, .))
    gen uniform_name = name[obs_no_for_1945]
    You can modify the code to adapt it to your actual situation.

    In the future, when asking for help with code, always show example data, and always use the -dataex- command to do that, as I have here. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    When asking for help with code, always show example data. When showing example data, always use -dataex-.

    Comment


    • #3
      You could also go

      Code:
      gen wanted = cond(year == 1945, name, "")
      bysort company (wanted) : replace wanted = wanted[_N]
      as cond() will yield string results and sorting on strings pushes empty strings above non-empty strings.
      Last edited by Nick Cox; 14 Jun 2022, 09:59.

      Comment


      • #4
        Thank you so much Clyde and Nick for your kind help!

        Comment

        Working...
        X