Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Tagging the same line of observations

    I created the following subsets of my dataset for my question.

    I have tag_organizations_ year variable which I created some days ago. I created tag_organizations_ year tag one observation in my organizations and year group.
    I created this variable using the egen tag function as follows
    egen tag_organizations_ year = tag (organization year)
    I have 16 organizations year of distinct observation as I can observe in my (distinct organization year, joint ) command.

    And the number of 1’s of variable “organizations” equals 16.

    My issue is I needed to create another tag variable again.
    tag_organization_year_Producttyp , which means now the group I want to tag distinct observation is contains organization, year, Producttype variables.
    When tagging my new tag, I want to tag the same lines of organizations and year observations I tagged when I created the tag_organizations_ year variable. The issue is I don’t remember how the data was sorted when I created tag_organizations_ year, so I can’t sort the data to have a similar sequence and then create tag_organization_year_Producttyp. Since I have many variables in my dataset, it may also be sorted by variables other than the three I use in this dataex- .


    . dataex Organizations year product_type tag_organizations_year tag_organization_year_Producttyp

    copy starting from the next line -----------------------
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float Organizations int year str30 product_type byte(tag_organizations_year tag_organization_year_Producttyp)
    1 2003 "A" 0 0
    1 2003 "A" 1 1
    1 2003 "A" 0 0
    1 2003 "A" 0 0
    2 2001 "B" 0 0
    2 2001 "B" 0 0
    2 2001 "B" 0 0
    2 2001 "B" 0 0
    2 2001 "B" 0 0
    2 2001 "B" 0 1
    2 2001 "B" 0 0
    2 2001 "B" 0 0
    2 2001 "B" 0 0
    2 2001 "B" 1 0
    2 2001 "B" 0 0
    2 2001 "B" 0 0
    2 2001 "B" 0 0
    2 2002 "B" 1 1
    2 2003 "B" 0 1
    2 2003 "B" 0 0
    2 2003 "B" 0 0
    2 2003 "B" 1 0
    2 2003 "B" 0 0
    2 2004 "B" 0 0
    2 2004 "B" 0 0
    2 2004 "B" 0 0
    2 2004 "B" 1 1
    2 2005 "C" 1 1
    3 2001 "A" 0 1
    3 2001 "A" 0 0
    3 2001 "A" 1 0
    3 2002 "A" 1 1
    3 2003 "A" 1 1
    3 2004 "A" 0 0
    3 2004 "A" 0 1
    3 2004 "A" 0 0
    3 2004 "A" 1 0
    3 2005 "C" 0 1
    3 2005 "A" 1 1
    4 2002 "A" 1 1
    4 2003 "A" 1 1
    4 2005 "C" 1 1
    4 2005 "A" 0 1
    5 2002 "B" 0 0
    5 2002 "B" 0 1
    5 2002 "A" 1 1
    6 2001 "A" 1 0
    6 2001 "A" 0 1
    6 2001 "C" 0 1
    6 2001 "A" 0 0
    6 2001 "A" 0 0
    6 2001 "B" 0 1
    end


    Here is what my tag_organizations_ year and tag_organization_year_Producttyp look like with my dataex.

    As you can see, the same line of organizations and year are not tagged in tag_organization_year_Producttyp variable. Of course, the organization and year that varies in product_type can be tagged again. However, I want organization and year don't vary in product_type to be tagged in a similar line of the "tag_organization_year" variable.

    Is there a way of tagging the same line of organizations and year in both variables?

    Well, it would have been easier if I had tag_organization_year_Producttyp before I created tag_organizations_ year.
    In that case, I would have used egen tag_organizations_ year = tag (organization year) if tag_organization_year_Producttyp==1 , and then that would have tagged the same line of organizations and year in both variables.
    However, I have already generated tag_organizations_ year and used it in many of my essential statistics. Therefore, when I tag tag_organization_year_Producttyp, I want to tag the same organizations and year tagged in tag_organizations_ year.


    In a simple word, I want to create tag_organization_year_Producttyp in a way that makes the following count = 0
    count if tag_organization_year_Producttyp !=1 & tag_organizations_year ==1

    But in my above case the value of count if tag_organization_year_Producttyp !=1 & tag_organizations_year ==1 is 5


    . count if tag_organization_year_Producttyp !=1 & tag_organizations_year ==1
    5


    To make my question clearer, below I give you an example of what it would look like if I had tag_organization_year_Producttyp before tag_organizations_ year and how I can tag the same organizations and year in both tag variables
    tag_Firstorga_year_Producttyp = (Organizations year product_type)
    egen tag_Secondorga_year = tag(Organizations year) if tag_Firstorga_year_Producttyp==1
    by tabulating the result here, I show you the number of 1's in tag_Firstorga_year_Producttyp variable and tag_Secondorga_year variable are equal to the number of distinct observations in each group, which makes my tagging correct.

    tab1 tag_Firstorga_year_Producttyp tag_Secondorga_year

    -> tabulation of tag_Firstorga_year_Producttyp

    tag(Organiz |
    ations year |
    product_typ |
    e) | Freq. Percent Cum.
    ------------+-----------------------------------
    0 | 31 59.62 59.62
    1 | 21 40.38 100.00
    ------------+-----------------------------------
    Total | 52 100.00

    -> tabulation of tag_Secondorga_year

    tag(Organiz |
    ations |
    year) | Freq. Percent Cum.
    ------------+-----------------------------------
    0 | 36 69.23 69.23
    1 | 16 30.77 100.00
    ------------+-----------------------------------
    Total | 52 100.00


    The value of the following count is also 0.

    count if tag_Firstorga_year_Producttyp!=1 & tag_Secondorga_year==1
    0
    dataex of the new variables are as follows


    . dataex Organizations year product_type tag_Secondorga_year tag_Firstorga_year_Producttyp

    copy starting from the next line -----------------------
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float Organizations int year str30 product_type byte(tag_Secondorga_year tag_Firstorga_year_Producttyp)
    1 2003 "A" 1 1
    1 2003 "A" 0 0
    1 2003 "A" 0 0
    1 2003 "A" 0 0
    2 2001 "B" 1 1
    2 2001 "B" 0 0
    2 2001 "B" 0 0
    2 2001 "B" 0 0
    2 2001 "B" 0 0
    2 2001 "B" 0 0
    2 2001 "B" 0 0
    2 2001 "B" 0 0
    2 2001 "B" 0 0
    2 2001 "B" 0 0
    2 2001 "B" 0 0
    2 2001 "B" 0 0
    2 2001 "B" 0 0
    2 2002 "B" 1 1
    2 2003 "B" 1 1
    2 2003 "B" 0 0
    2 2003 "B" 0 0
    2 2003 "B" 0 0
    2 2003 "B" 0 0
    2 2004 "B" 1 1
    2 2004 "B" 0 0
    2 2004 "B" 0 0
    2 2004 "B" 0 0
    2 2005 "C" 1 1
    3 2001 "A" 1 1
    3 2001 "A" 0 0
    3 2001 "A" 0 0
    3 2002 "A" 1 1
    3 2003 "A" 1 1
    3 2004 "A" 1 1
    3 2004 "A" 0 0
    3 2004 "A" 0 0
    3 2004 "A" 0 0
    3 2005 "C" 1 1
    3 2005 "A" 0 1
    4 2002 "A" 1 1
    4 2003 "A" 1 1
    4 2005 "C" 1 1
    4 2005 "A" 0 1
    5 2002 "B" 1 1
    5 2002 "B" 0 0
    5 2002 "A" 0 1
    6 2001 "A" 1 1
    6 2001 "A" 0 0
    6 2001 "C" 0 1
    6 2001 "A" 0 0
    6 2001 "A" 0 0
    6 2001 "B" 0 1
    end
    The same organisations and year are tagged in both. Of course, the tag_Firstorga_year_Producttyp has more organizations and years tagged, but that is because of the difference in product type, and that is not my issue at all.


    So how can I create similar order (style) of tag like tag_Firstorga_year_Producttyp variable and tag_Secondorga_year variable, but if I created tag_Secondorga_year variable before tag_Firstorga_year_Producttyp


    Please let me have your suggestions.
    Thank you for your suggestions in advance.

  • #2
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float Organizations int year str30 product_type byte(tag_organizations_year tag_organization_year_Producttyp)
    1 2003 "A" 0 0
    1 2003 "A" 1 1
    1 2003 "A" 0 0
    1 2003 "A" 0 0
    2 2001 "B" 0 0
    2 2001 "B" 0 0
    2 2001 "B" 0 0
    2 2001 "B" 0 0
    2 2001 "B" 0 0
    2 2001 "B" 0 1
    2 2001 "B" 0 0
    2 2001 "B" 0 0
    2 2001 "B" 0 0
    2 2001 "B" 1 0
    2 2001 "B" 0 0
    2 2001 "B" 0 0
    2 2001 "B" 0 0
    2 2002 "B" 1 1
    2 2003 "B" 0 1
    2 2003 "B" 0 0
    2 2003 "B" 0 0
    2 2003 "B" 1 0
    2 2003 "B" 0 0
    2 2004 "B" 0 0
    2 2004 "B" 0 0
    2 2004 "B" 0 0
    2 2004 "B" 1 1
    2 2005 "C" 1 1
    3 2001 "A" 0 1
    3 2001 "A" 0 0
    3 2001 "A" 1 0
    3 2002 "A" 1 1
    3 2003 "A" 1 1
    3 2004 "A" 0 0
    3 2004 "A" 0 1
    3 2004 "A" 0 0
    3 2004 "A" 1 0
    3 2005 "C" 0 1
    3 2005 "A" 1 1
    4 2002 "A" 1 1
    4 2003 "A" 1 1
    4 2005 "C" 1 1
    4 2005 "A" 0 1
    5 2002 "B" 0 0
    5 2002 "B" 0 1
    5 2002 "A" 1 1
    6 2001 "A" 1 0
    6 2001 "A" 0 1
    6 2001 "C" 0 1
    6 2001 "A" 0 0
    6 2001 "A" 0 0
    6 2001 "B" 0 1
    end
    
    drop tag_organization_year_Producttyp
    by Organizations year product_type (tag_organizations_year), sort: ///
        gen tag_organization_year_producttyp = (_n == _N)
    Last edited by Clyde Schechter; 31 Mar 2023, 12:05.

    Comment


    • #3
      Thank you very much, Clyde Schechter!
      This perfectly helped me. With kind regards,

      Comment

      Working...
      X