Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Reclink variable not found

    Hi guys,

    I try to merge two datasets with reclink, however I always get the error message: "variable experience_title not found" although the variable clearly is in the active dataset (the list command just before reclink works). Does anybody know what the mistake could be? If yes that would be greast


    _______________
    * Step 1: Load the datasets and ensure unique identifiers are added correctly
    import delimited using "test.csv", varnames(1) clear
    cap assert unique_id != .
    if _rc {
    gen unique_id = _n
    }
    save test_temp.dta, replace

    import delimited using "salary_data.csv", varnames(1) clear
    cap assert unique_id != .
    if _rc {
    gen unique_id = _n
    }
    save salary_data_temp.dta, replace


    * Load the test dataset
    use test_temp.dta, clear

    * Convert experience_title from strL to str and check the first few values
    gen experience_title_str = substr(experience_title, 1, 255)
    list experience_title_str in 1/5

    * Drop the original experience_title and rename the new one
    drop experience_title
    rename experience_title_str experience_title

    * Verify the new variable
    describe experience_title
    list experience_title in 1/5


    * Perform fuzzy matching for job titles using reclink
    reclink experience_title using salary_data_temp.dta.dta, idmaster(unique_id) idusing(job_title_id) gen(match_score_title)

    * Filter the matched records
    keep if match_score_title > 0.8


    ____________________________


    I also get the error when I dont perform the transformation below (I only added this because CHatGPT recommended changing it into str):
    * Convert experience_title from strL to str and check the first few values
    gen experience_title_str = substr(experience_title, 1, 255)
    list experience_title_str in 1/5

    * Drop the original experience_title and rename the new one
    drop experience_title
    rename experience_title_str experience_title



  • #2
    The variable experience_title needs to be in both data sets, and it needs to be str, not strL, in both. -reclink- is an old command, written long before strLs were introduced in Stata. From the code you show, either it doesn't appear in the salary data set, or perhaps it does but you didn't change it from strL to str.

    Comment


    • #3
      Originally posted by Clyde Schechter View Post
      The variable experience_title needs to be in both data sets, and it needs to be str, not strL, in both. -reclink- is an old command, written long before strLs were introduced in Stata. From the code you show, either it doesn't appear in the salary data set, or perhaps it does but you didn't change it from strL to str.
      Thank you very much!

      Comment

      Working...
      X