I came from the thread about STATA and AI, while my question is about the feasibility of using AI to facilitate data collating in some circumstances.
The dataset I recently processed include variables of client's birth countries, which was retrieved from an old database. The database allows the user to enter the country of birth manually, rather than selecting it using the drop down menu. Thus, it has led to many inaccurate (incorrect) country names. For example, Maori (supposed to be New Zealand), Wales (supposed to be United Kingdom), etc. Apparently, that 5% of manual input is very random and there are no rules to follow.
As all the countries' names need to be recoded using value label (e.g., Australia->1101, New Zealand->1201, Vietnam->5105...), those random country names won't be able to be coded.
Perhaps my expectation was beyond the specialised area of STATA, but after viewing the above thread, I was thinking whether AI may help me to achieve "guessing" what actual country that the random inputs refer to?
Of course, maybe this question is not worth consuming too much time. I don't know if there is a smarter way to do this and would appreciate any guidance.
The dataset I recently processed include variables of client's birth countries, which was retrieved from an old database. The database allows the user to enter the country of birth manually, rather than selecting it using the drop down menu. Thus, it has led to many inaccurate (incorrect) country names. For example, Maori (supposed to be New Zealand), Wales (supposed to be United Kingdom), etc. Apparently, that 5% of manual input is very random and there are no rules to follow.
As all the countries' names need to be recoded using value label (e.g., Australia->1101, New Zealand->1201, Vietnam->5105...), those random country names won't be able to be coded.
Perhaps my expectation was beyond the specialised area of STATA, but after viewing the above thread, I was thinking whether AI may help me to achieve "guessing" what actual country that the random inputs refer to?
Of course, maybe this question is not worth consuming too much time. I don't know if there is a smarter way to do this and would appreciate any guidance.
Comment