Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Unicode decode error while using Data.getAsDict in sfi.Data

    Dear Stata Forum,

    While trying to import some data from a Stata dataset to a Dictionary with the Data.getAsDict module in Python, the following error was retrieved:

    Code:
    UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf3 in position 1: invalid continuation byte
    The code I am currently using is:

    Code:
    python
    
    from sfi import Data
    import numpy as np
    import pandas as pd
    
    dataraw = Data.getAsDict(None, valuelabel=False, missingval=np.nan)
    
    end
    I am not at ease with this topic, but I think it may be related with the characters in string variable in the Stata dataset. In fact, running this code with the auto dataset works fine, but it triggers the error with my dataset. I present a sample of the dataset below:

    Variable 1 [long] Variable 2 [double] Variable 3 [str12]
    20211231 45411111 NIF / NIPC
    20211231 45411112 NIF / NIPC
    20211231 45411113 NIF / NIPC

    Please note that if I limit the dataset to Variable 1 and Variable 2, no error is triggered.

    I am currently using Stata 17 in a Windows.

    How may I solve this issue?

    Please state if there is any other information I may provide which may help.

    Thanks in advance,
    Francisco


Working...
X