Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Scraping Unemployment Data at the Quarterly Level, for Stata Users

    People often get unemployment data at the yearly level, but quarterly level data can be hard to come by, assuming this level is meaningful for us. Here is some Python code to scrape the quarterly level OECD (and some others) unemployment data from 2012-2019q4
    Code:
    import pandas as pd
    import requests
    import matplotlib.pyplot as plt
    import numpy as np
    #from mlsynth import PCASC, PCR, MSC, CVX, FMA
    # Define the list of country codes
    
    # Codes list
    
    # URL of the CSV file
    url = "https://gist.githubusercontent.com/tadast/8827699/raw/f5cac3d42d16b78348610fc4ec301e9234f82821/countries_codes_and_coordinates.csv"
    
    # Read only the 3rd column from the CSV file
    column_name = "Alpha-3 code"
    codes_frame = pd.read_csv(url, usecols=[column_name])
    
    codes_list = codes_frame[column_name].tolist()
    
    country_codes = [code.strip('" ') for code in codes_list]
    # Initialize an empty list to store the data
    data_list = []
    for country_code in country_codes:
        url = f"https://stats.oecd.org/SDMX-JSON/data/MEI/{country_code}.LRHUTTTT.STSA.M/all?startTime=2012-01&endTime=2019-12"
        try:
            data = requests.get(url).json()
    
            time_periods = data["structure"]["dimensions"]["observation"][0]["values"]
            gdp_data = data["dataSets"][0]["series"]["0:0:0:0"]["observations"]
            country_name = data["structure"]["dimensions"]["series"][0]["values"][0]["name"]
    
            country_data = [
                {
                    "Date": period["name"],
                    "Country": country_name,
                    "unemployment": gdp_value[0]
                }
                for period, gdp_value in zip(time_periods, gdp_data.values())
            ]
    
            data_list.extend(country_data)
        except Exception as e:
            print(f"Error processing data for country code {country_code}: {e}")
            continue  # Skip to the next iteration if there's an error
    
    # Create a DataFrame
    df = pd.DataFrame(data_list)
    Trust me, scraping this was an absolute nightmare, SO MANY NESTED DICTIONARIES. So, if you work in econ and ever need quarterly data for a few nations, perhaps this'll be useful guide to you. We can also do this with quarterly GDP data and other API indicators OECD keeps handy.
Working...
X