Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • ANN: wbopendata v18 — Celebrating 15 Years of World Bank Open Data Access from Stata

    Dear Statalist,

    I am pleased to announce the release of wbopendata v18, a major update to the Stata module for accessing the World Bank Open Data API.

    This release also marks 15 years since wbopendata was first released in February 2011 — almost as old as my daughters, which I can hardly believe. What started as a simple bridge between Stata and the newly launched World Bank Open Data Initiative has grown into a tool used by researchers, students, and policy analysts worldwide. I am grateful to everyone who has used it, reported bugs, and contributed over the years.

    This is the first announcement I am making on Statalist since v16 in July 2020. A lot has changed. What's New Since v16 (July 2020)


    By the numbers:
    Indicators ~16,000 29,323
    Data sources ~45 71
    Countries/regions ~260 296
    .ado files 6 34
    Automated tests 0 89
    Metadata format 89 .sthlp files 2 YAML files

    1. Discovery commands — browse the data catalog offline from Stata

    After an initial sync, you can search, browse, and explore the entire World Bank data catalog without a network connection:

    . wbopendata, search(learning+poverty) . wbopendata, info(SE.LPV.PRIM) . wbopendata, sources . wbopendata, alltopics
    Search supports multi-keyword queries, wildcards (NY.GDP.*), regex patterns, and filters by source, topic, or field. On Stata 16+, results return in under half a second after the first call.

    2. YAML metadata architecture

    The 89 per-indicator .sthlp help files have been replaced by two compact YAML files containing full metadata for all 29,323 indicators. This makes the package smaller, faster to update, and easier to maintain.

    3. Redesigned sync system

    . wbopendata, sync // safe preview (dry run) . wbopendata, sync detail // detailed breakdown . wbopendata, sync replace // apply changes . wbopendata, sync replace force // force full re-download
    The sync command defaults to a safe dry-run preview. The replace keyword is an explicit safety gate.

    4. Self-documenting datasets (char metadata)

    Every downloaded dataset now embeds provenance information as Stata char characteristics — indicator codes, query parameters, timestamps, and version — following the pattern established by Drukker (2006) in freduse. This metadata persists across save/use cycles. Suppress with nochar.

    5. Country context by default

    Downloads now automatically include region, income level, admin region, and lending type variables (8 additional variables). Suppress with nobasic.

    6. Graph-ready metadata

    New linewrap(), maxlength(), and linewrapformat() options format indicator names and descriptions for use in graph titles and notes. New return values provide dynamic subtitles with country counts and average data year.

    7. Community bug fixes

    Thanks to @lucaslindoso, @daniel-klein, @ckrf, @randrescastaneda, and @zhaowill for bug reports and contributions that improved latest, country metadata, URL construction, and varlist handling.

    8. Quality assurance

    The package now includes 89 automated tests across 17 categories, including offline deterministic tests using CSV fixtures following Gould's (2001) certification methodology.

    Installation


    * From SSC (stable) ssc install wbopendata, replace
    * From GitHub (latest) net install wbopendata, from("https://raw.githubusercontent.com/jpazvd/wbopendata/main") replace


    ResourcesAcknowledgments

    I would like to thank Kit Baum for maintaining the SSC Archive and for uploading wbopendata over all these years. The SSC infrastructure has been essential for making user-written Stata packages accessible to the community, and Kit's sustained dedication to this service is deeply appreciated.

    Best,

    Joao Pedro Azevedo
    Last edited by JPAzevedo; 18 Feb 2026, 15:20.

  • #2
    I am running version 18.3.2, and this does not work for me: "Downloads now automatically include region, income level, admin region, and lending type variables (8 additional variables)."

    I get no metadata variables, whether I use the options such as "regions" or "basic" or not.

    I get only the indicators I requested, plus: year, countrycode, and countryname.

    Any advice welcome, thanks!

    Comment


    • #3
      If you're coming across this post, FYI: The problem I reported has been fixed in 18.4. Thanks!

      Comment

      Working...
      X