Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Converting data in matrix format (input-output table) to a Stata friendly format

    Hi everyone,

    I have input output table data (by countries and sectors for a specific year) in a matrix format that I would like to convert to a stata friendly format. I am sorry I can't share the original data but I will share a hypothetical example.

    The data comes in two separate .txt files, one for the labels and one for the data itself. When I import them on Stata they would look like this:

    Labels:
    v1 v2
    Country 1 Sector 1
    Country 1 Sector 2
    Country 2 Sector 1
    Country 2 Sector 2

    Data
    v1 v2 v3 v4
    346 156 819 154
    354 443 42 92
    547 466 518 149
    752 936 611 496

    This is eventually the matrix format of the data if we combine the data with the labels manually (just to give you an understanding of what it means)
    Country 1 Country 1 Country 2 Country 2
    Sector 1 Sector 2 Sector 1 Sector 2
    Country 1 Sector 1 346 156 819 154
    Country 1 Sector 2 354 443 42 92
    Country 2 Sector 1 547 466 518 149
    Country 2 Sector 2 752 936 611 496

    What I want to do is that I would like to convert this matrix to a Stata friendly format like the one below for example. I would be grateful if you could advise.
    country origin sector origin country destination sector destination value
    Country 1 Sector 1 Country 1 Sector 1 346
    Country 1 Sector 1 Country 1 Sector 2 156
    Country 1 Sector 1 Country 2 Sector 1 819
    Country 1 Sector 1 Country 2 Sector 2 154
    Country 1 Sector 2 Country 1 Sector 1 354
    Country 1 Sector 2 Country 1 Sector 2 443
    Country 1 Sector 2 Country 2 Sector 1 42
    Country 1 Sector 2 Country 2 Sector 2 92
    Country 2 Sector 1 Country 1 Sector 1 547
    Country 2 Sector 1 Country 1 Sector 2 466
    Country 2 Sector 1 Country 2 Sector 1 518
    Country 2 Sector 1 Country 2 Sector 2 149
    Country 2 Sector 2 Country 1 Sector 1 752
    Country 2 Sector 2 Country 1 Sector 2 936
    Country 2 Sector 2 Country 2 Sector 1 611
    Country 2 Sector 2 Country 2 Sector 2 496
    Many thanks in advance.

    Jala


  • #2
    Well, Stata variable names cannot contain blanks, nor line breaks. So the result you ask for literally is impossible. However, the following will prove, I trust, a workable alternative:
    Code:
    use LABELS, clear
    gen `c(obs_t)' index = _n
    tempfile holding
    save `holding'
    drop index
    rename v1 country_origin
    rename v2 sector_origin
    
    //  MERGE IN THE DATA
    merge 1:1 _n using DATA, assert(match) nogenerate
    
    //  CONVERT TO LONG LAYOUT
    reshape long v, i(country_origin sector_origin) j(index)
    
    //  PARSE INDEX INTO DESTINATION COUNTRY & SECTOR
    merge m:1 index using `holding', assert(match) nogenerate
    drop index
    
    rename v1 country_destination
    rename v2 sector_destination
    rename v value
    order value, last
    sort country_origin sector_origin country_destination sector_destination
    In the future, when showing data examples, please use the -dataex- command to do so. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    Comment

    Working...
    X