Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Reshape?

    Hello,

    I'm trying to restructure a single variable. My data currently looks like this:

    Code:
    clear
    input str459 v1 float id
    "JO - American Journal of Political Science" 2
    "UR - http://dx.doi.org/10.1111/j.1540-5907.2008.00320.x" 1
    "JO - American Journal of Political Science" 2
    "UR - http://dx.doi.org/10.1111/ajps.12094" 1
    "JO - American Journal of Political Science" 2
    "UR - http://dx.doi.org/10.1111/ajps.12220" 1
    "JO - American Journal of Political Science" 2
    "UR - http://dx.doi.org/10.1111/j.1540-5907.2007.00263.x" 1
    "JO - American Journal of Political Science" 2
    "UR - http://dx.doi.org/10.1111/1540-5907.00037" 1
    "JO - American Journal of Political Science" 2
    "UR - http://dx.doi.org/10.1111/ajps.12312" 1
    "JO - American Journal of Political Science" 2
    "UR - http://dx.doi.org/10.1111/ajps.12315" 1
    "JO - American Journal of Political Science" 2
    "UR - http://dx.doi.org/10.1111/ajps.12290" 1
    "JO - American Journal of Political Science" 2
    "UR - http://dx.doi.org/10.1111/ajps.12318" 1
    "JO - American Journal of Political Science" 2
    "UR - http://dx.doi.org/10.1111/j.1540-5907.2007.00293.x" 1
    "JO - American Journal of Political Science" 2
    "UR - http://dx.doi.org/10.1111/ajps.12001" 1
    "JO - American Journal of Political Science" 2
    "UR - http://dx.doi.org/10.1111/j.1540-5907.2007.00267.x" 1
    "JO - American Journal of Political Science" 2
    "UR - http://dx.doi.org/10.1111/j.1540-5907.2007.00284.x" 1
    "JO - American Journal of Political Science" 2
    "UR - http://dx.doi.org/10.1111/ajps.12118" 1
    "JO - American Journal of Political Science" 2
    "UR - http://dx.doi.org/10.1111/1540-5907.00028" 1
    end

    Whereas i'd like to to have one column for journal name and another column for URL (i don't care about id, i just made this as i thought it might be useful). This seems like a problem I might be able to resolve with -reshape-, but it's not immediately clear to me whether my data is currently wide / long, or something else...

    Does anyone have any thoughts?

  • #2
    I have a rather crude solution -- but a solution nonetheless so not to worry! Thanks all

    Code:
    tempvar sub
    gen `sub' = trim(substr(v1, 1, strpos(v1, " ")))
    bysort `sub': gen bycount = _n
    
    gen journal = v1 if `sub'=="JO"
    gen link = v1 if `sub'=="UR"
    
    qui summ bycount
    replace link = link[_n+`r(max)']
    
    keep journal link

    Comment


    • #3
      Well, your initial instinct to do this with -reshape- is correct. Your data is almost in long layout. It's missing both an i() and a j() for -reshape- purposes! To get the i() you have to create some identifier (in the code below, I call it publication) for each JO/UR pair. Then for the j() part you have to separate out the UR or JO from the rest of the title. The -split- command is good for doing that, except that the hyphen that separates JO or UR from the rest can also appear elsewhere in v1, creating problems. So first, I replace that first hyphen (and only that one) with another character (#) and tell -split- to look for that.

      Putting it all together:
      Code:
      clear
      input str459 v1 float id
      "JO - American Journal of Political Science" 2
      "UR - http://dx.doi.org/10.1111/j.1540-5907.2008.00320.x" 1
      "JO - American Journal of Political Science" 2
      "UR - http://dx.doi.org/10.1111/ajps.12094" 1
      "JO - American Journal of Political Science" 2
      "UR - http://dx.doi.org/10.1111/ajps.12220" 1
      "JO - American Journal of Political Science" 2
      "UR - http://dx.doi.org/10.1111/j.1540-5907.2007.00263.x" 1
      "JO - American Journal of Political Science" 2
      "UR - http://dx.doi.org/10.1111/1540-5907.00037" 1
      "JO - American Journal of Political Science" 2
      "UR - http://dx.doi.org/10.1111/ajps.12312" 1
      "JO - American Journal of Political Science" 2
      "UR - http://dx.doi.org/10.1111/ajps.12315" 1
      "JO - American Journal of Political Science" 2
      "UR - http://dx.doi.org/10.1111/ajps.12290" 1
      "JO - American Journal of Political Science" 2
      "UR - http://dx.doi.org/10.1111/ajps.12318" 1
      "JO - American Journal of Political Science" 2
      "UR - http://dx.doi.org/10.1111/j.1540-5907.2007.00293.x" 1
      "JO - American Journal of Political Science" 2
      "UR - http://dx.doi.org/10.1111/ajps.12001" 1
      "JO - American Journal of Political Science" 2
      "UR - http://dx.doi.org/10.1111/j.1540-5907.2007.00267.x" 1
      "JO - American Journal of Political Science" 2
      "UR - http://dx.doi.org/10.1111/j.1540-5907.2007.00284.x" 1
      "JO - American Journal of Political Science" 2
      "UR - http://dx.doi.org/10.1111/ajps.12118" 1
      "JO - American Journal of Political Science" 2
      "UR - http://dx.doi.org/10.1111/1540-5907.00028" 1
      end
       
      replace v1 = subinstr(v1, "-", "#", 1)
      split v1, parse("#") gen(piece)
      rename piece1 jo_ur
      rename piece2 title_
      replace jo_ur = trim(itrim(jo_ur))
      drop v1
      
      gen long publication = ceil(_n/2)
      reshape wide title_ id, i(publication) j(jo_ur) string
      rename title_JO journal
      rename title_UR link
      As an aside, when talking about Stata, don't refer to rows and columns. Stata is not a spreadsheet: it has observations and variables, not rows and columns. I make this somewhat pedantic point because to become a successful Stata user you have to stop thinking in spreadsheet terms when you use it. Your habits and instincts acquired from using spreadsheets will seldom be helpful and they will frequently lead you in the wrong direction with Stata. To help your mind keep the distinction between a spreadsheet and a Stata data set vivid, it is best to drop the row/column terminology when speaking of Stata.

      Comment


      • #4
        Thank you Clyde. This is much neater solution than mine. It struck me as an odd form of data, not really long or wide.

        Thanks too for your comments on lexicon; i like your reasoning and it makes good sense to me. I usually do talk about variables and observations, but will be more conscientious about it moving forward.

        Comment

        Working...
        X