Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Export to Excel applying format when the range of the cells change

    Dear all,
    I have a question related to how to export to Excel different datasets (specifically, resultssets) giving format to a set of cells when the range of these cells changes on every dataset. I am using Stata 13 (although my Uni has access to Stata 14) in Windows 10.

    To give a background of what I am doing, I have a comparable (sampling design-wise) dataset for 30 countries. For each of these, I am estimating proportions and associated standard errors for the same set of K=20 categorical variables. My goal is to create, for every country, an Excel sheet (I must stick to this program for this project) with 3 colums (category, estimated proportion, estimated standard error) and 20 consecutive blocks of rows (one block for each of the 20 variables). Each of these 20 blocks is made up of 1+q_k rows:
    1) 1 row indicating the label of the k-th variable (column 1) and nothing on columns 2 and 3
    2) q_k additional rows showing the name of the categories for this k-th variable (in the column 1) and its associated proportions (column 2) and standard errors (column 3).
    I am sticking to the resultssets approach from Newson, R. (2004) From datasets to resultssets in Stata (http://www.rogernewsonresources.org....4/overhed2.pdf) which, basically, consists on turning the dataset into the desired statistical table. So, for each of the 30 countries, I apply a nested loop through the 20 variables which provides me the variable-especific proportions and standard errors for each category using -parmby- (the q_k rows above). After this, I use -insob- (from SSC) to insert a new observation which will have the variable label at the beginning of this block (the first row of the block). Iterating this procedure I get each of 20 the blocks described above. After appending these, I export them with -export excel- with the sheet() option. A MWE of this is
    Code:
    local countries_list country_1 country_2 ... country_30
    local vars_list var_1 var_2 ... var_20
    foreach country of local countries_list {
        use "Dataset `country'.dta", clear
        foreach var of local vars_list {
            local var_label: variable label `var'
            preserve
            parmby "proportion `var'", label rename(parm category estimate proportion stderr se) norestore
            insob 1 1
            replace category="Variable `var': `var_label'" in 1
            
            save "`country' `var'.dta", replace
            restore
        }
    
        clear
        foreach var of local vars_list {
            append using "`country' `var'.dta"
        }
    
        export excel category proportion se using "Descriptives.xlsx", sheet("`country'") firstrow(variables) sheetmodify
    }
    As expected, the structure of the resultsset for each country looks like
    Click image for larger version

Name:	Untitled.png
Views:	2
Size:	24.4 KB
ID:	1453394





    My problem is that I want to apply, for each block of each of these country-specific sheets, italics to the first row (which contains the label of the k-th variable) and indentation the q_k rows (which shows the categories of the k-th variable) in the first column. The complication arises because the range of the cells that I want to format changes in a non-uniform way between countries. This is because the number of categories for a given variable (say, var_k) might be different for every country, given a country-specific codification (e.g. geographical regions differ between countries) or, even in the case when a variable have theorethically a unique codification, a given category might be absent on particular countries. Hence, row i on a given sheet (country) might not refer to the same thing (variable label or variable categories) on different sheets. I am pretty sure that specifying these cells manually is not the way to go.

    So, my question is: is there any way that I can apply the desired formatting on an Excel file (using export excel or other command) from within Stata that accomodates to this resultssets approach when the range of cells to be formatted change? More specifically, I was wondering if there is solution similar to what can be done in LaTeX; namely to apply between parmby and insob commands
    replace category="\hspace{3bp} " + category
    which would give me what I want. Is there a similar way to do this in Excel? I am aware that the -putexcel- command allows formatting cells but this solution turns out to be inneficient since the range of the cells I want to format changes, as decribed above.

    Thank you for your time
    Last edited by Juan del Pozo; 14 Jul 2018, 05:58.

  • #2
    Juan,

    as per https://www.stata.com/stata14/excel-cell-formatting/
    Stata 14 gives you much more control over how your results are formatted in Excel than you had in Stata 13. Before, you could insert string and numeric values into specific cells in worksheets. Now, you can format cells, format the contents of the cells, and even add graphs and formulas.
    See also here: https://www.stata.com/features/overv...import-export/

    Best, Sergiy

    Comment


    • #3
      Dear Segiy
      Thank you for the insight. Despite that the links you posted suggest that the within-Stata Excel-formatting only can be done specifiying the range of the observation of interest (as opposed to LaTeX -what I hoped- where one can format a cell by modifying its content as, e.g. adding "\textit{" and "}" to the cell we want to compile as italics). Given this, and based on a previous post (https://www.stata.com/statalist/arch.../msg00070.html) and the links you suggested, I arrived to a solution to my problem using Stata 14 (I realised that the syntax of -putexcel- is different in Stata 15 although it can be easily accomodated) by exploiting the fact that the rows I want to turn into italics begin with the word "Variable" and those which I want to indent do not. I added the following to my code above

      Code:
          quietly: export excel using "Descriptives.xlsx", sheet("`country") firstrow(variables) sheetmodify    
      
          putexcel set "Descriptives.xlsx", sheet("`country_name'") modify    
      
          quietly: gen n=_n
          quietly: gen type=word(category,1)
      
          quietly: levelsof n if type=="Variable", local(selected)
          foreach row of local selected {
              quietly: putexcel A`=`row'+1'=italic("on")
          }
          quietly: levelsof n if type!="Variable"), local(selected)
          foreach row of local selected {
              quietly: putexcel A`=`row'+1'=txtindent(2)
          }
      
          quietly: desc, f
          local obs=`=`r(N)'+1'
          foreach column of numlist 2/`r(k)' {
              quietly: excelcol `column'
              quietly: putexcel `r(column)'2:`r(column)'`obs'=nformat("number_d2")
          }
      The last block of code exploits the -excelcol- command (from SSC) to give numeric format to a whole column, which cannot be done with -export excel-. Also, I add 1 to the local row in the loop since my first line in the excel File is a the variable names line.

      Thank you again!
      Last edited by Juan del Pozo; 16 Jul 2018, 17:20. Reason: typo

      Comment

      Working...
      X