How to make arrays in Stata from HTML JavaScript arrays?

Lisa Spoelstra

Join Date: Jan 2018

Posts: 24
#1

How to make arrays in Stata from HTML JavaScript arrays?

29 May 2018, 07:50

Hi all!

I'm transcribing a HTML JavaScript code to Stata. However, I'm new in Stata and I'm stuck with transcribing the array from JavaScript to Stata.
The HTML JavaScript code:
var balloons = new Array(0.06, 0.066, 0.05, 0.0612, 0.0582, 0.0546, 0.0522, 0.0475);

What I did in Stata:
foreach i of balloons 0.06, 0.066, 0.05, 0.0612, 0.0582, 0.0546, 0.0522, 0.0475{
}

But this doesn't work out. Someone who can help me?

Thanks in advance!
Tags: None
Baptiste Ottino

Join Date: May 2018

Posts: 21
#2

30 May 2018, 00:54

Hello. What does the "balloons" array represent? Is it a series of observations for a single variable, or is it an observation, for different variables?
Comment
Lisa Spoelstra

Join Date: Jan 2018

Posts: 24
#3

30 May 2018, 02:08

Hello, it is a series of observations for a single variable.
Comment
Baptiste Ottino

Join Date: May 2018

Posts: 21
#4

30 May 2018, 02:58

Ok, first, you really need to show us a dataset example using -dataex-. Without it, I assume you have other variables in your set already, with the same number of observations (here, 8). Here's an example

Code:

* Example generated by -dataex-. To install: ssc install dataex clear input float(a b) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 end

If you REALLY want to do what you want to do with code, you can do:

Code:

* Stores the values in a local macro, and tokenize it in the local macros 1, 2, 3, etc. local thevalues 0.06 0.066 0.05 0.0612 0.0582 0.0546 0.0522 0.0475 tokenize `thevalues' * Generates balloons, and for each observation fills it with the corresponding value gen balloons local N = _N forvalues 1/`N' { replace balloons = ``i'' in `i' }

Notice the difference in commas at the -replace- line: `i' returns the value stored in local macro i, for example 1, and ``i'' returns the value stored in the macro called 1, in that case.

I should add that I don't know what you're doing, but if it's just inputting a list of number by hand, you could just use

Code:

edit
Comment
Lisa Spoelstra

Join Date: Jan 2018

Posts: 24
#5

04 Jun 2018, 02:16

Thank you! Maybe I need to be more clear, because I can't share my data. But, I have a series of numbers which I need to "add to" a variable and then loop over a formula. Part of the code I have in JavaScript is: var y = new Array(0.06, 0.066, 0.05, 0.0612, 0.0582, 0.0546, 0.0522, 0.0475) and this same code in R: y <- c (0.06, 0.066, 0.05, 0.0612, 0.0582, 0.0546, 0.0522, 0.0475). I have more of these variables, but how to code this in Stata?

Afterwards I need to loop this over formulas in JavaScript the code is:
for(i=1;i<=15;i++) {

x [i]=y [i-1]+z [i];

and code in R: for (i in 2:16){
x [i] <- y [i-1] + z [i]

How to code this properly in Stata?
Comment
Stephen Jenkins

Join Date: Apr 2014

Posts: 1432
#6

04 Jun 2018, 02:35

This observer still has no idea what you're trying to achieve, Lisa. Are x, y and z what Stata calls variables? And do they already exist in your dataset, but you wish to modify x on the basis of values of y and z? The answer appears affirmative from

Code:

and code in R: for (i in 2:16){ x [i] <- y [i-1] + z [i]

If that is so, then you may not require looping at all. Perhaps something like the following would suffice

Code:

replace x = y[_n-1] + z

Note the use of "indexing" (see manuals), and no loops at all.

To make progress, I suggest you explain more clearly what you're trying to achieve, and the nature of the data (variables, observations) that you have to start with. Make up a little dataset if that helps, and share it with us. (Use CODE delimiters, as per FAQ)
Comment
Lisa Spoelstra

Join Date: Jan 2018

Posts: 24
#7

04 Jun 2018, 03:13

What I want to achieve is to put the codes, which I have in R and JavaScript, in Stata. But I don't know how to add it in Stata, because I'm not familiar with it. x, y and z are variables, which not exist in my dataset. But in R and JavaScript they make it as I already said:
JavaScript is: var y = new Array(0.06, 0.066, 0.05, 0.0612, 0.0582, 0.0546, 0.0522, 0.0475) and this same code in R: y <- c (0.06, 0.066, 0.05, 0.0612, 0.0582, 0.0546, 0.0522, 0.0475).

Then they make formulas with the variables x as already described: in JavaScript the code is:
for(i=1;i<=15;i++) {

x [i]=y [i-1]+z [i];

and code in R: for (i in 2:16){
x [i] <- y [i-1] + z [i]

The only thing I want to know is how to make these codes in Stata. I can not share other things, because it is confidential. However, what I want to do is calculate the survival over 15 years and al the number for y represent the survival per year. And then with the formulas I can calculate it.
Comment
Stephen Jenkins

Join Date: Apr 2014

Posts: 1432
#8

04 Jun 2018, 06:44

Lisa: you've simply repeated what you wrote before. Put differently, you're assuming that the best way in Stata to achieve your goal is to mimic the procedure you have in Matlab or R. Stata is different. I continue to maintain that if you answered my questions, things would be clearer to all readers. And you do not have to post an extract of your real data; I said "make up" some data.
Comment

Lisa Spoelstra

Join Date: Jan 2018
Posts: 24

06 Jun 2018, 03:20

Dear all, I am sorry. I had to discuss with my supervisor if I was allowed to share things from my research. But what I need to do is transcribing this code: view-source:http://www.lifemath.net/cancer/breas...rapy/index.php to Stata. And I especially have problems with coding things like these

var nvsr_death_prob_yearly = new Array(0, 0.006083, 0.000414, 0.000301, 0.000218, 0.000172, 0.000158, 0.000141, 0.000128, 0.000117, 0.000108, 0.000105, 0.000110, 0.000132, 0.000173, 0.000228, 0.000292, 0.000355, 0.000407, 0.000440, 0.000456, 0.000471, 0.000489, 0.000501, 0.000509, 0.000515, 0.000506, 0.000516, 0.000531, 0.000552, 0.000578, 0.000609, 0.000646, 0.000693, 0.000753, 0.000827, 0.000864, 0.000949, 0.001047, 0.001157, 0.001273, 0.001393, 0.001514, 0.001639, 0.001774, 0.001919, 0.002045, 0.002211, 0.002383, 0.002560, 0.002746, 0.002949, 0.003176, 0.003431, 0.003718, 0.004039, 0.004462, 0.004859, 0.005304, 0.005809, 0.006384, 0.007060, 0.007817, 0.008596, 0.009350, 0.010100, 0.011305, 0.012297, 0.013426, 0.014706, 0.016123, 0.017603, 0.019196, 0.021033, 0.023175, 0.025585, 0.028661, 0.031295, 0.034315, 0.037906, 0.042094, 0.046670, 0.051554, 0.057062, 0.063411, 0.070761, 0.079054, 0.087065, 0.095796, 0.105294, 0.115605, 0.126771, 0.138833, 0.151829, 0.165787, 0.180734, 0.196684, 0.213644, 0.231608, 0.250560, 0.270467, 0.295466253,0.317857963,0.341365799,0.365989493,0. 39172287,0.418553461,0.44646215,0.475422854,0.5054 0224,0.53635949,0.568246115,0.601005825,0.63457445 6,0.66887997,0.703842512,0.73937455,0.775381082,0. 811759916,0.848402039,0.88519205,0.92200867,0.9587 25335,0.995210852);

And making formulas like these:

for (i=1; i<=15; i++) {
	//STEP 2.c calculates cancer death distribution by multiplying 15yr KM cancer death rate by expected BRCA yearly lethality
	//percentage of overall cancer deaths occuring in the given year is computed, and cumulatively summed
	cancer_death_dist_cumm[i] = cancer_death_dist_cumm[i-1] + L_breastcancer_distribution[i-1]*L_breastcancer_KM;
	//cancer-specific hazard is computed as the chance of cancer death divided by cancer-specific survival to that point
	cancer_death_hazard[i] = L_breastcancer_distribution[i-1]*L_breastcancer_KM / (1-cancer_death_dist_cumm[i-1]);

I hope this makes it more clear and someone knows what to do. Thanks!

Last edited by Lisa Spoelstra; 06 Jun 2018, 03:29.

Comment

David Fisher

Join Date: Apr 2014

Posts: 407
#10

06 Jun 2018, 05:18

Hi Lisa,

If I understand you correctly, I think the solution is given in posts 4 and 6.

Unlike R or JS, Stata works with variables (columns of data) within a single dataset/data-frame. Lists of values may be stored as Stata variables, if appropriate; or if not, be temporarily defined (but not stored) as "local macros" (roughly equivalent, for your purposes, to temporary vector arrays). If you have many such arrays of the same length and with the same ordering, this suggests that it would be appropriate to store them as Stata variables. Post #4 gives one solution to how to code this, given existing R/JS code. (you could e.g. use find/replace in a text editor to remove the commas between each value.)

Once you have stored all your arrays as Stata variables, you can perform calculations with them. Post #6 gives an example of this. The important thing to note here is that you do not loop over observations (that is, values within a Stata variable) in Stata. Instead, commands apply to *all* observations simultaneously. Hence, replace x = y[_n-1] + z replaces *all* values within variable x with the sum of the corresponding value in variable z and the immediately preceding value in variable y.

Hope that helps a little!

Thanks,

David.

Last edited by David Fisher; 06 Jun 2018, 05:20. Reason: minor wording change
1 like
Comment
David Fisher

Join Date: Apr 2014

Posts: 407
#11

06 Jun 2018, 06:23

It has also just occurred to me that, if you already have your data stored in an R dataframe, you could simply export it to Stata wholesale, using write.dta (from the foreign R package).
Comment

Lisa Spoelstra

Join Date: Jan 2018
Posts: 24

#12

06 Jun 2018, 06:30

Thanks for your addition, David. I tried doing what you and the others suggested. So I tokenized like this, for all variables which needed to be tokenized:

Code:

local nvsr_death_prob_yearly 0 0.006083 0.000414 0.000301 0.000218 0.000172 0.000158 0.000141 0.000128 0.000117 0.000108 0.000105 0.000110 0.000132 0.000173 0.000228 ///
0.000292 0.000355 0.000407 0.000440 0.000456 0.000471 0.000489 0.000501 0.000509 0.000515 0.000506 0.000516 0.000531 0.000552 0.000578 0.000609 0.000646 0.000693 ///
0.000753 0.000827 0.000864 0.000949 0.001047 0.001157 0.001273 0.001393 0.001514 0.001639 0.001774 0.001919 0.002045 0.002211 0.002383 0.002560 0.002746 0.002949 ///
0.003176 0.003431 0.003718 0.004039 0.004462 0.004859 0.005304 0.005809 0.006384 0.007060 0.007817 0.008596 0.009350 0.010100 0.011305 0.012297 0.013426 0.014706 ///
0.016123 0.017603 0.019196 0.021033 0.023175 0.025585 0.028661 0.031295 0.034315 0.037906 0.042094 0.046670 0.051554 0.057062 0.063411 0.070761 0.079054 0.087065 ///
0.095796 0.105294 0.115605 0.126771 0.138833 0.151829 0.165787 0.180734 0.196684 0.213644 0.231608 0.250560 0.270467 0.295466253 0.317857963 0.341365799 0.365989493 ///
0.39172287 0.418553461 0.44646215 0.475422854 0.50540224 0.53635949 0.568246115 0.601005825 0.634574456 0.66887997 0.703842512 0.73937455 0.775381082 0.811759916 0.848402039 ///
0.88519205 0.92200867 0.958725335 0.995210852
tokenize `nvsr_death_prob_yearly'

And then for the first formula I tried this:

Code:

gen cancer_death_dist_cumm
local N = _N
forvalues 1/`N' {
replace cancer_death_dist_cumm = cancer_death_dist_cumm [_N-1] + L_breastcancer_distribution [_N-1] * L_breastcancer_KM
}

However, for the first row I get the error =exp. I am not sure what the problem is.

I tokenized cancer_death_dist_cumm also like this:

Code:

 local cancer_death_dist_cumm 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
tokenize `cancer_death_dist_cumm'

Last edited by Lisa Spoelstra; 06 Jun 2018, 06:34.

Comment

David Fisher

Join Date: Apr 2014
Posts: 407

#13

06 Jun 2018, 07:15

Hi Lisa,

Again, if I understand correctly what you are doing:

Using tokenize does not by itself read the values into a Stata variable. It merely indexes them. Hence, when you run the code for the formula, you will see an error. There are other things wrong with your code, too. Here is a pseudo-rewrite, bearing in mind that I don't know your full list of array/variable names:

Code:

* Assume you have 100 values in each array.  If not, change "100" to the number of values you actually do have
clear
set obs 100

* Create a new Stata variable
* Store the values in a local macro
* then, using the results of tokenize (i.e. indexed values), fill observations with the corresponding values
* (again, replace 100 with your true number of values)
gen nvsr_death_prob_yearly = .
local nvsr_death_prob_yearly 0 0.006083 .... // [and so on]
tokenize `nvsr_death_prob_yearly'
forvalues 1/100 {
    replace nvsr_death_prob_yearly = ``i'' in `i'
}

* Repeat the above for each of your variables, including cancer_death_dist_cumm and L_breastcancer_KM
* (not the "clear" and "set obs" commands, they only need doing once)
// code goes here

* Having done so, examine your data using Stata commands "describe", "browse" or "list".

* Now, if the data look OK, we can proceed with the formulae
* (note the lower-case _n here, NOT upper-case _N;
*   the former means "index of current observation"; the latter means "total number of observations")
replace cancer_death_dist_cumm = cancer_death_dist_cumm[_n-1] + L_breastcancer_distribution [_n-1] * L_breastcancer_KM

* Again, examine your data to check that Stata has done what you expected it to.

Hope that helps,

David.

Last edited by David Fisher; 06 Jun 2018, 07:17.

Comment

Lisa Spoelstra

Join Date: Jan 2018
Posts: 24

#14

06 Jun 2018, 07:40

Hello David, if I do this:

Code:

 
 * Store the values in a local macro, and tokenize it in the local macros 1, 2, 3, etc. local nvsr_death_prob_yearly 0 0.006083 .... // [and so on] tokenize `nvsr_death_prob_yearly'  * Assume you have 100 values in each array.  If not, change "100" to the number of values you actually do have clear set obs 100

all the other variables and observation from my data are deleted?

Comment

Lisa Spoelstra

Join Date: Jan 2018
Posts: 24

#15

06 Jun 2018, 09:08

If I do exactly this:

Code:

 
 gen nvsr_death_prob_yearly = . local nvsr_death_prob_yearly 0 0.006083 .... // [and so on] tokenize `nvsr_death_prob_yearly' forvalues 1/100 {     replace nvsr_death_prob_yearly = ``i'' in `i' }

Stata says that it is an invalid syntax. What is wrong?

for(i=1;i<=15;i++) {
	x [i]=y [i-1]+z [i];

for(i=1;i<=15;i++) {
	x [i]=y [i-1]+z [i];

Announcement