How to work on a subsample of my data set, and have changes into whole sample?

Michael Duarte Goncalves

Join Date: Oct 2022
Posts: 500

How to work on a subsample of my data set, and have changes into whole sample?

05 Jan 2024, 02:14

Hi everyone,

I would like to know if there is a way to work on stata on a smaller sample and apply the changes made that are translated and visible on the whole sample.
My computer has limited RAM and SSD capacity, and my computer crashes when I'm working on the whole sample.

Here's my simplified dataset:

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input str23 description str31 model
"ALFA ROMEO" "146"                
"ALFA ROMEO" "147"                
"ALFA ROMEO" "147"                
"ALFA ROMEO" "156"                
"ALFA ROMEO" "156"                
"ALFA ROMEO" "156 1 9 JTD FAMILIAR"
"ALFA ROMEO" "159 SPORTWAGON"      
"ALFA ROMEO" "164 2.0 TS"          
"ALFA ROMEO" "166"                
"ALFA ROMEO" "166"                
"ALFA ROMEO" "2000GT VELOCE"      
"ALFA ROMEO" "2000 GT VELOCE"      
"ALFA ROMEO" "2000 SPIDER VELOCE"  
"ALFA ROMEO" "4C"                  
"ALFA ROMEO" "4C"                  
"ALFA ROMEO" "4C"                  
"ALFA ROMEO" "4C"                  
"ALFA ROMEO" "4C SPIDER"          
"ALFA ROMEO" "4C SPIDER"          
"ALFA ROMEO" "500 ABARTH"          
end

And here is the attempted code, without success:

Code:

preserve

keep if description == "ALFA ROMEO"
gen num = 1
collapse (sum) num, by(description model COD_PROPULSION cilindrada potenciafiscal weight_max)

strgroup model, generate(similar_model) threshold(0.15) first normalize(shorter) force

restore

I have to do that for the entire sample. I have 67 different brand names. So if anyone knows a way to speed up the process, I would love to see it.
Thank you in advance for your help.

Best,

Michael

Tags: None

Announcement

How to work on a subsample of my data set, and have changes into whole sample?