Dear Sir,
We are running the Stata in the AWS cloud service.Our setup is very simple:
We have always been facing with such a memory issue, and the required large amount of cost has been a serious obstacle for us to advance our research projects.
Now, I am seeking for some ways to reduce the memory usage in Stata, though we have not yet examined to test any ways compared with the standard use of do-command. What I am thinking to try is to make a separation of a user interface and in-memory database structures in Stata. I have no idea what would be the best way to realize it and there might be no simple ways as long as we use the Stata application with a very huge amount of data. The below is a list of my very simple ideas I am trying to test it.
Best regards,
Tatsuru Kikuchi
We are running the Stata in the AWS cloud service.Our setup is very simple:
- Each user manipulate to run the Stata application installed in the EC2 instance.
- A large dataset (8TB in total) is stored in the S3 bucket.
- S3 bucket and EC2 instance are connected through the storage gateway.
We have always been facing with such a memory issue, and the required large amount of cost has been a serious obstacle for us to advance our research projects.
Now, I am seeking for some ways to reduce the memory usage in Stata, though we have not yet examined to test any ways compared with the standard use of do-command. What I am thinking to try is to make a separation of a user interface and in-memory database structures in Stata. I have no idea what would be the best way to realize it and there might be no simple ways as long as we use the Stata application with a very huge amount of data. The below is a list of my very simple ideas I am trying to test it.
- Call the Stata command in the Jupyter-Lab.
- Convert the existing Stata command to the Mata language.
- Construct a RDBMS in the storage sector and connect the Stata through the odbc connector.
Best regards,
Tatsuru Kikuchi
Comment