Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    In my stata using the first line as the variable is automatic, so there is no problem with that. I have a sample of coding which actually does some initial cleaning after setting up the data in stata format but I'm not sure if it does indicate that is there anything I need to do to come up with variable names. I'm still providing you with that sample coding if that helps.
    Last edited by Tariq Abdullah; 04 Jul 2022, 18:47.

    Comment


    • #17
      Tariq Abdullah -

      I didn't see your post #16 as I was typing this post.

      In post #3 the command
      Code:
      import delimited using abcd96.TXT, delimiters("^") varnames(1) rowrange(3)
      bears no relationship to your data file - you apparently copied the command from the topic you linked to in that post, and that data file is not at all like yours, and you did not attempt to understand the options on the command.

      In particular, what we can deduce from the dataex output you show is that your dataset apparently looks something like this, with commas as the delimiters between data items, not "^".
      Code:
      . type ~/Downloads/example.txt
      1996,1,0,1,"C04000026200",0,0,0,0,"000000000000",0.000,0.000,2,1,0,0,0,7,4,0,0,"00000",5,0,"   C0400",2,0,2,4,1,16.280,0,1,50,0,2,0,0,0,0.0,0.00,0.0,
      1996,1,0,1,"C05000003170",0,0,0,0,"000000000000",0.000,0.000,2,1,0,0,0,7,4,0,0,"00000",7,0,"   C0500",4,0,2,4,1,1.970,0,1,1100,0,2,0,0,0,0.0,0.00,0.0,
      1996,1,0,1,"C05000003400",0,0,0,0,"000000000000",0.000,0.000,2,1,0,0,0,7,4,0,0,"00000",7,0,"   C0500",4,0,2,4,1,2.113,0,1,1100,0,2,0,0,0,0.0,0.00,0.0,
      1996,1,0,1,"C05000015550",0,0,0,0,"000000000000",0.000,0.000,2,1,0,0,0,7,4,0,0,"00000",5,0,"   C0500",2,0,2,4,1,9.662,0,1,350,0,2,0,0,0,0.0,0.00,0.0,
      1996,1,0,1,"C05000019960",0,0,0,0,"000000000000",0.000,0.000,2,4,0,115,0,17,5,0,0,"00000",5,0,"   C0500",4,0,2,4,1,12.403,0,2,1100,0,2,0,0,0,0.0,0.00,0.0,
      and after
      Code:
      . import delimited using ~/Downloads/example.txt
      (encoding automatically selected: UTF-8)
      (43 vars, 5 obs)
      
      . dataex
      the results are
      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input int v1 byte(v2 v3 v4) str12 v5 byte(v6 v7 v8 v9 v10 v11 v12 v13 v14 v15) int v16 byte(v17 v18 v19 v20 v21 v22 v23 v24) str8 v25 byte(v26 v27 v28 v29 v30) float v31 byte(v32 v33) int v34 byte(v35 v36 v37 v38 v39 v40 v41 v42 v43)
      1996 1 0 1 "C04000026200" 0 0 0 0 0 0 0 2 1 0   0 0  7 4 0 0 0 5 0 "   C0400" 2 0 2 4 1  16.28 0 1   50 0 2 0 0 0 0 0 0 .
      1996 1 0 1 "C05000003170" 0 0 0 0 0 0 0 2 1 0   0 0  7 4 0 0 0 7 0 "   C0500" 4 0 2 4 1   1.97 0 1 1100 0 2 0 0 0 0 0 0 .
      1996 1 0 1 "C05000003400" 0 0 0 0 0 0 0 2 1 0   0 0  7 4 0 0 0 7 0 "   C0500" 4 0 2 4 1  2.113 0 1 1100 0 2 0 0 0 0 0 0 .
      1996 1 0 1 "C05000015550" 0 0 0 0 0 0 0 2 1 0   0 0  7 4 0 0 0 5 0 "   C0500" 2 0 2 4 1  9.662 0 1  350 0 2 0 0 0 0 0 0 .
      1996 1 0 1 "C05000019960" 0 0 0 0 0 0 0 2 4 0 115 0 17 5 0 0 0 5 0 "   C0500" 4 0 2 4 1 12.403 0 2 1100 0 2 0 0 0 0 0 0 .
      end
      It is clear from the screenshot in post #14 that your import delimited attempted to use the first line of data as variable names. That apparently was a bad guess on Stata's part.

      Try
      Code:
      import delimited using abcd96.TXT, varnames(nonames)
      and see what you get.

      If it doesn't work for you please try
      Code:
      type abcd96.TXT, lines(5)
      and post the command and its output here again using CODE delimiters (thank you for that!).

      In particular, it's import to see if the abcd96.TXT actually contains one line with variable names - at the moment it doesn't seem likely.
      Last edited by William Lisowski; 04 Jul 2022, 19:34.

      Comment


      • #18
        You may have to experiment with what the first row is, in the dialogue box. Seriously, when you import it from the dialogue box and see it how Stata does, you'll see things differently.

        Either way, you have string variables, so the fact that they're imported how they are is normal.

        Comment


        • #19
          I apologize to everyone for going through this incredibly painstaking dataset. After doing all the options available and getting all this very thoughtful feedback, I came to realize I need to code the variable names by using some different set of coding ( As you guessed it right that variable names are not given ). I need to include that with some additional coding.

          But, thanks so much for taking the time and giving me so much useful feedback. Really grateful to you Mr. Greathouse for taking the time and interest in my problem. Can not appreciate this enough for your effort and statalist community.

          Comment


          • #20
            I have a sample code where the author of the code did include the name of the variables with the following code by using multiple years. But, in my case I'm doing for a single year like hpms1996.dta. This is my last ask in this thread. Could you kindly let me know how I can change the following coding for a single dataset ( hpms_1996.dta ) instead of using for values for multiple years ( from 1993-1998 )

            Code:
            local cols93_98 " year state is_metric county section_id has_sample has_hov has_surveil is_donut lrs_id begin_lr "
            
            *clean up 1993-98;
            forvalues year=93(1)98{;
                use temp/hpms_univ_19`year',clear;
                compress;
                local i=0;
                foreach varname in `cols93_98'{;
                    local i=`i'+1;
                    rename Col`i' `varname';
                    *strip out quotes and right and left blanks from strings;
                    capture replace `varname' = subinstr(`varname',`"""',"",2);
                    capture replace `varname' =trim(`varname');
                    };
                save temp/hpms_univ_19`year'_clean,replace;
            Last edited by Tariq Abdullah; 04 Jul 2022, 20:00.

            Comment


            • #21
              If I had a group of variables "var1 var2 var3" from import delimited that I wanted to rename to "id year country" I would do
              Code:
              rename (var1 var2 var3) (id year country)
              I strongly encourage you to cease trying to modify code you got somewhere that uses parts of Stata you do not understand to do things where you do not understand what it is doing.

              Instead work to write your own code a step at a time with the help of Statalist. It's a lot more attractive for us to figure out your attempts than it is to figure out some obscure code and then try to figure out what is needed to make it work for you. And you will learn Stata that way.

              You'd have been a lot farther along at this point if you hadn't started with all the stuff in post #1. Don't go back to that approach now; don't copy random code that doesn't even apply to your data.

              Get your import delimited to work. Then get your renaming done using the most basic features described in
              Code:
              help rename group
              Then work to make the necessary changes to any of your variables.

              And whatever else you do, don't use semicolon delimiters. Write standard Stata.

              Finally, some more general advice.

              I expect you're new to Stata. If so, I'm sympathetic to you - there is quite a lot to absorb. And even worse if perhaps you are under pressure to produce some output quickly. Nevertheless, I'd like to encourage you to take a step back from your immediate tasks.

              When I began using Stata in a serious way, I started, as have others here, by reading my way through the Getting Started with Stata manual relevant to my setup. Chapter 18 then gives suggested further reading, much of which is in the Stata User's Guide, and I worked my way through much of that reading as well. All of these manuals are included as PDFs in the Stata installation and are accessible from within Stata - for example, through the PDF Documentation section of Stata's Help menu.

              The objective in doing the reading was not so much to master Stata - I'm still far from that goal - as to be sure I'd become familiar with a wide variety of important basic techniques, so that when the time came that I needed them, I might recall their existence, if not the full syntax, and know how to find out more about them in the help files and PDF manuals.

              Stata supplies exceptionally good documentation that amply repays the time spent studying it - there's just a lot of it. The path I followed surfaces the things you need to know to get started in a hurry and to work effectively.

              Stata also supples YouTube videos, if that's your thing.

              Comment


              • #22
                Mr. Lisowski,

                Cannot thank. you enough for putting so much thought for guiding me to the right direction. Though, I'm not new in stata but as you have guessed it right that I've to do something within a very limited time , and apparently the way I was trying to execute it - is way out of my league. From now on, I'll try to follow your suggestion and will keep in mind what you kindly suggested me to do.

                Again, much appreciate the commendable effort statalist community for taking your precious time to solve the issue I was having. Makes my academic life a little better!

                Comment

                Working...
                X