Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Frget won't merge in a variable

    EDIT: Originally, I asked about why the code failed when I tried to merge in _Y_synthetic to the default dataframe. The below code works, but I still don't know why.
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear *
    input float(cigsale year) double cf
      123 1970  122.1653883469584
      121 1971 121.62029507356708
    123.5 1972 123.65802631119759
    124.4 1973  124.1292205815946
    126.7 1974 126.80849818612906
    127.1 1975  126.6901995266301
      128 1976 128.01968539874275
    126.4 1977   126.562557509494
    126.1 1978 125.43148186489528
    121.9 1979 122.38722170060387
    120.2 1980 119.90265624361437
    118.6 1981 119.11404085777194
    115.4 1982 115.64496634374335
    110.8 1983 111.00762282380802
    104.8 1984 104.55046091349114
    102.8 1985 102.69997984676596
     99.7 1986  99.30180693683437
     97.5 1987  97.31873278154471
     90.1 1988  90.98715875261351
     82.4 1989  88.49250142078927
     77.8 1990   85.0805673074476
     68.7 1991  80.71134376817638
     67.5 1992  78.99709312992486
     63.4 1993  79.32651527790523
     58.6 1994  76.78981514810624
     56.4 1995  74.29330974832727
     54.5 1996  73.51829128134756
     53.8 1997   73.5041003462229
     52.3 1998   71.4293293399428
     47.2 1999  70.77370328388332
     41.6 2000  65.60763767351816
    end
    format %ty year
    
    frame create adhframe
    
    frame change adhframe
    
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input double(_Y_synthetic year)
     117.1224998779297 1970
    118.92689856719971 1971
    124.31680255126952 1972
    125.48379851531983 1973
     127.0066996688843 1974
    127.11689948272705 1975
    127.89249899291993 1976
    125.75760092926026 1977
    125.00309838867187 1978
    122.94150092315672 1979
    120.48190029907228 1980
    120.21059833526611 1981
    116.87670102691649 1982
    111.32929905700684 1983
    103.35900167083742 1984
    103.23869921874999 1985
     99.83780280303955 1986
      99.7549978942871 1987
     91.67349821472168 1988
      90.0025983352661 1989
      87.5086008682251 1990
     82.15799961090087 1991
     81.58240018463135 1992
     81.16239954757691 1993
     80.69319806289674 1994
     78.45580081176757 1995
     77.44370036315918 1996
     77.66760053634644 1997
      74.3454993095398 1998
     73.53380042648314 1999
      67.3186996383667 2000
    end
    
    rename _Y_synthetic ysyn
    
    frame change default
    cls
    frlink 1:1 year, frame(adhframe)
    frget ysyn, from(adhframe)
    br
    The underscores are obviously the offending characters that must be removed, but is there any reason why this would be? Perhaps it's a question for someone from Statacorp.
    Last edited by Jared Greathouse; 02 Jan 2023, 08:29.

  • #2
    Code:
    . frlink 1:1 year, frame(adhframe)
      (all observations in frame default matched)
    
    . frget _Y_synthetic = _Y_synthetic, from(adhframe)
      (1 variable copied from linked frame)
    Apparently frget does not want to create a variable whose name begins with an underscore without being told to do so explicitly,

    You might direct this question to Stata Technical Services, since this limitation is not documented.

    Added in edit: Aha!, I found the documentation! In the output of help frget in the discussion of the exclude() option we see (italics added for emphasis)
    Code:
        exclude(varlist) specifies variables that are not to be copied.  An
            example of the option is
    
                frget *, from(counties) exclude(emp*)
    
            All variables except variables starting with emp would be copied.
    
            More correctly, all variables except emp*, _*, and the match
            variables would be copied because frget always omits the underscore
            and match variables.  See the explanation below.
    There is no explanation below in the help output. There's a long explanation below in the linked PDF documentation. TL; DR - read the fine manual for a more complete understanding.
    Last edited by William Lisowski; 02 Jan 2023, 08:45.

    Comment


    • #3
      Okay because when I read the help file from h frget, I think I misread or maybe misunderstood what was meant by
      frget always omits the underscore and match variables.
      . I now know what they meant, was "frget does not merge in variables that have underscores in their names". But, I guess this is why the help says
      Links to PDF documentation

      Quick start

      Remarks and examples

      Methods and formulas

      The above sections are not included in this help file.
      since that is typically where lots of curious issues like this lie at.

      Comment


      • #4
        "frget does not merge in variables that have underscores in their names"
        That's not quite right. -frget- does not merge in variables whose names begin with underscore.

        Comment


        • #5
          -frget- does not merge in variables whose names begin with underscore.
          for which the PDF documentation (read the fine manual) explains:

          frget omits _* variables because they tend to be Stata system variables that are valid only in the dataset in which they appear. You do not want them.
          and then shows the workaround used in the code in post #2.

          Comment


          • #6
            Also, [U]11.3 Naming conventions:

            The first character of a name must be a letter or an underscore (macro names are an exception;
            they may also begin with a digit). We recommend, however, that you not begin your variable names
            with an underscore.All of Stata’s built-in variables begin with an underscore, and we reserve the
            right to incorporate new variables freely.

            Comment


            • #7
              The advice in post #6 would have been well-followed by the community authors of the synth package at SSC, which with the keep() option creates a dataset containing 5 variables all of which begin with an underscore. But a batch rename might be well advised whenever such a file is used.

              Comment


              • #8
                I agree. I never got the complicated naming conventions the authors used, so I made sure not to do that for my estimator. For my purposes, I just did
                Code:
                cap as strpos("_Y_synthetic", "_")
                if ~_rc {
                    
                    loc reny: di subinstr("_Y_synthetic", "_", "", .)
                }
                ...
                qui frlink 1:1 year, frame(adhframe)
                
                frget `reny'=_Y_synthetic, from(adhframe)
                so that way, the integrity of the original frame isn't changed.

                Comment

                Working...
                X