Automatic variables in sas
What are automatic variables in SAS?
Within a data step, SAS automatically create a couple of reserved variables. Automatic variables in SAS are created automatically by the DATA step or by DATA step statements. These variables are added to the program data vector however are usually not output to the data set is created. The values of automatic variables are retained from one iteration of the DATA step to the subsequent, rather than setting it to a missing value.
_N_ AUTOMATIC VARIABLE IN SAS
SAS automatically creates the _N_ automated variable in this Program Data Vector when processing a data step.
_N_ is initially set to 1. Each time the DATA step loops previous the DATA statement, the variable _N_ is incremented by 1. The value of _N_ represents the number of occasions the DATA step has iterated.
data one; input Characters $ @@; datalines; A B C ; run; data two; set one; put _N_=; run;
The SAS logs prints value of _N_ variable as below;
_N_=1 _N_=2 _N_=3
Next, let take a look at when the _N_ variable value increments. A typical misunderstanding in regards to the _N_ variable is that it signifies what number in a data step the current observation has.
That isn’t true. As the Documentation says: “The _N_ variable is initially set to 1”. That signifies that the _N_ variable is the same as one before any observation is read into the PDV, i.e. before any Set Statement. The _N_ variable increments by one each time the data step passes by a data statement.
data three; put _N_=; set one; put _N_=; run;
Below is the log of the above program.
_N_=1 _N_=1 _N_=2 _N_=2 _N_=3 _N_=3 _N_=4
_ERROR_ is set to 0 by default however is set to 1 every time there’s an error encountered, resembling an input data error, a conversion error, or a math error, as in division by 0 or a floating-point overflow.
You can use the value of this variable to help locate errors in information and to print an error message to the SAS log.
For instance, either of the two following statements writes to the SAS log, throughout every iteration of the DATA step, the contents of an input record during which an input error is encountered:
if _error_=1 then put _infile_;
if _error_ then put _infile_;
When you use a data step with an INPUT statement, SAS creates an input buffer where it holds your data prior to shifting the values of variables into the program data vector. If your data step reads a SAS data set with a SET statement and no INPUT statement is present, no input buffer is created.
If there’s an input buffer, you possibly can access the contents of that buffer using the variable name _INFILE_. _INFILE_ is an automatic variable whose value is accessible within a data step however will not be output to any data set being created in the data step.
One instance is suppose you want to read some data and want all of the character variables in Uppercase. You can use a $UPCASE informat, or you could possibly transfer each record into the input buffer and hold the record in the buffer with a trailing @-sign.
The contents of the input buffer are named _INFILE_ and you need to use a UPCASE operate to transform the contents to uppercase ã. The subsequent INPUT statement moves variable values from the input buffer into the program data vector
data names; input @; _infile_ = upcase(_infile_); put _INFILE_; input name :$10. city :$10.; datalines; abc Mumbai def Pune ; run;
Just as contents of the input buffer are available using _INFILE_, contents of the output buffer can be accessed using _FILE_.
Formats are created that will probably be used in the subsequent data step to check whether or not males in the data set SASHELP.CLASS have age, height, and weight that are Low (L) or at/above (H) from a given value.
The FILENAME statement sets up a FILEREF named class2. The special physical file DUMMY specifies that no actual file will probably be produced, however, you possibly can nonetheless write to that file using the FILE statement and PUT statements in the data step.
The PUT statement writes formatted values (L and H) of the three variables to the output
buffer and the results are held there using a trailing @- sign.
A new variable named AGHTWT is created from the contents of the output buffer
(formatted values of age, height, and weight). The output buffer is cleared with a PUT statement.
That second PUT statement clears the output buffer by writing its contents to the file DUMMY.
In the data set on the above right, you can see the values of age/height/weight that are low (L) or at/above the median (H) by looking the variable AGHTWT.
Without the second PUT statement, the presence of the trailing @-sign on the first PUT statement would result in the formatted values of the age, height, and weight for each observation to be added onto the end of a string of values from all the earlier observation(s).
A single trailing @-sign on a PUT statement persists during a number of passes by way of the information step, not like an INPUT statement where the INPUT buffer is cleared at each move through the data step if there is only one trailing @
proc format; value ag low-<13.5 = 'L' other = 'H'; value ht low-<64.15 = 'L' other = 'H'; value wt low-<107.25 = 'L' other = 'H'; run; filename class2 dummy; data Class_males (drop=sex); file class2; set sashelp.class (where=(sex eq 'M')); put age ag. height ht. weight wt. @; aghtwt = _file_; put; run;
IORC stands for Input-Output Return Code. SAS programmers generally use the _IORC_ variable along with a Modify Statement or a Set Statement where the Key= Option is used on a SAS Data Set Index. In this case, SAS returns a numeric value to the _IORC_ variable, which signifies whether or not the Index search was successful or not.
data class; set sashelp.class; put _IORC_=; run;
_IORC_=0 _IORC_=0 _IORC_=0 _IORC_=0 _IORC_=0 _IORC_=0 _IORC_=0 _IORC_=0 _IORC_=0 _IORC_=0 _IORC_=0 _IORC_=0 _IORC_=0 _IORC_=0 _IORC_=0 _IORC_=0 _IORC_=0 _IORC_=0 _IORC_=0
Below are some of the guidelines for IROC which will help you to understand _IORC_ in detail
- _IORC_ initializes with a zero value.
If you check the log from the above code example, _IORC_ is set to 0 by default.
2. SAS Automatically retains the variable.
If u assign a value to the IORC variable, it will be retained for every iteration unless you explicitly modify it. This is in contrast to the _N_ variable, which is modified each time the data step iterates through the Data Statement.
data class2; set sashelp.class; if _N_=3 then _IORC_=10; if _N_ = 3 then _N_ = 10; put _IORC_= _N_=; run;
_IORC_=0 _N_=1 _IORC_=0 _N_=2 _IORC_=10 _N_=10 _IORC_=10 _N_=4 _IORC_=10 _N_=5 _IORC_=10 _N_=6 _IORC_=10 _N_=7 _IORC_=10 _N_=8 _IORC_=10 _N_=9 _IORC_=10 _N_=10 _IORC_=10 _N_=11 _IORC_=10 _N_=12 _IORC_=10 _N_=13 _IORC_=10 _N_=14 _IORC_=10 _N_=15 _IORC_=10 _N_=16 _IORC_=10 _N_=17 _IORC_=10 _N_=18 _IORC_=10 _N_=19
4. SAS Does not change the value of _IORC_ unless it performs an index search.
SAS will not change the value of _IORC_ unless an index search is carried out using the KEY = option in the SET statement. This makes the _IORC_ variable usable in different contexts than the _N_ variable because we know that in most circumstances, _IORC_ is zero by default, retained and also not modified by SAS