In SAS, the order of variables in a data set is based on the order in which the variables were created. You may want to reorder variable names in a SAS data set for several reasons.
One of the reasons can be to logically order the variables prior to exporting the data into an external file.
ATTRIB, ARRAY, FORMAT, INFORMAT, LENGTH, and RETAIN.
In this article, we will demonstrate how to reorder the position of variables with a Data Step, as well as with PROC SQL.
In the SASHELP.RETAIL dataset, you want to reorder variables as day, month, year, date, and sales. The ATTRIB statement associates a format, informat, label, and/or length with one or more variables.data attrib_method; attrib day month year date sales length=3; set sashelp.retail(obs=5); run;
Using Attrib Statement to reorder Variables in SAS Data SetUsing Attrib Statement to Reorder Variables in the SAS Data Set
The Length statement is used to specify the number of bytes for storing characters and numeric variables. The default length of numeric variables is 8 and the length of a character variable is set at the first occurrence of the variable. In the example dataset, all variables are numeric so we can set the length of these variables to 3.
This is the most common method, that SAS programmers often use to reorder variables. Do remember that, the retain statement needs to be placed before the [SET statement](https://www.9to5sas.com/set-statement-sas/).
For more details, see our article on [Retain](https://www.9to5sas.com/retain-in-sas/) Statement.
```sas
data retain_method;
retain day month year date sales;
set sashelp.retail;
run;
However, unless the sole purpose of a data step is to re-order variables, the use of RETAIN for this purpose can lead to unwanted results when the value of any ‘retained’ variable is carried over from one data step iteration to another.
SAS arranges columns in a dataset according to the order in which the variables are declared or defined, so any variables declared prior to the SET statement will be placed first in the resulting output dataset.
In the below example, we have created a Category variable to hold the weight Category of a person based on BMI. By default, the category variable will be placed first in the output dataset.
In the right, table we created a Frequency table using the PROC FREQ procedure which shows the count of persons in each of these categories. Notice that, Category is missing if BMI = 18.5.
reorder Variables in SAS Data Set
We can see the counts under each category and the number of missing values using the Proc Freq procedure.
Using Attrib Statement to reorder Variables in SAS Data Set
Now, we will use the retain statement to place the category variable at the end of the output dataset and reorder variables such as age, BMI, and category.
data bmimen1 (keep=age bmi Category);
retain age bmi category;
Length category $20.;
set bmi;
age=int(age);
if (bmi 18.5) AND (bmi > 25) then
Category='healthy weight ';
else if (bmi >= 25) AND (bmi = 30) then
category="obesity";
run;
Using Retain Statement to reorder Variables in SAS Data SetUsing Retain Statement to Reorder Variables in SAS Data Set
As you can see using the retain statement, the count of the Underweight person has become 2 and there are no missing values. So, what went wrong?
In the original logic, due to the non-exhaustive IF/ELSE conditions, individuals with a BMI of 18.5 are not explicitly assigned a value for “Category”.
The inclusion of the RETAIN statement, however, means that BMI of 18.5 and 25 will be assigned to a category and they are assigned the value of “Category” that was retained from the previous data step iteration. which is “Underweight” where instead we might have expected a missing value.
This has led to erroneous results.
Of course, the above error would be corrected by updating the second IF condition as – else if (bmi >= 18.5) AND (bmi < 25) then, but this is only a simple case; with several levels of IF conditions and variable assignments in a single data step, it’s likely possible to have misleading results, especially if you rely on missing values for OUTPUT statements or WHERE conditions later in the code.
The FORMAT statement is another alternative to reorder variables in a SAS dataset. Place the format statement before the SET statement and follow by a variable list (no formats need be specified), and SAS will arrange columns in the dataset.
****.
“Note”: The FORMAT statement must be placed before the SET statement. Placing FORMAT after the SET statement will reset the format of any variables specified in the ordering list.
data bmimen1 (keep=age bmi Category);
format age bmi category;
Length category $20.;
set bmi;
age=int(age);
if (bmi < 18.5) then Category='underweight'; else if (bmi > 18.5) AND (bmi < 25) then Category='healthy weight '; else if (bmi >= 25) AND (bmi < 30) then Category='overweight'; else if (bmi >= 30) then
category="obesity";
run;
Using Format Statement to reorder Variables in SAS Data Set
PROC SQL in the context of variable reordering only REQUIRES a reordered list of variables but allows the specification of additional attributes, such as type, length, format, etc. PROC SQL also unique in that it allows you to rename or “alias” variables.
When using the SQL procedure, the order in which variables are listed in the SELECT statement will establish the variable order for the resulting output dataset.
proc sql outobs=5;
create table bmi as select day, month, year, date, sales
Though the RETAIN statement can be used to order columns in a SAS dataset, its intended function of retaining values across data step iterations can make it a poor choice for this purpose.
Instead, you can either use the FORMAT statement for data steps or the SELECT statement for PROC SQL as it has the same effect without compromising on the data. I would personally use and prefer the FORMAT statement to reorder the variable list.
So, this was our side to reorder variables in a SAS dataset. We really hope that you have found it useful. Moreover, if you have any other suggestions regarding other plagiarism tools then suggest us in the comment section. We would really take those lists in our further blog post.