HomeContact
How to transpose a SAS dataset using the Proc Transpose procedure
February 02, 2020
1 min

Table Of Contents

01
Transposing Wide to Long Datasets
02
Options available Proc Transpose
03
Transposing multiple variables - Double Transpose

Most SAS procedures prefer normalised data that tends to be tall and narrow, like Proc means and Proc Freq.

Since we often do not have control over the form of the data when we receive it, we need to be able to convert the data from the normal to non-normal form and from non-normal to normal form or from long data to wide data and vice versa.

This process is known as transposing the data, and the operations are commonly performed by **PROC TRANSPOSE**.

Proc Transpose
Proc Transpose
Variables become observations, and observations become variables.

The general syntax of the Transpose Procedure

BY variable(s);
COPY variable(s);
ID variable;
VAR variable(s);
RUN;```
- The variables specified in the **`BY`** statement is transposed within the combination of the BY variable. The BY variables themselves aren’t transposed but are used to determine the row structure of the transposed dataset.
<ul class="wp-block-list">
<li>The variables need to be sorted before running `PROC TRANSPOSE` unless you specify the `NOSORTED` option.
- For long-to-wide transposes, the BY variables should uniquely identify each row.
- For wide-to-long transposes, the BY variables determine the row structure of the long data; that is, it determines the repetition of the rows.
</li>
</ul>
- The **`ID`** statement can be used to help identify rows. The new columns created will be named as per the [variables specified](https://www.9to5sas.com/specify-list-of-variables-in-sas/) in the ID statement. Thus, ID Statement also gives names to the Transposed column.
<ul class="wp-block-list">
<li>The ID statement also ties a value in a specific row to a specified new column.
- In the case of long-to-wide transposes, the structure of the column is determined by the ID variable. There will be one column for each unique value of the ID variable (or if multiple ID variables are present, one column for each unique combination of values).
- For wide-to-long transposes, you typically do not need an ID variable. However, if you do supply an ID variable, it will determine the column structure.
- The combination of variables on the BY and ID statements must identify down to the row level.
</li>
</ul>
- The variables in the **`VAR`** statement are transposed. If the VAR statement is not included, PROC TRANSPOSE will transpose all numeric variables that are not included in a BY statement or an ID statement. Character variables are transposed only if they are listed in a VAR statement.
<ul class="wp-block-list">
<li>Usually, one variable is specified for a long to wide transpose, whereas multiple variables are specified for wide to long datasets.
- The output dataset returns one row for each variable in the VAR statement.
</li>
</ul>
## Transposing Long to Wide Datasets
PROC TRANSPOSE provides the ability to go from a long dataset to a wide dataset. Below is an example of a long dataset (SASHELP.ORSALES).
**Image**: Placeholder image converted to descriptive text.
```sas
proc transpose data=sashelp.orsales out=sales;
var quantity profit total_retail_price;
run;

“Output”:

Image: Placeholder image converted to descriptive text.

Transposing Wide to Long Datasets

The syntax for transposing wide to long datasets is identical. Still, the objective is to reduce the number of columns and create a data structure where multiple rows are used to define the different attributes of a variable.

Image: Placeholder image converted to descriptive text.

proc transpose data=sashelp.library out=column1;
id libref;
var _all_;
run;

“Output”:

Image: Placeholder image converted to descriptive text.

Options available Proc Transpose

NAME= SAS automatic variable _NAME_ contains the name of the variable being transposed. The remaining transposed variables are named COL1 all the way throughCOLn.

DELIMITER= specifies a delimiter to use as a name for transposed variables in the output data set. The delimiter specified is inserted between variable values if more than one variable is given in the ID statement.

You can use the PREFIX= or SUFFIX= option to specify a prefix or suffix for each new variable name.

data exa;
input subject test $ score;
datalines;
1 post 92
1 pre 90
2 post 88
2 pre 77
3 post 50
3 pre 51
4 post 77
4 pre 72
5 post 69
5 pre 60
;
run;

“Output”:

Image: Placeholder image converted to descriptive text.

proc transpose data=exa out=exa1 prefix=score;
by subject;
id test;
var score;
run;

“Output”:

Image: Placeholder image converted to descriptive text.

Transposing multiple variables - Double Transpose

Double Transpose helps us to transpose multiple variables and reshape long data to a wide format.

Below is the original format of the data we want to convert to a wide format.

data subj;
input subject Month $ potassium sodium;
datalines;
210 JAN 5.0 14.0
210 FEB 3.0 11.0
210 MAR 2.0 12.0
211 JAN 1.0 11.0
211 FEB 5.0 10.0
211 MAR 3.0 19.0
212 JUN 3.0 12.0
;
run;

“Output”:

Image: Placeholder image converted to descriptive text. We want an output similar to the below.

Image: Placeholder image converted to descriptive text. 1st Transpose

The first PROC TRANSPOSE step creates one column for each value of the variable Potassium and Sodium, and all the values are stored in a single variable COL1.

by subject Month notsorted;
var sodium potassium;
run;```
**Image**: Placeholder image converted to descriptive text.
**2nd Transpose**
The second PROC TRANSPOSE step reconverts the columns of Potassium and Sodium into rows. The data now has every month represented as a column for each Potassium and Sodium value.
```sas
proc transpose data=labtran out=sparsed(drop=_name_);
by subject;
var col1;
id Month _name_;
run;
proc print;

Share


© 2025 9to5sas
AboutContactPrivacyTerms