Most SAS procedures prefer normalised data that tends to be tall and narrow, like Proc means and Proc Freq.
Since we often do not have control over the form of the data when we receive it, we need to be able to convert the data from the normal to non-normal form and from non-normal to normal form or from long data to wide data and vice versa.
This process is known as transposing the data, and the operations are commonly performed by **PROC TRANSPOSE**.
    
    
The general syntax of the Transpose Procedure
BY variable(s);COPY variable(s);ID variable;VAR variable(s);RUN;```- The variables specified in the **`BY`** statement is transposed within the combination of the BY variable. The BY variables themselves aren’t transposed but are used to determine the row structure of the transposed dataset.<ul class="wp-block-list"><li>The variables need to be sorted before running `PROC TRANSPOSE` unless you specify the `NOSORTED` option.- For long-to-wide transposes, the BY variables should uniquely identify each row.- For wide-to-long transposes, the BY variables determine the row structure of the long data; that is, it determines the repetition of the rows.</li></ul>- The **`ID`** statement can be used to help identify rows. The new columns created will be named as per the [variables specified](https://www.9to5sas.com/specify-list-of-variables-in-sas/) in the ID statement. Thus, ID Statement also gives names to the Transposed column.<ul class="wp-block-list"><li>The ID statement also ties a value in a specific row to a specified new column.- In the case of long-to-wide transposes, the structure of the column is determined by the ID variable. There will be one column for each unique value of the ID variable (or if multiple ID variables are present, one column for each unique combination of values).- For wide-to-long transposes, you typically do not need an ID variable. However, if you do supply an ID variable, it will determine the column structure.- The combination of variables on the BY and ID statements must identify down to the row level.</li></ul>- The variables in the **`VAR`** statement are transposed. If the VAR statement is not included, PROC TRANSPOSE will transpose all numeric variables that are not included in a BY statement or an ID statement. Character variables are transposed only if they are listed in a VAR statement.<ul class="wp-block-list"><li>Usually, one variable is specified for a long to wide transpose, whereas multiple variables are specified for wide to long datasets.- The output dataset returns one row for each variable in the VAR statement.</li></ul>## Transposing Long to Wide DatasetsPROC TRANSPOSE provides the ability to go from a long dataset to a wide dataset. Below is an example of a long dataset (SASHELP.ORSALES).**Image**: Placeholder image converted to descriptive text.```sasproc transpose data=sashelp.orsales out=sales;var quantity profit total_retail_price;run;
“Output”:
Image: Placeholder image converted to descriptive text.
The syntax for transposing wide to long datasets is identical. Still, the objective is to reduce the number of columns and create a data structure where multiple rows are used to define the different attributes of a variable.
Image: Placeholder image converted to descriptive text.
proc transpose data=sashelp.library out=column1;id libref;var _all_;run;
“Output”:
Image: Placeholder image converted to descriptive text.
NAME= SAS automatic variable _NAME_ contains the name of the variable being transposed. The remaining transposed variables are named COL1 all the way throughCOLn.
DELIMITER= specifies a delimiter to use as a name for transposed variables in the output data set. The delimiter specified is inserted between variable values if more than one variable is given in the ID statement.
You can use the PREFIX= or SUFFIX= option to specify a prefix or suffix for each new variable name.
data exa;input subject test $ score;datalines;1 post 921 pre 902 post 882 pre 773 post 503 pre 514 post 774 pre 725 post 695 pre 60;run;
“Output”:
Image: Placeholder image converted to descriptive text.
proc transpose data=exa out=exa1 prefix=score;by subject;id test;var score;run;
“Output”:
Image: Placeholder image converted to descriptive text.
Double Transpose helps us to transpose multiple variables and reshape long data to a wide format.
Below is the original format of the data we want to convert to a wide format.
data subj;input subject Month $ potassium sodium;datalines;210 JAN 5.0 14.0210 FEB 3.0 11.0210 MAR 2.0 12.0211 JAN 1.0 11.0211 FEB 5.0 10.0211 MAR 3.0 19.0212 JUN 3.0 12.0;run;
“Output”:
Image: Placeholder image converted to descriptive text. We want an output similar to the below.
Image: Placeholder image converted to descriptive text. 1st Transpose
The first PROC TRANSPOSE step creates one column for each value of the variable Potassium and Sodium, and all the values are stored in a single variable COL1.
by subject Month notsorted;var sodium potassium;run;```**Image**: Placeholder image converted to descriptive text.**2nd Transpose**The second PROC TRANSPOSE step reconverts the columns of Potassium and Sodium into rows. The data now has every month represented as a column for each Potassium and Sodium value.```sasproc transpose data=labtran out=sparsed(drop=_name_);by subject;var col1;id Month _name_;run;proc print;