The column Input in SAS is one of the methods to read raw data files with spaces or delimiters between all the values or periods for missing numeric data.
What is column input in SAS?
Column input lets you read different values from each record in the same column.
To use column input, list the names of the variables in the INPUT statement. Then, right after each variable name, write the column position in each data line that goes with that variable. (Of course, you’ll need to put a dollar sign ($) before each character variable.)
Column input can be used when your raw data is in fixed columns and has a standard format for characters or numbers. Column input reads data values until it reaches the last column for the specified field.
Advantages of Column input in SAS
Column input in SAS has the following advantages over list input:
- Spaces are not required between values
- Missing values can be left blank
- Character data can have embedded spaces
- You can skip unwanted variables.
Syntax
INPUT variable <$> start-column<–end-column> <.decimals> <@ | @@>;
- $ – It indicates that the variable has character values. The $ is not required if the variable is previously defined as a character.
- Start-column – It specifies the first column of the input record that contains the value to read.
- End-column -specifies the last column of the input record that contains the value to read. If the variable value occupies only one column, omit the end column.
- .decimals – specifies the power of 10 by which to divide the value. If the data contain decimal points, the .decimals value is ignored. An explicit decimal point in the input value overrides a decimal specification.
- @ it holds the input record for executing the following INPUT statement within the same iteration of the DATA step. This line-hold specifier is called trailing @.
- @@ holds the input record for executing the following INPUT statement across iterations of the DATA step. This line-hold specifier is called double trailing @.
The important points to note about column input are:
- When using column input, you don’t have to use a placeholder like a period to show that a value is missing. That is, values that aren’t there can be left blank.
- Column input uses the specified columns to figure out how long character variables are, so character values can be longer than the default eight characters and include spaces.
- Column input makes it possible to skip fields or read them in any order.
- Column input lets you read only part of a value and lets you reread it.
- There is no need to put a space between the data values.
- The trailing @ and @@ must be the last item in the INPUT statement.
- The trailing @ prevents the following INPUT statement from automatically releasing the current input record and reading the next record into the input buffer. It is useful when you need to read from a record multiple times.
- The double trailing @ is useful when each input line contains values for several observations.
Example 1: Reading Input Records with Column Input in SAS
data Students;
input name $ 1-18 age 25-27 weight 30-32;
datalines;
Joseph 11 32
Mitchel 13 29
Sue Ellen 14 27
;
run;
Example 2: Read Input Records Using Decimals
The .decimals argument is used to read values from input lines and insert a decimal place into data that does not have an explicitly defined decimal.
If the data contains an explicit decimal, it is not changed. Instead, the data is padded to match the greatest number of significant digits occurring in any output data after conversion.
data product;
input cost 1-5 .2;
datalines;
26.77
21.00
400
3.2
12.56
9
;
run;
Example 3: Reading two types of Input data
This is an example that reads a file that contains two types of input data records and creates a SAS data set from these records.
One type of data record contains information about a particular college course. The second type of record contains information about the students enrolled in the course.
You will need two INPUT statements to read the two records and to assign the values to different variables that use different formats.
Records that contain class information have a Course in column 1, and records that contain student information have a Student in column 1, as shown below:
----+----1----+----2----+ Course HISTORY Watson Student Williams 0459 Student Flores 5423 Course MATHS Sen Student Lee 7085
data students(drop=type);
retain Course Professor;
input type $7. @;
if type='Course'
then input course $ professor $;
else if type='Student' then do;
input Student $10. Id;
output students;
end;
datalines;
Course HISTORY Watson
Student Williams 0459 Student Flores 5423
Course MATHS Sen
Student Lee 7085
;
run;
Example 4: Holding a Record across Iterations of the DATA Step
This example shows how to create multiple observations for each input data record. Each record contains several NAME and AGE values.
The DATA step reads a NAME value and an AGE value, writes an observation, and then reads another set of NAME and AGE values to output until all the recorded input values are processed.
data test;
input name $ age @@;
datalines;
Joseph 13
Mitchel 12
Sue 15
Stephen 10
Marc 22
Lily 17
;
run;
When to use Column Input in SAS?
If each variable’s values are always arranged in the same place in the data line, then you use column input in SAS as long as all the values are character or standard numeric.
Standard numeric data contains only numerals, decimal points, plus and minus signs, and E representing scientific notation.
If the numbers have embedded commas or dates are not standard numeric data.
If you liked this article, you might also want to read Importing Data using PROC IMPORTas well.
Do you have any tips to add? Let us know in the comments.
Please subscribe to our mailing list for weekly updates. You can also find us on Instagram and Facebook.
Hello,
Thank you again!
Just to notice that this code does not work:
data Students;
input name $ 1-18 age 25-27 weight 30-32;
datalines;
Joseph 11 32
Mitchel 13 29
Sue Ellen 14 27
;
run;
the start_columns and end_columns are not the good ones and it’s really painful to deal with the name separated by a space. Would you have a simple solution to read this lines ? Thanks.
Hi there!
You need to adjust the variable lengths and spaces accordingly. For example, the name contains the first and last names, so you need to ensure that the length of 1-10 includes the spaces and all the variable’s values need to be arranged in a column format.
Column input uses the columns specified in the input statement to determine the length of character variables.
The other alternative is to use the ampersand (&) modifier when you have spaces between values.
Suppose you use blanks as delimiters and have a character value containing a blank, such as a first and last name. By replacing the colon modifier with an ampersand (&), SAS will continue reading a character value, even if it contains single blanks.
An important point to note is that when you use the ampersand modifier, follow the variable with two or more spaces.
I have used two spaces after “Sue Ellen “.
Thank you Subhro !
I had built this which works also without adjusting the spaces.
As you can see, we can add any name, whatever its length, it is read correctly.
A question : do you have a valid email address where we can write to you directly?
Thanks