SAS Numerical functions are used to conduct tasks such as rounding numbers, computing dates from month-day-year values, summing and averaging the values of SAS variables, and many more.
For Character functions in SAS, read the “The Ultimate Guide To Character Functions In SAS” post.
Operator Hierarchy of SAS Operators
Before we begin with the numerical functions, it is essential to understand the Operator hierarchy structure in SAS.
Expressions are used in several ways within the DATA and PROC steps. It is important to remember that the expression evaluation will follow specific rules regardless of their use.
The table below contains groups of SAS operators according to their evaluation priority.
Parentheses are used to group operands. Operators within parenthesis are evaluated first.
Evaluation starts from Group 1, proceeds to Group 2, and lasts down to Group 7.
Logical and Comparison Operators in Assignment Statements
Logical and comparison operators can be combined to eliminate multiple IF ELSE statements.
For example, if we have to group a person’s weight from 1 to 4, based on whether the person is male or female and an age group of less than or greater than 90, then using the if else condition, we have to write logic as below.
if sex = 'M' and weight > 90 then group=1;
else if sex = 'M' and weight le 90 then group=2;
else if sex = 'F' and weight > 90 then group=3;
else if sex = 'F' and weight le 90 then group=4;
Or, we could rewrite the comparisons as below using the logical and comparisons operator.
data class;
set sashelp.class;
ageGroup= 1*(Sex = 'M' and weight > 90) +
2*(Sex = 'M' and weight le 90)+
3*(Sex = 'F' and weight > 90) +
4*(Sex = 'F' and weight le 90);
run;
Compound Inequalities
An expression with a compound inequality very often contains a variable
between two values, which form a range’s upper and lower limits.
This compound expression is effectively interpreted as two distinct inequalities with an AND.
if (weight > 80) and (weight < 100);
The above condition can be rewritten as
if 80 lt weight le 100;
The age value must satisfy both conditions (should be less than 80 and then less than 100 )for the overall expression to be evaluated as true.
Misplacing the parentheses changes the way that the expression is evaluated.
The inequality inside the parentheses is evaluated as true or false (0 or 1), and the result is compared to 100. This expression will be true for all values of weight (both 0 and 1 are less than 100).
In the macro language, compound inequalities are not evaluated the same way as in the DATA step.
The resolved macro variable in the expression shows that the result will be false, but it evaluates as TRUE.
%if 80 lt &weight le 100 %then %put less than 80 and 100
This happens because expression in macro language is not broken into two distinct inequalities. Instead, it is evaluated as if there are parentheses around the first portion of the comparison.
%if (80 lt &weight) le 100 %then %put less than 80 and 100
The expression is evaluated from left to right, and (80 lt &weight) will be either TRUE or FALSE (1 or 0). If “&weight” is > 1, the overall expression will be TRUE.
assuming weight = 52
80 < 52 evaluates to 0 or False
0 < 100 evaluates to 1 or True
assuming weight = 85
80 < 85 evaluates to 1 or True
1 < 100 evaluates to 1 or True
Hence, the expression continuously evaluates to True.
Numeric Expressions and Boolean Transformations
Sometimes, you may need to transform numeric values to boolean(0 or 1) values.
The double negation (NOT) is used to transform numeric values to 0 or 1. Since negation is a Boolean operator, it converts the original value to either a zero or a one.
SAS handles TRUE/FALSE: false is 0 or missing, and all else is true. The missing values must also map to 0.
x= ^age; y=^^age;
The value of X in the above example returns 0 because the value of X is non-missing. If we have any age missing, a single negation will return 1.
You can filter missing values or non-missing values by writing a condition as
if age = . /* To filter only missing ages*/
if age ^= . /* To filter non-missing ages*/
However, the same result can be achieved using negation operators as below.
if ^age; /* To filter only missing ages*/
if ^^age; /* To filter non-missing ages*/
SAS Numeric functions for replacing missing values with 0
The COALESCE
function can be used to return the first non-missing value, and the second argument to the function would replace the missing value with 0.
If you want to convert all missing values to 0 and all other values to 1 (including 0), you can use the negation of the MISSING function.
data class;
set sashelp.class;
if age=12 then
age=.;
z=coalesce(age, 0);
m=^missing(age);
run;
SAS Numeric functions to Round and Truncate Numeric Values
Two of the most useful SAS numeric functions in this category are the <strong>ROUND</strong>
and INT
functions.
ROUND
ROUND rounds numbers to the nearest integer or other place values. If a place value is not provided, a default value of 1 is used, and the argument is rounded to the nearest integer.
data values;
x1=round(331.456,.01);
x2=round(331.456);
x3=round(331.456,10);
x4=round(331.456,50);
put x1= x2= x3= x4=;
run;
Output:
x1=331.46 x2=331 x3=330 x4=350
INT, FLOOR, and CEIL function
INT
the function returns the integer part of a numeric value. For a positive argument, the function returns the same result as the FLOOR function, and for a negative value, it returns the same result as the CEIL
function.
The FLOOR
and CEIL
functions return the largest and smallest integer values, respectively.
data values;
x1=int(23.45);
x2=int(-23.45);
x3=ceil(23.45);
x4=floor(-23.45);
put x1= x2= x3= x4= x5=;
run;
Output:
x1=23 x2=-23 x3=24 x4=-24 x5=23
SAS Numeric functions that work with the Missing Values
A period indicates a missing numeric value in SAS, and a blank signifies a missing character value.
MISSING Function in SAS
You can use the MISSING function to find missing values in your data. It returns true or 1 if a value is missing, and false or 0 if it is not.
data values;
input @1 var1 3. @5 var2 3.;
x1=missing(var1);
x2=missing(var2);
datalines;
127
988 195
;
run;
proc print;
For more information, see our guide on Working with Missing Values in SAS
Setting Character and Numeric Values to Missing
CALL MISSING
With the CALL routine – CALL MISSING, you can set one or more character and numeric variables to missing. If you use a variable list such as A1–A10, you must precede the list with the keyword OF.
data class;
set sashelp.class;
call missing(name,age);
run;
Alternatively, you can use the missing (of all) call to set all variables to missing.
SAS Numeric Functions for Descriptive Statistics
There are several SAS numeric functions that you can use to compute statistical results such as means, standard deviation and many other statistical calculations.
Also Read: Descriptive Statistics in SAS with Examples
The N function in SAS
The N
function returns the number of non-missing numeric values among its arguments.
data UsingN;
x1=n(10, 0, ., 20, 50, .);
x2=n(10, 0);
x3=n(of x1-x2);
put x1=x2=x3=;
run;
The N function returns the number of non-missing values for variables x1,x2, and x3.
x1=4 x2=2 x3=2
NMISS function
The NMISS
function returns the number of missing values in the list.NMISS requires arguments to be numeric values, whereas CMISS will work both for numeric and character values.
data UsingNmiss;
x1=nmiss(10, 0, ., 20, 50, .);
x2=nmiss(10, 0);
x3=nmiss(of x1-x2);
put x1=x2=x3=;
run;
The output produced is as below.
x1=2 x2=0 x3=0
MEAN
The MEAN
function computes the mean of its arguments.
data UsingMean;
x1=mean(2, ., ., 6);
put x1=;
run;
Output:
x1=4
MAX and MIN functions in SAS
MAX
and <strong>MIN</strong>
are the two SAS numeric functions that can be used to return the largest and smallest value of its arguments.
libname files '/folders/myfolders';
data files.scores;
input ID $ Score1-Score10;
if n(of Score1-Score10) ge 7 then
Score=mean(of Score1-Score10);
MaxScore=max(of Score1-Score10);
MinScore=min(of Score1-Score10);
datalines;
A001 4 1 3 9 1 2 3 5 . 3
A002 3 5 4 2 . . . 2 4 .
A003 9 8 7 6 5 4 3 2 1 5
;
run;
proc print data=files.scores;
run;
The above example calcifies the mean score only if the non-missing values are greater than 7. The number of non-missing values for A002 is 6. The mean is not calculated for this ID.
A001 has nine non-missing values, so the mean is computed by adding the nine values and dividing by 9.
In the below example code, the non-missing and missing values are calculated for the respective IDs
data scores(keep=ID NonMissingScores_N MissingScores_NMISS);
set files.scores;
NonMissingScores_N=n(of Score1-Score10);
MissingScores_NMISS=nmiss(of Score1-Score10);
run;
Output:
LARGEST Function in SAS
The LARGEST
function extracts the nth largest value, given a list of variables.
The first argument of the LARGEST function tells SAS which value you want—1 gives you the largest value, 2 gives you the second largest, and so on.
To find the second-largest score from the above example, you could use the LARGEST function as below.
SMALLEST Function in SAS
SMALLEST
functions returns the smallest value in the list.
Note if there are duplicate values in the list, the largest or the smallest function will return the values based on the order of the variable.
For example, if the values in the list are (4, 1, 3, 9, 1, 2, 3, 5). smallest(2,<list>) would return 1, not 2.
Similarly, the second largest from the below example would return 789.
largest2=largest(2, 456,789,789,123);
Function for calculating the SUM of observations
You could calculate the sum of observations using Score1+Score2+Score3. However, there is a more efficient way of computing the sum of observations in SAS.
SUM Function in SAS
The SUM
function calculates the sum of observations provided in its argument.
sumScore = sum(of score1-score3);
The sum function has the added advantage of ignoring the missing values.
For example, if the score2 is missing, then using the +
operator would result in a missing value, whereas the SUM function calculates and returns the sum of Score1 and score3.
In addition, you could set a value that will be returned in case all values are missing. In the below example, 0 will be returned if all the scores are missing.
sumScore = sum(0,of score1-score3);
SAS numeric functions for performing mathematical operations
You can use mathematical functions in SAS to perform operations such as finding the absolute values, square root, log, etc.
ABS
ABS
function is a straightforward function used to find any integer’s absolute value. In other words, it removes the negative sign from the value.
SQRT
As the name suggests, the SQRT
function will return the square root of its arguments.
LOG
The <strong>LOG</strong>
function takes the natural logarithm of its argument. You can use the LOG10 function to return a base 10 log.
EXP
The EXP
function raises e (the base of natural logarithms) to the value provided in its argument.
Computing Constants
The CONSTANT
function returns values of commonly used mathematical constants such as pi and e.
You can find the list of all valid constants on the SAS Documentation website.
Another important of this function is to compute the largest integer bytes that can be stored without losing information.
Integer = constant('exactint',3);
The function returns 8192, which means for a length of a numeric variable 3, the largest integer you can represent without losing accuracy is 8,192.
SAS numeric functions for Generating Random Numbers
Random-number functions generate streams of random numbers starting from an initial point called the seed.
RANUNI Function
This function generates random numbers ranging between 0 and 1.
First, you must provide a seed number to generate the first number in the random sequence.
If the seed is 0 or a negative number, SAS uses the computer’s clock to supply the seed.
If you choose any positive integer, that number is used as a seed.
SAS recommends that you choose a seed of at least seven digits. If you supply the seed, the program generates the same sequence of random numbers every time you run.
With a 0 seed, the program’s sequence is different every time you run.
To generate random integers in the range from 1 to 10, you could use:
RandomInteger = int(ranuni(0)*10 + 1);
data random_numbers;
call streaminit(3);
do i=1 to 10;
simple_random=ranuni(2);
uniform_dist=rand('uniform');
normal_dist=rand('normal');
x=rand("Integer", 1, 10);
/* For generating random integer between 1 to 5*/
random_int=int(ranuni(0) * 5 + 1);
/* For generating random integer between 100 to 500*/
random_int2=int(ranuni(0) * 500 + 100);
output;
end;
drop i;
run;
Output:
The STREAMINIT
subroutine is used to define the random number seed for the RAND function in the DATA step. This seed value controls the sequence of random numbers. If you don’t use it, SAS will take the system date as the initial value.
UNIFORM
argument specifies that the numbers generated are uniformly distributed.
You can read more about random numbers distribution from the SAS documentation website.