Use Where Statement In SAS To Your Advantage5 min read

You can use a WHERE statement in SAS to subset the data if you are reading data from a SAS dataset.

Example:

data class;
set sashelp.class;
where sex='F';
run;

where statement

In this example, you may use either a WHERE or a subsetting IF statement. There are generally advantages to using a WHERE statement in SAS instead of a subsetting IF statement.

You have a larger alternative of operators that can be used with a WHERE statement and, if the input data set is indexed, the WHERE statement in SAS is likely to be more efficient. You may also use a WHERE statement in a SAS procedure to subset the data being processed.

Note: IF statements are not allowed inside SAS procedures.

Using Operators with WHERE statement in SAS

Below is the list of operators that you can use with the WHERE statement in SAS.

IS MISSING and IS NULL

The IS MISSING and IS NULL operators, are used with a WHERE, ON, and HAVING expressions and it can handle character or numeric variables.

It results in true if the expression results are missing and false if it is not missing.

They also work with the NOT operator.

data class;
    set sashelp.class;
    if age le 13 then call missing(age);
run;
*** use missing;
proc print data=class;
    where age is missing;
run;
 
*** using IS NULL;
proc print data=class;
    where age is null;
run;

*** using NOT MISSING;
proc print data=class;
    where age is not missing;
run;


*** using NOT NULL;
proc print data=class;
    where age is not NULL;
run;

BETWEEN AND

The BETWEEN AND expression matches all the values greater than or equal to the first value and less than or equal to the second value. This works with character in addition to numeric variables.

data class;
set sashelp.class;
where weight between 60 and 90;
run;

Between And

NOT

The NOT operator can be used with the WHERE statement in many other ways:

where the section is not missing and the score is missing;

  • where not (score in (34, 44, 84));
  • where not (Score between 50 and 90);
  • where NOT(Section EQ “A”);

SAME AND

Multiple WHERE statements in SAS cannot be used in a DATA step like the IF statements. When SAS encounters a second WHERE statement in a DATA step, the first is replaced by the second.

The WHERE AND also known as WHERE ALSO will let you use multiple where statements in SAS to add more restrictions.

data class;
set sashelp.class;
  where sex='M';
  where also age > 12 ;
run;

WHERE SAME AND

IN and NOT IN

The IN operator is a comparison operator, searches for character and numeric values that are equal to one of the values from a given list of values. The list of values must be in parentheses, with each character value in quotation marks and separated by either a comma or blank.

For example, suppose you want to filter for Cars which are  SUV, Sedan or Wagon. You could specify the values as:

where strip(upcase(type)) in ('SUV','SEDAN','WAGON');

Note the use of strip and Upcase function in the type variable. This is to ensure that any leading or trailing spaces to be removed and the values will be converted to Uppercase before comparison.

Also, read the article on The Ultimate Guide To SAS Character Functions

In addition to this, you can use the logical operator NOT to exclude values that are on the list.

where strip(upcase(type)) not in ('TRUCK', 'HYBRID', 'SPORTS');

You can also use a shorthand notation to specify a range of numbers to search. The range is specified by using the syntax M: N as a value in the list to search, where M is the lower bound and N is the upper bound.

You can replace the below statement in the form of WHERE IN shorthand notation.

where age in(13,14,15,16);
where age in(13:16);

CONTAINS (?)

The CONTAINS expression matches any character value containing the given string.

data class;
set sashelp.class;
where name ? 'Al';
run;

Contains Operator

NOT CONTAINS (^?)

data class;
set sashelp.class;
where name ^? 'J';
run;

Not Contains

The “contains” operator (?) and the “not contains” operator (^?) match a substring that appears anywhere in the target character variable.

LIKE

LIKE operator is frequently used for pattern matching, that is, evaluating whether a variable value equals, begins with a specified character, or sounds like a specified value or pattern.

The LIKE expression uses two wildcard operators. When using the LIKE operator, the underscore(_) wildcard takes the place of a single character, whereas the % sign might be substituted for a string of any length (including a null string).

Simple use of Like operator

data womens_product;
 set sashelp.shoes;
  where product like "Women's Dress";
  where also region like 'Pacific';
run;

like operator

Double quotes (“) are used here to include the apostrophe in “Women’s Dress”. Single quotes can be used when the text string does not contain an apostrophe or other single quotes.

Selecting Values that Begins with a Character String

The percentage sign (%) wildcard is used to search for character values that begin with, ends with or even contain certain character strings. Below are some of the examples using wildcards.

where upcase(name) like "A%"

The above expression returns all names that begin with “a” and followed by any characters of any lengths

like operator

where name like "_a%";

_ indicates that any character can contain at the beginning of the string followed by “a” as the second character and ending with any character of any length.

where wild cards

where name like "__n%";

Two underscores at the beginning indicate that the search should filter for any two characters at the beginning followed by “n”.

Like Operator in SAS

where name like "__n_";

The above expression would return character values which have 2 characters in the beginning then followed by n and any one character at the end.

Like Operator in SAS

Sounds Like Operator (=*)

The SOUNDS-LIKE operator is based on the SOUNDEX algorithm for identifying words that sound alike. The SOUNDEX algorithm is English-biased so its not useful for languages other than English.

data employees;
  set datasets.employees;
  where lname=*"PATTERSON";
run;

Sounds Like Operator

You can download the dataset employees.sas7bdat used in the program.

Also, read 4 Functions for fuzzy matching in SAS

Using a WHERE statement and a WHERE=option in a DATA Step.

Using the WHERE statement in SAS and a WHERE= dataset option in a DATA step is not a good practice. When both are used in the same dataset, SAS ignores the WHERE statement.

Additionally, a WARNING appears in the SAS log indicating that the WHERE statement cannot be applied.

by Subhro Kar
Been in the realm with the professionals of the IT industry. I am passionate about Coding, Blogging, Web Designing and deliver creative and useful content for a wide array of audience.

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Share via
Copy link
Powered by Social Snap