1 Comments

May 23, 2020

Most SAS® programmers depend on the COMPRESS function in SAS for cleaning up troublesome string data. The third ‘modifier’ argument, added in Version 9, might not be as acquainted.

If you’ve been programming in SAS® for any length of time, you’ve most likely used the COMPRESS function in SAS to clean up input data or extract a helpful tidbit from a string or variable name.

Unless you’re the kind of one who rigorously reads the documentation for each update, or browses by many convention papers, you won’t be aware of the hidden superpowers now you can use.

With Version 9, SAS added a 3rd “modifier” argument to the COMPRESS function– and it could actually do some superb things:

Compress Function in SAS with the third Argument

Most SAS programmers first learn COMPRESS function when they need to take away extraneous spaces or different troublesome characters from strings.

We attempt to work with a string, find things we don’t need and remove them:

Example: COMPRESS('SAS-is-wonderful&',ˈ-&ˈ)

This works, until the subsequent data arrives with new extraneous characters and our code complains or crashes once more. We put further characters within the second argument:

Example: COMPRESS('?SAS-is-wonderful&',ˈ-&?ˈ)

Now things run again, however, new data might introduce extra issues and the cycle repeats….

The third argument in the Compress function, if used correctly, can reduce this repeated modification. There are various options that cover entire classes of characters, so we are able to write generalized compression statements.

Using these options alone or together with one another because the third argument generalizes the conventional compress behaviour, that is it removes the entire class from the string.

For instance, the code we had above could be generalized as:

COMPRESS('?SAS-is-wonderful&', ,ˈPˈ) – The P modifier removes all punctuation (observe missing the second argument)

You can use as many options as you want together. You can leave the second argument blank if the final option is adequate, or it’s also possible to include particular items within the second argument.

COMPRESS('?SAS-is-wonderful&',ˈ0ˈ,ˈAPˈ) – The A and P modifiers are used together that removes all punctuation, all alphabetic characters AND the digit “0”

You can use as many options as you like in combination. You can leave the second argument blank if the general option is sufficient, or you can also include specific items in the second argument.

Below are the list of some of the modifiers that you can use to expand the functionality of compress function.

ArgumentMeaningExampleResults
ARemoves alphabetic characterscompress('A_B vC,D|m&@',,'A');
_,|&@
DRemoves digitscompress('A_B B5vC,D|m&@',,'D');
A_BBvC,D|m&@
FRemoves the underscore character and English letterscompress('A_2B vC,D|m&@',,'F');
2,|&@ a5=,|&@
HRemoves horizontal tabcompress('A BvC,D|m&@',,'H');
ABvC,D|m&@
Iignores the case of the characters to be kept or removed.compress('ABCcDcd','cd','I');
AB
KKeeps the characters in the list instead of removing them.compress('ABCcDcd','cd','K');
ccd
LRemoves lowercase letters
compress('ABCcDcd',,'L');
ABCD
NRemoves digits, the underscore character, and English letterscompress('A_2B vC,D|m&@',,'N');
,|&@
PRemoves punctuation markscompress('A_2B vC,D|m&@',,'P');
A2BvCDm
SRemoves space characters (blank, horizontal tab, vertical tab, carriage return, line feed, form feed, and NBSP (‘A0’x, or 160 decimal ASCII)compress('A_2B v C,D|m&@',,'S');
A_2BvC,D|m&@
TTrims trailing blanks from the first and second arguments.compress(' abcd',,'T');
abcd
URemoves uppercase letterscompress('ABvC,D|m&&@@',,'U');
v,|m&&@@
XRemoves hexadecimal characterscompress('ABvC,D|m&&@@',,'X');
v,|m&&@@

 

How to use compress function in a SAS macro?

You can use the compress function in a SAS macro by using %sysfunc. An example would be if you want to keep only digits in a macro variable, you can use the below line of code.

%let fname = 'JAN2020_012020';
%let onlydigits=%sysfunc(compress(&fname,,kd));
%put &onlydigits;

Using the %CMPRES, %QCMPRES Autocall Macros

There are two SAS Autocall macros that can compress multiple blanks and remove leading and trailing blanks.

The CMPRES and QCMPRES macros compress multiple blanks and remove leading and trailing blanks.

If the argument might contain a special character or mnemonic operator, listed below, use %QCMPRES.

& % ' " ( ) + − * / < > = ¬ ^ ~ ; , # blank
AND OR NOT EQ NE LE LT GE GT IN

CMPRES returns an unquoted result, even if the argument is quoted. QCMPRES produces a result with the following special characters and mnemonic operators masked, so the macro processor interprets them as text instead of as elements of the macro language:

%let a=15;
%let b=5;
%let sum=%nrstr(%eval(&a   +   &b));
%put QCMPRES: %qcmpres(&sum);
%put CMPRES: %cmpres(&sum);
QCMPRES: %eval(&a + &b)
CMPRES: 20

 

related posts:


Column Input in SAS to Read Raw Data Arranged in Columns


Date Interval Functions – INTNX and INTCK in SAS


SAS Generation Datasets

Subhro Kar

About the author

Been in the realm with the professionals of the IT industry. I am passionate about Coding, Blogging, Web Designing and deliver creative and useful content for a wide array of audience.

Subhro

Leave a Reply

Your email address will not be published. Required fields are marked

This site uses Akismet to reduce spam. Learn how your comment data is processed.

{"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}