Using the Compare function in SAS for comparing strings

Using the Compare function in SAS for comparing strings

  • Post author:
  • Post category:Base SAS
  • Post comments:0 Comments
  • Reading time:5 mins read

The COMPARE function in SAS lets you compare two-character values. With optionally available modifiers, you’ll be able to ignore cases and truncate a longer value to the length of a shorter value before making the comparison.

To demonstrate the COMPARE function, suppose you must verify analysis codes that begin with C450.

One downside is that some data could have the C in lowercase.

You need to match codes that begin with C450 and are followed by a period and, optionally, further digits resembling C450.100.

While this can be a comparatively simple activity using typical DATA step programming, you’ll be able to accomplish the comparison in a single statement using the COMPARE function.

Take a look at the following program:

data test1;
input code $10.; 
datalines; 
V450 
c450 
c450.100 
C900 
;
run;

data test;
set test1;
compareValue=compare(code,'C450','i:');
if compare(code,'C450','i:') eq 0 then Match = 'Yes';
else Match = 'No';
run;
  • The first two arguments of the COMPARE function are the two character values you
    need to compare.
  • The third argument is the option that lets you specify modifiers.
  • The i modifier is used to ignore the case.
  • The colon (:) modifier is used to truncate the longer string to the length of the shorter string before making the comparison.
Compare Function SAS

COMPARE returns a 0 if there’s a match (after applying the modifiers) and a non-zero value if the two values differ.

The value returned tells you the first character in the two strings that is different. Observe the compare value for observations 1 and 4. The value 1 for observation 1 tells that the 1st character is different, whereas observation 4 tells that the 2nd character is different.

The sign of this value tells you which of the two values comes first in the collating sequence.

In practice, you merely need to know if the function returns a Zero or not.

Be cautious whenever you use the colon modifier. When SAS computes the shorter string length, it includes trailing blanks.

Here is an example:

data test2; 
String1 = 'ABC'; 
String2 = 'ABCXYZ'; 
Compare1 = compare(String1,String2,':'); 
Compare2 = compare(trim(String1),String2,':'); 
run;
Compare function
  • String1 is ABC followed by trailing blanks. When you use the colon modifier to compare this value to String2, SAS sees the length of both strings as equal to 6.
  • Using the TRIM function to remove the trailing blanks while comparing is always a good practice.
  • For the value of Compare2, SAS trims String2 to a length of 3 (the length of String1 after you strip off the trailing blanks) before making the comparison.

If you are curious about the value of Compare1 is –4, here is why: The two strings differ in the fourth character. The value is negative because a blank comes before a Z in the collating sequence.

Every week we'll send you SAS tips and in-depth tutorials

JOIN OUR COMMUNITY OF SAS Programmers!

Subhro

Subhro provides valuable and informative content on SAS, offering a comprehensive understanding of SAS concepts. We have been creating SAS tutorials since 2019, and 9to5sas has become one of the leading free SAS resources available on the internet.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.