The COMPARE
function in SAS lets you compare two-character values. With optionally available modifiers, you’ll be able to ignore cases and truncate a longer value to the length of a shorter value before making the comparison.
To demonstrate the COMPARE function, suppose you must verify analysis codes that begin with C450.
One downside is that some data could have the C in lowercase.
You need to match codes that begin with C450 and are followed by a period and, optionally, further digits resembling C450.100.
While this can be a comparatively simple activity using typical DATA step programming, you’ll be able to accomplish the comparison in a single statement using the COMPARE function.
Take a look at the following program:
data test1;
input code $10.;
datalines;
V450
c450
c450.100
C900
;
run;
data test;
set test1;
compareValue=compare(code,'C450','i:');
if compare(code,'C450','i:') eq 0 then Match = 'Yes';
else Match = 'No';
run;
- The first two arguments of the COMPARE function are the two character values you
need to compare. - The third argument is the option that lets you specify modifiers.
- The
i
modifier is used to ignore the case. - The colon (
:
) modifier is used to truncate the longer string to the length of the shorter string before making the comparison.
COMPARE returns a 0 if there’s a match (after applying the modifiers) and a non-zero value if the two values differ.
The value returned tells you the first character in the two strings that is different. Observe the compare value for observations 1 and 4. The value 1 for observation 1 tells that the 1st character is different, whereas observation 4 tells that the 2nd character is different.
The sign of this value tells you which of the two values comes first in the collating sequence.
In practice, you merely need to know if the function returns a Zero or not.
Be cautious whenever you use the colon modifier. When SAS computes the shorter string length, it includes trailing blanks.
Here is an example:
data test2;
String1 = 'ABC';
String2 = 'ABCXYZ';
Compare1 = compare(String1,String2,':');
Compare2 = compare(trim(String1),String2,':');
run;
- String1 is ABC followed by trailing blanks. When you use the colon modifier to compare this value to String2, SAS sees the length of both strings as equal to 6.
- Using the TRIM function to remove the trailing blanks while comparing is always a good practice.
- For the value of Compare2, SAS trims String2 to a length of 3 (the length of String1 after you strip off the trailing blanks) before making the comparison.
If you are curious about the value of Compare1 is –4, here is why: The two strings differ in the fourth character. The value is negative because a blank comes before a Z in the collating sequence.