8 Ways to count the number of observations in a SAS dataset

8 Ways to count the number of observations in a SAS dataset and pass it into a macro variable

A SAS data set is a crucial component in data management, made up of two parts: the descriptor portion and the data portion. Specifically, the descriptor portion is integral as it carries essential information about the data set itself. This includes specific details such as the type of data, its structure, and characteristics. For instance, it may hold information such as: whether the data is numerical or categorical, its arrangement in the dataset, and any unique traits it may possess. Understanding the descriptor portion not only reveals more about our data but is also foundational in the upcoming exploration of gathering observation counts in a SAS data set. Simple yet effective navigation of this process, particularly for those new to the topic, can move us progressively closer to mastering SAS data management.

  • The descriptor part of a SAS dataset includes the dataset's name along with its associated type
  • The moment or timestamp when the dataset was initially developed is also recorded
  • Crucially, it lists the total number of collected observations within the dataset
  • Furthermore, it specifies the quantity of distinct variables included in the dataset
  • Lastly, it denotes the type of SAS engine employed to manage or access this dataset

Often, it may become necessary to count the number of observations within a SAS dataset and subsequently assign this figure to a macro variable. An effective method to achieve this is to read the descriptor portion of the said dataset. This technique is efficient as it only necessitates the processing of the metadata, consequently enhancing the speed of the overall operation. This approach is distinctive from the other methods primarily due to its efficiency and promptness, thereby illustrating it as a highly effective strategy.

"Join us as we navigate through eight different ways to count the number of rows in a SAS dataset, a fundamental task frequently undertaken in data manipulation. We will explore methods ranging from the basic PROC SQL to the more advanced, such as the use of Data Access Functions. Understanding and mastering these approaches will significantly enhance your efficiency in handling SAS datasets."

8 Ways to count the number of observations in a SAS dataset and pass it into a macro variable

1. Using PROC SQL

proc sql;
	select count(*) into :cnt from sashelp.class;
quit;

%put &cnt.;

Using the PROC SQL method is not considered to be an efficient way as it does not use metadata information of the SAS dataset. Instead, it reads through each record (row) of your SAS dataset which requires processing power. However, it is simple to understand and develop and can be used for smaller datasets.

 2. Using END= Statement

data _null_;
		set sashelp.class end=eof;
		count+1;
		if eof then
			call symput("nobs", count);
	run;
%put &nobs;

This method is also not efficient. It reads the entire data set and increments a counter to pick up the last value of the dataset. The END option returns true or 1 if the observation is the last observation in the dataset.

3. Using the Data Step

data _null_;
	set sashelp.class nobs=obs;
	call symputx('nobs', obs);
run;

%put &nobs.;

NOBS is a SAS automatic variable that contains the number of rows in a dataset and NOBS = obs holds the count of records in the variable obs.

CALL SYMPUTX  is a DATA Step call routine that assigns a value produced in a DATA step to a macro variable.

 4. Using IF 0 and STOP statement

data _NULL_;
	if 0 then
		set sashelp.class nobs=n;
	call symputx('totobs', n);
	stop;
run;
%put no. of observations = &totobs;

During the compilation phase, the data step reads in variables from the data set in the Set Statement. During execution, SAS reads in the observations from the input data set in sequential order. By using if 0 then set, we can bypass the execution part of the Set Statement which is conditional logic that always fails.

The ‘if 0‘ statement is not processed at all because the IF statement does not hold TRUE. The whole IF-THEN statement is used to pull the header information of the data set and later pass it to the compiler to adjust it to the PDV.

If 0 Then Set can also be coded as If (1=2) Then Set

The STOP statement is used to stop an endless loop.

Read – Exploring the SET Statement in SAS

5. Proc SQL Dictionary Method

proc sql noprint;
 select nobs into :totobs separated by ' ' from dictionary.tables
 where libname='SASHELP' and memname='CARS';
quit;
%put total records = &totobs.;

The metadata information of a dataset can be accessed with PROC SQL Dictionary.Tables.It is an efficient method as it does not look into each value of a dataset to determine the count.

The LIBNAME= refers to the name of the library in which data is stored. The MEMNAME= refers to the SAS table (dataset). The separated by ‘ ‘ is used in this case to left-align the numeric value.

6. Using the Library table – SASHELP.VTABLE

data _null_;
	set sashelp.vtable;
	where libname="SASHELP" and memname="CLASS";
	Num_obs=Nobs-Delobs;
	put "Nobs - Delobs: num_obs = " num_obs;
	call symputx('obs_cnt',num_obs);
run;
%put &obs_cnt;

As we know that SAS library tables contain metadata relating to your data sets, and we can get this information from the SASHELP.VTABLE by specifying the LIBNAME and the name of the data set.

The two variables you need from the table are Nobs which contains the total number of observations including ones marked for deletion and Delobs contains only the number of observations marked for deletion.

This method of determining the number of observations in a SAS data set has an advantage over the previous methods described so far. That is if you (or other people) are modifying a data set, you need to know the total number of observations in a data set as well as the number of observations that have been marked for deletion (but are still counted when you use the NOBS= SET option).

7. Using PROC SQL automatic variable – SQLOBS

proc sql noprint;
		select * from sashelp.class;
		run;
%put &sqlobs.;

Proc SQL automatically creates SALOBS macro variable, when it runs a SQL Step. SQLOBS macro variable will have the number of observations count of the last proc SQL statement executed.

 8. Using Data Access Functions

%macro totobs(mydata);
    %let mydataID=%sysfunc(OPEN(&mydata.,IN));
    %let NOBS=%sysfunc(ATTRN(&mydataID,NOBS));
    %let RC=%sysfunc(CLOSE(&mydataID));
    &NOBS
%mend;
%put %totobs(sashelp.cars);

References

How Many Observations Are In My Data Set?

Every week we'll send you SAS tips and in-depth tutorials

JOIN OUR COMMUNITY OF SAS Programmers!

Subhro

Subhro provides valuable and informative content on SAS, offering a comprehensive understanding of SAS concepts. We have been creating SAS tutorials since 2019, and 9to5sas has become one of the leading free SAS resources available on the internet.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.