Count of Missing Values in SAS – Row wise & column wise

In order to get the Count of missing values in SAS we will be using multiple methods in SAS. Get count of missing values of all the columns. Get count of missing values of numeric and character column in SAS. Count column wise missing values in SAS. Count row wise missing values in SAS. Count of missing values of group in SAS with example of each.

  • Count of missing values in SAS – Missing values of column in SAS.
  • Count of missing values of specific column in SAS
  • Count Column wise missing value in SAS
  • Count row wise missing values in SAS – cmiss()
  • Count of missing values of numeric Column in SAS – PROC MEANS
  • Count of missing values of character Column in SAS
  • Count of missing values of the group in SAS

So we will be using EMP_DET Table in our example

Count of Missing Values in SAS – Row wise & column wise 1

 

 

Count Missing value of all the column in table in SAS using PROC FREQ

In order to get the count of missing values of all the columns in SAS we will be using PROC FREQ with  table _all_/missing keyword as shown below


proc freq data = EMP_DET; 
   table _all_ /missing; 
   format _character_ $missing_char. _numeric_ missing_num.; 
run;

As the result we will be able to get the frequency of missing values along with the frequencies as shown below
Count of Missing Values in SAS – Row wise & column wise 2

 

 

Count Row wise Missing values in SAS:

In order to count row wise missing values we will be using cmiss() function which is shown below

Step 1:

We will be using symputx()  function get the total variables


/* row wise missing */ 
proc contents data=EMP_DET out=cols noprint; 
run; 
 
data _null_; 
   set cols nobs=total; 
   call symputx('totvar', total); 
run;  

 

Step 2:

  • cmiss() function gets the row wise count of missing values
  • Total variables – count of missing values will give count of total row wise non missing values in SAS

data EMP_DET1; 
set EMP_DET; 

totalvar=&totvar; 
totmiss=cmiss(of Employee--state); 
totnonmiss=totalvar- cmiss(of Employee--state); 
 
run; 

So the resultant table with total variables , row wise total missing values and total non missing values will be

Count of Missing Values in SAS – Row wise & column wise 3

 

 

Count Column wise Missing values in SAS:

In order to count the column wise missing values in SAS we will be using roundabout method which is explained below

Step 1:

Specify a format for the variables so that the missing values all have one value and the nonmissing values have another value.


proc format; 
   value $missfmt ' '='Missing' other='Not Missing'; 
   value missfmt .  ='Missing' other='Not Missing'; 
run;

Step 2:

  • PROC FREQ groups a variable’s values according to the formatted values.
  • Specify the MISSING AND MISSPRINT options on the TABLES statement.
  • Use the _CHAR_ and _NUMERIC_ keywords on the TABLES statement to specify that the FREQ procedure should compute statistics for all character or all numeric variables.

proc freq data=EMP_DET; 
format _CHAR_ $missfmt.; 
table _CHAR_ / missing missprint nocum nopercent; 
format _NUMERIC_ missfmt.; 
tables _NUMERIC_ / missing missprint nocum nopercent; 
run; 

Count of Missing Values in SAS – Row wise & column wise 4

 

 

Count of Missing Values of Numeric column SAS

PROC MEANS with NMISS N is used to get the count of missing values of numeric column in SAS

/* get the count of missing values of a column- numeric */ 

proc means data=EMP_DET 
NMISS N; 
var salary_in_USD; 
run;

Total missing values of “Salary_in_USD” column will be 2 and Total values of the column will be 7
Count of Missing Values in SAS – Row wise & column wise 5

 

 

Count of Missing Values of Character column SAS:

nmiss() function with PROC SQL gives count of missing values of character columninSAS


/* get the count of missing values of a column- Character */ 
proc sql; 
select nmiss(state) as n_missing_state from EMP_DET; 
quit;

count of missing values of character column “state” will be

Count of Missing Values in SAS – Row wise & column wise 6

 

 

Count of Missing Values of a column by Group: Method 1 using CLASS

count of missing values of a column by group is obtained using PROC MEANS procedure by specifying CLASS DISTRICT (group).


/* Missing value of A COLUMN by group */ 

proc means data=EMP_DET NMISS; 
class DISTRICT; 
var salary_in_USD; 
RUN;

so will be grouping the count of missing values by DISTRICT column

Count of Missing Values in SAS – Row wise & column wise 7

 

Difference between CLASS and BY statement

The CLASS statement returns analysis for a grouping (classification) variable in a single table whereas BY statement returns the analysis for a grouping variable in separate tables. Another difference is CLASS statement does not require the classification variable to be pre-sorted whereas BY statement demands sorting.

 

Count of Missing Values of a column by Group: Method 2 using BY

Count of missing values of a column by group is obtained using PROC MEANS procedure by specifying BY DISTRICT (group).


/* Missing value of a COLUMN by group but in seperate table*/
proc sort data= EMP_DET; 
by DISTRICT; 
run; 
 
 
proc means data=EMP_DET NMISS; 
by DISTRICT; 
var salary_in_USD; 
RUN;

Count of Missing Values in SAS – Row wise & column wise 8

 

Count of Missing Values of a column by Group: Two or more Group

Count of missing values of a column by group is obtained using PROC MEANS procedure by specifying CLASS DISTRICT and EXP_IN_YEARS (multiple groups).


/* Missing value of A COLUMN by group - 2 or more group */ 

proc means data=EMP_DET NMISS; 
class DISTRICT EXP_IN_YEARS; 
var salary_in_USD; 
RUN;

 

so will be grouping the count of missing values by DISTRICT and EXP_IN_YEARS columns.

Count of Missing Values in SAS – Row wise & column wise 9

 

 

                                                                                                         

Author

  • Sridhar Venkatachalam

    With close to 10 years on Experience in data science and machine learning Have extensively worked on programming languages like R, Python (Pandas), SAS, Pyspark.