Variance Function in Python pandas (Dataframe, Row and column wise Variance)

var() – Variance Function in python pandas is used to calculate variance of a given set of numbers, Variance of a data frame, Variance of column or column wise variance in pandas python  and Variance of rows or row wise variance in pandas python, let’s see an example of each.  We need to use the package name “statistics” in calculation of variance. In this tutorial we will learn,

  • How to find the variance of a given set of numbers
  • How to find variance of a dataframe in pandas python
  • How to find the variance of a column in pandas dataframe
  • How to find row wise variance of a pandas dataframe

Syntax of variance Function in python

DataFrame.var(axis=None, skipna=None, level=None, ddof=1, numeric_only=None)

Parameters :

axis : {rows (0), columns (1)}

skipna : Exclude NA/null values when computing the result

level : If the axis is a MultiIndex (hierarchical), count along a particular level, collapsing into a Series

ddof : Delta Degrees of Freedom. The divisor used in calculations is N – ddof, where N represents the number of elements.

numeric_only : Include only float, int, boolean columns. If None, will attempt to use everything, then use only numeric data. Not implemented for Series.

Variance Function in Python pandas

Simple variance function is shown below


# calculate variance
import numpy as np

print(np.var([1,9,5,6,8,7]))
print(np.var([4,-11,-5,16,5,7,9]))

output:

2.82842712475
8.97881103594

 

 

Variance of a dataframe in pandas python:

Create dataframe


import pandas as pd
import numpy as np

#Create a DataFrame
d = {
    'Name':['Alisa','Bobby','Cathrine','Madonna','Rocky','Sebastian','Jaqluine',
   'Rahul','David','Andrew','Ajay','Teresa'],
   'Score1':[62,47,55,74,31,77,85,63,42,32,71,57],
   'Score2':[89,87,67,55,47,72,76,79,44,92,99,69],
   'Score3':[56,86,77,45,73,62,74,89,71,67,97,68]}



df = pd.DataFrame(d)
print df

So the resultant dataframe will be

 

 

Variance of the dataframe in pandas python:


# variance of the dataframe
df.var()

will calculate the variance of the dataframe across columns so the output will be

Score1   304.363636
Score2   311.636364
Score3   206.083333
dtype: float64

 

 

Column variance of the dataframe in pandas:


# column variance of the dataframe

df.var(axis=0)

axis=0 argument calculates the column wise variance of the dataframe so the result will be

Score1   304.363636
Score2   311.636364
Score3   206.083333
dtype: float64

 

 

Row variance of the dataframe in pandas:


# Row variance of the dataframe

df.var(axis=1)

axis=1 argument calculates the row wise variance of the dataframe so the result will be

Variance Function in Python pandas (Dataframe, Row and column wise Variance) 1

 

 

Calculate the variance of the specific Column in pandas


# variance of the specific column
df.loc[:,"Score1"].var()

the above code calculates the variance of the “Score1” column so the result will be

304.36363636363637

 

previous small variance function in python pandas                                                                                                          next_small variance function in python pandas

Author

  • Sridhar Venkatachalam

    With close to 10 years on Experience in data science and machine learning Have extensively worked on programming languages like R, Python (Pandas), SAS, Pyspark.