Variance Function in Python pandas (Dataframe, Row and column wise Variance)

var() – Variance Function in python pandas is used to calculate variance of a given set of numbers, Variance of a data frame, Variance of column and Variance of rows, let’s see an example of each.  We need to use the package name “statistics” in calculation of variance. In this tutorial we will learn,

  • How to find the variance of a given set of numbers
  • How to find variance of a dataframe
  • How to find the variance of a column in dataframe
  • How to find row wise variance of a dataframe

Variance Function in Python pandas

Simple variance function is shown below


# calculate variance
import numpy as np

print(np.var([1,9,5,6,8,7]))
print(np.var([4,-11,-5,16,5,7,9]))

output:

2.82842712475
8.97881103594

 

Variance of a dataframe:

Create dataframe


import pandas as pd
import numpy as np

#Create a DataFrame
d = {
    'Name':['Alisa','Bobby','Cathrine','Madonna','Rocky','Sebastian','Jaqluine',
   'Rahul','David','Andrew','Ajay','Teresa'],
   'Score1':[62,47,55,74,31,77,85,63,42,32,71,57],
   'Score2':[89,87,67,55,47,72,76,79,44,92,99,69],
   'Score3':[56,86,77,45,73,62,74,89,71,67,97,68]}



df = pd.DataFrame(d)
print df

So the resultant dataframe will be

Variance Function in Python pandas (Dataframe, Row and column wise Variance) - image variance-function-in-python-pandas-1 on http://www.datasciencemadesimple.com

 

Variance of the dataframe:


# variance of the dataframe
df.var()

will calculate the variance of the dataframe across columns so the output will be

Score1   304.363636
Score2   311.636364
Score3   206.083333
dtype: float64

 

Column variance of the dataframe:


# column variance of the dataframe

df.var(axis=0)

axis=0 argument calculates the column wise variance of the dataframe so the result will be

Score1   304.363636
Score2   311.636364
Score3   206.083333
dtype: float64

 

Row variance of the dataframe:


# Row variance of the dataframe

df.var(axis=1)

axis=1 argument calculates the row wise variance of the dataframe so the result will be

0     309.000000
1     520.333333
2    121.333333
3    217.000000
4    449.333333
5    58.333333
6    34.333333
7    172.000000
8    262.333333
9    908.333333
10     244.000000
11     44.333333
dtype: float64

 

Calculate the variance of the specific Column


# variance of the specific column
df.loc[:,"Score1"].var()

the above code calculates the variance of the “Score1” column so the result will be

304.36363636363637

 

previous small variance function in python pandas                                                                                                                next_small variance function in python pandas