Cumulative product in pandas python – cumprod()

Cumulative product of a column in pandas python is carried out using cumprod() function.  Cumulative product of the column by group in pandas is also done using cumprod() function.  row wise cumulative product can also accomplished using this function. Let’s see how to

  • Get the cumulative product of a column in pandas dataframe in python
  • Row wise cumulative product of the column in pandas python
  • Cumulative product of the column by group in pandas
  • cumulative  product of the column with NA values in pandas

Desired Results : cumulative product of column.

cumulative product of the dataframe in pandas python 0a

 

syntax of cumprod() function in pandas:

cumprod(axis= 0|1,  skipna=True, *args, **kwargs)

Parameters:
axis: {index or Rows (0), columns (1)}
skipna: Exclude NA/null values. If an entire row/column is NA, the result will be NA
Returns: Cumulative product of the column

 

First let’s create a dataframe



import pandas as pd
import numpy as np

data = {'Product':['Box','Bottles','Pen','Markers','Bottles','Pen','Markers','Bottles','Box','Markers','Markers','Pen'],
       'State':['Alaska','California','Texas','North Carolina','California','Texas','Alaska','Texas','North Carolina','Alaska','California','Texas'],
       'Tax':[14,24,31,12,13,7,9,31,18,16,18,14],
       'Revenue':[240,300,340,180,230,332,345,560,430,np.nan,320,410]}

df1=pd.DataFrame(data, columns=['Product','State','Tax','Revenue'])
df1

df1 will be

cumulative product of the dataframe in pandas python 1

 

 Cumulative product of a column in a pandas dataframe python:

Cumulative product of a column in pandas is computed using cumprod() function and stored in the new column namely “cumulative_Tax” as shown below.  axis =0 indicated column wise performance i.e. column wise cumulative product.


### Cumulative product of a dataframe column

df1['cumulative_Tax']=df1['Tax'].cumprod(axis = 0) 
df1

so resultant dataframe will be

cumulative product of the dataframe in pandas python 2

 

 Cumulative product of a column by group in pandas:

Cumulative product of a column by group in pandas is computed using groupby() function. along with the groupby() function we will also be using cumulative product function. And the results are stored in the new column namely “cumulative_Tax_group” as shown below.


### Cumulative product of the column by group

df1['cumulative_Tax_group']=df1.groupby(['Product'])['Tax'].cumprod()
df1

so resultant dataframe will be

cumulative product of the dataframe in pandas python 3

 

 

Cumulative product of a column with NA values in a pandas dataframe python:

Cumulative product of a column in pandas with NA values is computed and stored in the new column namely “cumulative_Revenue” as shown below.  by default NA values will be skipped and cumulative product is calculated for rest


### cumulative product of the column with NA

df1['cumulative_Revenue']=df1.Revenue.cumprod(axis = 0) 
df1

so resultant dataframe will be

cumulative product of the dataframe in pandas python 4

 

 

Desired Results : Row wise cumulative product

cumulative product of the dataframe in pandas python 0b

Row wise Cumulative product of dataframe in pandas:

Cumulative product of a row in pandas is computed using cumprod() function and stored in the “Revenue” column itself.  axis =1 indicated row wise performance i.e. row wise cumulative product.


### Cumulative product of the column by group

df1[['Tax','Revenue']].cumprod(axis=1) 

so resultant dataframe will be

cumulative product of the dataframe in pandas python 5

 


Other Related Topics:

further details about cumprod() function is in documentation

p Cumulative product of column in pandas python                                                                                                           n Cumulative product of column in pandas python

Author

  • Sridhar Venkatachalam

    With close to 10 years on Experience in data science and machine learning Have extensively worked on programming languages like R, Python (Pandas), SAS, Pyspark.