Get the index or position of substring in a column of python dataframe – pandas

In this tutorial we will learn how to get the index or position of substring in a column of a dataframe in python – pandas.

We will be using find() function to get the position of substring in python.

Syntax of Find function:

str.find(str, beg=0, end=len(string))

 

Example of indexing a substring in a column:

Create a dataframe:

#create dataframe

import pandas as pd
d = {'Quarters' : ['quarter1 Revenue','quarter2 Revenue','quarter3 Revenue','quarter4 Revenue'],
     'Revenue':[23400344.567,54363744.678,56789117.456,4132454.987]}
df=pd.DataFrame(d)
print df

Resultant dataframe will be

index of a substring of dataframe in python 1

 

Indexing a substring of a column in dataframe Example:

# Index of a substring of dataframe in Python

df['Index'] = map(lambda x: x.find('3 Rev'), df['Quarters'])
print df

With the help of find() function we will be finding the position of substring “3 Rev” in Quarters column of df dataframe and storing it in a Index column.

When substring is found its starting position in returned

When substring is not found then -1 is returned. So the resultant data frame will be

index of a substring of dataframe in python 2

 

Indexing a substring of a column in dataframe with beg and end:

# Index of a substring of dataframe in Python with begining and end

df['Index'] = map(lambda x: x.find('quar',0,5), df['Quarters'])
print df

With the help of find() function we will be finding the position of substring “quar”  with beg and end parameters as 0 and 5 in Quarters column of df dataframe and storing it in a Index column.

When substring is found its starting position in returned

When substring is not found then -1 is returned. So the resultant data frame will be

index of a substring of dataframe in python 3

Here “quar” substring is found in all the rows of Quarters column at the position 0, so 0 is returned for all the rows

previous index or position of substring in a column of a dataframe in python                                                                                                           next index or position of substring in a column of a dataframe in python

Author

  • Sridhar Venkatachalam

    With close to 10 years on Experience in data science and machine learning Have extensively worked on programming languages like R, Python (Pandas), SAS, Pyspark.