Extract or keep only Numeric values in pandas column

In This section we will be focusing on how to extract or keep only numeric values of the column in pandas and there by remove all the character values in the pandas column. We will be keeping only numeric values in the specific column and across all the column, by using multiple methods. we will be discussing all the methods as shown below.

  • Extract only numeric values in pandas column Using extract() function
  • Keep only numeric values in pandas column Using replace() function
  • Keep only numeric values in pandas column Using replace() function with regular expression
  • Extract only numeric values in pandas column Using isdigit() function

 

Create Dataframe

 

## create dataframe

import pandas as pd
import numpy as np
#Create a DataFrame
import pandas as pd
import numpy as np
d = { 'StudentID':['Alisa819','Bobby212','Cathrine891','Jodha982','Raghu453','Ram834'],
'Maths':[76,73,83,93,89,94],
'Science':[85,41,55,75,81,97],
'Geography':[78,65,55,88,87,98]}

df = pd.DataFrame(d,columns=['StudentID','Maths','Science','Geography'])
df

Dataframe :

keep-only-numeric-values-in-pandas-column-and-remove-character-values-1

 

 

Keep only numeric values in a specific pandas column Using extract() function

In the below method we will be using extract() function with regular expression which will extract only digits i.e. numeric values and then converting the resultant column into integer.

 

### Method 1 : extract() function

df['StudentID'] = df['StudentID'].str.extract('(\d+)', expand=False)
df['StudentID'] = df['StudentID'].astype(int)

df

so the resultant dataframe will be.

keep-only-numeric-values-in-pandas-column-and-remove-character-values-2

 

 

Keep only numeric values in pandas column Using replace() function: Method 1

In the below method we will be using str.replace() function with regular expression which will replace non digits with empty string “”. and then converting the resultant column into integer.

 

 

#### Method 1 : replace() function with regex()

df['StudentID'] = df['StudentID'].str.replace(r'\D+', '')
df['StudentID'] = df['StudentID'].astype(int)

df

So, the resultant dataframe will be

keep-only-numeric-values-in-pandas-column-and-remove-character-values-2

 

Keep only numeric values in pandas column Using replace() function : Method 2

In the below method we will be using str.replace() function with regular expression which will replace non digits i.e. non numeric values with empty string “”. and then converting the resultant column into integer.

 

### Method 2 : replace() function with regex()


df['StudentID'] = df['StudentID'].str.replace(r'[^0-9]+', '')
df['StudentID'] = df['StudentID'].astype(int)

df

So, the resultant dataframe will be

keep-only-numeric-values-in-pandas-column-and-remove-character-values-2

 

 

 

Keep only numeric values in pandas column Using replace() function : Method 3

In the below method we will be using str.replace() function with regular expression which will replace alphabets with empty string “”. and then converting the resultant column into integer.

 

 

### Method 3 : replace() function with regex()

df['StudentID'] = df.StudentID.str.replace(r"[a-zA-Z]",'')
df['StudentID'] = df['StudentID'].astype(int)

df

So, the resultant dataframe will be

keep-only-numeric-values-in-pandas-column-and-remove-character-values-2

 

 

keep only numeric values in pandas column Using isdigit() function

In the below method we will be using isdigit() function with each element of the column with which digits are captured and joined together and the column is created and then converting the resultant column into integer.

 
### Method 4 : isdigit() function


df['StudentID'] = df['StudentID'].map(lambda x: ''.join([i for i in x if i.isdigit()]))
df['StudentID'] = df['StudentID'].astype(int)

df

So, the resultant dataframe will be

keep-only-numeric-values-in-pandas-column-and-remove-character-values-2

 

Author

  • Sridhar Venkatachalam

    With close to 10 years on Experience in data science and machine learning Have extensively worked on programming languages like R, Python (Pandas), SAS, Pyspark.