Index, Select and Filter dataframe in pandas python

Index, Select and Filter dataframe in pandas python – In this section we will learn how to index the dataframe in pandas python with example, How to select and filter the dataframe in pandas python with column name and column index using .ix(), .iloc() and .loc()

Create dataframe :


import pandas as pd
import numpy as np

#Create a DataFrame
d = {
    'Name':['Alisa','Bobby','Cathrine','Alisa','Bobby','Cathrine',
            'Alisa','Bobby','Cathrine','Alisa','Bobby','Cathrine'],
    'Exam':['Semester 1','Semester 1','Semester 1','Semester 1','Semester 1','Semester 1',
            'Semester 2','Semester 2','Semester 2','Semester 2','Semester 2','Semester 2'],
    
    'Subject':['Mathematics','Mathematics','Mathematics','Science','Science','Science',
               'Mathematics','Mathematics','Mathematics','Science','Science','Science'],
   'Score':[62,47,55,74,31,77,85,63,42,67,89,81]}

df = pd.DataFrame(d,columns=['Name','Exam','Subject','Score'])
df

so the resultant dataframe will be

Index, Select and Filter dataframe in pandas python 1

 

View a column of the dataframe in pandas python:

df['Name']

Index, Select and Filter dataframe in pandas python 2

 

View two columns of the dataframe in pandas:

df[['Name', 'Score']]

Output:

Index, Select and Filter dataframe in pandas python 3

 

View first two rows of the dataframe in pandas:

df[:2]

Output:

Index, Select and Filter dataframe in pandas python 4

 

 

Filter in Pandas dataframe:

View all rows where score greater than 70  


df[df['Score'] > 70]

Output:

Index, Select and Filter dataframe in pandas python 5

 

View all the rows where score greater than 70 and less than 85


df[(df['Score'] > 70) & (df['Score'] < 85)]

Output:

Index, Select and Filter dataframe in pandas python 6

 

Indexing with .ix:

.ix[] is used to index a dataframe by both name and position

View a column in pandas


df.ix[:,'Score']

Output:

Index, Select and Filter dataframe in pandas python 7

 

View the value based on row and column


df.ix[3,2]

Output:

‘Science’

 

select row by using row number in pandas  with .iloc

.iloc [1:m, 1:n] –  is used to select or index rows based on their position from 1 to m rows and 1 to n columns

# select first 2 rows

df.iloc[:2]

# or

df.iloc[:2,]

output:

Index, Select and Filter dataframe in pandas python 8

 

# select 3rd to 5th rows


df.iloc[2:5]

# or

df.iloc[2:5,]

output:

Index, Select and Filter dataframe in pandas python 9

 


# select all rows starting from third row

df.iloc[2:]

# or

df.iloc[2:,]

output:

Index, Select and Filter dataframe in pandas python 10

 

 

Select column by using column number in pandas with .iloc


# select first 2 columns
df.iloc[:,:2]

output:

Index, Select and Filter dataframe in pandas python 11

 

# select first 1st and 4th columns

df.iloc[:,[0,3]]

output:

Index, Select and Filter dataframe in pandas python 12

 

 

 

Select value by using row name and column name in pandas with .loc:

.loc [[Row_names],[ column_names]] –  is used to select or index rows or columns  based on their name

# select value by row label and column label using loc

df.loc[[1,2,3,4,5],['Name','Score']]

output:

Index, Select and Filter dataframe in pandas python 13

previous-small Index, Select and Filter dataframe in pandas python                                                                                                           next_small Index, Select and Filter dataframe in pandas python

Author

  • Sridhar Venkatachalam

    With close to 10 years on Experience in data science and machine learning Have extensively worked on programming languages like R, Python (Pandas), SAS, Pyspark.