Index, Select and Filter dataframe in pandas python – In this section we will learn how to index the dataframe in pandas python with example, How to select and filter the dataframe in pandas python with column name and column index using .ix(), .iloc() and .loc()
Create dataframe :
import pandas as pd import numpy as np #Create a DataFrame d = { 'Name':['Alisa','Bobby','Cathrine','Alisa','Bobby','Cathrine', 'Alisa','Bobby','Cathrine','Alisa','Bobby','Cathrine'], 'Exam':['Semester 1','Semester 1','Semester 1','Semester 1','Semester 1','Semester 1', 'Semester 2','Semester 2','Semester 2','Semester 2','Semester 2','Semester 2'], 'Subject':['Mathematics','Mathematics','Mathematics','Science','Science','Science', 'Mathematics','Mathematics','Mathematics','Science','Science','Science'], 'Score':[62,47,55,74,31,77,85,63,42,67,89,81]} df = pd.DataFrame(d,columns=['Name','Exam','Subject','Score']) df
so the resultant dataframe will be
View a column of the dataframe in pandas python:
df['Name']
View two columns of the dataframe in pandas:
df[['Name', 'Score']]
Output:
View first two rows of the dataframe in pandas:
df[:2]
Output:
Filter in Pandas dataframe:
View all rows where score greater than 70
df[df['Score'] > 70]
Output:
View all the rows where score greater than 70 and less than 85
df[(df['Score'] > 70) & (df['Score'] < 85)]
Output:
Indexing with .ix:
.ix[] is used to index a dataframe by both name and position
View a column in pandas
df.ix[:,'Score']
Output:
View the value based on row and column
df.ix[3,2]
Output:
‘Science’
select row by using row number in pandas with .iloc
.iloc [1:m, 1:n] – is used to select or index rows based on their position from 1 to m rows and 1 to n columns
# select first 2 rows df.iloc[:2] # or df.iloc[:2,]
output:
# select 3rd to 5th rows df.iloc[2:5] # or df.iloc[2:5,]
output:
# select all rows starting from third row df.iloc[2:] # or df.iloc[2:,]
output:
Select column by using column number in pandas with .iloc
# select first 2 columns df.iloc[:,:2]
output:
# select first 1st and 4th columns df.iloc[:,[0,3]]
output:
Select value by using row name and column name in pandas with .loc:
.loc [[Row_names],[ column_names]] – is used to select or index rows or columns based on their name
# select value by row label and column label using loc df.loc[[1,2,3,4,5],['Name','Score']]