Additional Handy Functions
You can use a few functions that are applicable to both DataFrames and Series. Here is a rundown:
iloc on DataFrame
For a DataFrame, you can provide both row and column index numbers to retrieve values.
Now, let's understand how to retrieve values in specific row(s) and column(s) using the iloc accessor in a DataFrame:
dataFrame.iloc[<row selection>, <column selection>] is used to select rows and columns by index number in the order that they appear in the DataFrame.
Row and column indexes start at 0 and span up to the length of the rows/columns - 1. Therefore, the last row index can be calculated by:
df.shape[0] - 1
And to calculate the last column index, it would be:
df.shape[1] - 1
Get the entire first row
import pandas as pd
df = pd.DataFrame(
{
"marks": [70, 66, 100, 88],
"age": [29, 32, 31, 28],
"sex": ["F", "M", "F", "F"],
"name": ["Jane", "John", "Sally", "Sandy"],
"ssn": ["1234", "3456", "4567", "5678"],
}
)
row = df.iloc[0] # Gets the first row's values as a Series
print(type(row))
print(row)
Output:
<class 'pandas.core.series.Series'> marks 70 age 29 sex F name Jane ssn 1234 Name: 0, dtype: object
df.iloc[-1]- Gets all the values in the last row.
df.iloc[:, 0]- Gets all the values in the first column.df.iloc[:, -1]- Gets all the values in the last column.
More Examples
Get the first row, second column value only
single_value = df.iloc[0, 1] # Gets the value as a scalar
print(type(single_value))
print(single_value)
Output:
<class 'numpy.int64'>
29
Get multiple row values of a single column
To get all the rows from 0 to 2:
row_0_to_2_of_2nd_column = df.iloc[0:3, 1] # index 3 is excluded
print(type(row_0_to_2_of_2nd_column)) # The returned type is a Series
print(row_0_to_2_of_2nd_column)
Output:
<class 'pandas.core.series.Series'> 0 29 1 32 2 31 Name: age, dtype: int64
Get multiple rows and multiple columns
row_0_2_and_column_0_3 = df.iloc[
0:2, 0:3
] # row index 2 and column index 3 are excluded
print(type(row_0_2_and_column_0_3))
print(row_0_2_and_column_0_3)
Output:
<class 'pandas.core.frame.DataFrame'> marks age sex 0 70 29 F 1 66 32 M
ilocreturns a scalar, Series, or DataFrame based on the results. If only a single value is returned, it will be one of the basic NumPy data types. If a collection of one type is returned, it will be a Series object. If two-dimensional results are returned, it will be a DataFrame.
loc
Using loc, you can retrieve row or column values using the integer index numbers (similar to iloc) if the DataFrame has no custom index labels, or you can use the custom labels if the index has been replaced.
loc on DataFrame
Here are the various ways you can use loc on a DataFrame:
- Use the index number, just like
iloc, to retrieve row values:
df = pd.DataFrame(
{
"marks": [70, 66, 100, 88],
"age": [29, 32, 31, 28],
"sex": ["F", "M", "F", "F"],
"name": ["Jane", "John", "Sally", "Sandy"],
"ssn": ["1234", "3456", "4567", "5678"],
}
)
row_values = df.loc[0]
print(type(row_values))
Output:
<class 'pandas.core.series.Series'>
Notice that the returned object is a Series. Now, let's print the Series object's values.
- Use multiple specific row indexes and a column label to get values:
row_column = df.loc[[1, 3], "name"]
print(row_column) # A Series is returned
Output:
1 John 3 Sandy Name: name, dtype: object
- Use multiple specific column labels with a single row index:
row_column = df.loc[1, ["name", "age"]]
print(row_column) # A Series is returned
Output:
name John age 32 Name: 1, dtype: object
- Use multiple specific row indexes and multiple column labels:
row_column = df.loc[[1, 3], ["name", "age"]]
print(row_column) # A DataFrame is returned
Output:
<class 'pandas.core.frame.DataFrame'>
name age
1 John 32
3 Sandy 28
loc will not work with index numbers if the default integer index is replaced with a custom one. You can only use the custom labels in that case. In the example below, we replace the existing integer index with the ssn column:
df.set_index("ssn", drop=True, inplace=True)
row_values = df.loc["1234"]
print(row_values)
Output:
marks 70 age 29 sex F name Jane Name: 1234, dtype: object