This lesson is still being designed and assembled (Pre-Alpha version)

Dataframes in Pandas

Overview

Time: min
Objectives

What are Dataframes?

A DataFrame is a 2 dimensional data structure. It is similar to a 2 dimensional array.

import pandas as pd

data = {
  "quantity_available": [420, 380, 390],
  "quantity_needed_for_full_inventory": [50, 40, 45]
}

#load data into a DataFrame object:
df = pd.DataFrame(data)

print(df) 

Accessing a row element

Pandas uses the .loc attribute to return one or more specified row(s).

NOTE: If you use loc to retrieve a single row, the result is a series. If we retrieve multiple rows, the result is a dataframe.

print(df.loc[0]) # returns single row
print(df.loc[[0, 1]]) # returns multiple rows in a dataframe

Named Indexes

We can create custom labels by using the index argument while creating a data frame.

data = {
  "quantity_available": [420, 380, 390],
  "quantity_needed_for_full_inventory": [50, 40, 45]
}

#load data into a DataFrame object:
df_labeled = pd.DataFrame(data, index = ['Sugar', 'Salt', 'Pepper'])

print(df_labeled) 

Accessing Rows Through Named Indexes

The loc attribute can be used to access rows with the custom labels.

print(df_labeled.loc['Sugar'])
print(df_labeled.loc[['Sugar','Salt']])

Key Points