Skip to content

Latest commit

 

History

History
362 lines (265 loc) · 10.8 KB

File metadata and controls

362 lines (265 loc) · 10.8 KB

Data Structures

(Original version copied from https://github.com/blutterfly/python/blob/main/docs/examples/data_structures.md on 2025-10-21 - thank you "Butterfly", then edited over time...)

Introduction of (a few) python data structures:


1. Lists

Lists are used to store multiple items in a single variable.
List items can be of any data type. And a list can contain different data types.
Lists are ordered. They are changeable. They allow duplicate values.
List items are indexed, the first item has index [0], the second item has index [1] etc.
When you add new items to a list, they are placed at the end of the list.
Lists are created using square brackets.

Creating a List

fruits = ["apple", "banana", "cherry"]
print(fruits)

or use the list() constructor

fruits = list(("apple", "banana", "cherry"))
print(fruits)

Accessing Items

print(fruits[0])  # First item
print(fruits[-1])  # Last item

Modifying Lists

fruits.append("orange")  # Add an item
fruits[1] = "blueberry"  # Change an item
print(fruits)

Iterating Through a List

for fruit in fruits:
    print(fruit)

Common List Methods

  • append(item): Add an item
  • remove(item): Remove an item
  • len(list): Get the number of items
  • sort(): Sort the list

Exercise:

  1. Create a list of your favorite hobbies.
  2. Add a new hobby to the list.
  3. Print each hobby using a loop.

Additional Resources:


2. Dictionaries

Dictionaries store data in key-value pairs.
A dictionary is a collection. As of Python version 3.7 dictionary items are ordered (its items have a defined order, and that order will not change). It is changeable (we can change, add or remove items after the dictionary has been created). And it does not permit duplicates (it cannot have two items with the same key).
Values in dictionary items can be of any data type.
Dictionaries are written with curly brackets, and have keys and values.

Creating a Dictionary

student = {"name": "Alex", "age": 16, "grade": "A"}
print(student)

Accessing Items

print(student["name"])  # Access value by key

Adding/Updating Keys

student["school"] = "High School"  # Add a new key
student["grade"] = "A+"  # Update value
print(student)

Iterating Through a Dictionary

for key, value in student.items():
    print(key, ":", value)

Common Dictionary Methods

  • keys(): Get all keys

  • values(): Get all values

  • items(): Get all key-value pairs

    Exercise:

  1. Create a dictionary with details about your favorite book (title, author, year).
  2. Add a new key for the genre.
  3. Print all the keys and values.

Additional Resources:


3. Sets

Sets are used to store multiple items in a single variable.
Sets are unordered.
Set items are unchangeable and are not indexed -- but you can remove items and add new items.
A set cannot contain duplicate members.
Set items can be of any data type and a given set can contain different data types.
Sets are written with curly brackets.

Creating a Set

fruits = {"apple", "banana", "cherry"}
print(fruits)

or we can use the set() constructor

fruits = set(("apple", "banana", "cherry"))
print(fruits)

Because there is no index, we cannot access items in a set via an index or a key.
That said, we can loop through the set items via a for loop.
We can also see if a specified value is exists in a set via the "in" keyword.

fruits = {"apple", "banana", "cherry"}

for x in fruits:
  print(x) 

print("apple" not in thisset)  # returns False

Additional Resources:


4. DataFrames (Using pandas)

What is a DataFrame?
A DataFrame is a 2-dimensional table-like data structure in the pandas library. Think of it as a spreadsheet.

Setting Up pandas
Make sure you have pandas installed:

pip install pandas

Creating a DataFrame

import pandas as pd

data = {
    "Name": ["Alice", "Bob", "Charlie"],
    "Age": [16, 17, 16],
    "Grade": ["A", "B", "A"]
}

df = pd.DataFrame(data)
print(df)

Accessing Columns

print(df["Name"])  # Access a single column
print(df[["Name", "Age"]])  # Access multiple columns

Filtering Rows

print(df[df["Age"] > 16])  # Students older than 16

Adding a New Column

df["Passed"] = [True, False, True]
print(df)

Iterating Through Rows

for index, row in df.iterrows():
    print(row["Name"], "is", row["Age"], "years old.")

Exercise:

  1. Create a DataFrame with data about your favorite movies (columns: Title, Year, Genre).
  2. Add a new column for Rating.
  3. Filter the movies to show only those released after 2010.

5. Tuples

Tuples are used to store multiple items in a single variable.
A tuple is a collection which is ordered and unchangeable.

  • Tuple items have a defined order, and that order will not change.
  • We cannot change, add or remove items after the tuple has been created.
    Tuple items are indexed, the first item has index [0], the second item has index [1] etc.
  • Tuple indexes enable duplicate values.
    Tuples are written with round brackets.

Creating a Tuple

fruits = ("apple", "banana", "cherry", "blueberry")
print(fruits)

or we can use the tuple() constructor

fruits = tuple(("apple", "banana", "cherry", "blueberry"))
print(fruits)

Accessing Items

print(fruits[0])   # First item
print(fruits[-1])  # Last item
print(fruits[0:2]) # Range = first 3 items
print(fruits[:2])  # Range = first 3 items
print(fruits[1:])  # Range = last 3 items

Unpacking Tuples

fruits = tuple(("apple", "banana", "cherry", "blueberry"))
(Central_Asia New_Guinea Northern_Hemisphere North_America) = fruits  # Unpack Tuple items

Iterating Through a Tuple

for fruit in fruits:
    print(fruit)

or use tuple index numbers

for i in range(len(fruits)):
  print(fruits[i])

Tuple Methods

count() # Returns the number of times a specified value occurs in a tuple
index() # Searches the tuple for a specified value and returns the position of where it was found

Additional Resources:


Comparing Lists, Dictionaries, DataFrames Sets and Tuples

Feature List Dictionary DataFrame Set Tuple
Data Organization Ordered, items by index Key-value pairs Rows and columns Ordered, no dups Ordered, immutable
Access Method By index By key By row/column Loop through By index
Ideal Use Case Simple collections Mapping relationships Tabular data Resist duplicates

5. Final Project Idea: Student Report System

Build a system that:

  1. Stores student data in a DataFrame.
  2. Allows adding a new student (Name, Age, Grade).
  3. Filters students by a minimum grade.
  4. Prints all student data.

Example Code for the System:

import pandas as pd

# Initial data
data = {
    "Name": ["Alice", "Bob", "Charlie"],
    "Age": [16, 17, 16],
    "Grade": ["A", "B", "A"]
}
df = pd.DataFrame(data)

# Add a new student
new_student = {"Name": "Daisy", "Age": 17, "Grade": "A+"}
df = df.append(new_student, ignore_index=True)

# Filter by grade
print("Students with grade A or higher:")
print(df[df["Grade"] >= "A"])

Additional Resources: