SHEDS Data Analysis Scripts

Scripts for processing and analyzing the Swiss Household Energy Demand Survey (SHEDS) data in R and Python. SHEDS data is stored in SPSS format (.sav) which includes value labels and variable descriptions. Descriptions and more information can be found at SHEDS - Sweet Cross.

Project Structure

The project provides example scripts demonstrating how to work with SHEDS data in R and Python. It also includes a CSV file listing all question identifiers across survey years, indicating when each question was used, to improve transparency and facilitate longitudinal analysis.

sheds_data_scripts/
├── sheds_questions_up2025.csv # Table of question identifiers over the years
├── README.md
├── .gitignore
└── src/
    ├── python/
    │   ├── utils.py # Contains useful functions
    │   ├── sheds_explore.ipynb
    │   ├── longitudinal_exploration.ipynb
    │   └── read_sav_example.ipynb
    └── R/
        ├── utils.R # Contains useful functions
        ├── sheds_explore.Rmd
        └── longitudinal_exploration.rmd

Setup

Python

pip install pandas numpy pyreadstat matplotlib seaborn

R

install.packages(c("haven", "tidyverse", "zoo", "scales"))

Loading SHEDS Data

R

source("utils.R")

# Load single wave (filters out screen == 3)
sheds <- read_clean_sheds("/path/to/SHEDS2025.sav")

# Get summary statistics
summary <- get_data_summary(sheds)

Python

from utils import read_clean_sheds, get_data_summary

# Load single wave (filters out screen == 3)
sheds = read_clean_sheds("/path/to/SHEDS2025.sav")

# Get summary statistics
summary = get_data_summary(sheds)

Working with SPSS Metadata

Python - Accessing Labels

import pyreadstat

# Load with metadata
df, meta = pyreadstat.read_sav("/path/to/SHEDS2025.sav", encoding="UTF-8")

# Get variable label (question text)
meta.column_names_to_labels['accom11_1']
# -> "How satisfied are you with your current heating system?"

# Get value labels (response options)
meta.variable_value_labels['accom11_1']
# -> {1: 'Very dissatisfied', 2: 'Dissatisfied', ..., 5: 'Very satisfied'}

# Apply labels to create readable values
df['accom11_1_label'] = df['accom11_1'].map(meta.variable_value_labels['accom11_1'])

R - Accessing Labels

library(haven)

sheds <- read_sav("/path/to/SHEDS2025.sav")

# Get variable label
attr(sheds$accom11_1, "label")

# Get value labels
attr(sheds$accom11_1, "labels")

# Apply labels
library(dplyr)
sheds %>%
  mutate(accom11_1_label = as_factor(accom11_1))

Functions

Function	Description
`read_clean_sheds(filepath)`	Read SPSS file, filter out screened respondents (`screen != 3`)
`get_data_summary(data)`	Returns n_respondents, n_variables, completion_rate, avg_duration
`build_car_history(all_waves_dict)`	Combine waves, carry forward car data for longitudinal analysis
`analyze_ev_ownership_data(data_history, year)`	Analyze EV/hybrid ownership for a specific year
`save_plot(plot, path, filename)`	Save figure in PDF and EPS formats
`check_finished(data, year)`	Report completion statistics for a wave

Example longitudinal analysis of EVs

Since we do not ask all respondents about their car type in every wave—only when they report a change—it is necessary to reconstruct the full car‑ownership history for the analysis. In each wave, respondents are asked whether they have changed their car since the previous survey. If they report a change, we collect the type of car; if not, the question is skipped. To build a complete car‑type history, we need to carry forward (i.e., “roll forward”) the car type reported in the most recent previous wave whenever no change is indicated, and update the value only in waves where a change is reported.

Python

from utils import read_clean_sheds, build_car_history, analyze_ev_ownership_data
import pandas as pd

# Load all waves
years = [2016, 2017, 2018, 2019, 2020, 2021, 2023, 2025]
waves = {}
for year in years:
    waves[str(year)] = read_clean_sheds(f"/path/to/SHEDS{year}.sav")

# Build car history with forward-fill
car_history = build_car_history(waves)

# Analyze each year
results = pd.concat([
    analyze_ev_ownership_data(car_history, year)
    for year in [2019, 2020, 2021, 2023, 2025]
])

R

source("utils.R")

years <- c(2016, 2017, 2018, 2019, 2020, 2021, 2023, 2025)
waves <- list()

for (year in years) {
  waves[[as.character(year)]] <- read_clean_sheds(paste0("/path/to/SHEDS", year, ".sav"))
}

car_history <- build_car_history(waves)

results <- bind_rows(
  analyze_ev_ownership_data(car_history, 2019),
  analyze_ev_ownership_data(car_history, 2020),
  analyze_ev_ownership_data(car_history, 2021),
  analyze_ev_ownership_data(car_history, 2023),
  analyze_ev_ownership_data(car_history, 2025)
)

Used Variables

Variable	Description
`id`	Respondent ID (consistent across waves)
`finished`	Survey completion (1 = finished)
`screen`	Screening status (3 = screened out)
`mob2_1`	Number of cars in household
`mob3_3`	Fuel type of main car (8 = Electric)
`mob2_e`	Has electric vehicle as secondary car (1 = yes)
`q_totalduration`	Survey duration in minutes

mob3_3 Fuel Type Codes

Code	Fuel Type
1	Gasoline
2	Diesel
3	Natural Gas
4	LPG
5	Hybrid gasoline
6	Plug-in Hybrid
7	Hybrid diesel
8	Electric
9	Other

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SHEDS Data Analysis Scripts

Project Structure

Setup

Python

R

Loading SHEDS Data

R

Python

Working with SPSS Metadata

Python - Accessing Labels

R - Accessing Labels

Functions

Example longitudinal analysis of EVs

Python

R

Used Variables

mob3_3 Fuel Type Codes

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
sheds_questions_up2025.csv		sheds_questions_up2025.csv

License

SiLab-group/sheds_data_scripts

Folders and files

Latest commit

History

Repository files navigation

SHEDS Data Analysis Scripts

Project Structure

Setup

Python

R

Loading SHEDS Data

R

Python

Working with SPSS Metadata

Python - Accessing Labels

R - Accessing Labels

Functions

Example longitudinal analysis of EVs

Python

R

Used Variables

mob3_3 Fuel Type Codes

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages