Skip to content

DataHaskell/dataframe

User guide | Discord

DataFrame

A fast, safe, and intuitive DataFrame library.

Why use this DataFrame library?

  • Encourages concise, declarative, and composable data pipelines.
  • Lets you opt into your preferred level of type safety: keep it lightweight for rapid exploration or lock it down completely for robust production pipelines.
  • Delivers high performance thanks to Haskell’s optimizing compiler and efficient memory model.
  • Designed for interactivity: expressive syntax, helpful error messages, and sensible defaults.
  • Works seamlessly in both command-line and notebook environments—great for exploration and scripting alike.

Features

  • Type-safe column operations with compile-time guarantees
  • Familiar, approachable API designed to feel easy coming from other languages.
  • Interactive REPL for data exploration and plotting.

Quick start

Browse through some examples in binder.

Install

See the Quick Start guide for setup and installation instructions.

Example

dataframe> df = D.fromNamedColumns [("product_id", D.fromList [1,1,2,2,3,3]), ("sales", D.fromList [100,120,50,20,40,30])]
dataframe> df
------------------
product_id | sales
-----------|------
   Int     |  Int 
-----------|------
1          | 100  
1          | 120  
2          | 50   
2          | 20   
3          | 40   
3          | 30   

dataframe> :declareColumns df
"product_id :: Expr Int"
"sales :: Expr Int"
dataframe> df |> D.groupBy [F.name product_id] |> D.aggregate [F.sum sales `as` "total_sales"]
------------------------
product_id | total_sales
-----------|------------
   Int     |     Int    
-----------|------------
1          | 220        
2          | 70         
3          | 70         

Documentation

About

A fast, safe, and intuitive DataFrame library.

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors