A miniature in-memory database with SQL-like query support, built from scratch using only Python 3.11+ standard library.
- Typed Columns: INTEGER, STRING, FLOAT, BOOLEAN
- CRUD Operations: INSERT, SELECT, UPDATE, DELETE
- SQL-like Query Language:
- WHERE with AND/OR comparisons (=, >, <, >=, <=, !=, LIKE, IN)
- ORDER BY (ASC/DESC)
- GROUP BY with aggregations
- LIMIT clause
- Aggregations: COUNT, SUM, AVG, MIN, MAX
- JOINs: INNER JOIN between tables
- Indexing: Automatic hash-based indexing on primary keys
- Query Planner: Chooses between index scans and table scans
- Persistence: Save/load database to JSON files with versioning
No installation required! MiniDB uses only Python standard library.
# Just clone and use
git clone <repo-url>
cd MiniDB
python -m pytest tests/ -v # Run tests
python main.py # Run demofrom minidb import MiniDB, Column, ColumnType
# Create database
db = MiniDB()
# Create table using SQL
db.execute('''
CREATE TABLE users (
id INTEGER PRIMARY KEY,
name STRING,
age INTEGER,
salary FLOAT,
active BOOLEAN
)
''')
# Or create table programmatically
db.create_table('orders', [
Column('id', ColumnType.INTEGER, primary_key=True),
Column('user_id', ColumnType.INTEGER),
Column('product', ColumnType.STRING),
Column('total', ColumnType.FLOAT),
])
# Insert data
db.execute("INSERT INTO users (id, name, age, salary, active) VALUES (1, 'Alice', 30, 75000.0, true)")
db.execute("INSERT INTO users (id, name, age, salary, active) VALUES (2, 'Bob', 25, 55000.0, false)")
# Query data
results = db.query("SELECT * FROM users WHERE age > 28")
# Complex queries
results = db.query('''
SELECT name, salary FROM users
WHERE active = true AND salary > 60000
ORDER BY salary DESC
''')
# Aggregations
results = db.query("SELECT COUNT(*), AVG(salary) FROM users")
# GROUP BY
results = db.query('''
SELECT active, COUNT(*), AVG(salary)
FROM users
GROUP BY active
''')
# JOINs
results = db.query('''
SELECT users.name, orders.product, orders.total
FROM users
JOIN orders ON users.id = orders.user_id
WHERE orders.total > 50
''')
# UPDATE
affected = db.execute("UPDATE users SET salary = 80000.0 WHERE id = 1")
# DELETE
affected = db.execute("DELETE FROM users WHERE active = false")
# Persistence
db.save('my_database.json')
db = MiniDB.load('my_database.json')CREATE TABLE table_name (
column1 INTEGER PRIMARY KEY,
column2 STRING,
column3 FLOAT,
column4 BOOLEAN
)INSERT INTO table_name (col1, col2, col3) VALUES (1, 'value', 3.14)SELECT * FROM table_name
SELECT col1, col2 FROM table_name
SELECT col1, COUNT(*), AVG(col2) FROM table_name GROUP BY col1WHERE col = value
WHERE col > value
WHERE col >= value
WHERE col < value
WHERE col <= value
WHERE col != value
WHERE col LIKE 'pattern%' -- % matches any sequence
WHERE col IN (1, 2, 3)
WHERE cond1 AND cond2
WHERE cond1 OR cond2ORDER BY col ASC
ORDER BY col DESC
ORDER BY col1 ASC, col2 DESCLIMIT 10SELECT t1.col, t2.col
FROM t1
JOIN t2 ON t1.id = t2.t1_idUPDATE table_name SET col1 = value1, col2 = value2 WHERE conditionDELETE FROM table_name WHERE conditionminidb/
├── __init__.py # Public API exports
├── database.py # MiniDB main class
├── table.py # Table storage
├── column.py # Column and Schema definitions
├── index.py # Hash-based indexing
├── parser.py # SQL lexer and parser
├── query.py # Query execution engine
├── planner.py # Query planner
├── persistence.py # JSON serialization
├── types.py # Type definitions
└── errors.py # Custom exceptions
MiniDB includes a comprehensive test suite with 29+ test cases:
# Run all tests
python -m pytest tests/ -v
# Run specific test file
python -m pytest tests/test_queries.py -v
# Run with coverage
python -m pytest tests/ -v --cov=minidb- test_database.py: Database lifecycle and table management
- test_crud.py: INSERT, SELECT, UPDATE, DELETE operations
- test_queries.py: WHERE, ORDER BY, LIMIT
- test_aggregations.py: COUNT, SUM, AVG, MIN, MAX, GROUP BY
- test_joins.py: JOIN operations
- test_index.py: Indexing and query planning
- test_persistence.py: Save/load functionality
- test_performance.py: Large dataset tests (10,000+ rows)
MiniDB is designed for small to medium datasets:
- 10,000 rows: All operations complete in under 30 seconds
- Indexed lookups: O(1) for primary key equality
- Range queries: O(n) with index optimization
- Table scans: O(n) for non-indexed columns
- In-memory only (no disk-based storage during operation)
- Single-threaded
- No transactions
- No foreign key constraints
- No NULL handling in some edge cases
- Limited JOIN support (INNER JOIN only)
MIT License - Use freely for any purpose.
Contributions welcome! Areas for improvement:
- B-tree indexes for range queries
- LEFT/RIGHT OUTER JOIN
- Subqueries
- HAVING clause
- DISTINCT
- More aggregate functions
- Query optimization
- Concurrent access