From 84f765d699856abf51ab5052caf6ce15b2f251d2 Mon Sep 17 00:00:00 2001 From: MD JUBER QURAISHI Date: Sat, 16 May 2026 07:45:55 +0530 Subject: [PATCH 1/2] docs: add DBMS chapter 4 - normalization and chapter 5 - transactions --- app/sem4/dbms/content/chapter4.tsx | 146 +++++++++++++++++++++++++++++ app/sem4/dbms/content/chapter5.tsx | 126 +++++++++++++++++++++++++ 2 files changed, 272 insertions(+) create mode 100644 app/sem4/dbms/content/chapter4.tsx create mode 100644 app/sem4/dbms/content/chapter5.tsx diff --git a/app/sem4/dbms/content/chapter4.tsx b/app/sem4/dbms/content/chapter4.tsx new file mode 100644 index 0000000..f73052d --- /dev/null +++ b/app/sem4/dbms/content/chapter4.tsx @@ -0,0 +1,146 @@ +export const Ch4Content = () => { + return ( +
+

+ Normalization is the process of + organizing a database to reduce redundancy and improve data integrity by + breaking large tables into smaller, well-structured ones. +

+ +
+ +
+

+ Why Normalize? +

+
    +
  • Insertion Anomaly: cannot add data without unrelated data. Example: can't add a course unless a student is enrolled.
  • +
  • Deletion Anomaly: deleting one record accidentally removes other useful data.
  • +
  • Update Anomaly: updating one value requires changing it in many places.
  • +
  • Normalization solves all three by removing redundant data dependencies.
  • +
+
+ +
+ +
+

+ Functional Dependency +

+
    +
  • A Functional Dependency (FD) means one attribute determines another. Written as A → B.
  • +
  • Example: StudentID → StudentName (knowing the ID tells you the name).
  • +
  • Partial Dependency: a non-key attribute depends on part of a composite primary key.
  • +
  • Transitive Dependency: a non-key attribute depends on another non-key attribute.
  • +
+
+

Exam Tip: FDs are the foundation of all normal forms. Master them first.

+
+
+ +
+ +
+

+ 1NF — First Normal Form +

+
    +
  • A table is in 1NF if every column contains atomic (indivisible) values.
  • +
  • No repeating groups or arrays in a single column.
  • +
  • Each row must be unique (has a primary key).
  • +
+
+

Violation → Fix

+
{`❌ Not 1NF:
+StudentID | Courses
+101       | DBMS, OS, CN
+
+✅ 1NF:
+StudentID | Course
+101       | DBMS
+101       | OS
+101       | CN`}
+
+
+ +
+ +
+

+ 2NF — Second Normal Form +

+
    +
  • Must be in 1NF.
  • +
  • No partial dependencies — every non-key attribute must depend on the entire primary key.
  • +
  • Applies only when the primary key is composite.
  • +
+
+

Violation → Fix

+
{`❌ Not 2NF (PK = StudentID + CourseID):
+StudentID | CourseID | StudentName | Grade
+101       | CS401    | Zubair      | A
+
+StudentName depends only on StudentID (partial dependency).
+
+✅ 2NF: Split into two tables:
+Students(StudentID, StudentName)
+Enrollments(StudentID, CourseID, Grade)`}
+
+
+ +
+ +
+

+ 3NF — Third Normal Form +

+
    +
  • Must be in 2NF.
  • +
  • No transitive dependencies — non-key attributes must not depend on other non-key attributes.
  • +
+
+

Violation → Fix

+
{`❌ Not 3NF:
+StudentID | DeptID | DeptName
+101       | D01    | CSE
+
+DeptName depends on DeptID (not on StudentID) — transitive.
+
+✅ 3NF: Split:
+Students(StudentID, DeptID)
+Departments(DeptID, DeptName)`}
+
+
+ +
+ +
+

+ BCNF — Boyce-Codd Normal Form +

+
    +
  • A stricter version of 3NF.
  • +
  • For every FD A → B, A must be a superkey.
  • +
  • Handles cases where 3NF still has anomalies due to overlapping candidate keys.
  • +
+
+

If a table is in BCNF, it is also in 3NF — but not vice versa. BCNF is stronger.

+
+
+ +
+ +
+

+ Decomposition +

+
    +
  • Lossless Join: splitting a table and rejoining gives back the original data exactly.
  • +
  • Dependency Preservation: all original FDs can still be checked in the decomposed tables.
  • +
  • BCNF decomposition always gives lossless joins but may lose dependency preservation.
  • +
  • 3NF decomposition preserves both lossless join and dependency preservation.
  • +
+
+
+ ); +}; \ No newline at end of file diff --git a/app/sem4/dbms/content/chapter5.tsx b/app/sem4/dbms/content/chapter5.tsx new file mode 100644 index 0000000..aa54577 --- /dev/null +++ b/app/sem4/dbms/content/chapter5.tsx @@ -0,0 +1,126 @@ +export const Ch5Content = () => { + return ( +
+

+ A transaction is a sequence of + database operations treated as a single unit. Concurrency control ensures + multiple transactions run simultaneously without conflicts. +

+ +
+ +
+

+ ACID Properties +

+
    +
  • Atomicity: all operations succeed, or none do. No partial execution.
  • +
  • Consistency: the database moves from one valid state to another.
  • +
  • Isolation: concurrent transactions do not interfere with each other.
  • +
  • Durability: once committed, changes persist even after a crash.
  • +
+
+

Exam Tip: ACID is one of the most asked topics. Remember all four with the mnemonic: All Changes Isolated & Durable.

+
+
+ +
+ +
+

+ Transaction States +

+
    +
  • Active: transaction is currently executing.
  • +
  • Partially Committed: last operation executed, not yet written to disk.
  • +
  • Committed: all changes written permanently to the database.
  • +
  • Failed: normal execution cannot continue.
  • +
  • Aborted: transaction rolled back; database restored to previous state.
  • +
+
+

State Flow

+
{`Active → Partially Committed → Committed
+Active → Failed → Aborted`}
+
+
+ +
+ +
+

+ Concurrency Problems +

+
    +
  • Lost Update: two transactions read and update the same data; one update overwrites the other.
  • +
  • Dirty Read: a transaction reads data written by another transaction that has not committed yet.
  • +
  • Unrepeatable Read: a transaction reads the same row twice and gets different values because another transaction updated it.
  • +
  • Phantom Read: a transaction re-executes a query and finds new rows added by another transaction.
  • +
+
+ +
+ +
+

+ Locking +

+
    +
  • Shared Lock (S): allows reading. Multiple transactions can hold a shared lock simultaneously.
  • +
  • Exclusive Lock (X): allows reading and writing. Only one transaction can hold it; no other locks allowed.
  • +
  • A transaction must acquire the appropriate lock before accessing data.
  • +
+
+ +
+ +
+

+ Two-Phase Locking (2PL) +

+
    +
  • A protocol to ensure serializability (correct concurrent execution).
  • +
  • Growing Phase: locks are acquired; no lock is released.
  • +
  • Shrinking Phase: locks are released; no new lock is acquired.
  • +
  • 2PL guarantees conflict serializability but can cause deadlocks.
  • +
+
+

Once a transaction releases its first lock, it enters the shrinking phase and cannot acquire new locks.

+
+
+ +
+ +
+

+ Deadlock +

+
    +
  • A deadlock occurs when two or more transactions wait for each other to release locks — forming a cycle.
  • +
  • Detection: use a wait-for graph; a cycle means deadlock.
  • +
  • Prevention: use timestamps (wait-die or wound-wait schemes).
  • +
  • Recovery: abort one transaction (the victim) to break the cycle.
  • +
+
+

Deadlock Example

+
{`T1 holds Lock(A), waits for Lock(B)
+T2 holds Lock(B), waits for Lock(A)
+→ Deadlock! Neither can proceed.`}
+
+
+ +
+ +
+

+ Timestamp-Based Concurrency Control +

+
    +
  • Each transaction is assigned a unique timestamp when it starts.
  • +
  • Operations are ordered by timestamp to ensure serializability.
  • +
  • Wait-Die: older transaction waits; younger is aborted (dies).
  • +
  • Wound-Wait: older transaction aborts the younger (wounds it); younger waits if it's older.
  • +
+
+
+ ); +}; \ No newline at end of file From e7711ad633cf491538207dddcd1c631b5f1d2346 Mon Sep 17 00:00:00 2001 From: MD JUBER QURAISHI Date: Sat, 16 May 2026 07:50:37 +0530 Subject: [PATCH 2/2] docs: add DBMS chapters 6, 7, 8 - indexing, query processing, recovery and security --- app/sem4/dbms/content/chapter6.tsx | 83 ++++++++++++++++++++ app/sem4/dbms/content/chapter7.tsx | 90 ++++++++++++++++++++++ app/sem4/dbms/content/chapter8.tsx | 117 +++++++++++++++++++++++++++++ 3 files changed, 290 insertions(+) create mode 100644 app/sem4/dbms/content/chapter6.tsx create mode 100644 app/sem4/dbms/content/chapter7.tsx create mode 100644 app/sem4/dbms/content/chapter8.tsx diff --git a/app/sem4/dbms/content/chapter6.tsx b/app/sem4/dbms/content/chapter6.tsx new file mode 100644 index 0000000..7bb75e0 --- /dev/null +++ b/app/sem4/dbms/content/chapter6.tsx @@ -0,0 +1,83 @@ +export const Ch6Content = () => { + return ( +
+

+ Indexing and{" "} + Hashing are techniques used to + speed up data retrieval in large databases without scanning every row. +

+ +
+ +
+

+ Why Indexing? +

+
    +
  • Without an index, every query scans the entire table — slow for large data.
  • +
  • An index is like a book's table of contents — it points directly to the data.
  • +
  • Indexes speed up SELECT queries but slightly slow down INSERT, UPDATE, DELETE.
  • +
  • Created using: CREATE INDEX idx_name ON table(column);
  • +
+
+ +
+ +
+

+ Types of Indexes +

+
    +
  • Primary Index: built on the primary key of an ordered file. One entry per data block.
  • +
  • Secondary Index: built on non-primary key fields. Points to exact records.
  • +
  • Clustering Index: records are physically ordered by the indexed field.
  • +
  • Dense Index: one index entry for every record in the file.
  • +
  • Sparse Index: one index entry per block, not per record. Requires ordered data.
  • +
+
+

Dense index is faster to search but uses more space. Sparse index saves space but requires the file to be sorted.

+
+
+ +
+ +
+

+ B-Tree and B+ Tree +

+
    +
  • Most databases use B+ Trees for indexing.
  • +
  • B-Tree: stores data in both internal nodes and leaf nodes.
  • +
  • B+ Tree: stores data only in leaf nodes; internal nodes store only keys for navigation.
  • +
  • Leaf nodes in B+ Tree are linked — great for range queries.
  • +
  • All operations (search, insert, delete) take O(log n) time.
  • +
+
+

B+ Tree Structure

+
{`Internal nodes: [10 | 20 | 30]
+                /    |    |    \\
+Leaf nodes: [5,8] [12,15] [22,25] [35,40]
+             ↔      ↔       ↔       ↔  (linked)`}
+
+
+ +
+ +
+

+ Hashing +

+
    +
  • Hashing maps a key value directly to a bucket (storage location) using a hash function.
  • +
  • Best for exact match queries — not range queries.
  • +
  • Static Hashing: fixed number of buckets. Can cause overflow if data grows.
  • +
  • Dynamic Hashing (Extendible Hashing): buckets grow and split as data increases.
  • +
  • Collision: two keys map to the same bucket — handled by chaining or open addressing.
  • +
+
+

Use indexing (B+ Tree) for range queries. Use hashing for fast exact lookups.

+
+
+
+ ); +}; \ No newline at end of file diff --git a/app/sem4/dbms/content/chapter7.tsx b/app/sem4/dbms/content/chapter7.tsx new file mode 100644 index 0000000..a26f921 --- /dev/null +++ b/app/sem4/dbms/content/chapter7.tsx @@ -0,0 +1,90 @@ +export const Ch7Content = () => { + return ( +
+

+ Query Processing is how the DBMS + takes a SQL query, understands it, and executes it efficiently.{" "} + Query Optimization selects the + most efficient execution plan. +

+ +
+ +
+

+ Steps in Query Processing +

+
    +
  • Parsing: the SQL query is checked for syntax errors and converted into a parse tree.
  • +
  • Translation: the parse tree is converted into relational algebra expressions.
  • +
  • Optimization: the query optimizer picks the most efficient execution plan.
  • +
  • Execution: the chosen plan is executed and results are returned.
  • +
+
+

Query Processing Pipeline

+
{`SQL Query
+   ↓ Parser
+Parse Tree
+   ↓ Translator
+Relational Algebra Expression
+   ↓ Optimizer
+Execution Plan
+   ↓ Evaluator
+Query Result`}
+
+
+ +
+ +
+

+ Query Cost Estimation +

+
    +
  • Cost is measured in terms of disk I/O, CPU time, and memory usage.
  • +
  • Disk I/O dominates — the optimizer minimizes the number of disk reads/writes.
  • +
  • The optimizer uses statistics (number of rows, distinct values, index availability) to estimate cost.
  • +
+
+ +
+ +
+

+ Join Algorithms +

+
    +
  • Nested Loop Join: for each tuple in the outer relation, scan all tuples in the inner. Simple but slow for large tables.
  • +
  • Block Nested Loop Join: loads blocks instead of tuples — fewer disk reads.
  • +
  • Merge Join: both relations sorted on the join attribute, then merged. Efficient for sorted data.
  • +
  • Hash Join: hash both relations on the join attribute and match buckets. Very efficient for large unsorted data.
  • +
+
+

Hash Join is generally the fastest for large datasets. Merge Join is best when data is already sorted.

+
+
+ +
+ +
+

+ Query Optimization Techniques +

+
    +
  • Heuristic Optimization: apply rules to rewrite the query into a more efficient form before execution.
  • +
  • Push selections down: apply WHERE filters as early as possible to reduce rows.
  • +
  • Push projections down: select only needed columns early to reduce data size.
  • +
  • Cost-Based Optimization: enumerate multiple plans, estimate their cost, and pick the cheapest.
  • +
+
+

Heuristic Example

+
{`-- Before optimization (filter happens after join):
+Students ⋈ Enrollments WHERE Students.dept = 'CSE'
+
+-- After optimization (filter before join):
+σ(dept='CSE')(Students) ⋈ Enrollments`}
+
+
+
+ ); +}; \ No newline at end of file diff --git a/app/sem4/dbms/content/chapter8.tsx b/app/sem4/dbms/content/chapter8.tsx new file mode 100644 index 0000000..9f91f13 --- /dev/null +++ b/app/sem4/dbms/content/chapter8.tsx @@ -0,0 +1,117 @@ +export const Ch8Content = () => { + return ( +
+

+ Recovery restores the database to + a consistent state after a failure.{" "} + Security ensures only authorized + users can access or modify data. +

+ +
+ +
+

+ Types of Failures +

+
    +
  • Transaction Failure: logical error or deadlock causes a transaction to abort.
  • +
  • System Failure: power outage or OS crash — data in memory is lost but disk is safe.
  • +
  • Media Failure: disk crash — data on disk is lost. Requires backups.
  • +
+
+ +
+ +
+

+ Log-Based Recovery +

+
    +
  • The DBMS maintains a log file recording every change before it is applied to the database.
  • +
  • Each log record contains: transaction ID, data item, old value, new value.
  • +
  • Undo: if a transaction fails, old values from the log are restored.
  • +
  • Redo: if a committed transaction's changes weren't written to disk, they are reapplied.
  • +
  • Write-Ahead Logging (WAL): log must be written to disk before the actual data change.
  • +
+
+

WAL is the golden rule of recovery — always log before you change.

+
+
+ +
+ +
+

+ Checkpoints +

+
    +
  • A checkpoint is a point where the DBMS writes all in-memory changes to disk and records this in the log.
  • +
  • During recovery, only transactions after the last checkpoint need to be redone or undone.
  • +
  • Checkpoints reduce recovery time significantly.
  • +
+
+

Recovery After Crash

+
{`Checkpoint at T=10
+T1 committed at T=8  → already safe, skip
+T2 committed at T=12 → redo (may not be on disk)
+T3 active at crash   → undo (was never committed)`}
+
+
+ +
+ +
+

+ Shadow Paging +

+
    +
  • An alternative to log-based recovery.
  • +
  • Maintains two page tables: current and shadow.
  • +
  • Changes go to the current page table. On commit, current replaces shadow.
  • +
  • On failure, just restore the shadow page table — no undo needed.
  • +
  • Simpler than logging but causes fragmentation and is rarely used in modern systems.
  • +
+
+ +
+ +
+

+ Database Security +

+
    +
  • Authentication: verifying who the user is — username and password.
  • +
  • Authorization: controlling what an authenticated user can do.
  • +
  • GRANT: gives a user permission. Example: GRANT SELECT ON students TO user1;
  • +
  • REVOKE: removes a permission. Example: REVOKE SELECT ON students FROM user1;
  • +
  • Views as Security: expose only specific columns or rows to certain users.
  • +
+
+ +
+ +
+

+ SQL Injection +

+
    +
  • A common attack where malicious SQL is inserted into an input field to manipulate the database.
  • +
  • Prevention: use prepared statements and parameterized queries — never concatenate raw user input into SQL.
  • +
+
+

SQL Injection Example

+
{`-- Vulnerable query:
+"SELECT * FROM users WHERE name = '" + input + "'"
+
+-- Attacker enters: ' OR '1'='1
+-- Resulting query (returns all users!):
+SELECT * FROM users WHERE name = '' OR '1'='1'
+
+-- Safe fix (prepared statement):
+SELECT * FROM users WHERE name = ?`}
+
+
+
+ ); +}; \ No newline at end of file