diff --git a/.gitignore b/.gitignore
index bb91773..3cb2889 100644
--- a/.gitignore
+++ b/.gitignore
@@ -6,3 +6,6 @@ __pycache__/
PDS
.DS_Store
+/styles/00 archive/
+00 archive/
+.idea/
\ No newline at end of file
diff --git a/.idea/.gitignore b/.idea/.gitignore
new file mode 100644
index 0000000..13566b8
--- /dev/null
+++ b/.idea/.gitignore
@@ -0,0 +1,8 @@
+# Default ignored files
+/shelf/
+/workspace.xml
+# Editor-based HTTP Client requests
+/httpRequests/
+# Datasource local storage ignored files
+/dataSources/
+/dataSources.local.xml
diff --git a/CHANGELOG.md b/CHANGELOG.md
new file mode 100644
index 0000000..babd809
--- /dev/null
+++ b/CHANGELOG.md
@@ -0,0 +1,217 @@
+# Changelog
+
+All notable changes to this project will be documented in this file.
+
+The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/ "null"), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html "null").
+
+## \[2.6.0\] - 2025-01-26
+
+Added professional PDF publication capabilities.
+
+### Added
+
+- **PDF Generator:** Introduced `htmlToPDF.py`. Converts the dumped HTML structure into a single, hierarchical PDF document.
+
+- **PDF Features:**
+
+ - **Smart Splitting:** Option `--split-by-root` to generate separate PDFs for each top-level folder (scalable for 4000+ pages).
+
+ - **Mixed Orientation:** Supports mixing Portrait and Landscape pages within the same PDF based on source HTML hints.
+
+ - **Bookmarks:** Generates PDF Outlines/Bookmarks matching the sidebar structure.
+
+ - **Link Rewriting:** Converts HTML links to internal PDF anchors for seamless navigation.
+
+- **PDF Configuration:** Auto-generates `styles/pdf_settings.css` for user-definable page layouts (A4/Letter, Margins).
+
+
+## \[2.5.0\] - 2025-11-22
+
+Introduction of the "Architecture Sandbox" for offline restructuring.
+
+### Added
+
+- **Architecture Sandbox:** Introduced `create_editor.py` and `patch_sidebar.py`. Users can now generate a visual Drag & Drop editor (`editor_sidebar.html`) to restructure the exported documentation offline.
+
+- **Editor Features:**
+
+ - **Zero-Dependency:** The editor is a self-contained HTML file requiring no internet access.
+
+ - **Drag & Drop:** Robust reordering of pages and folders.
+
+ - **Working Copy:** Supports a `sidebar_edit.md` workflow to keep the original structure safe.
+
+
+### Changed
+
+- **CSS Strategy:** Refined the "Two-Layer" styling approach (Standard + Custom) to be more robust.
+
+
+## \[2.4.1\] - 2025-11-21
+
+UI/UX Improvements and Bug Fixes.
+
+### Added
+
+- **Metadata Injection:** Page Title, Author, and Modification Date are now injected directly into the HTML Body (top of the page) for better readability.
+
+- **Automatic Time-stamping:** Output folders are now automatically named with `YYYY-MM-DD HHMM [Title]` to support clean versioned backups.
+
+- **Persistent Sidebar:** The sidebar width is now remembered across page loads using `localStorage`.
+
+- **Absolute Links in Markdown:** The generated `sidebar.md` uses absolute file URIs to support opening links in external editors like Logseq or WebStorm directly.
+
+
+### Fixed
+
+- **Empty Page Bug:** Fixed an issue where pages with empty bodies (folders) resulted in 0-byte HTML files. Now generates a proper HTML skeleton with title and sidebar.
+
+- **Markdown Patching:** Updated `patch_sidebar.py` to handle absolute file URIs correctly.
+
+- **UI Layout:** Optimized Sidebar/Content padding and Hamburger button alignment.
+
+
+## \[2.4.0\] - 2025-11-21
+
+Advanced Filtering and Tree Logic Update.
+
+### Added
+
+- **Label Forest Mode:** The `label` command now supports deep recursion ("Forest Export"). It finds all pages with the include-label and treats them as roots for full tree exports.
+
+- **Label Pruning:** Added `--exclude-label` to prune subtrees based on a specific label (e.g., 'archived') during recursion.
+
+
+## \[2.3.0\] - 2025-11-21
+
+Enterprise Performance & Usability Release.
+
+### Added
+
+- **Recursive Inventory:** Changed scanning logic to use `/child/page` API endpoints. This ensures the export respects the **manual sort order** of Confluence.
+
+- **Multithreading:** Added `-t/--threads` argument to parallelize page downloads (Phase 2), significantly improving performance on large spaces.
+
+- **Tree Pruning (ID):** Added `--exclude-page-id` to skip specific branches during recursion.
+
+- **JS Resizer:** The sidebar now has a robust JavaScript-based drag-handle for resizing.
+
+- **UX Improvements:**
+
+ - Fixed Hamburger position (top-left).
+
+ - Added "Heartbeat" visualization during inventory scan.
+
+ - Added VPN Reminder for Data Center profiles.
+
+
+### Changed
+
+- **Architecture:** Split process into a strict "Inventory Phase" (Serial, Recursive for sorting) and "Download Phase" (Parallel).
+
+
+## \[2.2.0\] - 2025-11-20
+
+Introduction of Static Sidebar Injection.
+
+### Added
+
+- **Static Sidebar Injection:** Automatically generates a hierarchical navigation tree and injects it into every HTML page.
+
+- **Inventory Phase:** Scans all pages/metadata _before_ downloading content to allow for accurate progress bars (`tqdm`) and global tree generation.
+
+- **Smart Linking:** Improved detection of dead/external links vs. local links based on the inventory.
+
+- **CSS Auto-Discovery:** The script automatically detects and applies `site.css` from the local `styles/` directory.
+
+- **Multi-CSS Support:** Allows layering multiple CSS files (Standard + Custom).
+
+- **`sidebar.html` Export:** Saves the generated sidebar tree as a separate file.
+
+
+### Changed
+
+- **HTML Layout:** Pages are now wrapped in a Flexbox layout container to support the sidebar.
+
+- **Logging:** Cleaned up library logging to support progress bars.
+
+
+## \[2.1.0\] - 2025-11-19
+
+Major functionality restore and improvement ("Visual Copy" release).
+
+### Added
+
+- **HTML Processing with BeautifulSoup:** Re-introduced intelligent HTML parsing.
+
+ - **Image Downloading:** Automatically detects embedded images/emoticons, downloads them, and rewrites HTML links to local paths (`../attachments/`).
+
+ - **Link Sanitizing:** Attempts to rewrite Confluence internal links to relative filenames.
+
+ - **Metadata Injection (Head):** Injects Title, Page ID, and Labels into the HTML `
`.
+
+- **Export View:** Switched API fetch from `storage` format to `export_view` (or `view`) to get rendered HTML (resolves macros like TOC).
+
+- **Attachment Downloading:** Downloads _all_ attachments of a page via API list, not just those embedded in the text.
+
+
+### Changed
+
+- **HTML First:** The primary output format is now processed HTML (`export_view`). RST export is optional via `-R`.
+
+- **Dependencies:** Added `beautifulsoup4` to requirements.
+
+- **CSS handling:** Improved relative pathing for robust offline viewing.
+
+
+## \[2.0.0\] - 2025-11-17
+
+This version introduces a major architectural refactoring to support both Confluence Cloud and Data Center.
+
+### Added
+
+- **Confluence Data Center Support:** The script now supports both Confluence Cloud (`--profile cloud`) and Data Center (`--profile dc`).
+
+- **Configuration File (`confluence_products.ini`):** All platform-specific values (API URL templates, auth methods, base paths) are now defined in this external INI file.
+
+- **Data Center Authentication:** Added support for Bearer Token (Personal Access Token) authentication.
+
+- **New `label` Command:** Added support for dumping all pages with a specific label.
+
+- **Troubleshooting Hints:** Added specific error messages for Data Center users when authentication fails (Intranet/VPN warning).
+
+- **Documentation:** Added `CONTRIBUTING.md` and `CHANGELOG.md`.
+
+
+### Changed
+
+- **\[BREAKING CHANGE\] CLI Architecture (Sub-Commands):** The script's interface has been completely modernized, replacing the `-m`/`--mode` flag with sub-commands (like `git`).
+
+ - **REMOVED:** The `-m`/`--mode` flag.
+
+ - **REMOVED:** The `-s`/`--site` argument.
+
+ - **ADDED:** Sub-commands: `single`, `tree`, `space`, `all-spaces`, `label`.
+
+ - **ADDED (Global):** `--base-url`, `--profile`, `--context-path`.
+
+- **Refactored `myModules.py`:** All API functions are now platform-agnostic. Hardcoded URLs removed.
+
+- **Internationalization:** All code comments translated to English.
+
+
+_History below this line is from the original author (jgoldin-skillz)._
+
+## \[1.0.2\] - 2022-03-03
+
+- Bugfixes
+
+
+## \[1.0.1\] - 2022-03-03
+
+- Added `confluenceDumpWithPython.py`
+
+
+## \[1.0.0\] - 2022-03-01
+
+- Initial version
\ No newline at end of file
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
new file mode 100644
index 0000000..4bd212e
--- /dev/null
+++ b/CONTRIBUTING.md
@@ -0,0 +1,61 @@
+# Contributing Guide
+
+This document explains the internal architecture of the toolbox for contributors.
+
+## Architectural Overview
+
+The toolbox consists of three loosely coupled components that share a common data structure (the Output Directory).
+
+### 1\. Data Acquisition (`confluenceDumpToHTML.py`)
+
+- **Inventory Phase (Serial):**
+
+ - Uses recursion (`recursive_scan`) to walk the Confluence tree via `/child/page` API.
+
+ - This guarantees the `sidebar.md` matches the _manual sort order_ of Confluence (unlike CQL search).
+
+ - Applies pruning (excludes) at this stage.
+
+- **Download Phase (Parallel):**
+
+ - Uses `ThreadPoolExecutor` for performance.
+
+ - **Data Mocking (Overrides):** If a `.mhtml` file is present in the override directory, the API response is mocked. We use `email` library to parse MHTML and `BeautifulSoup` to strip "dirty" DOM elements (hidden rows, UI artifacts) using a strict "Data Diet" approach.
+
+- **Processing (`myModules.py`):**
+
+ - Uses `BeautifulSoup` to sanitize HTML.
+
+ - Injects Metadata, CSS links, and the static Sidebar into every page.
+
+
+### 2\. Structural Editing (`create_editor.py` / `patch_sidebar.py`)
+
+- **Zero-Dependency:** The editor is a self-contained HTML file generated via string concatenation (to avoid Python formatting issues with JS code).
+
+- **Logic:** The structure is parsed from Markdown into a DOM tree, manipulated via vanilla JS (Drag & Drop), and exported back to Markdown.
+
+- **Patching:** The patcher parses the modified Markdown (`sidebar_edit.md`) and re-injects the navigation tree into all static HTML files.
+
+
+### 3\. Document Composition (`htmlToDoc.py`)
+
+- **Separation:** Kept separate to avoid heavy dependencies (`weasyprint`/GTK) for users who only want HTML.
+
+- **Assembly:**
+
+ 1. Reads `sidebar.md` to determine order.
+
+ 2. Extracts `` from every page (ignoring the sidebar).
+
+ 3. Rewrites `href="page.html"` to internal anchors `href="#page-anchor"`.
+
+ 4. Wraps content in `div.chapter` with orientation classes.
+
+- **Styling & Orientation:**
+
+ - WeasyPrint does not support mixing orientations easily via global CSS.
+
+ - **Solution:** We scan the source HTML for `size: landscape`. If found, we wrap the content in a specific div (`.landscape-wrapper`) which maps to a named page `@page landscape` in the CSS.
+
+ - **CSS Priority:** `DEFAULT_PDF_BASE_CSS` -> `site.css` -> `pdf_settings.css`.
\ No newline at end of file
diff --git a/README.md b/README.md
index 207040a..4c4f60e 100644
--- a/README.md
+++ b/README.md
@@ -1,142 +1,99 @@
-# Confluence Dump With Python
-
-Dump Confluence pages using Python (requests) in HTML and RST format, including embedded pictures and attachments.
-References to downloaded files will be updated to their local relative path.
-
-## Description
-
-Nonetheless, the refactoring will require only 2 files and accept command-line args:
-* `myModules.py`: Contains all the required functions.
-* `confluenceDumpWithPython.py`: Script to use with the following command line args:
- * `-m, --mode`: The export mode, `single`, `space`, `bylabel`, `pageprops` (required).
- * Note: Only `single`, `pageprops` and `space` have been implemented so far.
- * `-S, --site`: The Atlassian Site (required).
- * `-s, --space`: The Space Key (if needed).
- * `-p, --page`: The Page ID (if needed).
- * `-l, --label`: The Page label (if needed).
- * `-x, --sphinx`: The `_images` and `_static` folders are placed at the root of the export folder, instead of together with the exported HTML files.
- * `--notags`: Does not add the tags directives to the rst files (when the `sphinx-tags` addon is not used).
-* `updatePageLinks.py`: Update online confluence links to the local files that have been downloaded so far.
- * `--folder`: Folder containing the files to update.
- * `--test`: Instead of overwriting the original .rst files, it will create updated ones with `zout_` as a prefix.
-* `getPageEditorVersion.py`: Get the editor version from single pages or all pages in a space.
- * `--site`: The Atlassian Site (required).
- * `--page`: Page ID (either/or)
- * `--space`: Space Key (either/or)
-
-For CSS Styling, it uses the `confluence.css` from Confluence that can be obtained by using the Workaround described in: https://jira.atlassian.com/browse/CONFSERVER-40907.
-The `site.css` file included with Confluence UI HTML exports is not as complete as the one above.
-
-### Folder and file structure:
-
-* The default output folder is `output/` under the same path as the script.
-* A folder with the Space name, Page Properties report page, single page name or Page Label name will be created under the output folder.
-* By default, the `_images/` and `_static/` folders will be placed in the page|space|pageprops|label folder.
- * The `--sphinx` command line option will put those folder directly under the output folder
-* The file `styles/confluence.css` will be copied into the defined `_static/`
-
-## What it does
-
-* Leverages the Confluence Cloud API
-* Puts Confluence meta data like Page ID and Page Labels, in the HTML headers and RST fields.
-* beautifulsoup is used to parse HTML to get and update content, ie. change remote links to local links.
-* Download for every page, all attachments, emoticons and embedded files.
-
-## Requirements
-
-* declare system variables:
- * `atlassianAPIToken`
- * `atlassianUserEmail`
-
-### Dependencies
-
-* python3
- * requests
- * beautifulsoup4
- * Pillow (handle images)
- * pandoc & pypandoc (convert to RST)
- * re
-
-### Installing
-
-* Clone repo.
-* Install dependencies.
-* Declare system variables for Atlassian API Token.
-
-### Executing program
-
-
-* How to download a single page based on its ID.
-
+# Confluence Dump with Python
+This toolbox exports content from a Confluence instance (Cloud or Data Center) into a static, navigable HTML archive and converts it into professional, hierarchical PDF documents.
+**Key Features:**
+- **Visual Fidelity:** Fetches rendered HTML (`export_view`) to preserve macros, layouts, and formatting.
+- **Navigation:** Injects a fully functional, static navigation sidebar into every HTML page.
+- **Offline Browsing:** Localizes images and links, and downloads **all** attachments (PDFs, Office docs, etc.) for complete offline access.
+- **Sort Order:** Recursively scans the tree to ensure the **manual sort order** from Confluence is preserved.
+- **Metadata Injection:** Automatically adds Page Title, Author, and Modification Date to the top of every page.
+- **Versioning:** Creates timestamped output folders (e.g., `2025-11-21 1400 Space IT`) for clean history management.
+- **Professional PDF:** Merges the content into a single PDF with TOC, Bookmarks, and mixed Portrait/Landscape orientation.
+## Toolbox Overview (Key Files)
+- **`confluenceDumpToHTML.py`**: The main downloader. Connects to Confluence, scrapes content, and creates the folder structure.
+- **`htmlToDoc.py`**: The publisher. Converts the downloaded HTML folder into a single PDF or a Master-HTML file for LLMs.
+- **`confluence_products.ini`**: Configuration file for API URLs (Cloud vs. Data Center).
+- **`styles/`**: Contains CSS files. `site.css` (if present) is applied automatically. `pdf_settings.css` configures the PDF layout (A4/Letter, Margins).
+## Quick Start Guide
+Follow these steps to create your first PDF export of a single page tree.
+### 1\. Setup
+Install requirements and set your credentials.
```
-confluenceDumpWithPython.py -m single -S -p [