This is a place holder Issue for the first major release [v1.0.0](https://github.com/indrajithi/tiny-web-crawler/milestones) **Please feel free to create issue from this list** ## Scope and Features: First major version v1.0.0 ### Functional Requirements - [x] Basic Crawling Functionality #1 - [x] Configurable options for maximum links to crawl #1 - [x] Handle both relative and absolute URLs #1 - [x] Save crawl results to a specified file #1 - [x] Configurable verbosity levels for logging #7 - [x] Concurrency and custom delay #7 - [x] Support Regular expression #16 - [x] Crawl internal / external links only #11 - [x] Return optional html in response #19 - [ ] Crawl depth per website/domain #37 - [x] Logging #38 - [x] Retry mechanism for transient errors #39 - [ ] Support Javascript heavy dynamic websites #10 - [x] (Optional) Respect Robots.txt #42 - [ ] (Optional) User-Agent Customization - [ ] (Optional) Proxy support - [ ] (Optional) Use Asynchronous I/O - [ ] (Optional) Crawl output to database (Mongo mabye) ### Non Functional Requirements - [X] Git workflow for CI/CD #4 - [ ] Documentation (API and Developer) #18 - [x] Test coverage above 80% #28 - [x] Git hooks #22 - [x] Modular and Extensible Architecture #17 - [ ] (Optional) Memory Benchmark: Monitor Monitor memory usage during the crawling process - [ ] (optional) Security considerations (e.g., handling of malicious content)
This is a place holder Issue for the first major release v1.0.0
Please feel free to create issue from this list
Scope and Features: First major version v1.0.0
Functional Requirements
Non Functional Requirements