Skip to content

Commit 6cf7dd6

Browse files
Merge pull request #9 from AtlasAnalyticsLab/fix/jekyll-sitemap-2026-02-08
Fix/jekyll sitemap 2026 02 08
2 parents d06fe66 + edaf5d3 commit 6cf7dd6

File tree

16 files changed

+49
-92
lines changed

16 files changed

+49
-92
lines changed

DEVELOPMENT.md

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -570,6 +570,7 @@ front matter...
570570
```
571571
- **Purpose:** Tells search engines where to find the sitemap
572572
- **Impact:** SEO - affects how search engines crawl the site
573+
- **Note:** `/sitemap.xml` is generated automatically by the `jekyll-sitemap` plugin (configured in `_config.yml`). Do not maintain a manual `sitemap.xml` file.
573574
- **Priority:** 🔴 CRITICAL
574575
575576
3. **`CNAME`** (line 19)
@@ -615,6 +616,7 @@ If you need to change the website URL:
615616
#### Step 1: Update Critical Configuration
616617
- [ ] Update `_config.yml` → `url:` field
617618
- [ ] Verify `robots.txt` → `Sitemap:` line (generated from `{{ site.url }}{{ site.baseurl }}`)
619+
- [ ] Verify `/sitemap.xml` is generated (jekyll-sitemap) and includes key pages
618620
- [ ] Update or remove `CNAME` file if using custom domain
619621
620622
#### Step 2: Test Locally
@@ -768,9 +770,10 @@ headline: "Text with [link](OPENINGS_LINK)"
768770
**Security & Protection:**
769771
- ✅ Enhanced 404 page with navigation buttons and Bootstrap icons
770772
- ✅ Comprehensive `robots.txt` with crawler access control
771-
- Allows: Googlebot, Bingbot, Slurp with 10-second crawl delay
772-
- Blocks: MJ12bot, AhrefsBot, SemrushBot, DotBot, PetalBot, DataForSeoBot
773-
- Restricts: `/images/`, `/assets/`, `/css/`, `/js/` directories
773+
- Allows: major search engines (Googlebot, Bingbot, Slurp, DuckDuckBot, etc.)
774+
- Blocks: known heavy scraper / SEO bots (AhrefsBot, SemrushBot, MJ12bot, DotBot, PetalBot, DataForSeoBot)
775+
- Avoids `Crawl-delay` (ignored by Googlebot and may trigger Search Console warnings)
776+
- Restricts: internal build artifacts only (`/_site/`, `/bin/`)
774777
- ✅ Apache security configuration (`.htaccess`)
775778
- Directory browsing disabled
776779
- Security headers (X-Frame-Options, X-XSS-Protection, etc.)
@@ -968,7 +971,7 @@ headline: "Text with [link](OPENINGS_LINK)"
968971
- Test theme compatibility
969972

970973
3. **Security Review:**
971-
- Review `robots.txt` blocked crawlers
974+
- Review `robots.txt` blocked scraper list (and confirm sitemap URL)
972975
- Update `.htaccess` security headers
973976
- Check GitHub Pages security settings
974977

Gemfile

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@
1010
# Key Dependencies:
1111
# - jekyll 4.4.1+: Static site generator
1212
# - jekyll-scholar: BibTeX bibliography support
13+
# - jekyll-sitemap: Automatic sitemap.xml generation for search engines
1314
# - webrick 1.9+: Ruby web server for local development
1415
#
1516
# Installation:
@@ -29,5 +30,6 @@ gem "jekyll", "4.4.1"
2930
# gem "github-pages", "~> 232", group: :jekyll_plugins
3031

3132
gem "jekyll-scholar", group: :jekyll_plugins
33+
gem "jekyll-sitemap", group: :jekyll_plugins
3234
gem "webrick", "~> 1.9"
3335
gem "wdm", ">= 0.1.0" if Gem.win_platform?

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -328,7 +328,7 @@ bundle install
328328

329329
## Security Features
330330

331-
-**Crawler Protection:** `robots.txt` controls search engine access
331+
-**Crawler Protection:** `robots.txt` allows major search engines and blocks known heavy scraper bots
332332
-**Custom 404 Page:** User-friendly error handling with navigation
333333
-**DDoS Protection:** GitHub Pages + Cloudflare CDN
334334
-**Security Headers:** Content security and XSS protection

_config.yml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,10 @@ include:
3434
- _pages
3535
- robots.txt
3636

37+
plugins:
38+
- jekyll-scholar
39+
- jekyll-sitemap
40+
3741
sass:
3842
sass_dir: _sass
3943

_pages/aboutwebsite.md

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,6 @@
22
title: "About the website"
33
layout: textlay
44
excerpt: "About the website."
5-
sitemap: false
65
permalink: /aboutwebsite.html
76
---
87
<!--

_pages/allnews.md

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,6 @@
22
title: "News"
33
layout: textlay
44
excerpt: "Atlas Analytics Lab at Concordia University."
5-
sitemap: false
65
permalink: /allnews.html
76
---
87
<!--

_pages/contact.md

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,6 @@
22
title: "Atlas Analytics Lab - Contact"
33
layout: textlay
44
excerpt: "Ways to reach the Atlas Analytics Lab."
5-
sitemap: false
65
permalink: /contact/
76
---
87
<!--

_pages/funding.md

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,6 @@
22
title: "Atlas Analytics Lab - Funding"
33
layout: textlay
44
excerpt: "Atlas Analytics Lab -- Funding."
5-
sitemap: false
65
permalink: /funding/
76
---
87
<!--

_pages/gallery.md

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,6 @@
22
title: "Lab Life"
33
layout: gallerylay
44
excerpt: "Atlas Analytics Lab at Concordia University."
5-
sitemap: false
65
permalink: /gallery/
76
---
87
<!--

_pages/home.md

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,6 @@
22
title: "Atlas Analytics Lab - Home"
33
layout: homelay
44
excerpt: "Atlas Analytics Lab at Concordia University."
5-
sitemap: false
65
permalink: /
76
---
87
<!--

0 commit comments

Comments
 (0)