Skip to content

feat: Internal cross-linking System Agent task #228

@saraichinwag

Description

@saraichinwag

Why

Internal linking between related posts is currently done manually or by AI during content generation. 1,389 out of 1,707 published posts (81%) have zero internal links. This is massive SEO value left on the table.

New posts get links during generation (Pipeline 12 AI step uses local_search), but existing posts need backfill — and the linking needs to be semantic, not a bolted-on "Related Posts" section.

Architecture: Follows AltTextTask Pattern

This is a System Agent task, same architecture as AltTextTask:

  1. One post per jobSystemAgent::scheduleTask("internal_linking", ["post_id" => 1234])
  2. Action Scheduler handles batching — no server overload on bulk runs
  3. AI-powered semantic insertion — uses configured provider via RequestBuilder::build()
  4. Idempotent — tracks linked posts in post meta, won't re-process

Task Flow

InternalLinkingTask::execute($jobId, $params)
  1. Load post content
  2. Find 3 related published posts (category/tag overlap, no AI needed)
  3. Filter out posts already linked in content
  4. Send to AI: post content + related post URLs/titles
  5. AI returns content with links semantically woven into existing sentences
  6. Validate: confirm links were inserted, content length roughly matches
  7. Update post_content
  8. Store metadata in post meta (_dm_internal_links)
  9. completeJob() with results

AI Prompt Strategy

The AI receives:

  • Full post content (Gutenberg blocks)
  • 3 related post objects: {url, title, excerpt}

Instructions: "Weave these links naturally into existing sentences. Find phrases where the related topic is mentioned and wrap them in anchor tags. Do NOT add a Related Posts section. Do NOT change tone, meaning, or structure. Return full updated content."

Example — before:

In the winter, sunflowers don't grow because they need warm soil.

After (linking to "Why Don't Sunflowers Grow in Winter"):

In the winter, sunflowers don't grow because they need warm soil.

Related Post Discovery (No AI)

Score candidates by:

  • Shared category: 1 point per shared category
  • Shared tag: 2 points per shared tag
  • Title keyword overlap: bonus points
  • Pick top 3 that aren't already linked

Files

inc/Engine/AI/System/Tasks/InternalLinkingTask.php   — The task
inc/Abilities/InternalLinkingAbilities.php            — Ability registration
inc/Cli/Commands/LinksCommand.php                     — CLI (wp dm links crosslink)

Registration

// In SystemAgentServiceProvider::getBuiltInTasks()
$tasks["internal_linking"] = InternalLinkingTask::class;

Interface

wp_get_ability("datamachine/internal-linking")->execute([
    "post_ids"       => [1234, 5678, 9012],  // explicit list
    // OR
    "category"       => "birds",              // all posts in category
    "links_per_post" => 3,                    // default 3
    "dry_run"        => false,
]);

CLI

wp datamachine links crosslink --post_id=1234
wp datamachine links crosslink --category=birds --dry-run
wp datamachine links crosslink --all --links-per-post=3

Post Meta Tracking

// _dm_internal_links meta per post
[
    ["target_post_id" => 5678, "url" => "...", "anchor_text" => "...", "linked_at" => "2026-02-18"],
    ["target_post_id" => 9012, "url" => "...", "anchor_text" => "...", "linked_at" => "2026-02-18"],
]

Scope

  • 1,389 posts need backfill
  • ~4,167 cheap AI calls (gpt-5-mini) spread over Action Scheduler
  • Callable from: ability, CLI, agent pings, pipeline steps
  • Posts already have "Related reading" sections at the bottom — this task handles inline semantic links only

Context

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions