Skip to content

feat(task):task增加xAI渠道grok-imagine-video模型支持#4217

Open
faqwang wants to merge 1 commit intoQuantumNous:mainfrom
faqwang:main
Open

feat(task):task增加xAI渠道grok-imagine-video模型支持#4217
faqwang wants to merge 1 commit intoQuantumNous:mainfrom
faqwang:main

Conversation

@faqwang
Copy link
Copy Markdown

@faqwang faqwang commented Apr 13, 2026

⚠️ 提交说明 / PR Notice

Important

  • 增加xAI官方grok-imagine-video视频生成任务支持

📝 变更描述 / Description

支持通过/v1/video/generations调用grok-imagine-video视频生成任务,支持image.url入参,待完善引用图、视频传入模式。
计费规则硬编码在代码里,推荐模型计费基准设置grok-imagine-video按次计费,0.1/次即官方价格,官方计费规则:480p/720p计费标准分别为0.05/0.07每秒,图片0.02每张(图片价格硬编码到代码里)

🚀 变更类型 / Type of change

  • ✨ 新功能 (New feature) - 重大特性建议先通过 Issue 沟通

🔗 关联任务 / Related Issue

  • Closes # (如有)

✅ 提交前检查项 / Checklist

  • 人工确认: 我已亲自整理并撰写此描述,没有直接粘贴未经处理的 AI 输出。
  • 非重复提交: 我已搜索现有的 IssuesPRs,确认不是重复提交。
  • Bug fix 说明: 若此 PR 标记为 Bug fix,我已提交或关联对应 Issue,且不会将设计取舍、预期不一致或理解偏差直接归类为 bug。
  • 变更理解: 我已理解这些更改的工作原理及可能影响。
  • 范围聚焦: 本 PR 未包含任何与当前任务无关的代码改动。
  • 本地验证: 已在本地运行并通过测试或手动验证,维护者可以据此复核结果。
  • 安全合规: 代码中无敏感凭据,且符合项目代码规范。

📸 运行证明 / Proof of Work

(请在此粘贴截图、关键日志或测试报告,以证明变更生效)
创建任务结果:
image

任务完成结果:
image

Summary by CodeRabbit

  • New Features
    • Added support for xAI's video generation API, enabling users to generate videos through the xAI channel with configurable parameters (duration, aspect ratio, resolution, seed).
    • Integrated automatic billing calculation based on video generation parameters.
    • Implemented task polling and status tracking for video generation requests.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 13, 2026

Walkthrough

This PR introduces support for xAI video generation by implementing a new TaskAdaptor for the xAI channel type. The adaptor integrates with the existing task framework, handling request validation, URL construction, billing estimation, HTTP submission, response parsing, and task status polling. Supporting utilities normalize video attributes and process image inputs.

Changes

Cohort / File(s) Summary
xAI TaskAdaptor Core
relay/channel/task/xai/adaptor.go, relay/channel/task/xai/dto.go
Implements complete TaskAdaptor with methods for request/response lifecycle (validation, URL building, headers, body construction, billing estimation), HTTP submission, and task result parsing. Includes request/response DTOs for xAI video generation API interactions.
xAI Helper Utilities
relay/channel/task/xai/helpers.go, relay/channel/task/xai/image.go
Adds utility functions to resolve video attributes (duration, aspect ratio, resolution) from metadata and size strings; implements multipart image extraction and data URL conversion for image inputs.
xAI Testing
relay/channel/task/xai/adaptor_test.go
Introduces unit tests for EstimateBilling method, verifying duration resolution from both request and metadata sources.
Integration
model/task.go, relay/relay_adaptor.go
Expands InitTask condition to include ChannelTypeXai alongside existing channel types; registers xAI TaskAdaptor in GetTaskAdaptor function.

Sequence Diagram

sequenceDiagram
    participant Client
    participant TaskAdaptor as xAI TaskAdaptor
    participant XaiAPI as xAI API
    participant HTTPClient

    Client->>TaskAdaptor: ValidateRequestAndSetAction()
    activate TaskAdaptor
    TaskAdaptor-->>Client: validation result
    deactivate TaskAdaptor

    Client->>TaskAdaptor: BuildRequestURL()
    TaskAdaptor-->>Client: /v1/videos/generations

    Client->>TaskAdaptor: BuildRequestHeader()
    TaskAdaptor-->>Client: Bearer auth + Content-Type

    Client->>TaskAdaptor: BuildRequestBody()
    TaskAdaptor-->>Client: JSON with video params

    Client->>TaskAdaptor: DoRequest()
    activate TaskAdaptor
    TaskAdaptor->>HTTPClient: POST request
    HTTPClient->>XaiAPI: Submit video generation
    XaiAPI-->>HTTPClient: {request_id, ...}
    HTTPClient-->>TaskAdaptor: response
    deactivate TaskAdaptor
    TaskAdaptor-->>Client: response

    Client->>TaskAdaptor: DoResponse()
    activate TaskAdaptor
    TaskAdaptor->>TaskAdaptor: Parse request_id
    TaskAdaptor-->>Client: taskID + OpenAIVideo response
    deactivate TaskAdaptor

    loop Poll for result
        Client->>TaskAdaptor: FetchTask()
        activate TaskAdaptor
        TaskAdaptor->>HTTPClient: GET /v1/videos/{task_id}
        HTTPClient->>XaiAPI: Check status
        XaiAPI-->>HTTPClient: {status, video_url, ...}
        HTTPClient-->>TaskAdaptor: response
        TaskAdaptor-->>Client: response
        deactivate TaskAdaptor

        Client->>TaskAdaptor: ParseTaskResult()
        TaskAdaptor-->>Client: TaskInfo with status/URL/error
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~70 minutes

Possibly related PRs

Suggested reviewers

  • creamlike1024
  • seefs001
  • Calcium-Ion

Poem

🐰 A video channel hops into place,
xAI joins the race with grace,
Metadata normalized, images processed right,
Status polls spinning through the night,
From request to video, the adaptor's delight! 🎬

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 53.85% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main change: adding xAI channel support with grok-imagine-video model to the task system, which aligns with the primary objective of the pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@model/task.go`:
- Around line 176-179: When handling ChannelTypeXai in the task creation branch
that currently only sets privateData.Key = relayInfo.ChannelMeta.ApiKey, also
persist the submit-time header overrides (the effective auth context) into the
task's private data so async follow-ups can reuse them; specifically, copy
relayInfo.ChannelMeta's client header / header-override field(s) (e.g.,
client_header, header_overrides) into a new or existing privateData field (e.g.,
privateData.HeaderOverrides or privateData.ClientHeaders) so
relay/channel/task/xai/adaptor.go can reconstruct requests from the full auth
context rather than just channel.Key.

In `@relay/channel/task/xai/adaptor.go`:
- Around line 79-90: The code currently counts all entries in req.Images plus a
multipart-upload image and multiplies by 0.02 units each, but BuildRequestBody
only ever forwards a single image and 0.02 units is 10x too small (0.02 USD per
image with a $0.1 base unit equals 0.2 units). Change the logic to treat images
as at-most-one-billable input: compute a boolean like hasImage :=
len(req.Images) > 0 || ExtractMultipartImageURL(c, nil) != "" and set imageCount
= 1 if hasImage else 0, then compute imageBillingUnits := float64(imageCount) *
0.2 (0.2 units per image). Update references to imageCount/imageBillingUnits and
keep ExtractMultipartImageURL and BuildRequestBody behavior unchanged.

In `@relay/channel/task/xai/helpers.go`:
- Around line 97-113: The SizeToResolution function currently maps any size
under 1920 to "720p", causing 480p inputs to be billed at a higher tier; update
SizeToResolution to parse width/height as it does now, compute maxDim from parts
(using strconv.Atoi results), then return "480p" when maxDim <= 480, "720p" when
maxDim > 480 and < 1920, and "1080p" when maxDim >= 1920 (retain the existing
fallback behavior on parse errors).

In `@relay/channel/task/xai/image.go`:
- Around line 49-50: Guard against a nil info before mutating its Action: in the
function where info.Action is set (referencing info and
constant.TaskActionGenerate in image.go and the callsite EstimateBilling() ->
ExtractMultipartImageURL in adaptor.go), add a conditional check (if info !=
nil) around the line that assigns info.Action = constant.TaskActionGenerate so
multipart input_reference flows that pass a nil info won't panic; keep the
subsequent return buildDataURL(mimeType, fileBytes) unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 0adf2e99-f473-42b0-ab5f-f16e8c914863

📥 Commits

Reviewing files that changed from the base of the PR and between 3ab65a8 and 393bbbf.

📒 Files selected for processing (7)
  • model/task.go
  • relay/channel/task/xai/adaptor.go
  • relay/channel/task/xai/adaptor_test.go
  • relay/channel/task/xai/dto.go
  • relay/channel/task/xai/helpers.go
  • relay/channel/task/xai/image.go
  • relay/relay_adaptor.go

Comment thread model/task.go
Comment on lines 176 to 179
if relayInfo.ChannelMeta.ChannelType == constant.ChannelTypeGemini ||
relayInfo.ChannelMeta.ChannelType == constant.ChannelTypeVertexAi {
relayInfo.ChannelMeta.ChannelType == constant.ChannelTypeVertexAi ||
relayInfo.ChannelMeta.ChannelType == constant.ChannelTypeXai {
privateData.Key = relayInfo.ChannelMeta.ApiKey
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Persist xAI's resolved header overrides, not just the task key.

Adding privateData.Key for ChannelTypeXai is necessary, but it still leaves async follow-up requests short of the full auth context. relay/channel/task/xai/adaptor.go rebuilds polling requests from the bearer key alone, so any submit-time {client_header:*} / header-override auth will be lost after task creation.

Based on learnings, async video task flows should reuse the effective authentication context (API key + header overrides) captured at task submit time rather than re-deriving it from channel.Key at each call site.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@model/task.go` around lines 176 - 179, When handling ChannelTypeXai in the
task creation branch that currently only sets privateData.Key =
relayInfo.ChannelMeta.ApiKey, also persist the submit-time header overrides (the
effective auth context) into the task's private data so async follow-ups can
reuse them; specifically, copy relayInfo.ChannelMeta's client header /
header-override field(s) (e.g., client_header, header_overrides) into a new or
existing privateData field (e.g., privateData.HeaderOverrides or
privateData.ClientHeaders) so relay/channel/task/xai/adaptor.go can reconstruct
requests from the full auth context rather than just channel.Key.

Comment on lines +79 to +90
// 4. 统计图片数量并计算图片计费单位
// 统计普通请求里的图片
imageCount := len(req.Images)

// 如果有文件上传形式的图片,也算一张
// (参考你代码里的 ExtractMultipartImageURL 逻辑)
if imageURL := ExtractMultipartImageURL(c, nil); imageURL != "" {
imageCount++
}

// 每张图 $0.002,基准 $0.1,所以每张图合 0.02 个单位
imageBillingUnits := float64(imageCount) * 0.02
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Align image billing with the single-image request shape and stated price.

This path bills len(req.Images) plus a multipart upload, but BuildRequestBody() only ever forwards one image (input_reference or req.Images[0]). It also uses 0.02 base units per image, which is $0.002 when the base unit is $0.1; the PR objective says $0.02 per image. As written, multi-image requests are billed for inputs that are dropped, and each billed image is still underpriced by 10x.

💸 Proposed fix
-	// 统计普通请求里的图片
-	imageCount := len(req.Images)
+	imageCount := 0
+	if len(req.Images) > 0 {
+		imageCount = 1
+	}

-	// 如果有文件上传形式的图片,也算一张
-	// (参考你代码里的 ExtractMultipartImageURL 逻辑)
+	// multipart input takes precedence over req.Images[0]
 	if imageURL := ExtractMultipartImageURL(c, nil); imageURL != "" {
-		imageCount++
+		imageCount = 1
 	}

-	// 每张图 $0.002,基准 $0.1,所以每张图合 0.02 个单位
-	imageBillingUnits := float64(imageCount) * 0.02
+	// 每张图 $0.02,基准 $0.1,所以每张图合 0.2 个单位
+	imageBillingUnits := float64(imageCount) * 0.2
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// 4. 统计图片数量并计算图片计费单位
// 统计普通请求里的图片
imageCount := len(req.Images)
// 如果有文件上传形式的图片,也算一张
// (参考你代码里的 ExtractMultipartImageURL 逻辑)
if imageURL := ExtractMultipartImageURL(c, nil); imageURL != "" {
imageCount++
}
// 每张图 $0.002,基准 $0.1,所以每张图合 0.02 个单位
imageBillingUnits := float64(imageCount) * 0.02
// 4. 统计图片数量并计算图片计费单位
imageCount := 0
if len(req.Images) > 0 {
imageCount = 1
}
// multipart input takes precedence over req.Images[0]
if imageURL := ExtractMultipartImageURL(c, nil); imageURL != "" {
imageCount = 1
}
// 每张图 $0.02,基准 $0.1,所以每张图合 0.2 个单位
imageBillingUnits := float64(imageCount) * 0.2
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@relay/channel/task/xai/adaptor.go` around lines 79 - 90, The code currently
counts all entries in req.Images plus a multipart-upload image and multiplies by
0.02 units each, but BuildRequestBody only ever forwards a single image and 0.02
units is 10x too small (0.02 USD per image with a $0.1 base unit equals 0.2
units). Change the logic to treat images as at-most-one-billable input: compute
a boolean like hasImage := len(req.Images) > 0 || ExtractMultipartImageURL(c,
nil) != "" and set imageCount = 1 if hasImage else 0, then compute
imageBillingUnits := float64(imageCount) * 0.2 (0.2 units per image). Update
references to imageCount/imageBillingUnits and keep ExtractMultipartImageURL and
BuildRequestBody behavior unchanged.

Comment on lines +97 to +113
// SizeToResolution converts a "WxH" size string to a resolution label used by xAI.
func SizeToResolution(size string) string {
parts := strings.SplitN(strings.ToLower(strings.TrimSpace(size)), "x", 2)
if len(parts) != 2 {
return "720p"
}
w, _ := strconv.Atoi(parts[0])
h, _ := strconv.Atoi(parts[1])
maxDim := w
if h > maxDim {
maxDim = h
}
if maxDim >= 1920 {
return "1080p"
}
return "720p"
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Handle a real 480p tier instead of collapsing all smaller sizes into 720p.

SizeToResolution("640x480") / ("854x480") currently returns 720p, so this adaptor can never hit the lower-resolution pricing path when the client provides a 480p-sized input. That will request and bill a higher tier than intended.

💡 Proposed direction
 func SizeToResolution(size string) string {
 	parts := strings.SplitN(strings.ToLower(strings.TrimSpace(size)), "x", 2)
 	if len(parts) != 2 {
 		return "720p"
 	}
 	w, _ := strconv.Atoi(parts[0])
 	h, _ := strconv.Atoi(parts[1])
+	if w <= 0 || h <= 0 {
+		return "720p"
+	}
 	maxDim := w
 	if h > maxDim {
 		maxDim = h
 	}
 	if maxDim >= 1920 {
 		return "1080p"
 	}
-	return "720p"
+	if maxDim >= 1280 {
+		return "720p"
+	}
+	return "480p"
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@relay/channel/task/xai/helpers.go` around lines 97 - 113, The
SizeToResolution function currently maps any size under 1920 to "720p", causing
480p inputs to be billed at a higher tier; update SizeToResolution to parse
width/height as it does now, compute maxDim from parts (using strconv.Atoi
results), then return "480p" when maxDim <= 480, "720p" when maxDim > 480 and <
1920, and "1080p" when maxDim >= 1920 (retain the existing fallback behavior on
parse errors).

Comment on lines +49 to +50
info.Action = constant.TaskActionGenerate
return buildDataURL(mimeType, fileBytes)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Guard info before mutating Action.

EstimateBilling() calls ExtractMultipartImageURL(c, nil) in relay/channel/task/xai/adaptor.go, so a multipart input_reference request will panic here before submission. Make the action update conditional on info != nil.

🐛 Proposed fix
-	info.Action = constant.TaskActionGenerate
+	if info != nil {
+		info.Action = constant.TaskActionGenerate
+	}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
info.Action = constant.TaskActionGenerate
return buildDataURL(mimeType, fileBytes)
if info != nil {
info.Action = constant.TaskActionGenerate
}
return buildDataURL(mimeType, fileBytes)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@relay/channel/task/xai/image.go` around lines 49 - 50, Guard against a nil
info before mutating its Action: in the function where info.Action is set
(referencing info and constant.TaskActionGenerate in image.go and the callsite
EstimateBilling() -> ExtractMultipartImageURL in adaptor.go), add a conditional
check (if info != nil) around the line that assigns info.Action =
constant.TaskActionGenerate so multipart input_reference flows that pass a nil
info won't panic; keep the subsequent return buildDataURL(mimeType, fileBytes)
unchanged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant