Skip to content

Fix preserving dataset merge for SFT#2218

Open
Bungmint wants to merge 1 commit intoNVIDIA-NeMo:mainfrom
Bungmint:codex-fix-preserving-dataset-merge
Open

Fix preserving dataset merge for SFT#2218
Bungmint wants to merge 1 commit intoNVIDIA-NeMo:mainfrom
Bungmint:codex-fix-preserving-dataset-merge

Conversation

@Bungmint
Copy link
Copy Markdown

@Bungmint Bungmint commented Apr 6, 2026

What does this PR do ?

Heterogenous tool schema concatenation instead of calling the HuggingFace Datasets concatenate_datasets

Issues

Closes the PR - (#2116):

@Bungmint Bungmint requested review from a team as code owners April 6, 2026 06:18
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot bot commented Apr 6, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@chtruong814 chtruong814 added the needs-follow-up Issue needs follow-up label Apr 6, 2026
Copy link
Copy Markdown
Contributor

@yuki-97 yuki-97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi @Bungmint , thanks for helping fix this! left one minor comment.

you'll also need to:

  1. rebase main since your branch is a bit far from main (>10 will cause CI fail)
  2. fix DCO check: https://github.com/NVIDIA-NeMo/RL/pull/2218/checks?check_run_id=70050497521

data_config[key] = default_data_config[key]


def merge_map_style_datasets(datasets: list[Any]) -> Any:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: merge_map_style_datasets -> merge_datasets

Suggested change
def merge_map_style_datasets(datasets: list[Any]) -> Any:
def merge_datasets(datasets: list[Any]) -> Any:

@chtruong814 chtruong814 added waiting-for-customer Waiting for response from the original author and removed needs-follow-up Issue needs follow-up waiting-for-customer Waiting for response from the original author labels Apr 8, 2026
@chtruong814 chtruong814 added the waiting-on-customer Waiting on the original author to respond label Apr 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-request waiting-on-customer Waiting on the original author to respond

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants