Skip to content

tests: add whitespace tests for vertical tab behavior#155028

Open
Brace1000 wants to merge 40 commits intorust-lang:mainfrom
Brace1000:whitespace-tests
Open

tests: add whitespace tests for vertical tab behavior#155028
Brace1000 wants to merge 40 commits intorust-lang:mainfrom
Brace1000:whitespace-tests

Conversation

@Brace1000
Copy link
Copy Markdown

@Brace1000 Brace1000 commented Apr 9, 2026

View all comments

This PR adds two small tests to highlight how vertical tab (\x0B)
is handled differently across Rust's whitespace definitions.

The Rust lexer treats vertical tab as whitespace (Unicode
Pattern_White_Space), while split_ascii_whitespace follows the
WhatWG Infra Standard and does not include vertical tab.

These tests make that difference visible and easier to understand.

See: rust-lang/rust-project-goals#53

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Apr 9, 2026
@rustbot
Copy link
Copy Markdown
Collaborator

rustbot commented Apr 9, 2026

r? @dingxiangfei2009

rustbot has assigned @dingxiangfei2009.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

Why was this reviewer chosen?

The reviewer was selected based on:

  • Owners of files modified in this PR: compiler
  • compiler expanded to 69 candidates
  • Random selection from 11 candidates

@rustbot

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

Copy link
Copy Markdown
Author

@Brace1000 Brace1000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix tidy for whitespace tests spaces, trailing newline

View changes since this review

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

Copy link
Copy Markdown
Contributor

@teor2345 teor2345 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, just needs a few tweaks

View changes since this review

let x = 5;
let y = 10;
let z = x + y;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since vertical tab doesn't show up in GitHub's PR review rendering, please put a comment above each line containing the whitespace.

You might want to add lines with each of the 11 permitted whitespace characters:
https://doc.rust-lang.org/reference/whitespace.html

And then some lines with the other 14 disallowed whitespace characters (the ones from this list marked White_Space, that aren't in the first list):
https://www.unicode.org/Public/UCD/latest/ucd/PropList.txt

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! Added a comment above each line with an invisible whitespace character so they are visible in the diff. Also expanded the test to cover all 11 permitted Pattern_White_Space characters inline, and listed the 14 disallowed Unicode White_Space characters in comments since placing them between tokens would cause a compile error.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can create a separate test that fails, and put //@ check-fail at the top of the file:
https://rustc-dev-guide.rust-lang.org/tests/directives.html#controlling-outcome-expectations

This is how to test whitespace that is not allowed in Rust source code.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback! I’ve added the failing UI test and updated it to match the full stderr output, including the help message. I’ll go through it again and make the remaining tweaks.

@@ -0,0 +1,22 @@
// This test checks that split_ascii_whitespace does NOT split on
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this test is relevant to the compiler?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point. The test documents the gap between what the lexer accepts and what the stdlib gives you. Happy to remove it if you think it doesn't belong here.

Comment thread tests/ui/README.md Outdated

Tests on `where` clauses. See [Where clauses | Reference](https://doc.rust-lang.org/reference/items/generics.html#where-clauses).

## `whitespace`
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will need an explanation of why the whitespace tests are needed. It's a good place to mention that is_ascii_whitespace and is_whitespace in the standard library don't match the Rust language's definition of whitespace.

Copy link
Copy Markdown
Author

@Brace1000 Brace1000 Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added! The README now explains that the Rust lexer uses Unicode Pattern_White_Space, which differs from both is_ascii_whitespace (WhatWG, skips vertical tab) and is_whitespace (Unicode White_Space, broader set). That context makes it clearer why these tests exist

// the standard library's is_ascii_whitespace does NOT include vertical
// tab, following the WhatWG Infra Standard instead.
//
// See: https://github.com/rust-lang/rust-project-goals/issues/53
Copy link
Copy Markdown
Contributor

@teor2345 teor2345 Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

View changes since the review

Where did you get this link? It's not the Outreachy tracking issue.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, Fixed it to point to the correct Outreachy tracking issue.

@rustbot

This comment has been minimized.

@rustbot rustbot added has-merge-commits PR has merge commits, merge with caution. S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Apr 11, 2026
@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

Add two small tests to highlight how vertical tab is handled differently.

- vertical_tab_lexer.rs checks that the lexer treats vertical tab as whitespace
- ascii_whitespace_excludes_vertical_tab.rs shows that split_ascii_whitespace does not split on it

This helps document the difference between the Rust parser (which accepts vertical tab)
and the standard library’s ASCII whitespace handling.

See: rust-lang/rust-project-goals#53
make sure tabs and spaces are well checked
Updated error message to clarify invisible characters.
@rust-log-analyzer

This comment has been minimized.

Updated error message for invisible characters in code.
@rust-log-analyzer

This comment has been minimized.

Add error message for invalid whitespace in code
@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

Updated the test to check for invalid whitespace handling.
@rust-log-analyzer

This comment has been minimized.

Fix formatting issue by adding space around '=' in variable declaration.
@rust-log-analyzer

This comment has been minimized.

Update the error messages for invalid whitespace in the test.
@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

Copy link
Copy Markdown
Contributor

@teor2345 teor2345 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like you're struggling a bit, here are some notes about what to update in the tests

View changes since this review

Comment thread tests/ui/whitespace/invalid_whitespace.rs
|
= help: invisible characters like '\u{200b}' are not usually visible in text editors

error: aborting due to 1 previous error
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to update this file so it matches the test output exactly.

The easiest way to do this is ./x test ui/whitespace --bless

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, thanks! I ran "./x test ui/whitespace --bless" and updated the file to match the expected output. Everything is passing on my end now

@rust-log-analyzer

This comment has been minimized.

@Brace1000
Copy link
Copy Markdown
Author

Brace1000 commented Apr 15, 2026

Hi @teor2345 , Since today is the deadline, I just wanted to check , is it okay if I keep contributing to this PR while it’s still under review? It’s almost there, so I’d be happy to address any remaining feedback if that works.

@teor2345
Copy link
Copy Markdown
Contributor

You can put links to incomplete PRs in your final application, and you can continue to work on those PRs.

What happens to your PR is up to the maintainers of the tool you're modifying or testing. Sometimes PRs (or parts of PRs) don't get accepted for good reasons, and that's ok. Sometimes there are review delays because reviewers are busy, and that's also ok.

@Brace1000
Copy link
Copy Markdown
Author

Hi @teor2345 ,
I still have an unmerged PR I’ve been working on. Could you please guide me on how I should continue improving it or how best to proceed?
Thank you for your help

@teor2345
Copy link
Copy Markdown
Contributor

The next step is a review by someone with Rust merge rights.

Please be patient, it is normal for reviewers to take 2 weeks or more to review your PR. If it has been more than 2 weeks, you can tag the assigned reviewer, or re-assign another review using the instructions at the top of this PR.

@Brace1000
Copy link
Copy Markdown
Author

Got it, thank you for the clarification. I’ll be patient and follow up if needed after some time.
I really appreciate your guidance throughout.it has been an exciting experience journey with you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants