Skip to content

[http] sanitise URL options before use them#22186

Open
linev wants to merge 6 commits into
root-project:masterfrom
linev:http_sanity_arg
Open

[http] sanitise URL options before use them#22186
linev wants to merge 6 commits into
root-project:masterfrom
linev:http_sanity_arg

Conversation

@linev
Copy link
Copy Markdown
Member

@linev linev commented May 8, 2026

URL syntax allow to provide different special symbols using %12%20 symbols.
Also some situations un-escaped quotes can make a problems.
Therefore cleanup arguments from URL string adding C escape symbols.

Use it in exe.json and cmd.json - main place where arguments which are coming via URL
are formatted for ProcessLine.

Add sophisticated check in cmd.json processing. One need to check quotes in argument and
in command definition around argument. If none exists - only simple numeric values can be injected
without quotes. Avoid double quotes one both side of argument placeholder.

Provide special flag to enable processing of post data in the exe.json, off by default.

Provide testing for the main requests processing in the sniffer.
This emulates response of THttpServer with same kind of http requests.

@linev linev requested review from dpiparo and jblomer May 8, 2026 09:16
@linev linev self-assigned this May 8, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 8, 2026

Test Results

    22 files      22 suites   3d 10h 19m 55s ⏱️
 3 856 tests  3 856 ✅ 0 💤 0 ❌
76 152 runs  76 152 ✅ 0 💤 0 ❌

Results for commit cb5ec22.

♻️ This comment has been updated with latest results.

@dpiparo dpiparo changed the title [http] sanity URL options before use them [http] sanitise URL options before use them May 8, 2026
Comment thread net/httpsniff/src/TRootSnifferFull.cxx Outdated
Copy link
Copy Markdown
Contributor

@silverweed silverweed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general I don't think we can really trust all this code to produce properly "sanitized" input. Sanitizers are notoriously hard to get right and I see many suspicious places in the Sniffer code.

So if these checks are in place as a measure against accidental errors, they may be useful, but I wouldn't consider them a measure against actual attacks.

Regardless, I put a couple of more specific comments.

Also, we definitely need tests for this.

if (!value || !*value)
return "";

TString res = value;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How long can value be? Since we're trying to sanitize the input, shouldn't we also put a reasonable max length?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no such limit and I cannot make here any assumption.
civetweb cuts URL length by 16K.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The backend is not guaranteed to be civetweb in principle, is it?
We should put an artificial limit here so we don't have to worry about huge allocations/reallocations/copies/iterations and we could in principle even preallocate the output string once and do a single pass over the string to validate it.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my mind such checks are not part of this PR.
They can be done in TUrl class which used to parse all url arguments

Comment thread net/http/src/TRootSniffer.cxx
clean.Append('\\');
clean.Append(c);
} else if (c > 31)
// ignore all special symbols
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think c == 127 is a "special" symbol (DEL)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably I just will use !std::iscntrl(c)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what this excludes exactly (does it exclude all ASCII non-printable characters? Does it depend on the locale? The man page makes it look like the case), so I was actually more comfortable with the explicit check.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not like to exclude all non-latin symbols here - as long as they supported on all other places.

Main intention of this PR - prevent creation of string literals which can be mis-interpreted by cling.

Comment thread net/httpsniff/inc/TRootSnifferFull.h Outdated

void CreateMemFile();

TString SanitzeArgument(Bool_t isstr, const char *value, Bool_t onlyquotes = kFALSE);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: Sanitize

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also this should probably be static

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While code used only once in ProduceExe - I decide to move it directly into ProduceExe method.

Comment thread net/httpsniff/src/TRootSnifferFull.cxx Outdated

TString TRootSnifferFull::SanitzeArgument(Bool_t isstr, const char *value, Bool_t onlyquotes)
{
Int_t beg = 0, end = strlen(value);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is value guaranteed to have a reasonable max length here? Otherwise strlen is potentially dangerous.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Value created from URL and will not be longer than 16K

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above: we should put our own limit as well

Comment thread net/httpsniff/src/TRootSnifferFull.cxx Outdated
if (!isstr && !onlyquotes) {
// for numeric types make very strict check
if (std::isalnum(c) || std::strchr(".:+-", c))
sanitized.Append(c);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will not necessarily produce a number, so I'm not sure what's the goal of this sanitization (what if the string is -4+3.2-245+++++67?)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Goal to avoid symbols sequence which can be interpreted as C/C++ code invocation like gSystem->DoSomething()

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're checking for isalnum though: did you mean to check for isdigit? Otherwise I wouldn't rule out "interesting" code invocations going through custom arithmetic operators...

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if every non-operator is a digit this will still cause N>=1 expressions to be evaluated, which I guess is safe, but it will still go through the interpreter to be executed before being passed to TMethodCall I think.

Comment thread net/httpsniff/src/TRootSnifferFull.cxx Outdated
if (std::isalnum(c) || std::strchr(".:+-", c))
sanitized.Append(c);
} else if ((c == '\"') || (c == '\\')) {
// escape quotes inside string
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is quite bizarre if I understand correctly: if an argument is "ab"cdef" we will parse it as the string "ab\"cdef"?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be delivered to C++ as

new TMethodCall(obj_cl, method_name, "ab\"cdef" );

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we rather require all internal quotes to be already escaped in the argument? If the incoming argument is supposed to be a valid string literal it would not be valid if it contains unescaped quotes, which may indicate some other kind of error (on the user side or otherwise)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we rather require all internal quotes to be already escaped in the argument?

On each layer escape character can disappear.
Quotes are very special - because when they goes into C++ interpreter they can break argument list and start new command. By such transformation we want guarantee that in any case interpreter will take provided string as single argument. Actually exactly such exploit is shown in [THttpServer-4] issue.
So attempt here to produce string which not closed in between.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On each layer escape character can disappear.

That sounds like a bug from the previous layers...
Regardless, it would be nice to have some tests on this and comments to make it clear why we're doing this and exactly what form of strings do we accept from the user side

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are many differences how quotes escaped in URL and in C++.
Main problem - string is used again in C++ script. So one need to provide/create escape characters.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even so, please add unit tests to verify/document the expected behavior of this code before merging.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I add testing of 4 different combinations how quotes can appear in command definition and in command invocation.

@linev linev force-pushed the http_sanity_arg branch from 3668ee2 to 83b7d75 Compare May 8, 2026 14:37
linev added 2 commits May 8, 2026 16:43
Remove any special symbols
Add escape characters for quote and escape itself

Try to avoid manipulation of arguments for method execution
Avoid special characters as draw arguments
@linev linev force-pushed the http_sanity_arg branch from 83b7d75 to 2b46cab Compare May 8, 2026 14:44
linev added 2 commits May 18, 2026 13:11
Always use DecodeUrlOptionValue method when processing URL arguments
or URL string. Internally method provides escape symbols for quotes and backslash.
If expecting numeric value - remove all symbols
keeping alphanumeric, '.', '+', '-' and ':'
It allows to deserialize post data as ROOT object when processing exe.json request.
While this can leads to arbitrary code loading and injection, disable this feature by default.
Can be enabled back with:
 ```
serv->SetAllowPostObject(kTRUE);
```
While here arbitrary string injected into ProcessLine,
ensure that only numeric argument is not quoted.
All other arguments kinds will be quoted and prevent execution of potentially dangerous code
@linev linev requested a review from bellenot as a code owner May 18, 2026 15:55
@linev linev requested a review from silverweed May 18, 2026 15:59
@linev linev force-pushed the http_sanity_arg branch from babddba to cb5ec22 Compare May 19, 2026 06:44
Verify execution of several supported requests
which can be handled by http server. Testing:
   - root.json
   - root.xml
   - exe.json
   - exe.json with POST data
   - item.json
   - cmd.json
   - multi.json
@linev linev force-pushed the http_sanity_arg branch from cb5ec22 to 69ba55a Compare May 19, 2026 08:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants