Skip to content

ROX-33198: Instrument inode tracking on file open lsm hook#391

Open
JoukoVirtanen wants to merge 19 commits intomainfrom
jv-ROX-33198-instrument-inode-tracking-on-file_open-lsm-hook
Open

ROX-33198: Instrument inode tracking on file open lsm hook#391
JoukoVirtanen wants to merge 19 commits intomainfrom
jv-ROX-33198-instrument-inode-tracking-on-file_open-lsm-hook

Conversation

@JoukoVirtanen
Copy link
Contributor

@JoukoVirtanen JoukoVirtanen commented Mar 15, 2026

Description

Host file paths need to be reported correctly. When a file is created in a tracked directory its inode should be added to a hash set in kernel space. In user space an entry needs to be added into a map with the inode as the key and file path as the value.

Checklist

  • Investigated and inspected CI test results
  • Updated documentation accordingly

Automated testing

  • Added unit tests
  • Added integration tests
  • Added regression tests

If any of these don't apply, please comment below.

Integration tests were modified.

Testing Performed

cat .cargo/config.toml
[build]
rustflags = ["-C", "force-frame-pointers=yes"]

[target.'cfg(all())']
runner = "sudo -E"

The following command was run in the main branch and this branch.

FACT_LOGLEVEL=debug cargo run --bin fact -- -p '/etc/sensitive-files/**/*' -p /etc/sensitive-files --expose-metrics

The following commands were also run

etc$ sudo mkdir sensitive-files
/etc/sensitive-files$ sudo touch hello_world.txt
/etc/sensitive-files$ sudo vi hello_world.txt

One of the tests was run with bye_world.txt instead.

In the main branch

{
  "timestamp": 1773981129521946307,
  "hostname": "jvirtane-thinkpadp1gen3.rmtuswa.csb",
  "process": {
    "comm": "vim",
    "args": [
      "/usr/bin/vim",
      "bye_world.txt"
    ],
    "exe_path": "/usr/bin/vim",
    "container_id": null,
    "uid": 0,
    "username": "root",
    "gid": 0,
    "login_uid": 4203274,
    "pid": 1252416,
    "in_root_mount_ns": true,
    "lineage": [
      {
        "uid": 4203274,
        "exe_path": "/usr/bin/sudo"
      },
      {
        "uid": 4203274,
        "exe_path": "/usr/bin/sudo"
      }
    ]
  },
  "file": {
    "Chmod": {
      "inner": {
        "filename": "/etc/sensitive-files/bye_world.txt",
        "host_file": "",
        "inode": {
          "inode": 0,
          "dev": 0
        }
      },
      "new_mode": 33188,
      "old_mode": 33188
    }
  }
}

Note that the host_file is blank.

In this branch

{
  "timestamp": 1773980922622536024,
  "hostname": "jvirtane-thinkpadp1gen3.rmtuswa.csb",
  "process": {
    "comm": "vim",
    "args": [
      "/usr/bin/vim",
      "hello_world.txt"
    ],
    "exe_path": "/usr/bin/vim",
    "container_id": null,
    "uid": 0,
    "username": "root",
    "gid": 0,
    "login_uid": 4203274,
    "pid": 1251608,
    "in_root_mount_ns": true,
    "lineage": [
      {
        "uid": 4203274,
        "exe_path": "/usr/bin/sudo"
      },
      {
        "uid": 4203274,
        "exe_path": "/usr/bin/sudo"
      }
    ]
  },
  "file": {
    "Chmod": {
      "inner": {
        "filename": "/etc/sensitive-files/hello_world.txt",
        "host_file": "/etc/sensitive-files/hello_world.txt",
        "inode": {
          "inode": 74960529,
          "dev": 37
        },
        "parent_inode": {
          "inode": 0,
          "dev": 0
        }
      },
      "new_mode": 33188,
      "old_mode": 33188
    }
  }
}

Note that host_file is now populated.

@JoukoVirtanen JoukoVirtanen requested a review from Molter73 March 15, 2026 18:14
"""
cwd = os.getcwd()
config = {
'paths': [f'{monitored_dir}/**', '/mounted/**', '/container-dir/**'],
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure what globbing to use here.

}

unsigned long magic = inode->i_sb->s_magic;
unsigned long magic = BPF_CORE_READ(inode, i_sb, s_magic);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inode_to_key needs to be able to use untrusted pointers, so that it can get the key for a parent inode. To get it to be able to use untrusted pointers, BPF_CORE_READ is used.

@JoukoVirtanen JoukoVirtanen requested a review from Molter73 March 20, 2026 05:22
@JoukoVirtanen JoukoVirtanen marked this pull request as ready for review March 20, 2026 05:23
@JoukoVirtanen JoukoVirtanen requested review from a team and rhacs-bot as code owners March 20, 2026 05:23
fut = '/mounted/test.txt'
fut_host = f'{ignored_dir}/test.txt'
fut_backup = f'{fut}~'
fut_backup_host = f'{ignored_dir}/test.txt~'
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fut_host and fut_backup_host should probably be blank.

def test_sed(vi_container, server, ignored_dir):
# File Under Test
fut = '/mounted/test.txt'
fut_host = f'{ignored_dir}/test.txt'
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

Copy link
Contributor

@Molter73 Molter73 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seems to be a lot of style changes, is this due to you using a newer Rust version? Can you check what version you are using? Is it 1.92+? If this is due to a version difference and yours is newer, can you open a new PR with those style changes? that way we can focus this PR on the actual inode tracking.

@Molter73
Copy link
Contributor

There seems to be a lot of style changes, is this due to you using a newer Rust version? Can you check what version you are using? Is it 1.92+? If this is due to a version difference and yours is newer, can you open a new PR with those style changes? that way we can focus this PR on the actual inode tracking.

Actually, on a second read-through, it might also be related to the bump to version 2024, if so, please move the bump to a separate PR and we can stack this one on top of that one, sorry for the inconvenience!

Copy link
Contributor

@Molter73 Molter73 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still need to go through the integration test changes, but I see them failing in the PR, do you need help debugging the failures?

self.update_entry(path.as_path()).with_context(|| {
format!("Failed to update entry for {}", path.display())
})?;
} else if path.is_dir() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is slightly related to a conversation I've been having with @Stringy. IMO, in the case of finding a directory that matches a glob pattern, we should recursively scan into it, the reason being that tracking the directory itself isn't of much value if the pattern doesn't hit any files inside the directory.

For now we can leave it like this, but I think we need to take a close look at what users might expect when configuring different types of glob patterns (i.e do they want that strict set of patterns to be monitored, or are they expressing a set of paths that work as the base to be monitored?)

/// We use the parent inode provided by the eBPF code
/// to look up the parent directory's host path, then construct the full
/// path by appending the new file's name.
fn handle_creation_event(&self, event: &Event) -> anyhow::Result<()> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm hoping this is still WIP and the debug statements are for testing, we really don't need a debug statement on each return condition, particularly not in the hot path where monitoring sets of files mildly noisy will just flood the logs and make it hard to see what is actually going on.

Please cut down on the debug statements, if something is expected to happen, don't put a statement there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried removing all the debug statements in this method and managed to boil it down from 50+ lines to just this:

        let inode = event.get_inode();
        let parent_inode = event.get_parent_inode();
        if self.get_host_path(Some(inode)).is_some() || parent_inode.empty() {
            return Ok(());
        }

        if let Some(filename) = event.get_filename().file_name()
            && let Some(parent_host_path) = self.get_host_path(Some(parent_inode))
        {
            let host_path = parent_host_path.join(filename);
            self.update_entry_with_inode(inode, host_path)
                .with_context(|| {
                    format!(
                        "Failed to add creation event entry for {}",
                        filename.display()
                    )
                })?;
        }

        Ok(())

Given this is a lot more compact and easy to ready and the additional verbosity in the logs the implementation in the PR doesn't bring much of a benefit IMO, I would propose we try to move the implementation closer to this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have made this change, except I was not able to combine the if statements because the edition is still 2021.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR bumping the edition is already in the main branch, you can rebase and get this working.

@JoukoVirtanen JoukoVirtanen force-pushed the jv-ROX-33198-instrument-inode-tracking-on-file_open-lsm-hook branch from 8fc3c95 to 4f1686a Compare March 20, 2026 17:31
@JoukoVirtanen JoukoVirtanen requested a review from Molter73 March 23, 2026 03:45
@JoukoVirtanen
Copy link
Contributor Author

I still need to go through the integration test changes, but I see them failing in the PR, do you need help debugging the failures?

The CI tests are now passing. I think there might be a flake involving unlinking, but that might need to be fixed in the future PR for the file_unlink LSM hook.

@JoukoVirtanen JoukoVirtanen force-pushed the jv-ROX-33198-instrument-inode-tracking-on-file_open-lsm-hook branch from cd4df3f to 2b139ff Compare March 23, 2026 14:46
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we unify the logic for checking if an inode is monitored in inode_is_monitored? Just have that method take both the inode and parent_inode, then return one of MONITORED, NOT_MONITORED or PARENT_MONITRED values? I think that might be a bit cleaner and we don't need a dedicated _with_parent function.

}

/// Similar to update_entry except we are are directly using the inode instead of the path.
fn update_entry_with_inode(&self, inode: &inode_key_t, path: PathBuf) -> anyhow::Result<()> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to take inode by reference here, if anything, it is causing the call from update_entry to create the inode and then copy it inside here, meaning we copy the data twice. On the other hand, the other usage in handle_creation_event gets a reference to an inode_key_t because it gets if from event.get_inode(), if anything, handle_creation_event should create the copy explicitly instead.

Comment on lines +108 to +113
f'{monitored_dir}',
f'{monitored_dir}/**/*',
'/mounted',
'/mounted/**/*',
'/container-dir',
'/container-dir/**/*',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain why we need these many monitored paths? Is the one with wildcards not hitting the parent directory?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants