fix(telemetry): improve vm/container detection by hsanjuan · Pull Request #10944 · ipfs/kubo

hsanjuan · 2025-08-28T10:52:28Z

Current VM detection is not very accurate and systemd-detect-virt does exactly what's needed under a miriad of virtualization platforms.

The downside is that we are running a system command which is uglier and might perhaps flip anti-viruses or something.

Current VM detection is not very accurate and systemd-detect-virt does exactly what's needed under a miriad of virtualization platforms. The downside is that we are running a system command which is uglier and might perhaps flip anti-viruses or something.

gammazero

Since the command output is not being captured, you probably want to use the --quiet flag.

plugin/plugins/telemetry/telemetry.go

Co-authored-by: Andrew Gillis <11790789+gammazero@users.noreply.github.com>

lidel

Hm.. i think we need different approach, this is too hairy:

PATH injection risk - attacker could place malicious systemd-detect-virt in PATH -- not saying this is real, but all systems / audits / hardenign tools will scream, making it harder for engineers to use kubo in their infra.
No timeout - command could hang indefinitely
Unnecessary repeated execution - virtualization status won't change during runtime, so we should only check it once and cache result

Give me a sec, I have a stashed branch with go-only checks, need to refresh my memory, maybe we can avoid calling external binary.

replace systemd-detect-virt with file-based detection to avoid: - security risks from executing external binaries - unnecessary repeated detection (now cached with sync.Once) - missing detection on non-systemd systems removes false positives: - cpu hypervisor flag (indicates capability, not guest status) - generic dmi strings that match physical hardware - overlay filesystem check (used by immutable distros)

aschmahmann · 2025-08-29T13:58:57Z

Give me a sec, I have a stashed branch with go-only checks, need to refresh my memory, maybe we can avoid calling external binary.

@lidel in case it's of use I see that this library exists github.com/ShellCode33/VM-Detection. It doesn't seem maintained (from both the last commit and the issues), so you might be better off just writing / porting the basic checks from C, but wanted to flag in case it's helpful.

lidel

@aschmahmann thx, yes, my changes from d77f130 are partially based on that prior art. Unfortunately that library is bit too native and produces false-positives. We skipped some... creative 🙃 things to avoid false-pisitives, for example

Microsoft Surface laptops being interpreted as VMs ;-)
CPUID VM flag - detects CPU capability to run VMs, not whether we're in one (causes false positives on modern hardware)
MAC address prefixes - can be spoofed or match USB adapters from those vendors
Low resource detection (<3 CPUs, <3GB RAM) - would incorrectly flag cheap RPi/old/embedded hardware as VMs
Reading ALL files in /sys/class/dmi/id/ - too broad, could match vendor names in unexpected places etc

I think the current approach in this PR is the right balance, focusing on zero false positives over maximum detection coverage (bad telemetry data is worse than missing data).

@hsanjuan @gammazero mind taking a look? (ok to merge from my end)

aschmahmann · 2025-08-29T14:28:30Z

plugin/plugins/telemetry/telemetry.go

-	if err == nil {
-		for line := range strings.Lines(string(content)) {
-			if strings.Contains(line, "overlay") && strings.Contains(line, "/var/lib/containers/storage/overlay") {
+	// WSL is technically a container-like environment


True, but does this help us to bundle with other virtualization tools? Maybe yes from the perspective of "networking is harder / easier to mess up", but not really if we're using as a proxy for say users who tend to control their environments programmatically.

Not strictly advocating one way or the other here, but would be helpful to know (and document in case we need to revisit down the line) what the idea is.

I also don't have strong feeling here, but I think if something fails in WSL is would be because of container-like nature.

I'm thinking this way: if we see networking issues in WSL, they should correlate with "Window+container" cohort, rather regular Windows users who run IPFS Desktop, no?

Yeah probably. Although IIUC the comparison is not to windows users but to traditional Linux users using CLI or desktop, since the only information being tracked is that it's Linux and a container (vs just Linux)

This is fine as it is, but we may want to consider adding some string to our telemetry data to indicate just what was detected: "linux+wsl"

hsanjuan

LGTM

gammazero

This is a much safer approach than exec.Command.

See question about caching the answer within the check, as is done in this PR. Or whether telemetry should have knowledge about which checks to only do once.

plugin/plugins/telemetry/telemetry.go

gammazero · 2025-08-29T21:21:56Z

plugin/plugins/telemetry/telemetry.go

-	if err == nil {
-		for line := range strings.Lines(string(content)) {
-			if strings.Contains(line, "overlay") && strings.Contains(line, "/var/lib/containers/storage/overlay") {
+	// WSL is technically a container-like environment


This is fine as it is, but we may want to consider adding some string to our telemetry data to indicate just what was detected: "linux+wsl"

gammazero · 2025-08-29T21:35:43Z

plugin/plugins/telemetry/telemetry.go


 func isRunningInContainer() bool {
-	// Check for Docker container
+	containerDetectionOnce.Do(func() {


Do we want to cache the result here (if does not hurt to do so), or do we want to cache at a higher level remembering that isRunningInContainer and isRunningInVM have already been called?

Previously, I had assumed that container/VM checks would be cached at a higher level, but caching the result within in the check here says that was a wrong assumption. It seems like telemetry should know which questions it only needs to ask once.

i guess this is an extra precaution for now. we will likely refine metrics over time, and have more clear split between dynamic and one-time ones, but thats out of scope here (merging as is)

More memory-efficient as it processes one line at a time instead of creating a slice of all lines upfront. Ref: #10944 (comment) Co-authored-by: Andrew Gillis <11790789+gammazero@users.noreply.github.com>

hsanjuan self-assigned this Aug 28, 2025

hsanjuan requested a review from a team as a code owner August 28, 2025 10:52

hsanjuan added the skip/changelog This change does NOT require a changelog entry label Aug 28, 2025

gammazero approved these changes Aug 29, 2025

View reviewed changes

plugin/plugins/telemetry/telemetry.go Outdated Show resolved Hide resolved

plugin/plugins/telemetry/telemetry.go Outdated Show resolved Hide resolved

hsanjuan and others added 2 commits August 29, 2025 08:45

Update plugin/plugins/telemetry/telemetry.go

ad9326a

Co-authored-by: Andrew Gillis <11790789+gammazero@users.noreply.github.com>

Update plugin/plugins/telemetry/telemetry.go

12ad47e

Co-authored-by: Andrew Gillis <11790789+gammazero@users.noreply.github.com>

lidel assigned lidel and unassigned hsanjuan Aug 29, 2025

lidel requested changes Aug 29, 2025

View reviewed changes

lidel mentioned this pull request Aug 29, 2025

Release 0.38 #10884

Closed

49 tasks

lidel approved these changes Aug 29, 2025

View reviewed changes

lidel changed the title ~~telemetry: use systemd-detect-virt for container/vm detection~~ fix(telemetry): improve vm/container detection Aug 29, 2025

aschmahmann reviewed Aug 29, 2025

View reviewed changes

hsanjuan commented Aug 29, 2025

View reviewed changes

gammazero approved these changes Aug 29, 2025

View reviewed changes

refactor(telemetry): use strings.Lines for efficient line iteration

77c33ce

More memory-efficient as it processes one line at a time instead of creating a slice of all lines upfront. Ref: #10944 (comment) Co-authored-by: Andrew Gillis <11790789+gammazero@users.noreply.github.com>

lidel merged commit 049256c into master Sep 8, 2025
16 checks passed

lidel deleted the telemetry/improve-vm-detection branch September 8, 2025 18:38

BrewTestBot mentioned this pull request Oct 8, 2025

ipfs 0.38.1 Homebrew/homebrew-core#247492

Merged

Uh oh!

Conversation

hsanjuan commented Aug 28, 2025

Uh oh!

gammazero left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

lidel left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aschmahmann commented Aug 29, 2025

Uh oh!

lidel left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aschmahmann Aug 29, 2025

Choose a reason for hiding this comment

Uh oh!

lidel Aug 29, 2025

Choose a reason for hiding this comment

Uh oh!

aschmahmann Aug 29, 2025

Choose a reason for hiding this comment

Uh oh!

gammazero Aug 29, 2025

Choose a reason for hiding this comment

Uh oh!

hsanjuan left a comment

Choose a reason for hiding this comment

Uh oh!

gammazero left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

gammazero Aug 29, 2025

Choose a reason for hiding this comment

Uh oh!

gammazero Aug 29, 2025

Choose a reason for hiding this comment

Uh oh!

lidel Sep 8, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

lidel left a comment •

edited

Loading

lidel left a comment •

edited

Loading