Skip to content

[agent] Add udev DB reader and FillFromCrawler for device map#239

Open
fastrapier wants to merge 5 commits intofeat/udev-properties-devicemapfrom
feat/udev-crawler-fill
Open

[agent] Add udev DB reader and FillFromCrawler for device map#239
fastrapier wants to merge 5 commits intofeat/udev-properties-devicemapfrom
feat/udev-crawler-fill

Conversation

@fastrapier
Copy link
Copy Markdown
Member

Summary

PR 2 of 3 in the effort to replace lsblk exec-based block device discovery.

Adds initial device population support to DeviceMap:

  • ReadUdevDB — reads /run/udev/data/b<major>:<minor> and extracts
    E: properties. Crawler events from sysfs only carry basic kernel uevent
    fields (MAJOR, MINOR, DEVNAME, DEVTYPE); this function provides the
    udev-enriched fields (serial, model, wwn).
  • EnrichWithUdevDB — merges udev DB properties into a raw event env
    map. Event keys take priority over DB keys. On any failure returns the
    original env with an explanatory error.
  • FillFromCrawler — atomically replaces the entire DeviceMap with
    devices from crawler.Device (go-udev), enriching each with the udev
    database. Unparseable devices are skipped; all errors are returned to
    the caller.

How this will be used

PR 3 will call FillFromCrawler during scanner startup (after
go-udev/crawler.ExistingDevices) to populate the DeviceMap before
the netlink monitor starts receiving events.

What is NOT in this PR

  • No crawler invocation (only accepts []crawler.Device)
  • No sysfs I/O or BuildDevice / Snapshot
  • No scanner integration

Test plan

  • Unit tests covering ReadUdevDB, EnrichWithUdevDB (merge priority,
    error cases, immutability), FillFromCrawler (replace semantics, skip
    bad devices, nil input, udev DB enrichment)

Copy link
Copy Markdown

@szhem szhem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Blockers: none. Two High-severity findings that should be addressed before merge; the rest can be follow-ups.

Summary

Severity Count
Critical 0
High 2
Medium 4
Low 5
Nit 1

Positive Observations

  • Clean mutex discipline: file I/O in FillFromCrawler happens outside the lock; map swap is atomic under mu.Lock().
  • EnrichWithUdevDB does not mutate the input env — explicitly covered by TestEnrichWithUdevDB_DoesNotMutateOriginal.
  • All() returns a shallow copy — aliasing race prevented and validated by TestAll_ReturnsCopy.
  • Sentinel ErrUnknownAction follows the Go guideline; errors.Is usage covered in tests.
  • Field-resolution priorities (SCSI_IDENT_SERIAL → ID_SERIAL, ID_WWN_WITH_EXTENSION → ID_WWN, ID_MODEL_ENC → ID_MODEL) mirror lsblk exactly — good behavioural compatibility with the existing tooling.
  • unhexmangle + normalizeWhitespace covered by honest tests, including malformed sequences (\xZZ, \x4).
  • Every branch of HandleEvent is exercised (add/change/remove/bind/unbind/move/online/offline/unknown + bad MAJOR/MINOR + missing fields) — exemplary coverage.
  • TestDeviceMap_ConcurrentAccess now collects goroutine errors instead of swallowing them (addresses prior review).

PR-level notes

  • Tests (Nit): TestFillFromCrawler_ReplacesAll checks only Len() and absence of the old key — consider asserting that the new entries carry the expected DevName/Model so regressions in the fill path surface faster.
  • Tests (Nit): TestHandleEvent_RemoveNonexistent_NoError codifies silent remove-of-unknown. A one-line comment explaining this is intentional (kernel may emit remove for devices the agent never saw) would help future readers.

Comment thread images/agent/internal/udev/devicemap.go Outdated
Comment on lines +29 to +31
// DeviceMap is a thread-safe in-memory store of block device properties
// keyed by "major:minor". It is populated by netlink events (HandleEvent)
// and provides a consistent snapshot via All().
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Severity: Medium — Outdated godoc

After this PR the map is also populated by FillFromCrawler, but the doc still says it is only populated by netlink events. Readers of godoc will miss the bootstrap path.

Recommendation:

// DeviceMap is a thread-safe in-memory store of block device properties
// keyed by "major:minor". It is populated by FillFromCrawler at startup
// and by HandleEvent from netlink events thereafter. A consistent snapshot
// is available via All().

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in cd4f80d. Godoc now mentions both FillFromCrawler and HandleEvent.

Comment on lines +43 to +45
// HandleEvent processes a single netlink uevent. The action string comes
// directly from the kernel (add, change, remove, bind, unbind, move, online,
// offline). The env map is the raw udev environment from the event.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Severity: LowHandleEvent godoc does not mention ErrUnknownAction

Returning the sentinel is part of the contract — tests rely on errors.Is(err, ErrUnknownAction), but callers reading the godoc will not know they can differentiate this failure mode.

Recommendation: add one line:

// Returns ErrUnknownAction (wrapped) when the action is not one of the
// listed values; callers can check via errors.Is.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in cd4f80d. Added ErrUnknownAction to HandleEvent godoc.

Comment thread images/agent/internal/udev/devicemap.go Outdated
// properties (MAJOR, MINOR, DEVNAME, DEVTYPE) without udev-enriched fields
// (serial, model, wwn). Devices that fail to parse are skipped and their
// errors are collected in the returned slice.
func (dm *DeviceMap) FillFromCrawler(devices []crawler.Device, udevDBPath string) []error {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Severity: MediumudevDBPath as a method parameter, not a DeviceMap field

The path to the udev DB is an environment invariant, not a per-call detail. Every caller now has to know and thread DefaultRunUdevDataPath through. If a future change wants to enrich in HandleEvent as well, the signature has to change again.

Recommendation:

type DeviceMap struct {
    mu         sync.RWMutex
    devices    map[string]Properties
    udevDBPath string
}

func NewDeviceMap(udevDBPath string) *DeviceMap {
    return &DeviceMap{
        devices:    make(map[string]Properties),
        udevDBPath: udevDBPath,
    }
}

func (dm *DeviceMap) FillFromCrawler(devices []crawler.Device) []error { ... }

Bonus: easier to mock in the PR 3 scanner tests — pass t.TempDir() into the constructor, no extra argument plumbing.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in cd4f80d. udevDBPath is now a DeviceMap struct field, passed via NewDeviceMap(udevDBPath).

Comment thread images/agent/internal/udev/devicemap.go Outdated
// properties (MAJOR, MINOR, DEVNAME, DEVTYPE) without udev-enriched fields
// (serial, model, wwn). Devices that fail to parse are skipped and their
// errors are collected in the returned slice.
func (dm *DeviceMap) FillFromCrawler(devices []crawler.Device, udevDBPath string) []error {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Severity: MediumFillFromCrawler does not accept a context.Context

The method performs synchronous file I/O on N devices with no timeout and no shutdown path. In PR 3 this runs on agent startup; if /run/udev/data/ is on slow storage or the pod is being terminated mid-start, the agent will hang.

Recommendation: prepare the signature now so PR 3 can wire a real context:

func (dm *DeviceMap) FillFromCrawler(ctx context.Context, devices []crawler.Device) []error {
    for _, dev := range devices {
        if err := ctx.Err(); err != nil {
            return append(errs, err)
        }
        ...
    }
}

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in cd4f80d. FillFromCrawler now accepts context.Context and checks ctx.Err() on each iteration.

Comment thread images/agent/internal/udev/devicemap.go Outdated
// errors are collected in the returned slice.
func (dm *DeviceMap) FillFromCrawler(devices []crawler.Device, udevDBPath string) []error {
newDevices := make(map[string]Properties, len(devices))
var errs []error
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Severity: Nit — Preallocate errs

Upper bound is known (at most 2*len(devices) entries — enrichment + parse). A pre-sized slice avoids a couple of growslice calls.

Recommendation:

errs := make([]error, 0, len(devices))

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in cd4f80d. errs is now preallocated: make([]error, 0, len(devices)).

Comment thread images/agent/internal/udev/devicemap.go Outdated
defer dm.mu.Unlock()
dm.devices = newDevices

return errs
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Severity: Low — Consider returning error (via errors.Join) instead of []error

Go 1.20+ errors.Join is the idiomatic way to aggregate multiple errors. The caller gets a single error that still supports errors.Is/errors.As, logs as one line, and integrates with existing error-propagation code. Today every caller will have to loop through []error and glue the strings together.

Recommendation:

func (dm *DeviceMap) FillFromCrawler(...) error {
    ...
    return errors.Join(errs...)
}

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in cd4f80d. FillFromCrawler now returns error via errors.Join(errs...).

Comment thread images/agent/internal/udev/udevdb.go Outdated
// ReadUdevDB reads /run/udev/data/b<major>:<minor> and returns the E: properties
// as a key-value map. Non-E: lines (N:, S:, I:, etc.) are ignored.
func ReadUdevDB(basePath string, major, minor int) (map[string]string, error) {
path := fmt.Sprintf("%s/b%d:%d", basePath, major, minor)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Severity: Medium — Use filepath.Join instead of fmt.Sprintf for path construction

Current form does not handle a trailing slash in basePath cleanly (/run/udev/data//run/udev/data//b8:0). It works, but the idiomatic Go equivalent is safer and self-documents intent.

Recommendation:

path := filepath.Join(basePath, fmt.Sprintf("b%d:%d", major, minor))

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in cd4f80d. Now uses filepath.Join(basePath, fmt.Sprintf("b%d:%d", major, minor)).

Comment thread images/agent/internal/udev/udevdb.go Outdated
}
props := make(map[string]string)
for _, line := range strings.Split(string(data), "\n") {
after, ok := strings.CutPrefix(strings.TrimSpace(line), "E:")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Severity: LowTrimSpace of the whole line can introduce a leading space in the key

The udev DB format is strict (E:KEY=VALUE, no whitespace), so today this does not matter. But if the format ever grows a tolerant variant like E: FOO=BAR, strings.CutPrefix(strings.TrimSpace(line), "E:") yields " FOO=BAR", and the key stored in the map becomes " FOO" with a leading space.

Recommendation: keep the prefix check literal and trim key/value separately if needed:

after, ok := strings.CutPrefix(line, "E:")
if !ok {
    continue
}
if idx := strings.IndexByte(after, '='); idx > 0 {
    props[after[:idx]] = strings.TrimRight(after[idx+1:], "\r\n")
}

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in cd4f80d. Removed TrimSpace, now CutPrefix(line, "E:") directly. Value trimmed with TrimRight(after[idx+1:], "\r\n"). Also switched to IndexByte.

// serialFromUdevEnv follows the lsblk get_properties_by_udev() priority chain:
// SCSI_IDENT_SERIAL -> ID_SCSI_SERIAL -> ID_SERIAL_SHORT -> ID_SERIAL.
func serialFromUdevEnv(env map[string]string) string {
for _, key := range []string{"SCSI_IDENT_SERIAL", "ID_SCSI_SERIAL", "ID_SERIAL_SHORT", "ID_SERIAL"} {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Severity: Low — Priority list allocated on every call

[]string{"SCSI_IDENT_SERIAL", ...} is a new literal every time serialFromUdevEnv runs. For hot reconciliation loops this is avoidable GC pressure.

Recommendation: hoist to a package-level var:

var serialKeyPriority = []string{
    "SCSI_IDENT_SERIAL",
    "ID_SCSI_SERIAL",
    "ID_SERIAL_SHORT",
    "ID_SERIAL",
}

func serialFromUdevEnv(env map[string]string) string {
    for _, key := range serialKeyPriority {
        ...
    }
}

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in cd4f80d. Hoisted to package-level var serialKeyPriority.

return b.String()
}

func ensureDevPrefix(devName string) string {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Severity: LowensureDevPrefix has no defence against path traversal in DEVNAME

DEVNAME="../etc/passwd" becomes /dev/../etc/passwd. Today this value is only used as an identifier, so the risk is latent. But if a follow-up PR starts doing os.Open(props.DevName) or otherwise dereferences the path (for ioctl, size probing, etc.), this escapes /dev/ silently. The source of DEVNAME is trusted (kernel → udev), but defence in depth matters here.

Recommendation:

func ensureDevPrefix(devName string) string {
    if devName == "" || strings.HasPrefix(devName, "/dev/") {
        return devName
    }
    if strings.ContainsAny(devName, "/\x00") {
        return "" // or return an error from ParseProperties
    }
    return "/dev/" + devName
}

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving as-is. DEVNAME comes from the kernel (kobject uevent → dev->disk_name), not from user input. The value is always a flat device name like sda, nvme0n1p1, dm-0 — no path components. We don't do any file I/O on props.DevName; it's used purely as an identifier string. Adding validation for a kernel-sourced field that can never contain traversal sequences would be dead code.

@fastrapier fastrapier force-pushed the feat/udev-crawler-fill branch from 59f0aa7 to d51649e Compare April 20, 2026 12:42
…tion sentinel

Address code review feedback:
- Remove stuttering: UdevProperties → Properties, ParseUdevProperties → ParseProperties
- Add ErrUnknownAction sentinel error for programmatic error handling
- Fix concurrency test to collect and verify errors instead of discarding them

Signed-off-by: Daniil Studenikin <daniil.studenikin@flant.ru>
- Add doc-comments on all exported symbols (ErrUnknownAction,
  NewDeviceMap, Properties, ParseProperties)
- Document HandleEvent add-like/remove-like policy in docstring
- Move mutex acquisition after action validation to avoid locking
  on unknown actions
- Add "handle event:" prefix to unknown action error for consistency
- Switch MAJOR/MINOR parsing to strconv.ParseUint to reject negatives
- Soften normalizeWhitespace comment (approximating, not matching)
- Simplify unhexmangle via strconv.ParseUint, remove hexVal helper
- Add concurrent read/write and same-key contention tests

Signed-off-by: Daniil Studenikin <daniil.studenikin@flant.ru>
Add udevdb.go with ReadUdevDB and EnrichWithUdevDB for reading
/run/udev/data and merging udev-enriched properties into crawler
events (which only carry basic kernel uevent fields).

Add FillFromCrawler to DeviceMap: atomically replaces the device map
with sysfs crawler results, enriching each device from the udev
database. Unparseable devices are skipped with errors returned to
the caller.

Signed-off-by: Daniil Studenikin <daniil.studenikin@flant.ru>
- Skip E: lines with empty key (idx > 0 instead of idx >= 0)
- Add test for value containing '=' sign
- Add test for invalid MINOR in EnrichWithUdevDB

Signed-off-by: Daniil Studenikin <daniil.studenikin@flant.ru>
- Update DeviceMap/HandleEvent godoc (FillFromCrawler, ErrUnknownAction)
- Move udevDBPath from method param to DeviceMap struct field
- Add context.Context to FillFromCrawler for cancellation support
- Preallocate errs slice
- Ignore ErrNotExist from udev DB enrichment (normal at early boot)
- Reject nil devices slice instead of silently wiping the map
- Return errors.Join instead of []error
- Use filepath.Join for path construction in ReadUdevDB
- Fix TrimSpace/CutPrefix ordering and use IndexByte in ReadUdevDB
- Hoist serialKeyPriority to package-level var

Signed-off-by: Daniil Studenikin <daniil.studenikin@flant.ru>
@fastrapier fastrapier force-pushed the feat/udev-crawler-fill branch from cd4f80d to a1f6a8a Compare April 20, 2026 13:10
@fastrapier fastrapier changed the base branch from main to feat/udev-properties-devicemap April 20, 2026 13:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants