Medatata DB silent corruption protection/detection

## Possible mitigations

1. Store the DB on an integrity-protected filesystem.
  * Ext4 doesn't support checksumming data (though recent advancements support checksumming for metadata - but that's not enough), but below Ext4 it would be possible to use something like [dm-integrity](https://docs.kernel.org/admin-guide/device-mapper/dm-integrity.html)
  * This is harder to guarantee because we can't control circumstances where users will want to run Varasto.
2. When migrating to SQLite (see #206), use something like [The Checksum VFS Shim](https://sqlite.org/cksumvfs.html)
  * this however complicates deployment of SQLite since this is plugin we must conditionally compile in or load at runtime. Does it work with [SQLite's Go port](https://pkg.go.dev/modernc.org/sqlite) etc.
3. Build integrity verification one layer above SQLite. Since we're targeting on moving to EventSourcing -based architecture we can rely on properties of that architecture to project the event log to two different instances of the SQLite database: 1) operational 2) verification copy. If the process is deterministic (it should be) then I suspect the SQLite DB should be byte-identical (or at least semantically-equivalent) on disk. So we can compare two DB exports (based on same event cursor). If they differ, we know either instance is bad - we can shut down and investigate before more errors start accumulating. Of course this assumes the event log needs its own integrity verification (this is just shifting the problem there) - but it needs it anyway. Looks like this:

```mermaid
flowchart TB
    eventlog[Event log]
    operationaldb[Operational DB]
    comparisondb[Comparison DB]

    subgraph "DB integrity verification"
        compare[Comparison]
    end

    eventlog -- projected to --> operationaldb --> compare
    eventlog -- projected to --> comparisondb --> compare
```

Comparison of approaches:

| Approach | Works for all users | Does not complicate SQLite deployment |
|--------------|----|----|
| 1 | ❌ | ❌ |
| 2 | ✅  | ❌ |
| 3 | ✅  | ✅ |

Option 3 seems the cleanest.


## Concrete incident from my own instance

Integrity verification job identified three different disks having the same blob missing:

<img width="752" height="757" alt="Image" src="https://github.com/user-attachments/assets/af7e6fac-d798-4c5c-86e7-97a84b91792e" />

This is highly suspicious (was this a test of mine from years ago where I purposefully removed same blob from all three replicas?).

The blob ref from database record is:
```
6510b426e09cef0843dd8bdedd946067bcb016a9d0990794aa0fb938f4856dd8
```

The source for this blob ref is bucket scan: https://github.com/function61/varasto/blob/278ac87bd815fcd165f6dc33e96717a8e4e9b50d/pkg/blorm/simplerepo.go#L114-L124

When patched with some debug code:

```go
		if idExpected := r.idExtractor(record); !bytes.Equal(key, idExpected) {
			return fmt.Errorf("repo[%s] record[%x]: DISCREPANCY: id expected %x", r.bucketName, key, idExpected)
		}
```

When doing DB export (which visits each DB record) it first with this:

```
repo[blobs] record[6510b426e09cef0843dd8bdedd946067bcb016a9d0990794aa2fb938f4856dd8]: DISCREPANCY: id expected 6510b426e09cef0843dd8bdedd946067bcb016a9d0990794aa0fb938f4856dd8
```

(the "expected" comes from actual record, the "record" comes from the bucket's record key which should always be the same as the ID in the record)

So it looks like the ID in the record "payload" has bitrotted while the key is the original correct one. That explains why integrity verifier could read the blob metadata (it uses bucket scan which gets the incorrect ID) but then querying the blob metadata from REST API didn't work (it uses `OpenByPrimaryKey()` which expects the correct key and not the bitrotted one from record payload).

Most likely this is bitrot on the SSD and not bitrot on RAM when the record was last modified. Just due to probability of those different events..

Comparing the two (above the correct, below the incorrect one):

```
6510b426e09cef0843dd8bdedd946067bcb016a9d0990794aa2fb938f4856dd8
6510b426e09cef0843dd8bdedd946067bcb016a9d0990794aa0fb938f4856dd8
                                                  ^ difference here
hex in binary:
00000010
00000000
      ^ bit flipped 1->0
```

The fix will be to export the DB to JSON (a feature that already exists), fix the ID and then re-import.

### Sidetrack

On my server the [REST API blob metadata query](https://github.com/function61/varasto/blob/278ac87bd815fcd165f6dc33e96717a8e4e9b50d/pkg/stoserver/restapi.go#L516):
- didn't work with the bitrotted id.
- did work with the correct id

When I exported the database (to JSON format) and imported into my dev setup the same REST API did work. This was baffling. But it worked because of these differences resulting from semantic export->import (not using raw DB copy):

|  | On server | On dev setup |
|---|---|---|
| bucket key | correct | incorrect |
| id in record | incorrect | incorrect |

	func (r SimpleRepository) EachFrom(from []byte, fn func(record any) error, tx bbolt.Tx) error {
	bucket := tx.Bucket(r.bucketName)
	if bucket == nil {
	return ErrBucketNotFound
	}

	all := bucket.Cursor()
	for key, value := all.Seek(from); key != nil; key, value = all.Next() {
	record := r.alloc()

	if err := msgpack.Codec.Unmarshal(value, record); err != nil {

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Medatata DB silent corruption protection/detection #271

Possible mitigations

Concrete incident from my own instance

Sidetrack

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Medatata DB silent corruption protection/detection #271

Description

Possible mitigations

Concrete incident from my own instance

Sidetrack

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions