Skip to content

Conversation

@mirkoCrobu
Copy link
Contributor

Motivation

closes #109

The properties.msgpack.lock should protect properties.msgpack only during read/write operations; instead it seems that the lock file remains there indefinitely after the first access.

To reproduce

rm .local/share/arduino-app-cli/properties.msgpack.lock
curl http://localhost:8800/v1/properties/asd
file .local/share/arduino-app-cli/properties.msgpack.lock

Change description

Explicitly add file lock removal after checking if it exists

Additional Notes

Reviewer checklist

  • PR addresses a single concern.
  • PR title and description are properly filled.
  • Changes will be merged in main.
  • Changes are covered by tests.
  • Logging is meaningful in case of troubleshooting.

@mirkoCrobu mirkoCrobu self-assigned this Dec 2, 2025
@mirkoCrobu mirkoCrobu requested a review from a team December 2, 2025 08:02
@per1234 per1234 added the bug Something isn't working label Dec 2, 2025
@lucarin91
Copy link
Contributor

So it seems that the file is created by the lock procedure, but it is leave to the application to remove it https://pkg.go.dev/github.com/gofrs/flock#Flock.Unlock. That seems strange but probably ok.

@mirkoCrobu
Copy link
Contributor Author

So it seems that the file is created by the lock procedure, but it is left to the application to remove it https://pkg.go.dev/github.com/gofrs/flock#Flock.Unlock. That seems strange, but probably ok.

I guess this behavior is intended by the library. If there is a lot of concurrency, the following scenario can happen:

  1. A creates the lock file and locks the resource.
  2. B finds the lock file and waits for it.
  3. A unlocks and deletes the file.
  4. C arrives, creates a new lock file, and thinks it is free to work on the resource.
  5. B sees that the lock file (created by A) doesn't exist anymore and assumes the resource is free (creating a lock file different from the one created by C). In this case, B and C are working on the same resource at the same time.

In our case, this scenario is highly improbable, so we can prioritize filesystem cleanliness.

@lucarin91
Copy link
Contributor

So it seems that the file is created by the lock procedure, but it is left to the application to remove it https://pkg.go.dev/github.com/gofrs/flock#Flock.Unlock. That seems strange, but probably ok.

I guess this behavior is intended by the library. If there is a lot of concurrency, the following scenario can happen:

  1. A creates the lock file and locks the resource.
  2. B finds the lock file and waits for it.
  3. A unlocks and deletes the file.
  4. C arrives, creates a new lock file, and thinks it is free to work on the resource.
  5. B sees that the lock file (created by A) doesn't exist anymore and assumes the resource is free (creating a lock file different from the one created by C). In this case, B and C are working on the same resource at the same time.

In our case, this scenario is highly improbable, so we can prioritize filesystem cleanliness.

No, this isn't correct. B will see that another process creates the file and either continue to wait or just fail. At least that is how it should work. Removing the file shouldn't make any difference because the important part is that you unlock the file and if the file isn't there means that no one is locking it.

The open question is, how the interleave with remove, create and lock file works:

1. A unlock 
2. B. lock
3. A remove the file < -- this should fail and everything is ok
1. A unlock
2. A remove
3. B create
4. C lock 
4. B lock <-- this should fail but it also ok

So I don't know, for me it seems safe in theory, although I would prefer to do some tests regarding data consistency, we could spin up multiple process that does concurrent changes, and see if we have any data race.

@mirkoCrobu
Copy link
Contributor Author

@lucarin91 The problem is that the processes are working on file descriptors pointing to inodes:

A creates a file lock.lock (inode: 111)
B opens lock.lock, gets a file descriptor to Inode 111, and waits for the lock
A removes the file lock.lock (inode: 111), but B still holds an open File Descriptor to Inode 111.
B now acquires the (ghost) file lock.lock (inode: 111)
C checks for entry "lock.lock", sees it is missing, creates the lock.lock(with inode 123) and acquires it.
B has a valid lock on Inode 111. C has a valid lock on Inode 222. Both are modifying the resource simultaneously."

I tested it with the following script:

package main

import (
	"fmt"
	"os"
	"syscall"

	"github.com/gofrs/flock"
)

func main() {
	path := "race.lock"
	fileLock := flock.New(path)

	fmt.Printf("try taking the lock on %s...\n", path)

	locked, err := fileLock.TryLock()
	if err != nil {
		panic(err)
	}

	if locked {
		fileInfo, _ := os.Stat(path)
		stat := fileInfo.Sys().(*syscall.Stat_t)

		fmt.Printf("Lock Acquired!!\n")
		fmt.Printf("File name: %s\n", path)
		fmt.Printf("Inode ID : %d  \n", stat.Ino)
		fmt.Println("delete lock file from another shell and push enter")

		var input string
		fmt.Scanln(&input)

		fileLock.Unlock()
		fmt.Println("Unlocked.")
	} else {
		fmt.Println("Unable to acquire the lock.")
	}
}

@lucarin91
Copy link
Contributor

@lucarin91 The problem is that the processes are working on file descriptors pointing to inodes:

A creates a file lock.lock (inode: 111) B opens lock.lock, gets a file descriptor to Inode 111, and waits for the lock A removes the file lock.lock (inode: 111), but B still holds an open File Descriptor to Inode 111. B now acquires the (ghost) file lock.lock (inode: 111) C checks for entry "lock.lock", sees it is missing, creates the lock.lock(with inode 123) and acquires it. B has a valid lock on Inode 111. C has a valid lock on Inode 222. Both are modifying the resource simultaneously."

I tested it with the following script:

package main

import (
	"fmt"
	"os"
	"syscall"

	"github.com/gofrs/flock"
)

func main() {
	path := "race.lock"
	fileLock := flock.New(path)

	fmt.Printf("try taking the lock on %s...\n", path)

	locked, err := fileLock.TryLock()
	if err != nil {
		panic(err)
	}

	if locked {
		fileInfo, _ := os.Stat(path)
		stat := fileInfo.Sys().(*syscall.Stat_t)

		fmt.Printf("Lock Acquired!!\n")
		fmt.Printf("File name: %s\n", path)
		fmt.Printf("Inode ID : %d  \n", stat.Ino)
		fmt.Println("delete lock file from another shell and push enter")

		var input string
		fmt.Scanln(&input)

		fileLock.Unlock()
		fmt.Println("Unlocked.")
	} else {
		fmt.Println("Unable to acquire the lock.")
	}
}

Yeah, that makes sense. You should use file permissions to prevent removing a locked file. But just for completeness, this doesn't mean that the flock system doesn't work; it just means that the file should be there to make the locking system work. You can even check with lslocks command that the file is a working flock file, but it is labeled (deleted)

@lucarin91
Copy link
Contributor

At this point, I am wondering if we should not only keep the .lock file but also use the properties.msgpack file for locking. I don't see a lot of advantage in using another file for this.

@mirkoCrobu
Copy link
Contributor Author

I thought the same, but in deleteProperty and UpsertProperty we use renameio.WriteFile on properties.msgpack that works on a temp file and replaces the original with a new file and new inode.
So, we would need an immutable file to implement the lock.

I see two solutions:

  • keep the same .lock file and never delete it
  • manually deleting the .lock file, considering we don't have a high concurrency, and that the scenario A-B-C is unlikely

@lucarin91
Copy link
Contributor

I thought the same, but in deleteProperty and UpsertProperty we use renameio.WriteFile on properties.msgpack that works on a temp file and replaces the original with a new file and new inode. So, we would need an immutable file to implement the lock.

I see two solutions:

  • keep the same .lock file and never delete it
  • manually deleting the .lock file, considering we don't have a high concurrency, and that the scenario A-B-C is unlikely

Yeah, at this point, we could remove renameio and write to the file directly, not sure why we used both renameio and flock

@mirkoCrobu
Copy link
Contributor Author

I thought the same, but in deleteProperty and UpsertProperty we use renameio.WriteFile on properties.msgpack that works on a temp file and replaces the original with a new file and new inode. So, we would need an immutable file to implement the lock.
I see two solutions:

  • keep the same .lock file and never delete it
  • manually deleting the .lock file, considering we don't have a high concurrency, and that the scenario A-B-C is unlikely

Yeah, at this point, we could remove renameio and write to the file directly, not sure why we used both renameio and flock

I guess we need both because they solve 2 different problems:

  • Flock prevents data corruption through mutual exclusion from concurrency issues
  • renameio prevents data corruption through atomicity from an unexpected system crash

@lucarin91
Copy link
Contributor

I thought the same, but in deleteProperty and UpsertProperty we use renameio.WriteFile on properties.msgpack that works on a temp file and replaces the original with a new file and new inode. So, we would need an immutable file to implement the lock.
I see two solutions:

  • keep the same .lock file and never delete it
  • manually deleting the .lock file, considering we don't have a high concurrency, and that the scenario A-B-C is unlikely

Yeah, at this point, we could remove renameio and write to the file directly, not sure why we used both renameio and flock

I guess we need both because they solve 2 different problems:

  • Flock prevents data corruption through mutual exclusion from concurrency issues
  • renameio prevents data corruption through atomicity from an unexpected system crash

Right, so at this point, we could convert this task to improve the documentation regarding why implemented in that way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

properties.msgpack.lock is not deleted after changing the file

3 participants