Add option to reach data download token from b64 SA keys#165
Add option to reach data download token from b64 SA keys#165nikhilwoodruff wants to merge 1 commit into
Conversation
anth-volk
left a comment
There was a problem hiding this comment.
@nikhilwoodruff Question for you around desired implementation.
| @@ -16,7 +20,18 @@ class SimplifiedGoogleStorageClient: | |||
| """ | |||
|
|
|||
| def __init__(self): | |||
There was a problem hiding this comment.
question, blocking: Have you looked at all into Google Secrets Manager?
An option I've found that we could use would be, instead of having us create individual service accounts for customers, we could generate our own API keys and then use Google Secrets Manager to store these keys and pass & validate them via SHA256 encryption. This seems a bit more akin to what you're seeking to do. It also gives us more granular control over which datasets this enables, as we can store richer metadata with the key (imagine one day we create another type of limited-access dataset that we don't want these customers to access).
I have a Claude chat here describing Google Secrets Manager, as well as other options that may be less desirable (e.g., JWT tokens). The most relevant portions are toward the end. Curious to hear your thoughts.
There was a problem hiding this comment.
We would have to build some other service and host it to authenticate right though?
There was a problem hiding this comment.
I think I need to understand our use case better here. I don't really want to manage service accounts for external users or force the user to authenticate using a specific mechanism.
Is there a reason we can't have them provide a gmail account or service account that they own/manage and then grant it permission to access the bucket?
There was a problem hiding this comment.
The reason is that we want to keep the process of running code that downloads the microdata simple. A single microdata access token as an env var is as simple as it gets
Fixes #164
After this PR, we could give users access tokens for private microdata. The process would be:
Storage legacy bucket readerandStorage legacy object readerin the relevant Google Cloud bucket in the projectPolicyEngine researchThe argument for b64 encoding is that it reduces complexity and bug-prone behaviour around string formatting etc. when pasting the contents.