Conversation
|
👋 patrickhuie19, thanks for creating this pull request! To help reviewers, please consider creating future PRs as drafts first. This allows you to self-review and make any final changes before notifying the team. Once you're ready, you can mark it as "Ready for review" to request feedback. Thanks! |
There was a problem hiding this comment.
CodeQL found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.
erikburt
left a comment
There was a problem hiding this comment.
I’ve left more detailed comments inline, but at this size it’s difficult to fully vet in a single pass.
Some higher-level feedback:
- I think this probably belongs in its own repo. It’s not just a single action, it’s a collection of related actions and workflows, and because of that, seems like a suite/product in itself. Having it's own repository would make ownership and versioning a lot clearer. Can expand on this further if needed.
- I’m happy to help set up a new repo.
- If it must live in this repository, then all the code should live in the top-level
apps/medicdirectory probably, rather than.github/actions/medic.
- You're most of the way there, but each major medic feature should likely be its own independently versioned action, with shared functionality pulled into a common library.
- For example comment-parser, doesn't have an action.yml but is does have it's own
dist.libis not an actual lib, it's a shared folder. Although, probably not big enough to warrant it being a separate library yet, still helps with organization imo.
- For example comment-parser, doesn't have an action.yml but is does have it's own
- This kind of goes with the above, but right now, the code is tough to reason about and maintain. I think it would benefit from more structural cleanup. Clearer boundaries between individual functionality, tests utilities, common functions, etc.
- Documentation would be helpful as well
| - name: Checkout medic | ||
| uses: actions/checkout@v4 | ||
| with: | ||
| repository: smartcontractkit/medic-test | ||
| path: _medic | ||
|
|
||
| - name: Setup Node.js | ||
| uses: actions/setup-node@v4 | ||
| with: | ||
| node-version: '20' | ||
|
|
||
| - name: Parse comment | ||
| id: parse | ||
| run: node _medic/.github/actions/medic/dist/comment-parser/index.js | ||
| env: | ||
| INPUT_GITHUB-TOKEN: ${{ secrets.GITHUB_TOKEN }} | ||
| INPUT_COMMENT-BODY: ${{ github.event.comment.body }} | ||
| INPUT_ISSUE-NUMBER: ${{ github.event.issue.number }} | ||
| INPUT_COMMENTER-LOGIN: ${{ github.event.comment.user.login }} | ||
|
|
There was a problem hiding this comment.
Seems like you properly split up the other functions/actions into their own action.ymls, but this one should be as well.
| id: check | ||
| run: | | ||
| echo "Found /medic generate conflict command" | ||
| echo "should_generate=true" >> $GITHUB_OUTPUT |
There was a problem hiding this comment.
I recommend using ... | tee -a "${GITHUB_OUTPUT}" everywhere so you can see what values are actually being set.
| uses: actions/checkout@v4 | ||
|
|
||
| - name: Authenticate to GCP | ||
| uses: google-github-actions/auth@v2 |
| export_environment_variables: true | ||
|
|
||
| - name: Auto retry failed workflow | ||
| uses: smartcontractkit/medic-test/.github/actions/medic/workflow-retry-auto@main |
There was a problem hiding this comment.
ideally this is versioned and tagged, not pointing to main. could add reusable workflows in a later PR.
| - name: Checkout medic | ||
| uses: actions/checkout@v4 | ||
| with: | ||
| repository: smartcontractkit/medic-test | ||
| path: _medic |
There was a problem hiding this comment.
Using old repo all over the place. And based on below comment, this shouldn't be necesary.
| const AnthropicVertex = mod.default; | ||
| client = new AnthropicVertex({ | ||
| projectId, | ||
| region: process.env.CLOUD_ML_REGION || 'us-east5', |
There was a problem hiding this comment.
Ideally this is an input, mixing random env vars and inputs can be confusing (same with ANTHROPIC_VERTEX_PROJECT_ID) above.
|
|
||
| try { | ||
| const response = await octokit.request( | ||
| 'GET /repos/{owner}/{repo}/actions/jobs/{job_id}/logs', |
There was a problem hiding this comment.
Why not use octokit.rest.actions.downloadJobLogsForWorkflowRun. It also accepts the request object afaict.
| const execPromise = exec.exec( | ||
| 'claude', | ||
| ['--dangerously-skip-permissions', '-p', prompt, '--output-format', 'stream-json', '--verbose'], | ||
| { |
There was a problem hiding this comment.
This technically leaks credentials in 2 ways.
This exec call inherits env vars, so INPUT_GITHUB-TOKEN is leaked to Claude. Also when using actions/checkout in a workflow, credentials are also saved to the repo unless the action is called with persist-credentials: false.
There was a problem hiding this comment.
This is a blocker. The LLM should only have access to what it needs. Since Claude code can execute arbitrary commands, the risk is too high. Also note that the prompt injection can be dynamically fetched too.
Claude should not have internet access.
There was a problem hiding this comment.
I think Claude should run in a separate step with no secrets in the environment variable, and with persist-credentials: false
| @@ -0,0 +1,292 @@ | |||
| /** | |||
There was a problem hiding this comment.
I think files like this should probably live in __tests__ or in a e2e folder of some sort.
| import { PROMPT_TEMPLATE } from './lib/prompts.js'; | ||
|
|
||
| process.stdout.write(PROMPT_TEMPLATE); |
timweri
left a comment
There was a problem hiding this comment.
After looking at this closer, we need more hardenings on the agent:
Remove access to secrets
We shouldn't let the agent get access to the Github token or any secrets. Use deterministic code to prepare agent inputs and handle agent outputs. The agent step should have no access to secrets
Sandbox and lock down permissions
We can harden the agent more:
- Run in sandbox mode
- Deny read to sensitive folders
- Limit network access. Maybe with
allowedDomains. - Run in
dontAskpermission mode instead of god mode and allow only needed tools (Read,Edit)
I think the agent should not be able to call any bash command. It should only read, edit and provide text output. Everything else should be scripted.
| steps: | ||
| - name: React with eyes | ||
| run: | | ||
| gh api repos/${{ github.repository }}/issues/comments/${{ github.event.comment.id }}/reactions \ |
There was a problem hiding this comment.
The CodeQL flag is correct here. Even when some values are not easily manipulated for injection, we should put these workflow variables as environment variables as good practice.
Example:
env:
REPO: ${{ github.repository }}
COMMENT_ID: ${{ github.event.comment.id }}
run: gh api repos/"$REPO"/issues/comments/"$COMMENT_ID"/reactions ...
| - name: Checkout medic | ||
| uses: actions/checkout@v4 | ||
| with: | ||
| repository: smartcontractkit/medic-test |
| @@ -0,0 +1,334 @@ | |||
| name: Medic - Merge Conflict Resolution | |||
There was a problem hiding this comment.
I think we should keep this PR light with just CI retry.
| @@ -0,0 +1,58 @@ | |||
| { | |||
| "name": "medic-action", | |||
There was a problem hiding this comment.
We should prefix this to be @chainlink/medic-action
| const execPromise = exec.exec( | ||
| 'claude', | ||
| ['--dangerously-skip-permissions', '-p', prompt, '--output-format', 'stream-json', '--verbose'], | ||
| { |
There was a problem hiding this comment.
This is a blocker. The LLM should only have access to what it needs. Since Claude code can execute arbitrary commands, the risk is too high. Also note that the prompt injection can be dynamically fetched too.
Claude should not have internet access.
| const execPromise = exec.exec( | ||
| 'claude', | ||
| ['--dangerously-skip-permissions', '-p', prompt, '--output-format', 'stream-json', '--verbose'], | ||
| { |
There was a problem hiding this comment.
I think Claude should run in a separate step with no secrets in the environment variable, and with persist-credentials: false
| id: direct-pr | ||
| if: inputs.pr-number != '' | ||
| run: | | ||
| PR_DATA=$(gh pr view ${{ inputs.pr-number }} --json number,headRefName,baseRefName,author --jq '{number: .number, headRefName: .headRefName, baseRefName: .baseRefName, author: .author.login, currentAttempts: 0}') |
There was a problem hiding this comment.
Same for ${{ inputs.pr-number }}. This should put it as an env and then use "$ENV".
| attempt, | ||
| tokens: claudeResult.tokens, | ||
| author: pr.user?.login || 'unknown', | ||
| details: errorMessage, |
There was a problem hiding this comment.
Error outputs from Claude should be treated caution. For example, it might include phishing markdown injection. I think if Claude fails, the error should be restricted. Overall, Claude open output in comments should not be trusted.
Bringing https://github.com/smartcontractkit/medic-test/pulls to this repo. Next step is to use the reusable workflows in core.