Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@pwntester
Copy link

@pwntester pwntester commented Nov 2, 2024

Description

  • Fixes the recently adjusted if condition so that it actually works as intended.
  • Better documents concerns for maintainers to be aware of.
  • Reference the pull_requests ENV at runtime instead of embedding content into the script via GHA context expression. This is a better practice which prevent exploits from untrusted inputs (notably for context objects which might introduce new fields in future).

Copy link
Member

@polarathene polarathene left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the contribution, but I am not sure if it make sense to resolve these unless there is a clear benefit over the current implementation?

  • if can exclude ${{ }} yes, but for maintainers I rather they don't have to think about that and would prefer to keep it simple for them to recognize and not question when it's valid to omit the expression syntax. These workflows are rarely modified beyond dependabot.
  • Replacing the env context usage with explicit shell variables could be done, but unless there is a valid improvement, I would again prefer to keep it as-is since it'll be easier to grok and less error prone. It can be quite easy for these little mistakes to slip through.

pull_requests: ${{ tojson(github.event.workflow_run.pull_requests) }}
run: |
PR_NUMBER=$(jq -r '[.[] | select(.head.sha == "${{ env.head_sha }}")][0].number' <<< '${{ env.pull_requests }}')
PR_NUMBER=$(jq -r '[.[] | select(.head.sha == "${head_sha}")][0].number' <<< '${pull_requests}')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what the value of this would be, and I don't think your suggestion here would work as-is due to the use of ' single-quotes wrapping this expression it would not interpolate the variables. Thus your suggestion would break the intended functionality.

The expression syntax to use env context works as that is pre-processed before the run executes.


Both of these variables are referencing the steps own env vars set directly above, which use context that shouldn't be possible to tamper with? Could you please explain the value of this suggestion?

I think there is less risk for error for maintainers with the env context expressions used, since it is distinctively clear for us to grok vs the subtle error with quotes in shell scripts affecting interpolation which can introduce bugs as this PR has done unintentionally?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sha is totally safe since its only alphanumeric value. The github.event.workflow_run.pull_requests contains untrusted data from the triggering Pull Request though (eg: title and body). An attacker can add a body like foo'`id`'bar which will pollute the value of the pull_requests env var. If you use the ${{}} expression interpolation, the workflow expressions will get interpolated into the final bash script that will be then passed to bash for exection. So if you use ${{ env.pull_requests }} the attacker will be able to use the PR body to close the bash single quote and add new commands (eg: foo';`id`;bar).

You are right that '${pull_requests}' needs to be replaced with "${pull_requests}" since otherwise it wont get expanded. Will fix that in the PR. More info here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if you use ${{ env.pull_requests }} the attacker will be able to use the PR body to close the bash single quote and add new commands (eg: foo';`id`;bar).

While I get what you're saying here, how is it different from ${pull_requests}? Both would be a string of JSON key/value pairs?

Ah ok, templating to generate script with content vs runtime variable.

The github.event.workflow_run.pull_requests contains untrusted data from the triggering Pull Request though (eg: title and body).

I don't think it does, pull_request context has specific metadata like that but not for this workflow_run event:

echo "PRs: ${{ tojson(github.event.workflow_run.pull_requests) }}"
PRs: [
  {
    base: {
      ref: main,
      repo: {
        id: 506839796,
        name: actions-example,
        url: https://api.github.com/repos/polarathene/actions-example
      },
      sha: e2e0c6555bec8ca661b59057443cfa5a54b8da75
    },
    head: {
      ref: test-branch,
      repo: {
        id: 506839796,
        name: actions-example,
        url: https://api.github.com/repos/polarathene/actions-example
      },
      sha: 113f63be4028584a1528e63ff33f821d990c28d2
    },
    id: 2157169857,
    number: 7,
    url: https://api.github.com/repos/polarathene/actions-example/pulls/7
  }
]

No PR title in this data?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to know, the docs just mention that its an array of pull request objects so I assumed they contained the title and body. I would be in the safe side though, perhaps untrusted data is added to these objects in the future

{
echo "PR_NUMBER=${PR_NUMBER}"
echo 'PR_HEADSHA=${{ env.head_sha }}'
echo 'PR_HEADSHA=${head_sha}'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise, the content is single quote wrapped, this would not interpolate and instead the value of PR_HEADSHA as an ENV will now be the string ${head_sha} instead of the actual fixed string of a the head SHA.

pull_requests: ${{ tojson(github.event.workflow_run.pull_requests) }}
run: |
PR_NUMBER=$(jq -r '[.[] | select(.head.sha == "${{ env.head_sha }}")][0].number' <<< '${{ env.pull_requests }}')
PR_NUMBER=$(jq -r '[.[] | select(.head.sha == "${head_sha}")][0].number' <<< "${pull_requests}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will still be invalid:

jq -r '[.[] | select(.head.sha == "${head_sha}")][0].number'

jq is splatting (.[]) all array items from pull_requests variable, and selecting only the one that has .head.sha value that matches the expected checksum, but due to single quote wrapping this will not work. Flipping the quotes alone will not work either, you'd need to escape the inner " AFAIK.

The env input appears fairly trustworthy though? I don't think the branch name can be used as an attack vector with quotes?

Suggested change
PR_NUMBER=$(jq -r '[.[] | select(.head.sha == "${head_sha}")][0].number' <<< "${pull_requests}")
PR_NUMBER=$(jq -r "[.[] | select(.head.sha == \"${head_sha}\")][0].number" <<< "${pull_requests}")

vs

Suggested change
PR_NUMBER=$(jq -r '[.[] | select(.head.sha == "${head_sha}")][0].number' <<< "${pull_requests}")
PR_NUMBER=$(jq -r '[.[] | select(.head.sha == "${{ env.head_sha }}")][0].number' <<< '${{ env.pull_requests }}')

The latter seems simpler to grok, so long as the input is not exploitable that seems less prone to human error with maintenance?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the branch name can be used as an attack vector with quotes?

The branch name can, but the SHA is safe to use, so in this case its ok to use the env workflow expression

Copy link
Member

@polarathene polarathene left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll apply this. LGTM, thank you for contributing the fix and improvements! ❤️

@polarathene polarathene changed the title Update docs-preview-deploy.yml ci: Revise docs-preview-deploy.yml Nov 4, 2024
@polarathene polarathene merged commit 0ff9c01 into docker-mailserver:master Nov 4, 2024
@polarathene polarathene added this to the v15.0.0 milestone Nov 4, 2024
Comment on lines +23 to +25
github.event.workflow_run.conclusion == 'success'
&& github.event.workflow_run.event == 'pull_request'
&& contains(github.event.workflow_run.pull_requests.*.head.sha, github.event.workflow_run.head_sha)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This condition appears to be triggering a skip for some reason.


docs-preview-prepare ran successfully on the PR pre-merge commit:

$ git -c protocol.version=2 fetch --no-tags --prune --no-recurse-submodules --depth=1 origin +a721146229a6adfb6f18cd28f14b8396c54ec6f7:refs/remotes/pull/4183/merge

a721146229a6adfb6f18cd28f14b8396c54ec6f7 -> pull/4183/merge

When cloning this repo and checkout that PR pre-merge commit, we can run git log:

$ git log

commit a721146229a6adfb6f18cd28f14b8396c54ec6f7 (grafted, HEAD, pull/4183/merge)
Author: RoelSG <email-address-here>
Date:   Sun Nov 10 00:56:27 2024 +0000

    Merge a5682e7c805397493aec6e5f333b236fb13c7cb3 into 0ff9c0132a8914d6756739a7a3b085e47870b93d

docs-preview-deploy then triggers for commit 0ff9c01, and the if condition above evaluates to false, skipping the workflow 🤔


The doc-preview-* workflows were triggered by a merge commit from master into the PR (that I did via Github's Web UI button to update the branch).

Commit a5682e7 reflects that, while a721146 would be the pre-merge commit Github created for the PR sync event, and commit 0ff9c01 is the current latest commit on master branch (the squash merge commit from this very merged PR updating the workflow).

Looking at an older workflow run that did pass, this commit was the trigger, even though it shouldn't have met the docs-preview-prepare path requirement to trigger, it was the PR merge to master branch commit, rather than a pre-merge commit, or as mentioned in this case a commit merging master into the PR branch (which has been a valid trigger in the past too, unrelated to the commit content).


The previous working if condition was:

if: ${{ github.event.workflow_run.event == 'pull_request' && github.event.workflow_run.conclusion == 'success' }}

Presumably the contains addition is not compatible here. If that's the case then the context may not be valid below either 🤔

I suppose it depends on if a721146 (github generated pre-merge commit) or the actual latest commit on the PR branch (a5682e7?) is treated as the head_sha. Perhaps in this case it's a compatibility issue with the master to PR merge commit only, I'll verify that shortly 😕

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UPDATE:

  • The condition did not skip on my own test repo when I created a PR and sync'd to changes of the primary branch main.
  • Opened a PR with a small change to trigger the workflow, that worked fine.

Perhaps it has something to do with the the failing PR having been opened prior to this CI workflow update, or it's something related to a third-party contributor/branch 🤷‍♂️


UPDATE 2:

Opened another PR, this time from my own fork of DMS. This triggered the same problem with the if condition skipping the workflow, and that was just the PR commits itself, no new pushes/updates.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the third condition:

&& contains(github.event.workflow_run.pull_requests.*.head.sha, github.event.workflow_run.head_sha)

The workflow runs but shows that the context we want is missing completely:

  PR_NUMBER=$(jq -r '[.[] | select(.head.sha == "45f63b686fb61d85fec2ddab7ac4504f8b191555")][0].number' <<< "${pull_requests}")
  {
    echo "PR_NUMBER=${PR_NUMBER}"
    echo 'PR_HEADSHA=45f63b686fb61d85fec2ddab7ac4504f8b191555'
  } >> "${GITHUB_ENV}"
  shell: /usr/bin/bash -e {0}
  env:
    head_sha: 45f63b686fb61d85fec2ddab7ac4504f8b191555
    pull_requests: []

Whereas for the working PR (branch on the same repo) we have that extra context available:

  PR_NUMBER=$(jq -r '[.[] | select(.head.sha == "8bef84e53657709eda5d35729399817035362efd")][0].number' <<< "${pull_requests}")
  {
    echo "PR_NUMBER=${PR_NUMBER}"
    echo 'PR_HEADSHA=8bef84e53657709eda5d35729399817035362efd'
  } >> "${GITHUB_ENV}"
  shell: /usr/bin/bash -e {0}
  env:
    head_sha: 8bef84e53657709eda5d35729399817035362efd
    pull_requests: [
    {
      "base": {
        "ref": "master",
        "repo": {
          "id": 33037215,
          "name": "docker-mailserver",
          "url": "https://api.github.com/repos/docker-mailserver/docker-mailserver"
        },
        "sha": "0ff9c0132a8914d6756739a7a3b085e47870b93d"
      },
      "head": {
        "ref": "docs/smtp-bind-fix-snippet-titles",
        "repo": {
          "id": 33037215,
          "name": "docker-mailserver",
          "url": "https://api.github.com/repos/docker-mailserver/docker-mailserver"
        },
        "sha": "8bef84e53657709eda5d35729399817035362efd"
      },
      "id": 2171261491,
      "number": 4258,
      "url": "https://api.github.com/repos/docker-mailserver/docker-mailserver/pulls/4258"
    }
  ]

So perhaps some metadata does need to be passed through, but then anyone could adjust the PR/issue provided as input 🤷‍♂️

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the late response, I totally missed the notification.
The third condition can be safely removed.
I dont know why github.event.workflow_run.pull_requests is empty for that case, but I dont see how passing it via an env var rather than interpolating it directly could change the value it receives from the runner.
I found this answer that says that pull_requests is not filled for PRs coming from a fork and they had to iterate through all open PRs to find the right one: https://stackoverflow.com/a/79017997

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we go, from the GHA workflow event trigger docs

The pull_request webhook event payload is empty for merged pull requests and pull requests that come from forked repositories.


There has also been an ongoing discussion about this particular issue for years:

  1. GH blog advises with passing PR number via artifact as ENV from untrusted to trusted workflows (which led to the security concern in the first place 😅 )
  2. GH CLI or REST API via JS script (rate limit risk)
  3. pull_request_target + re-usable workflows with restricted permissions for untrusted builds
  4. gh run view provides the PR number in an awkward manner, but is only compatible with forks apparently 🤷‍♂️

Perhaps this is something GH can properly address, as community discussions like that with the confusion and various solutions probably isn't benefiting anyone 😅


The blog post referenced by solution 1 above, does touch on pull_request_target caveats:

You may ask yourself: if the pull_request_target workflow only checks out and builds the PR, i.e. runs untrusted code but doesn’t reference any secrets, is it still vulnerable?

Yes it is, because a workflow triggered on pull_request_target still has the read/write repository token in memory that is potentially available to any running program.
If the workflow uses actions/checkout and does not pass the optional parameter persist-credentials as false, it makes it even worse. The default for the parameter is true.

It means that in any subsequent steps any running code can simply read the stored repository token from the disk.
If you don’t need a repository write access or secrets, just stick to the pull_request trigger.

The persist-credentials setting appears to have that related logic here.

The blog post only refers to steps for a job, while solution 3 from above shows a workflow with two jobs, the build job of the 1st workflow provides inputs into the 2nd via workflow call trigger to perform a git checkout of the PR without secrets or permissions, which should avoid the persist-credentials concern for their preview job that runs next?


EDIT: I've been looking into solution 3 (pull_request_target) and it seems like it might be the right way to go, I'll get a PR up for this and ping you for a glance over? :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The third condition can be safely removed.

Yeah, it was more of a safe-guard and in this case it did correctly skip the workflow because the expected metadata was missing. I dropped it to verify that it was in fact lacking the metadata needed, so a different solution is required.

I dont know why github.event.workflow_run.pull_requests is empty for that case, but I dont see how passing it via an env var rather than interpolating it directly could change the value it receives from the runner.

The previous solution that did work was using the untrusted pull_request workflow (prepare) to get the pull request number and related info, then send that over to the trusted workflow_run workflow (deploy) which appended the lines to the filepath of $GITHUB_ENV.

While that did work, there was a security concern with LD_PRELOAD you mentioned and there was various concerns I had with how to approach that properly that I tried the alternative approach that we have in place currently.

I'd rather not go back to the previous method since the PR contributor can manipulate what those values would be.

I found this answer that says that pull_requests is not filled for PRs coming from a fork and they had to iterate through all open PRs to find the right one: https://stackoverflow.com/a/79017997

Yes, as my prior message posted at roughly the same time as your comment notes, that appears to be the case. I'll push ahead with the pull_request_target solution 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants