|
| 1 | +<!-- |
| 2 | + Licensed to the Apache Software Foundation (ASF) under one |
| 3 | + or more contributor license agreements. See the NOTICE file |
| 4 | + distributed with this work for additional information |
| 5 | + regarding copyright ownership. The ASF licenses this file |
| 6 | + to you under the Apache License, Version 2.0 (the |
| 7 | + "License"); you may not use this file except in compliance |
| 8 | + with the License. You may obtain a copy of the License at |
| 9 | +
|
| 10 | + http://www.apache.org/licenses/LICENSE-2.0 |
| 11 | +
|
| 12 | + Unless required by applicable law or agreed to in writing, |
| 13 | + software distributed under the License is distributed on an |
| 14 | + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY |
| 15 | + KIND, either express or implied. See the License for the |
| 16 | + specific language governing permissions and limitations |
| 17 | + under the License. |
| 18 | +--> |
| 19 | + |
| 20 | +# Apache Beam |
| 21 | + |
| 22 | +## CI Environment |
| 23 | + |
| 24 | +Continuous Integration is important component of making Apache Beam robust and stable. |
| 25 | + |
| 26 | +Our execution environment for CI is mainly the Jenkins which is available at |
| 27 | +[https://ci-beam.apache.org/](https://ci-beam.apache.org/). See |
| 28 | +[.test-infra/jenkins/README](.test-infra/jenkins/README.md) |
| 29 | +for trigger phrase, status and link of all Jenkins jobs. See Apache Beam Developer Guide for |
| 30 | +[Jenkins Tips](https://cwiki.apache.org/confluence/display/BEAM/Jenkins+Tips). |
| 31 | + |
| 32 | +An additional execution environment for CI is [GitHub Actions](https://github.com/features/actions). GitHub Actions |
| 33 | +(GA) are very well integrated with GitHub code and Workflow and it has evolved fast in 2019/2020 to become |
| 34 | +a fully-fledged CI environment, easy to use and develop for, so we decided to use it for building python source |
| 35 | +distribution and wheels. |
| 36 | + |
| 37 | +## GitHub Actions |
| 38 | + |
| 39 | +### GitHub actions run types |
| 40 | + |
| 41 | +The following GA CI Job runs are currently run for Apache Beam, and each of the runs have different |
| 42 | +purpose and context. |
| 43 | + |
| 44 | +#### Pull request run |
| 45 | + |
| 46 | +Those runs are results of PR from the forks made by contributors. Most builds for Apache Beam fall |
| 47 | +into this category. They are executed in the context of the "Fork", not main |
| 48 | +Beam Code Repository which means that they have only "read" permission to all the GitHub resources |
| 49 | +(container registry, code repository). This is necessary as the code in those PRs (including CI job |
| 50 | +definition) might be modified by people who are not committers for the Apache Beam Code Repository. |
| 51 | + |
| 52 | +The main purpose of those jobs is to check if PR builds cleanly, if the test run properly and if |
| 53 | +the PR is ready to review and merge. |
| 54 | + |
| 55 | +#### Direct Push/Merge Run |
| 56 | + |
| 57 | +Those runs are results of direct pushes done by the committers or as result of merge of a Pull Request |
| 58 | +by the committers. Those runs execute in the context of the Apache Beam Code Repository and have also |
| 59 | +write permission for GitHub resources (container registry, code repository). |
| 60 | +The main purpose for the run is to check if the code after merge still holds all the assertions - like |
| 61 | +whether it still builds, all tests are green. |
| 62 | + |
| 63 | +This is needed because some of the conflicting changes from multiple PRs might cause build and test failures |
| 64 | +after merge even if they do not fail in isolation. |
| 65 | + |
| 66 | +#### Scheduled runs |
| 67 | + |
| 68 | +Those runs are results of (nightly) triggered job - only for `master` branch. The |
| 69 | +main purpose of the job is to check if there was no impact of external dependency changes on the Apache |
| 70 | +Beam code (for example transitive dependencies released that fail the build). Another reason for the nightly |
| 71 | +build is that the builds tags most recent master with `nightly-master`. |
| 72 | + |
| 73 | +All runs consist of the same jobs, but the jobs behave slightly differently or they are skipped in different |
| 74 | +run categories. Here is a summary of the run categories with regards of the jobs they are running. |
| 75 | +Those jobs often have matrix run strategy which runs several different variations of the jobs |
| 76 | +(with different platform type / Python version to run for example) |
| 77 | + |
| 78 | +| Job | Description | Pull Request Run | Direct Push/Merge Run | Scheduled Run | Requires GCP Credentials | |
| 79 | +|-------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------|-----------------------|---------------|--------------------------| |
| 80 | +| Build python source distribution | Builds python source distribution and uploads it to artifacts. Artifacts from release branch are used in release process ([`build_release_candidate.sh`](release/src/main/scripts/build_release_candidate.sh)) | Yes | Yes | Yes | - | |
| 81 | +| Prepare GCS | Clears target path on GCS if already exists. | - | Yes | Yes | Yes | |
| 82 | +| Upload python source distribution to GCS bucket | Uploads python source distribution to GCS bucket for path unique for specific workflow run. | - | Yes | Yes | Yes | |
| 83 | +| Build python wheels on linux/macos/windows | Builds python wheels on linux/macos/windows platform with usage of `cibuildwheel` and uploads it to artifacts. Artifacts from release branch are used in release process ( [ `build_release_candidate.sh` ](release/src/main/scripts/build_release_candidate.sh) ) | Yes | Yes | Yes | - | |
| 84 | +| Upload python wheels to GCS bucket | Uploads python wheels to GCS bucket for path unique for specific workflow run. Additionally uploads workflow run data. | - | Yes | Yes | Yes | |
| 85 | +| List files on Google Cloud Storage Bucket | Lists files on GCS for verification purpose. | - | Yes | Yes | Yes | |
| 86 | +| Tag repo nightly | Tag repo with `nightly-master` tag if build python source distribution and python wheels finished successfully. | - | - | Yes | - | |
| 87 | + |
| 88 | +### Google Cloud Platform Credentials |
| 89 | + |
| 90 | +Some of the jobs require variables stored as a [GitHub Secrets](https://docs.github.com/en/actions/configuring-and-managing-workflows/creating-and-storing-encrypted-secrets) |
| 91 | +to perform operations on Google Cloud Platform. Currently these jobs are limited to Apache repository only. |
| 92 | +These variables are: |
| 93 | + * `GCP_SA_EMAIL` - Service account email address. This is usually of the format `<name>@<project-id>.iam.gserviceaccount.com`. |
| 94 | + * `GCP_SA_KEY` - Service account key. This key should be created and encoded as a Base64 string (eg. `cat my-key.json | base64` on macOS). |
| 95 | + |
| 96 | +Service Account shall have following permissions: |
| 97 | + * Storage Object Admin (roles/storage.objectAdmin) |
| 98 | + |
| 99 | +### GitHub Action Tips |
| 100 | + |
| 101 | +* If you introduce changes to the workflow it is possible that your changes will not be present in the check run triggered in Pull Request. |
| 102 | +In this case please attach link to the modified workflow run executed on your fork. |
| 103 | +* Possible timeouts with macOS runner - existing issue: [(X) This check failed - sometimes happens on macOS runner #841](https://github.com/actions/virtual-environments/issues/841) |
| 104 | +* [GitHub Actions Documentation](https://docs.github.com/en/actions) |
0 commit comments