Thanks to visit codestin.com
Credit goes to github.com

Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions .devcontainer/devcontainer.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
{
"name": "nydus-dev-container",
"image": "mcr.microsoft.com/vscode/devcontainers/base:1-jammy",
"features": {
"ghcr.io/devcontainers/features/aws-cli:1": {},
"ghcr.io/dhoeric/features/google-cloud-cli:1": {}
}
}
28 changes: 28 additions & 0 deletions .github/workflows/gosec.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
name: gosec

# Run workflow each time code is pushed to your repository and on a schedule.
# The scheduled workflow runs every at 00:00 on Sunday UTC time.
on:
push:

jobs:
tests:
runs-on: ubuntu-latest
permissions:
security-events: write
actions: read
contents: read
env:
GO111MODULE: on
steps:
- name: Checkout Source
uses: actions/checkout@v4
- name: Run Gosec Security Scanner
uses: securego/gosec@master
with:
# we let the report trigger content trigger a failure using the GitHub Security features.
args: "-no-fail -fmt sarif -out results.sarif ./..."
- uses: reviewdog/action-setup@v1
- name: Run review dog
run: |
reviewdog -f=sarif -name=gosec < results.sarif
25 changes: 25 additions & 0 deletions .github/workflows/lint.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
name: lint

on:
push:
schedule:
- cron: "0 0 * * *"
workflow_dispatch:

jobs:
golangci:
name: lint
permissions:
security-events: write
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # v4.1.1
- name: Set up Go
uses: actions/setup-go@0c52d547c9bc32b1aa3301fd7a9cb496313a4491 # v5.0.0
with:
go-version-file: "go.mod"
- run: sudo apt-get update && sudo apt-get install -y libpcap0.8 libpcap0.8-dev
- name: golangci-lint
uses: golangci/golangci-lint-action@3cfe3a4abbb849e10058ce4af15d205b6da42804 # v4.0.0
with:
args: --timeout=3m
101 changes: 101 additions & 0 deletions .github/workflows/publish.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
name: publish

on:
push:

env:
TAG_NAME: nydus:${{ github.sha }}
BUILD_VERSION: ${{ github.sha }}
GITHUB_IMAGE_REPO: ghcr.io/${{ github.repository_owner }}/nydus
GITHUB_IMAGE_NAME: ghcr.io/${{ github.repository_owner }}/nydus:${{ github.sha }}

jobs:
build:
runs-on: ubuntu-latest
permissions:
contents: read
packages: write

steps:
- name: checkout
uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # v4.1.1

- name: Go Build Cache for Docker
uses: actions/cache@v3
with:
path: go-build-cache
key: ${{ runner.os }}-go-build-cache-${{ hashFiles('go.sum') }}

- name: inject go-build-cache into docker
# v1 was composed of two actions: "inject" and "extract".
# v2 is unified to a single action.
uses: reproducible-containers/[email protected]
with:
cache-source: go-build-cache

- name: Set up Docker buildx
uses: docker/setup-buildx-action@f95db51fddba0c2d1ec667646a06c2ce06100226 # v3.0.0
- name: Login to GitHub Container Registry
uses: docker/login-action@343f7c4344506bcbf9b4de18042ae17996df046d # v3.0.0
with:
registry: ghcr.io
username: ${{ github.repository_owner }}
password: ${{ secrets.GITHUB_TOKEN }}

- name: Get the tag or commit id
id: version
run: |
if [[ $GITHUB_REF == refs/tags/* ]]; then
# If a tag is present, strip the 'refs/tags/' prefix
TAG_OR_COMMIT=$(echo $GITHUB_REF | sed 's/refs\/tags\///')
echo "This is a tag: $TAG_OR_COMMIT"
else
# If no tag is present, use the commit SHA
TAG_OR_COMMIT=$(echo $GITHUB_SHA)
echo "This is a commit SHA: $TAG_OR_COMMIT"
fi
# Set the variable for use in other steps
echo "TAG_OR_COMMIT=$TAG_OR_COMMIT" >> $GITHUB_OUTPUT
shell: bash

- name: Build and push
uses: docker/build-push-action@4a13e500e55cf31b7a5d59a38ab2040ab0f42f56 # v5.1.0
with:
context: .
push: true
tags: ${{ env.GITHUB_IMAGE_NAME }}
build-args: |
BUILD_VERSION=${{ steps.version.outputs.TAG_OR_COMMIT }}
cache-from: type=gha
cache-to: type=gha,mode=max
platforms: linux/amd64

release-ghcr:
runs-on: ubuntu-latest
permissions:
contents: read
packages: write
needs: build
if: startsWith(github.ref, 'refs/tags/')
steps:
- name: extract tag
id: tag
run: |
TAG=$(echo ${{ github.ref }} | sed -e "s#refs/tags/##g")
echo "tag=$TAG" >> $GITHUB_OUTPUT
- name: Login to GitHub Container Registry
uses: docker/login-action@343f7c4344506bcbf9b4de18042ae17996df046d # v3.0.0
with:
registry: ghcr.io
username: ${{ github.repository_owner }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Pull Docker image
run: docker pull ${{ env.GITHUB_IMAGE_NAME }}
- name: Rename Docker image (tag name)
run: docker tag ${{ env.GITHUB_IMAGE_NAME }} "${{ env.GITHUB_IMAGE_REPO }}:${{ steps.tag.outputs.tag }}"
- name: Rename Docker image (latest)
run: docker tag ${{ env.GITHUB_IMAGE_NAME }} "${{ env.GITHUB_IMAGE_REPO }}:latest"
- name: Push Docker image (tag name)
run: docker push "${{ env.GITHUB_IMAGE_REPO }}:${{ steps.tag.outputs.tag }}"
- name: Push Docker image (latest)
run: docker push "${{ env.GITHUB_IMAGE_REPO }}:latest"
21 changes: 21 additions & 0 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
name: test

on:
push:
schedule:
- cron: "0 0 * * *"
workflow_dispatch:

jobs:
testing:
runs-on: ubuntu-latest

steps:
- name: Checkout upstream repo
uses: actions/checkout@f43a0e5ff2bd294095638e18286ca9a3d1956744 # v3.6.0
with:
ref: ${{ github.head_ref }}
- uses: actions/setup-go@6edd4406fa81c3da01a34fa6f6343087c207a568 # v3.5.0
with:
go-version-file: "go.mod"
- run: go test ./...
37 changes: 37 additions & 0 deletions .github/workflows/trivy.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
name: trivy

on:
push:
schedule:
- cron: "0 0 * * *"
workflow_dispatch:

jobs:
scan:
runs-on: ubuntu-latest
permissions:
security-events: write
actions: read
contents: read

steps:
- name: Checkout upstream repo
uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # v4.1.1
with:
ref: ${{ github.head_ref }}
- id: scan
name: Run Trivy vulnerability scanner in repo mode
uses: aquasecurity/trivy-action@f3d98514b056d8c71a3552e8328c225bc7f6f353 # master
with:
scan-type: "fs"
ignore-unfixed: true
format: "template"
template: "@/contrib/sarif.tpl"
output: "trivy-results.sarif"
exit-code: 1

- name: Upload Trivy scan results to GitHub Security tab
if: failure() && steps.scan.outcome == 'failure'
uses: github/codeql-action/upload-sarif@e8893c57a1f3a2b659b6b55564fdfdbbd2982911 # v3.24.0
with:
sarif_file: "trivy-results.sarif"
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -23,3 +23,5 @@ go.work.sum

# env file
.env

tmp
18 changes: 18 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
FROM golang:1.23 AS build-go
ENV CGO_ENABLED=0
ARG BUILD_VERSION

WORKDIR /app
RUN go env -w GOMODCACHE=/root/.cache/go-build

COPY go.mod go.sum ./
RUN --mount=type=cache,target=/root/.cache/go-build go mod download

COPY . /app
RUN --mount=type=cache,target=/root/.cache/go-build go build -o nydus -ldflags "-X github.com/m-mizutani/nydus/pkg/domain/types.AppVersion=${BUILD_VERSION}" .

FROM gcr.io/distroless/base:nonroot
USER nonroot
COPY --from=build-go /app/nydus /nydus

ENTRYPOINT ["/nydus"]
131 changes: 129 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,129 @@
# locust
Event-driven object data transfer tool
# Nydus

Cross-Cloud Platform Tool for Event-Driven Object Data Transfer.

![overview](https://github.com/user-attachments/assets/514b04ce-7ca7-4f68-830f-b94ca54f1d87)

The `nydus` copies object data between cloud storage services in an event-driven manner. It can receive notifications from cloud storage services and transfer object data between them. When an object is created, updated, or some action in a source storage service, `nydus` will automatically transfer the object to the destination storage service.

The name "nydus" comes from the [Nydus Network](https://starcraft.fandom.com/wiki/Nydus_network) in StarCraft, which is a network of tunnels that allows units to travel between locations.

## Use Cases

- **Backup data from one cloud storage service to another**. For example, coping backup data of critical database for your business into another cloud storage service for disaster recovery.
- **Centralized data management**. For example, copying data from multiple cloud storage services into a single cloud storage service for centralized data management. Some services can dump data into a specific cloud storage service, and `nydus` can transfer the data to the centralized cloud storage service.

## How it works

`nydus` is a HTTP server that listens to the events from cloud storage services. When an event is received, `nydus` will transfer the object data from the source storage service to the destination storage service.

Overview of the data transfer process:

1. `nydus` listens to the events from the source storage service as HTTP server.
- Amazon S3 can send events via SNS (Simple Notification Service).
- Google Cloud Storage can send via Pub/Sub.
- Azure Blob Storage can send via Event Grid.
2. When an event is received, `nydus` parse event data and evaluate it with [Rego](https://www.openpolicyagent.org/docs/latest/policy-language/) policy.
3. If the result has "route" that describes the destination storage service, `nydus` will transfer the object data to the destination storage service.

## Getting Started

### Prerequisites

- For Google Cloud: You need Service Account to access Google Cloud Storage.
- For Azure: You need App to access Azure Blob Storage.
- For AWS: You need IAM Service Account to access Amazon S3.

### Write a Rego policy

Write a Rego policy that describes the routing rules for the object data transfer. The policy should return the destination storage service and the destination bucket name.

Here is an example of the Rego policy that routes the object data from Google Cloud Storage to Azure Blob Storage:

```rego
package route

gcs[dst] {
dst := {
"bucket": "my-backup-bucket",
"name": sprintf("from-azure/%s/%s/%s", [
input.abs.object.storage_account,
input.abs.object.container,
input.abs.object.blob_name,
]),
}
}
```

See [How to write a Rego policy](#how-to-write-a-rego-policy) for more details.

### Creating your container image

Create a container image that contains the Rego policy and the `nydus` binary. `nydus` provides a Docker image that contains the `nydus` binary from the GitHub Container Registry. You can use the `nydus` image as a base image and copy the Rego policy into the image.

```Dockerfile
FROM ghcr.io/secmon-as-code/nydus:latest

# It assumes that the Rego policy is in the "policy" directory.
COPY policy /policy

ENV NYDUS_POLICY_DIR=/policy
ENV NYDUS_ADDR=:8080

ENTRYPOINT ["/nydus" , "serve"]
```

The `nydus` binary is located at `/nydus` in the container image. The Rego policy should be copied to the `/policy` directory in the container image.

Environment variables for the `nydus` binary:

- `NYDUS_POLICY_DIR` (required): The directory that contains the Rego policy files.
- `NYDUS_ADDR` (optional): The address that `nydus` listens to. The default value is `127.0.0.1:8080`. You need to set this environment variable to exposed binding address, such as `:8080` to listen to all interfaces.
- `NYDUS_LOG_LEVEL` (optional): The log level of `nydus`. The default value is `info`.
- `NYDUS_LOG_FORMAT` (optional): The log format of `nydus`. You can choose `console` or `json`. The default value is `json`.
- `NYDUS_ENABLE_GCS` (optional): Enable Google Cloud Storage client. It's required for both of download and upload an object. The default value is `false`. Following environment variables are required when `NYDUS_ENABLE_GCS` is `true`.
- `NYDUS_GCS_CREDENTIAL_FILE` (optional): The path to the Google Cloud Service Account credential file. It's basically not needed when the application is running on Google Cloud Platform.
- `NYDUS_ENABLE_AZURE` (optional): Enable Azure Blob Storage client. It's required for both of download and upload an object. The default value is `false`. Following environment variables are required when `NYDUS_ENABLE_AZURE` is `true`.
- `NYDUS_AZURE_TENANT_ID` (required): The Azure Tenant ID.
- `NYDUS_AZURE_CLIENT_ID` (required): The Azure Client ID for the App.
- `NYDUS_AZURE_CLIENT_SECRET` (required): The Azure Client Secret for the App.

### Deploy your container image

Deploy the container image to your container platform, such as Kubernetes, Docker, or any other container platform. We recommend using [Cloud Run](https://cloud.google.com/run?hl=en) on Google Cloud Platform, as it is a serverless container platform that can scale automatically.

## How to write a Rego policy

Please refer to the [Open Policy Agent documentation](https://www.openpolicyagent.org/docs/latest/policy-language/) for more details about the Rego policy language.

### Rego package name

The Rego policy should return the destination storage service information, such as the destination bucket name and the object path in the destination bucket. The policy should be written in the `route` package. That means the policy file should start with `package route`.

### Input data

The input data for the Rego policy is the event data from the source storage service. The event data is parsed by `nydus` and passed to the Rego policy as the `input` variable.

The `input` variable has the following structure:

- `abs`: The abstracted event data that is common to all cloud storage services.
- `object`: The object data.
- `storage_account`: The storage account name.
- `container`: The container name.
- `blob_name`: The blob name.
- `size`: The object size.
- `content_type`: The object content type.
- `etag`: The object ETag.
- `event`: This field contains original Azure Event Grid notification data. See [Azure Event Grid schema](https://docs.microsoft.com/en-us/azure/event-grid/event-schema-blob-storage?tabs=event-grid) for more details.

### Output data

The Rego policy should return the destination storage service information as a set. The set should contain the destination bucket name and the object path in the destination bucket.

- `gcs`: The destination storage service is Google Cloud Storage. The variable must be [Set](https://www.openpolicyagent.org/docs/latest/policy-language/#sets) type and contain the following fields:
- `bucket`: The destination bucket name.
- `name`: The object path in the destination bucket.

## License

Apache License 2.0
Loading