Thanks to visit codestin.com
Credit goes to github.com

Skip to content
/ quilt Public

Quilt is a serverless optimizer that automatically merges workflows that consist of many functions (possibly in different languages) into one process thereby avoiding high invocation latency, communication overhead, and long chains of cold starts.

Notifications You must be signed in to change notification settings

eniac/quilt

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Pre-requisites

To reproduce our results you will need (1) a computing cluster with enough machines / CPUs, each node can run sudo command without being prompted for a password (2) a local machine that has SSH access to every node in the computing cluster, and (3) an account in DockerHub.

Computing cluster

The cluster that we used to evaluate Quilt in the paper consists of 6 machines.

  • 3 machines (128-core Intel Xeon Platinum 8253 with 2 TB RAM) run all the Fission serverless functions;
  • 1 machine (20-core Intel Xeon E5-2680 with 500 GB RAM) runs the API Gateway, Fission serverless runtime, cAdvisor, and the trace collector;
  • 1 machine (8-core Intel Xeon Gold 6334 with 64 GB RAM) runs Tempo, InfluxDB, KeyDB, and memcached;
  • 1 one machine (8-core AMD EPYC 72F3 with 64 GB RAM) runs the workload generator and also works as the provisioning node.

All machines run Ubuntu 24.04.2 LTS (Linux 6.8.0) and are connected to a 1 Gbps network with ≈200 𝜇s RTTs

What if I don't have the same machines?

In most cases, it'll be challenging to find exactly the same machines to reproduce our setup. Fortunately, this is not necessary!

The main requirement is that you need to have:

  • 1 machine to run the provisioning node and workload generator. This does not need to be a big machine.
  • 1 machine to run the serverless and profiling runtime(API Gateway, Fission serverless runtime, cAdvisor, and the trace collector). This does not need to be a big machine.
  • 1 machine to run the storage (Tempo, InfluxDB, KeyDB, and memcached). This does not need to be a big machine.
  • X machines to run the serverless functions. Here, the requirement is that you have enough total cores (~350 vCPUs) across the X machines.

For example, you could get X = 9 machines with 40 cores each, or X = 18 machines with 20 cores each. You can usually get these from Cloudlab.

Now that you have a "similar" cluster you can proceed to the configuration step.

Configuration

To make things simple, our strategy will be to (1) set up all configurations in a local machine that will then install all dependencies and SSH keys and such in all nodes, and then (2) deploy kubernetes and the serverless runtime from the provisioning node.

Let us start with the local machine.

Actions to be performed in your local machine

Please ensure that your local machine has SSH access to every machine in the remote cluster before proceeding.

git clone https://github.com/eniac/quilt.git
cd quilt
  • Edit setup/preqequisite/machines.json:

    • Update the name of cluster nodes.
    • Update the ssh_str and ip fields with your cluster nodes' SSH strings and IP addresses.
  • Edit setup/serverless_runtime/machine.json:

    • Update the information of cluster nodes to be consistent with the above.
    • Note: The name field must match the output of hostname command on the corresponding machine.
  • Edit setup/prerequisite/build.sh

    • Modify $USER
    • Modify the line that contains ALL_ENGINE_NODES="machine1 machine2 machine3 machine4 macine5 machine6" to the node name specified in machines.json
  • Generate a new SSH private and public key (quilt_key / quilt_key.pub):

ssh-keygen -t ed25519 -f ~/.ssh/quilt_key

These SSH keys will be used by k3sup to deploy Kubernetes from the provisioning node to the other nodes. build.sh will copy these keys to all machines and also add them to ~/.ssh/authorized_key on all machines.

At this point, all configuration is done. We can now copy all code and install all dependencies in all of the machines in the cluser.

cd quilt/setup/prerequisite
./build.sh

Actions to be performed in the provisioning node

Recall that you designated one of your machines as the provisioning node. This node will not run any Kubernetes pods, but will be used to

  • deploy kubernetes + serverless runtime to other nodes
  • run the workload generator

SSH into the provisioning node, and then run:

ROOT_DIR=$(pwd)
git clone https://github.com/eniac/quilt.git 
echo "export KUBECONFIG=$ROOT_DIR/quilt/setup/serverless_runtime/kubeconfig" >> ~/.bashrc
source ~/.bashrc
  • Navigate to quilt/setup/serverless_runtime

  • Install the Kubernetes cluster and serverless runtime:

./install.sh build
  • If you want to shutdown the kubernetes and the serverless runtime after you finish all experiments
./install.sh kill

The following steps should also be performed in the provisioning node.

Build the Compiler Docker image

First, you need to have an account in Dockerhub.

Then, run the following:

echo "export DOCKER_USER={your dockerhub username}" >> ~/.bashrc
source ~/.bashrc
cd quilt/dockerfiles/LLVM/llvm-19
sudo docker login
./build.sh llvm

Build the container environment and rust environment for Fission function image:

cd quilt/dockerfiles/Fission/container-based/fission-env
./build.sh
cd quilt/dockerfiles/Env/rust_env
./build.sh

Instructions about how to build the serverless function images

Build Function Images

  • The structure of function images in the benchmark directory

    • DeathStarBench directory: original function code
    • DeathStarBench_fakedb directory: function code with fake DB accesses.
    • DeathStarBench_ContainerMerged_fakedb directory: function code with fake DB accesses for container-based merging.
    • Each directory contains 5 (or 6) applications.
      • The social-network and media-microservice applications each have both asynchronous function invocation and synchronous function invocation versions.
  • An example of building all function images in the original social network app

cd quilt/benchmark/DeathStarBench/social_network/functions
./build.sh build_fission_c

Build Merged Workflow Images

  • An example of merging a workflow
cd quilt/benchmark/DeathStarBench/social_network/functions/merge

# we use {app name}-{root function name}-merged as the container name
# e.g., sn-compose-post-merged
# asynchronous workflow automatically has `-async` after function name 
# e.g., mm-compose-review-async-merged
# to merge functions within a workflow:
# ./build.sh merge_fission <root function name> <workflow file>
./build.sh merge_fission compose-post funcTrees/funcTree.compose_post
  • In the last command:
    • the workflow file funcTree.compose_post specifies how functions are connected within a workflow.
      • The other workflow files can be found in the funcTrees directory.
    • The root function nameserves as the entry point of the workflow and is used to name the merged workflow image.

Run our experiments

Run the following in the provisioning node.

Build wrk2

cd quilt/test/wrk2_fission && make

Figure 6 experiment

  • the value of each bar can be tested seperately
  • the following script measures the 50th percentile latency of baseline compose-post async
  • the result is recorded in output_compose-post-async_1.log
# First, build the function images for the Social Network benchmark
# (within the DeathStarBench directory)
cd quilt/benchmark/DeathStarBench/social_network_async/functions
./build.sh build_fission_c

# Then, run the script to measure 50th percentile latency
cd quilt/test/wrk2_fission/social_network
./figure7.sh perf compose-post async
  • the following script measures the 50th percentile latency of merged compose-review sync
  • the result is recorded in output_compose-review-merged-sync_1.log
# build the function images for the merged workflow in Media
# Microservice benchmark (within the DeathStarBench directory)
cd quilt/benchmark/DeathStarBench/media_microservice/merge
./build.sh merge_fission compose-review funcTrees/funcTree.compose_review

# run the script to measure 50th percentile latency
cd quilt/test/wrk2_fission/media_microservice
./figure7.sh perf compose-review-merged sync

Figure 7(a)(b) experiment

  • Each curve can be tested seperately
  • For Figure 8 and Figure 9(a), if the curve does not exhibit the expected trend, you may try using smaller connection numbers in the corresponding script. For example, consider adjusting the values here to smaller ones if the throughput vs. latency curve for social_network does not follow the expected shape.

To test the baseline curve

# must rebuild the function images for in Social Network
# benchmark (within the DeathStarBench_fakedb directory)
cd quilt/benchmark/DeathStarBench_fakedb/social_network/functions
# for 8(b), the above command should be
# `cd quilt/benchmark/DeathStarBench_fakedb/social_network_async/functions`
./build.sh build_fission_c

# run the script to measure 50th percentile latency
cd quilt/test/wrk2_fission/social_network
./figure8ab.sh perf compose-post sync
# for 8(b), the above command should be
# `./figure8ab.sh perf compose-post async`

# collect the throughput and latency data
./getlattput.py

To test the quilt curve

# First, rebuild the function image for merged workflow for in Social Network
# benchmark (within the DeathStarBench_fakedb directory)
cd quilt/benchmark/DeathStarBench_fakedb/social_network/merge
# for 8(b), the above command should be
# `cd quilt/benchmark/DeathStarBench_fakedb/social_network_async/merge`
./build.sh merge_fission compose-post funcTree/funcTree.compose_post 

# Then, run the script to measure 50th percentile latency
cd quilt/test/wrk2_fission/social_network
./figure8ab.sh perf compose-post-merged sync

# collect the throughput and latency data
./getlattput.py

To test the container-based merge curve

  • The following code tests the sync container-based merge curve.
  • To get the async container-based merge curve
    • in the following commands, replace python3 gen_func.py social_network compose-post with python3 gen_func.py social_network_async compose-post
    • run the commands again
# Build the internal API gateway
cd quilt/benchmark/DeathStarBench_ContainerMerge_fakedb/apiGateway_go/go_server && ./build.sh

# Generate function binaries
cd quilt/benchmark/DeathStarBench_ContainerMerged_fakedb/apiGateway_go
python3 gen_func.py social_network compose-post 

# Build the merged function container
./build.sh build compose-post

# Then, run the script to measure 50th percentile latency
cd quilt/test/wrk2_fission/social_network
./figure8ab_cm.sh perf container-merge-compose-post

# collect the throughput and latency data
./getlattput.py

Figure 7(c) experiment

To test the baseline curve

# Build the baseline function images
cd quilt/benchmark/DeathStarBench_fakedb/hotel_reservation_async/functions
./build.sh build_fission_c

# run the script to measure throughput and latency
cd quilt/test/wrk2_fission/hotel_reservation
./figure8c.sh perf nearby-cinema-top

# collect the throughput and latency data
./getlattput.py

To test the merge-all curve

# Build the baseline function images
cd quilt/benchmark/DeathStarBench_fakedb/hotel_reservation_async/merge
./build.sh merge_fission nearby-cinema-top funcTrees/funcTree.nearby-cinema-top

# run the script to measure throughput and latency
./figure8c.sh perf nearby-cinema-top-merged merged

# collect the throughput and latency data
./getlattput.py

To test the merge-into-2 curve

# Build the baseline function images
cd quilt/benchmark/DeathStarBench_fakedb/hotel_reservation_async/merge
./build.sh merge_fission nearby-cinema-top-2 funcTrees/funcTree.nearby-cinema-top-2
./build.sh merge_fission nearby-cinema-parallel-2 funcTrees/funcTree.nearby-cinema-parallel-2

# run the script to measure throughput and latency
./figure8c.sh perf nearby-cinema-top-2 merged

# collect the throughput and latency data
./getlattput.py

Figure 8(a) experiment

  • run the following command to get one noop curve with ingress nginx enabled.
# enable ingress
kubectl -n fission-function create secret generic tracing --from-literal=ingress-enable="true"

# make sure noop function is built
cd quilt/benchmark/DeathStarBench/social_network/functions/noop
./build.sh fission_c && ./build.sh push

# run the script to measure 50th percentile latency
cd quilt/test/wrk2_fission/social_network
./figure9a.sh perf noop sync
  • turn off the ingress controller by setting ingress-enable="false" in the first kubectl command to get another noop curve

Figure 8(c) experiment

# First, navigate to the application directory corresponding to the workflow
# under the DeathStarBench_fakedb directory
cd quilt/benchmark/DeathStarBench_fakedb/social_network/merge
./build.sh merge_fission compose-post funcTrees/funcTree.compose_post
  • Docker will automatically report the time.
    • compilation time will show at the end of RUN ./merge_tree.py compile funcTree line
    • merging time will show at the end of RUN ./merge_tree.py merge funcTree line
    • linking time will show at the end of RUN ./merge_tree.py link funcTree line
  • For more results, please merge the following workflows

Figure 8(b) and 9 experiment

  • See the merge_solver subdirectory for instructions on how to reproduce the merge solver's results.

Figure 10 experiment

  • Build LLVM-17 compiler image and fission-env for C/C++
cd quilt/dockerfiles/LLVM/llvm-17
./build.sh llvm
cd quilt/dockerfiles/Env/c-env
./build.sh
  • Generate Function Images and Deploy Functions
    • make sure you have Fission successfully setup
    • the number 6 after ./build.sh merge is the threshold, you can set it to other numbers.
# build the caller image and deploythe the function
cd quilt/merge_func/merge-c-fanout/example/caller \
  && ./build.sh build && ./build.sh deploy
# build the callee image and deploy the function
cd quilt/merge_func/merge-c-fanout/example/callee \
  && ./build.sh build && ./build.sh deploy
# build the merged images (both conditional and no-conditional versions are built)
# and deploy the functions
cd quilt/merge_func/merge-c-fanout/example/merge_script \
  && ./build.sh merge 6 && ./build.sh deploy
  • Run wrk2 tests
    • the 5 at the end of each command means client wants the fanout from the caller to be 5
cd quilt/test/wrk2_fission/c_fanout
# measure baseline performance
./test_fanout.sh baseline 5
./test_fanout.sh baseline 13
# measure Quilt performance
./test_fanout.sh fanout 5
./test_fanout.sh fanout 13
# measure Quilt (no conditional) performance - we set the threshold to 9999
./test_fanout.sh fanout-no-cond 5
./test_fanout.sh fanout-no-cond 13

Example of merging functions in different languages

Quilt is able to merge functions across various languages. We give examples here if you are interested. We did not report any cross-language experiments in our evaluation, so there is nothing to reproduce.

Build llvm-17 docker image and Fission multi-language environment

> cd quilt/dockerfiles/LLVM/llvm-17
> ./build.sh llvm
> cd quilt/dockerfiles/container-based/fission-multi-lang-env
> ./build.sh

Build caller, callee and merged function image

  • replace caller_language in the following script to be one from c, rust and swift
  • replace caller_language in the following script to be one from c, rust and swift
  • caller and callee should be in different languages.
# build and deploy caller
> cd quilt/merge_func/merge-{caller_language}-and-{callee_language}/example/caller
> ./build.sh build && ./build.sh deploy
# build and deploy callee
> cd quilt/merge_func/merge-{caller_language}-and-{callee_language}/example/callee
> ./build.sh build && ./build.sh deploy
# build merged function image
> cd quilt/merge_func/merge-{caller_language}-and-{callee_language}/example/merge_script
> ./build.sh merge && ./build.sh deploy

Invoke functions

# invoke original workflow
> cd quilt/merge_func/merge-{caller_language}-and-{callee_language}/example/caller
> ./build.sh invoke
# invoke merged workflow
> cd quilt/merge_func/merge-{caller_language}-and-{callee_language}/example/merge_script
> ./build.sh invoke

About

Quilt is a serverless optimizer that automatically merges workflows that consist of many functions (possibly in different languages) into one process thereby avoiding high invocation latency, communication overhead, and long chains of cold starts.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •