Run Uniflow Pipeline on Compute Cluster
Run a Uniflow pipeline that schedules Ray jobs on the compute Kubernetes cluster michelangelo-compute-0.
Prerequisites
- Repository: Local checkout with
$REPOROOTpointing to the repo root - Tooling:
poetry,docker,k3d - Storage: Access to
s3://default(or update the--storage-url)
Procedure
- Change to the Python workspace:
cd $REPOROOT/python
- Create the Michelangelo sandbox and compute cluster:
poetry run ma sandbox create --create-compute-cluster
Note: This provisions a local k3d cluster named michelangelo-compute-0.
- Build the example image:
docker build -t examples:latest -f ./examples/Dockerfile .
- Import the image into the k3d cluster:
k3d image import examples:latest -c michelangelo-compute-0
- Launch the pipeline (remote run) against the compute cluster:
PYTHONPATH=. poetry run python ./examples/bert_cola/bert_cola.py remote-run \
--image docker.io/library/examples:latest \
--storage-url s3://default \
--yes
Outcome:
- Sandbox and compute K8s cluster are created
- The
examples:latestimage is available inmichelangelo-compute-0 - The
bert_colapipeline runs Ray jobs on the compute cluster