|
| 1 | +## Slicer Cloud Provider for Cluster Autoscaler |
| 2 | + |
| 3 | +The cluster autoscaler for [Slicer](https://slicervm.com/) scales nodes on lightweight slicer microVMs. |
| 4 | + |
| 5 | +The architecture is as follows: |
| 6 | + |
| 7 | +* Slicer runs a K3s control plane on one virtualisation host. |
| 8 | +* Slicer runs all agents on one or more additional virtualisation hosts running the Slicer REST API. Starting off with zero microVMs and relying on cluster autoscaler to add new ones as Pods cannot be scheduled to the existing set of nodes. |
| 9 | + |
| 10 | +Check the documentation on [SlicerVM.com](https://docs.slicervm.com/examples/autoscaling-k3s/) for instructions on how to setup the cluster-autoscaler with with Slicer. |
| 11 | + |
| 12 | +## Configuration |
| 13 | + |
| 14 | +The `cluster-autoscaler` with Slicer needs a configuration file to work by using the `--cloud-config` parameter, it is an INI file with the following fields: |
| 15 | + |
| 16 | +| Key | Value | Mandatory | Default | |
| 17 | +|-----|-------|-----------|---------| |
| 18 | +| `global/k3s-url` | The URL of the K3s control plane API server | yes | none | |
| 19 | +| `global/k3s-token` | The K3s join token for adding new agent nodes | yes | none | |
| 20 | +| `global/default-min-size` | Default minimum size of a node group (must be > 0) | no | 1 | |
| 21 | +| `global/default-max-size` | Default maximum size of a node group | no | 8 | |
| 22 | +| `nodegroup \"slicer_host_group_name\"/slicer-url` | The URL of the Slicer API server for this node group | yes | none | |
| 23 | +| `nodegroup \"slicer_host_group_name\"/slicer-token` | The authentication token for the Slicer API server | yes | none | |
| 24 | +| `nodegroup \"slicer_host_group_name\"/min-size` | Minimum size for a specific node group | no | global/defaut-min-size | |
| 25 | +| `nodegroup \"slicer_host_group_name\"/max-size` | Maximum size for a specific node group | no | global/defaut-max-size | |
| 26 | + |
| 27 | +## Development |
| 28 | + |
| 29 | +Follow the instructions in the [slicer docs](https://docs.slicervm.com/examples/autoscaling-k3s/) to setup a K3S cluster and host groups for nodes. |
| 30 | + |
| 31 | +Make sure you are inside the `cluster-autoscaler` path of the [autoscaler repository](https://github.com/kubernetes/autoscaler). |
| 32 | + |
| 33 | +### Run out of cluster |
| 34 | + |
| 35 | +Start the cluster-autoscaler: |
| 36 | + |
| 37 | +```bash |
| 38 | +#!/bin/bash |
| 39 | +go run . \ |
| 40 | + --cloud-provider=slicer \ |
| 41 | + --kubeconfig $HOME/k3s-cp-kubeconfig \ |
| 42 | + --scale-down-enabled=true \ |
| 43 | + --scale-down-delay-after-add=30s \ |
| 44 | + --scale-down-unneeded-time=30s \ |
| 45 | + --expendable-pods-priority-cutoff=-10 \ |
| 46 | + --cloud-config="$HOME/cloud-config.ini" \ |
| 47 | + --v=4 |
| 48 | +``` |
| 49 | + |
| 50 | +### Run in cluster. |
| 51 | + |
| 52 | +Build and publish an image: |
| 53 | + |
| 54 | +```sh |
| 55 | +REGISTRY=ttl.sh/openfaasltd BUILD_TAGS=slicer TAG=dev make dev-release |
| 56 | +``` |
| 57 | + |
| 58 | +Create a the cloud-config secret: |
| 59 | + |
| 60 | +```sh |
| 61 | +kubectl create secret generic cluster-autoscaler-cloud-config \ |
| 62 | + --from-file=cloud-config=cloud-config.ini \ |
| 63 | + -n kube-system |
| 64 | +``` |
| 65 | + |
| 66 | +Create a `values.yaml` for the cluster-autoscaler chart: |
| 67 | + |
| 68 | +```yaml |
| 69 | +image: |
| 70 | + repository: ttl.sh/openfaasltd/cluster-autoscaler-slicer-amd64 |
| 71 | + tag: dev |
| 72 | + |
| 73 | +cloudProvider: slicer |
| 74 | + |
| 75 | +fullnameOverride: cluster-autoscaler-slicer |
| 76 | + |
| 77 | +autoDiscovery: |
| 78 | + clusterName: k3s-slicer |
| 79 | + |
| 80 | +# Mount the cluster-autoscaler-cloud-config secret |
| 81 | +extraVolumeSecrets: |
| 82 | + cluster-autoscaler-cloud-config: |
| 83 | + name: cluster-autoscaler-cloud-config |
| 84 | + mountPath: /etc/slicer/ |
| 85 | + items: |
| 86 | + - key: cloud-config |
| 87 | + path: cloud-config |
| 88 | + |
| 89 | +# All your required parameters |
| 90 | +extraArgs: |
| 91 | + cloud-config: /etc/slicer/cloud-config |
| 92 | + # Standard logging |
| 93 | + logtostderr: true |
| 94 | + stderrthreshold: info |
| 95 | + v: 4 |
| 96 | + |
| 97 | + scale-down-enabled: true |
| 98 | + scale-down-delay-after-add: "30s" |
| 99 | + scale-down-unneeded-time: "30s" |
| 100 | + expendable-pods-priority-cutoff: -10 |
| 101 | +``` |
| 102 | +
|
| 103 | +Deploy with Helm: |
| 104 | +
|
| 105 | +```sh |
| 106 | +helm install cluster-autoscaler-slicer charts/cluster-autoscaler \ |
| 107 | + --namespace=kube-system \ |
| 108 | + --values=values.yaml |
| 109 | +``` |
| 110 | + |
| 111 | +To test the autoscaler do one of the following: |
| 112 | + |
| 113 | +* Scale a deployment higher than can fit on the current set of control-plane nodes, then wait for the autoscaler to scale up the cluster. |
| 114 | +* Or, create a taint / affinity / anti-affinity rule that will prevent a pod from being scheduled to the existing set of nodes, then wait for the autoscaler to scale up the cluster. |
0 commit comments