Clusters#

Open In Colab

A cluster is the most basic form of compute in Runhouse, largely representing a group of instances or VMs connected with Ray. They largely fall in two categories:

  1. Static Clusters: Any machine you have SSH access to, set up with IP addresses and SSH credentials.

  2. On-Demand Clusters: Any cloud instance spun up automatically for you with your cloud credentials.

Runhouse provides various APIs for interacting with remote clusters, such as terminating an on-demand cloud cluster or running remote CLI or Python commands from your local dev environment.

Letโ€™s start with a simple example using AWS. First, install runhouse with AWS dependencies:

! pip install "runhouse[aws]"

Make sure your AWS credentials are set up:

! aws configure
! sky check

On-Demand Clusters#

We can start by using the rh.cluster factory function to create our cluster. By specifying an instance_type, Runhouse sets up an On-Demand Cluster in AWS EC2 for us.

Each cluster must be provided with a unique name identifier during construction. This name parameter is used for saving down or loading previous saved clusters, and also used for various CLI commands for the cluster.

Our instance_type here is defined as CPU:2, which is the accelerator type and count that we need (another example would be A10G:2). We could alternatively specify a specific specific instance type, such as p3.2xlarge or g4dn.xlarge (these are instance types on AWS).

import runhouse as rh

aws_cluster = rh.cluster(name="test-cluster", instance_type="CPU:2")

Next, we set up a basic function to throw up on our cluster. For more information about Functions & Modules that you can put up on a cluster, see Functions & Modules.

def run_home(name: str):
    return f"Run home {name}!"

remote_function = rh.function(run_home).to(aws_cluster)

After running .to, your function is set up on the cluster to be called from anywhere. When you call remote_function, it executes remotely on your AWS instance.

remote_function("in cluster!")
INFO | 2024-03-06 15:18:58.439252 | Calling run_home.call
INFO | 2024-03-06 15:18:59.490122 | Time to call run_home.call: 1.05 seconds
'Run home in cluster!!'

On-Demand Clusters with TLS exposed#

In the previous example, the cluster that was brought up in EC2 is only accessible to the original user that has SSH credentials to the machine. However, you can set up a cluster with ports exposed to open Internet, and access objects and functions via curl.

tls_cluster = rh.cluster(name="tls-cluster",
                         instance_type="CPU:2",
                         open_ports=[443], # expose HTTPS port to public
                         server_connection_type="tls", # specify how runhouse communicates with this cluster
                         den_auth=False, # no authentication required to hit this cluster (NOT recommended)
)
WARNING | 2024-03-06 15:19:05.297411 | /Users/rohinbhasin/work/runhouse/runhouse/resources/hardware/on_demand_cluster.py:317: UserWarning: Server is insecure and must be inside a VPC or have den_auth enabled to secure it.
  warnings.warn(
remote_tls_function = rh.function(run_home).to(tls_cluster)
remote_tls_function("Marvin")
INFO | 2024-03-06 15:26:05.482586 | Calling run_home.call
INFO | 2024-03-06 15:26:06.550625 | Time to call run_home.call: 1.07 seconds
'Run home Marvin!'
tls_cluster.address
'54.172.178.196'
! curl "https://54.172.178.196/run_home/call?name=Marvin" -k
{"data":""Run home Marvin!"","error":null,"traceback":null,"output_type":"result_serialized","serialization":"json"}

This cluster is exposed to the open Internet, so anyone can hit it. If you do want to share functions and apps publically, itโ€™s recommended you set den_auth=True when setting up your cluster, which requires a user to run runhouse login in order to hit the cluster. Weโ€™ll enable it now:

tls_cluster.enable_den_auth()
! curl "https://54.172.178.196/run_home/call?name=Marvin" -k
{"data":null,"error":raise PermissionError(\nPermissionError: No Runhouse token provided. Try running $ runhouse login or visiting https://run.house/login to retrieve a token. If calling via HTTP, please provide a valid token in the Authorization header.\n"","output_type":"exception","serialization":null}

If we send our Runhouse Den token as a header, then the request is valid:

! curl "https://54.172.178.196/run_home/call?name=Marvin" -k -H "Authorization: Bearer <YOUR TOKEN HERE>"
{"data":""Run home Marvin!"","error":null,"traceback":null,"output_type":"result_serialized","serialization":"json"}

Static Clusters#

If you have existing machines within a VPC that you want to connect to, you can simply provide the IP addresses and path to SSH credentials to the machine.

cluster = rh.cluster(  # using private key
              name="cpu-cluster-existing",
              ips=['<ip of the cluster>'],
              ssh_creds={'ssh_user': '<user>', 'ssh_private_key':'<path_to_key>'},
          )

Useful Cluster Functions#

tls_cluster.run(['pip install numpy && pip freeze | grep numpy'])
Warning: Permanently added '54.172.178.196' (ED25519) to the list of known hosts.
Requirement already satisfied: numpy in /opt/conda/lib/python3.10/site-packages (1.26.4)
numpy==1.26.4
[(0,
  'Requirement already satisfied: numpy in /opt/conda/lib/python3.10/site-packages (1.26.4)nnumpy==1.26.4n',
  "Warning: Permanently added '54.172.178.196' (ED25519) to the list of known hosts.rn")]
tls_cluster.run_python(['import numpy', 'print(numpy.__version__)'])
1.26.4
[(0, '1.26.4n', '')]