File#

A File is a Runhouse primitive that represents a file, and can be used to interact with the data at the given path and system.

File Factory Method#

runhouse.file(data=None, name: str | None = None, path: str | None = None, system: str | None = None, data_config: Dict | None = None, dryrun: bool = False)[source]#

Returns a File object, which can be used to interact with the resource at the given path

Parameters:
  • data – File data. This should be a serializable object.

  • name (Optional[str]) – Name to give the file object, to be reused later on.

  • path (Optional[str]) – Path (or path) of the file object.

  • system (Optional[str or Cluster]) – File system or cluster name. If providing a file system this must be one of: [file, github, sftp, ssh, s3, gs, azure]. We are working to add additional file system support.

  • data_config (Optional[Dict]) – The data config to pass to the underlying fsspec handler.

  • dryrun (bool) – Whether to create the File if it doesn’t exist, or load a File object as a dryrun. (Default: False)

Returns:

The resulting file.

Return type:

File

Example

>>> import runhouse as rh
>>> import json
>>> data = json.dumps(list(range(50))
>>>
>>> # Remote file with name and no path (saved to bucket called runhouse/blobs/my-file)
>>> rh.file(name="@/my-file", data=data, system='s3').write()
>>>
>>> # Remote file with name and path
>>> rh.file(name='@/my-file', path='/runhouse-tests/my_file.pickle', system='s3').save()
>>>
>>> # Local file with name and path, save to local filesystem
>>> rh.file(data=data, path=str(Path.cwd() / "my_file.pickle")).write()
>>>
>>> # Local file with name and no path (saved to ~/.cache/blobs/my-file)
>>> rh.file(name="~/my-file", data=data).write().save()
>>> # Loading a file
>>> my_local_file = rh.file(name="~/my_file")
>>> my_s3_file = rh.file(name="@/my_file")

File Class#

class runhouse.File(path: str | None = None, name: str | None = None, system: str | None = 'file', env: Env | None = None, data_config: Dict | None = None, dryrun: bool = False, **kwargs)[source]#
__init__(path: str | None = None, name: str | None = None, system: str | None = 'file', env: Env | None = None, data_config: Dict | None = None, dryrun: bool = False, **kwargs)[source]#

Runhouse File object

Note

To build a File, please use the factory method file().

exists_in_system()[source]#

Check whether the file exists in the file system

Example

>>> file = rh.file(data, path="saved/path")
>>> file.exists_in_system()
open(mode: str = 'rb')[source]#

Get a file-like (OpenFile container object) of the file data. User must close the file, or use this method inside of a with statement.

Example

>>> with my_file.open(mode="wb") as f:
>>>     f.write(data)
>>>
>>> obj = my_file.open()
resolved_state(deserialize: bool = True, mode: str = 'rb')[source]#

Return the data for the user to deserialize. Primarily used to define the behavior of the fetch method.

Example

>>> data = file.fetch()
rm()[source]#

Delete the file and the folder it lives in from the file system.

Example

>>> file = rh.file(data, path="saved/path")
>>> file.rm()
to(system, env: str | Env | None = None, path: str | None = None, data_config: dict | None = None)[source]#

Return a copy of the file on the destination system and path.

Example

>>> local_file = rh.file(data)
>>> s3_file = file.to("s3")
>>> cluster_file = file.to(my_cluster)
write(data, serialize: bool = True, mode: str = 'wb')[source]#

Save the underlying file to its specified fsspec URL.

Example

>>> rh.file(system="s3", path="path/to/save").write(data)