Virtual File System (pyobs.vfs)
A virtual file system (VFS) is a convenient way for allowing access to different resources with a common interface. It also makes it easy to change the actual place of storage without any changes to the code. In pyobs, a path into the VFS looks like a normal path on a Unix-like system:
/path/to/file.ext
To be more precise, a path is built like this:
/<root>/<path>/<filename>
where <root> indicates, which VFS to use, <path> is the path within that VFS and <filename> is the actual
filename.
The available roots in the pyobs VFS are usually defined in the configuration file like this:
vfs:
root1:
class: VfsClass1
root2:
class: VfsClass2
With this configuration, all filenames that start with /root1/ (e.g. /root1/path/filename.ext) are handled
by VfsClass1, while all filenames starting with /root2/ use VfsClass2. Both defined classes can have
additional parameters, which should also be given in the configuration.
With this example, one can easily see the advantages of using a VFS:
File access to all roots look similar, we always open a filename like
/root/path/file.ext.Changing the handling class for one of the roots changes the way we access a file. For instance, if a file was always stored locally, but now we want to change that to a location using a SSH location, this can easily be accomplished by simply changing the configuration.
Another advantage that we have not mentioned before is, that the same roots can use different handling classes on different machines. See below for details.
Imagine a file /storage/file.ext that maps to a local file on machine A, i.e. storage uses a handling class
that simply changes the filename to a local filename. On machine B we can now define the same root storage, but use
a different handling class that, e.g., accesses the file via SSH. Still, the filename would be the same on both
machines. So A could store a file as /storage/file.ext, send the filename to B, which then can access the file
via the same filename.
- Currently supported are these types of file access:
LocalFile: Local file on the machine the module is running on.HttpFile: File on a HTTP server.MemoryFile: File in memory.SSHFile: File on different machine that is accessible via SSH.TarFile: Wrapper for a dynamically created TAR file. Can only be read from.TempFile: Temporary file that will be deleted after being closed.ArchiveFile: Wrapper for a file in the pyobs-archive image archive.
The base class for all of these classes is VFSFile.
In a distributed pyobs system, modules typically run on different machines — a camera controller on one computer, an image pipeline on another, a scheduler on a third. The Virtual File System (VFS) lets all of them read and write files using the same logical paths, regardless of where the data physically lives.
Each VFS path starts with a root — a named prefix that maps to a specific storage backend. For example,
a path like /cache/image_001.fits uses the root cache, which might point to a local directory on one
machine and an HTTP file cache on another. The calling code never needs to know the difference.
Configuration
The VFS is configured in YAML under a vfs key:
vfs:
class: pyobs.vfs.VirtualFileSystem
roots:
cache:
class: pyobs.vfs.LocalFile
root: /data/pyobs/cache
This maps /cache/... to local files under /data/pyobs/cache. On a different machine that needs to
read the same files over HTTP:
vfs:
class: pyobs.vfs.VirtualFileSystem
roots:
cache:
class: pyobs.vfs.HttpFile
download: https://camera-host.example.com/filecache/
Both machines use identical paths in their code — /cache/image_001.fits — and the VFS handles the
transport transparently.
Two roots are always available by default even without explicit configuration:
/pyobs/→/opt/pyobs/storage/(local)/robotic/→/opt/pyobs/robotic/(local)
Reading and writing files
The primary interface is open_file(), which returns a VFSFile
that supports async context manager usage:
async with self.vfs.open_file("/cache/image_001.fits", "rb") as f:
data = await f.read()
For common file types, VirtualFileSystem provides convenience methods that wrap
open_file automatically:
read_image()/write_image()—Imageobjectsread_fits()/write_fits()— raw FITS HDU listsread_csv()/write_csv()— pandas DataFramesread_yaml()/write_yaml()— dicts
Inside any Object or Module, the VFS is available via
self.vfs:
image = await self.vfs.read_image("/cache/image_001.fits")
File access classes
Each root in the VFS configuration maps to one of these classes:
Class |
Use case |
|---|---|
Files on the local filesystem. The most common choice for the machine that writes data. |
|
Files served over HTTP, typically via |
|
Files on a remote machine accessed over SSH/SCP. Suitable when HTTP caching is not available. |
|
Files on a remote machine accessed over SFTP. |
|
Files on a Windows share (SMB/CIFS). |
|
In-memory file storage. Useful for testing or short-lived intermediate data. |
|
Temporary files on the local filesystem, cleaned up automatically. |
|
Files stored in a pyobs archive service. |
API reference
- class VirtualFileSystem(roots: dict[str, Any] | None = None, **kwargs: Any)
Base for a virtual file system.
Create a new VFS.
- Parameters:
roots – Dictionary containing roots, see
vfsfor examples.
- async exists(path: str) bool[source]
Checks, whether a given path or file exists.
- Parameters:
path – Path to check.
- Returns:
Whether it exists or not
- async find(path: str, pattern: str) list[str][source]
Find a file in the given path.
- Parameters:
path – Path to search in.
pattern – Pattern to search for.
- Returns:
List of found files.
- async listdir(path: str) list[str][source]
Find a file in the given path.
- Parameters:
path – Path to search in.
pattern – Pattern to search for.
- Returns:
List of found files.
- async local_path(path: str) str[source]
Returns a local filename, but only, if path leads to a LocalFile.
- Parameters:
path – Path to get local path for.
- Returns:
Local path.
- Raises:
ValueError if path does not lead to LocalFile. –
- open_file(filename: str, mode: str) VFSFile[source]
Open a file. The handling class is chosen depending on the rootse in the filename.
- Parameters:
filename (str) – Name of file to open.
mode (str) – Opening mode.
- Returns:
(IOBase) File like object for given file.
- async read_csv(filename: str, *args: Any, **kwargs: Any) DataFrame[source]
Convenience function for reading a CSV file into a DataFrame.
- Parameters:
filename – Name of file to read.
- Returns:
DataFrame with content of file.
- async read_fits(filename: str) HDUList[source]
Convenience function that wraps around open_file() to read a FITS file and put it into a astropy FITS structure.
- Parameters:
filename – Name of file to download.
- Returns:
A PrimaryHDU containing the FITS file.
- async read_image(filename: str) Image[source]
Convenience function that wraps around open_file() to read an Image.
- Parameters:
filename – Name of file to download.
- Returns:
An image object
- async read_yaml(filename: str) Any[source]
Convenience function for reading a YAML file into a dict.
- Parameters:
filename – Name of file to read.
- Returns:
Content of file.
- async remove(path: str) bool[source]
Removes file with given path.
- Parameters:
path – Path to delete.
- Returns:
Success of deletion.
- static split_root(path: str) tuple[str, str][source]
Splits the root from the rest of the path.
- Parameters:
path (str) – Path to split.
- Returns:
(tuple) Tuple (root, filename).
- async write_bytes(filename: str, data: bytes, *args: Any, **kwargs: Any) None[source]
Convenience function for writing bytes to a file.
- Parameters:
filename – Name of file to write.
data – Bytes to write.
- async write_csv(filename: str, df: DataFrame, *args: Any, **kwargs: Any) None[source]
Convenience function for writing a CSV file from a DataFrame.
- Parameters:
filename – Name of file to write.
df – DataFrame to write.
- async write_fits(filename: str, hdulist: HDUList, *args: Any, **kwargs: Any) None[source]
Convenience function for writing an Image to a FITS file.
- Parameters:
filename – Name of file to write.
hdulist – hdu list to write.
- class VFSFile
Base class for all VFS file classes.
- async classmethod exists(path: str, *args: Any, **kwargs: Any) bool[source]
Checks, whether a given path or file exists.
- Parameters:
path – Path to check.
- Returns:
Whether it exists or not
- async classmethod find(path: str, pattern: str, **kwargs: Any) list[str][source]
Find files by pattern matching.
- Parameters:
path – Path to search in.
pattern – Pattern to search for.
- Returns:
List of found files.
- class LocalFile(name: str, mode: str = 'r', root: str | None = None, mkdir: bool = True, **kwargs: Any)
Bases:
VFSFileWraps a local file with the virtual file system.
Open a local file.
- Parameters:
name – Name of file.
mode – Open mode.
root – Root to prefix name with for absolute path in filesystem.
mkdir – Whether or not to create non-existing paths automatically.
- async classmethod exists(path: str, root: str = '', *args: Any, **kwargs: Any) bool[source]
Checks, whether a given path or file exists.
- Parameters:
path – Path to check.
root – VFS root.
- Returns:
Whether it exists or not
- async static find(path: str, pattern: str, **kwargs: Any) list[str][source]
Find files by pattern matching.
- Parameters:
path – Path to search in.
pattern – Pattern to search for.
- Returns:
List of found files.
- async static listdir(path: str, **kwargs: Any) list[str][source]
Returns content of given path.
- Parameters:
path – Path to list.
kwargs – Parameters for specific file implementation (same as __init__).
- Returns:
List of files in path.
- class HttpFile(name: str, mode: str = 'r', download: str | None = None, upload: str | None = None, username: str | None = None, password: str | None = None, verify_tls: bool = False, timeout: int = 30, **kwargs: Any)
Bases:
BufferedFileWraps a file on a HTTP server that can be accessed via GET/POST. Especially useful in combination with
HttpFileCache.Creates a new HTTP file.
- Parameters:
name – Name of file.
mode – Open mode (r/w).
download – Base URL for downloading files. If None, no read access possible.
upload – Base URL for uploading files. If None, no write access possible.
username – Username for accessing the HTTP server.
password – Password for accessing the HTTP server.
verify_tls – Whether to verify TLS certificates.
timeout – Timeout in seconds for uploading/downloading files.
- async read(n: int = -1) str | bytes[source]
Read number of bytes from stream.
- Parameters:
n – Number of bytes to read. Read until end, if -1.
- Returns:
Read bytes.
- property url: str
Returns URL of file.
- class SSHFile(name: str, mode: str = 'r', hostname: str | None = None, port: int = 22, username: str | None = None, password: str | None = None, keyfile: str | None = None, root: str | None = None, mkdir: bool = True, **kwargs: Any)
Bases:
BufferedFileVFS wrapper for a file that can be accessed over a SFTP connection.
Open/create a file over a SSH connection.
- Parameters:
name – Name of file.
mode – Open mode.
bufsize – Size of buffer size for SFTP connection.
hostname – Name of host to connect to.
port – Port on host to connect to.
username – Username to log in on host.
password – Password for username.
keyfile – Path to SSH key on local machine.
root – Root directory on host.
mkdir – Whether or not to automatically create directories.
- async static listdir(path: str, **kwargs: Any) list[str][source]
Returns content of given path.
- Parameters:
path – Path to list.
kwargs – Parameters for specific file implementation (same as __init__).
- Returns:
List of files in path.
- class SFTPFile(name: str, mode: str = 'r', hostname: str | None = None, port: int = 22, username: str | None = None, password: str | None = None, keyfile: str | None = None, root: str | None = None, mkdir: bool = True, **kwargs: Any)
Bases:
VFSFileVFS wrapper for a file that can be accessed over a SFTP connection.
Open/create a file over a SSH connection.
- Parameters:
name – Name of file.
mode – Open mode.
bufsize – Size of buffer size for SFTP connection.
hostname – Name of host to connect to.
port – Port on host to connect to.
username – Username to log in on host.
password – Password for username.
keyfile – Path to SSH key on local machine.
root – Root directory on host.
mkdir – Whether or not to automatically create directories.
- class SMBFile(name: str, mode: str = 'r', hostname: str | None = None, share: str | None = None, username: str | None = None, password: str | None = None, root: str | None = None, mkdir: bool = True, **kwargs: Any)
Bases:
VFSFileVFS wrapper for a file that can be accessed over a SMB connection.
Requires smbprotocol package to work.
Open/create a file over a SSH connection.
- Parameters:
name – Name of file.
mode – Open mode.
hostname – Name of host to connect to.
share – Share to access on server.
username – Username to log in on host.
password – Password for username.
keyfile – Path to SSH key on local machine.
root – Root directory on host.
mkdir – Whether or not to automatically create directories.
- class MemoryFile(name: str, mode: str = 'r', **kwargs: Any)
Bases:
BufferedFileA file stored in memory.
Open/create a file in memory.
- Parameters:
name – Name of file.
mode – Open mode.
- property closed: bool
Whether stream is closed.
- class TempFile(name: str | None = None, mode: str = 'r', prefix: str | None = None, suffix: str | None = None, root: str = '/tmp/pyobs/', mkdir: bool = True, **kwargs: Any)
Bases:
VFSFileA temporary file.
Open/create a temp file.
- Parameters:
name – Name of file.
mode – Open mode.
prefix – Prefix for automatic filename creation in write mode.
suffix – Suffix for automatic filename creation in write mode.
root – Temp directory.
mkdir – Whether to automatically create directories.
- class ArchiveFile(name: str, url: str, mode: str = 'w', token: str | None = None)
Bases:
HttpFileWraps a file in an archive. To be used in combination with pyobs-archive.
Creates a new archive file.
- Parameters:
name – Name of file.
mode – Open mode (r/w).
url – Archive url url.
token – Authorization token.