Full Reference
ibridges.data_operations module
Data and metadata transfers.
Transfer data between local file system and iRODS, includes upload, download and sync. Also includes operations for creating a local metadata archive and using this archive to set the metadata.
- ibridges.data_operations.apply_meta_archive(session, meta_fp, ipath, dry_run=False)
Apply a metadata archive to set the metadata of collections and data objects.
The archive is a utf-8 encoded JSON file with the metadata of all subcollections and data objects. The archive can be created with the function
create_meta_archive()
.- Parameters:
session – Session with the iRODS server.
meta_fp (
Union
[str
,Path
]) – Metadata archive file to use to set the metadata.ipath (
Union
[str
,IrodsPath
]) – Root collection to set the metadata for. The collections and data objects relative to this root collection should be the same as the ones in the metadata archive.dry_run (
bool
) – If True, only create an operations object, but do not execute the operation, default False.optional – If True, only create an operations object, but do not execute the operation, default False.
- Returns:
The Operations object that allows the user to execute the operations using ops.execute(session).
- Raises:
CollectionDoesNotExistError: – If the ipath does not exist.
NotACollectionError: – If the ipath is not a collection.
Examples
>>> apply_meta_archive(session, "meta_archive.json", "/some/home/collection")
- ibridges.data_operations.create_collection(session, coll_path)
Create a collection and all parent collections that do not exist yet.
Alias for
ibridges.path.IrodsPath.create_collection()
- Parameters:
- Raises:
PermissionError: – If creating a collection is not allowed by the server.
- Return type:
iRODSCollection
Examples
>>> create_collection(session, IrodsPath("~/new_collection"))
- ibridges.data_operations.create_meta_archive(session, source, meta_fp, dry_run=False)
Create a local archive file for the metadata.
The archive is a utf-8 encoded JSON file with the metadata of all subcollections and data objects. To re-use this archive use the function
apply_meta_archive()
.- Parameters:
session (
Session
) – Session with the iRODS server.source (
Union
[str
,IrodsPath
]) – Source iRODS path to create the archive for. This should be a collection.meta_fp (
Union
[str
,Path
]) – Metadata archive file.dry_run (
bool
) – Whether to do a dry run. If so, the archive itself won’t be created, by default False.optional – Whether to do a dry run. If so, the archive itself won’t be created, by default False.
- Returns:
The Operations object that allows the user to execute the operations using ops.execute(session).
- Raises:
CollectionDoesNotExistError: – If the source collection does not exist.
NotACollectionError: – If the source is not a collection but a data object.
Examples
>>> create_meta_archive(session, "/some/home/collection", "meta_archive.json")
- ibridges.data_operations.download(session, irods_path, local_path, overwrite=False, ignore_err=False, resc_name='', copy_empty_folders=True, options=None, dry_run=False, metadata=None, progress_bar=True)
Download a collection or data object to the local filesystem.
- Parameters:
session (
Session
) – Session to download the collection from.irods_path (
Union
[str
,IrodsPath
]) – Absolute irods source path pointing to a collectionlocal_path (
Union
[str
,Path
]) – Absolute path to the destination directoryoverwrite (
bool
) – If an error occurs during download, and ignore_err is set to True, any errors encountered will be transformed into warnings and iBridges will continue to download the remaining files. By default all errors will stop the process of downloading.ignore_err (
bool
) – Collections: If download of an item fails print error and continue with next item.resc_name (
str
) – Name of the resource from which data is downloaded, by default the server will decide.copy_empty_folders (
bool
) – Create respective local directory for empty collections.options (
Optional
[dict
]) – Python-irodsclient options found inirods.keywords
. The following keywords will be ignored since they are set by iBridges: FORCE_FLAG_KW, RESC_NAME_KW, NUM_THREADS_KW, REG_CHKSUM_KW, VERIFY_CHKSUM_KW.dry_run (
bool
) – Whether to do a dry run before uploading the files/folders.metadata (
Union
[None
,str
,Path
]) – If not None, the path to store the metadata to in JSON format. It is recommended to use the .json suffix.progress_bar (
bool
) – Whether to display a progress bar.
- Return type:
- Returns:
Operations object that can be used to execute the download in case of a dry-run.
- Raises:
PermissionError: – If the iRODS server (for whatever reason) forbids downloading the file or (part of the) collection.
DoesNotExistError: – If the irods_path is not pointing to either a collection or a data object.
FileExistsError: – If the irods_path points to a data object and the local file already exists.
NotADirectoryError: – If the irods_path is a collection, while the destination is a file.
Examples
>>> # Below will create a directory "some_local_dir/some_collection" >>> download(session, "~/some_collection", "some_local_dir")
>>> # Below will create a file "some_local_dir/some_obj.txt" >>> download(session, IrodsPath(session, "some_obj.txt"), "some_local_dir")
>>> # Below will create a file "new_file.txt" in two steps. >>> ops = download(session, "~/some_obj.txt", "new_file.txt", dry_run=True) >>> ops.execute()
- ibridges.data_operations.sync(session, source, target, max_level=None, dry_run=False, ignore_err=False, copy_empty_folders=False, resc_name='', options=None, metadata=None, progress_bar=True)
Synchronize data between local and remote copies.
The command can be in one of the two modes: synchronization of data from the client’s local file system to iRODS, or from iRODS to the local file system. The mode is determined by the type of the values for source and target: objects with type
ibridges.path.IrodsPath
will be interpreted as remote paths, while typesstr
andPath
with be interpreted as local paths.Files/data objects that have the same checksum will not be synchronized.
- Parameters:
session (
Session
) – An authorized iBridges session.source (
Union
[str
,Path
,IrodsPath
]) – Existing local folder or iRODS collection. An exception will be raised if it doesn’t exist.target (
Union
[str
,Path
,IrodsPath
]) – Existing local folder or iRODS collection. An exception will be raised if it doesn’t exist.max_level (
Optional
[int
]) – Controls the depth up to which the file tree will be synchronized. A max level of 1 synchronizes only the source’s root, max level 2 also includes the first set of subfolders/subcollections and their contents, etc. Set to None, there is no limit (full recursive synchronization).dry_run (
bool
) – List all source files and folders that need to be synchronized without actually performing synchronization.ignore_err (
bool
) – If an error occurs during the transfer, and ignore_err is set to True, any errors encountered will be transformed into warnings and iBridges will continue to transfer the remaining files.copy_empty_folders (
bool
) – Controls whether folders/collections that contain no files or subfolders/subcollections will be synchronized.resc_name (
str
) – Name of the resource from which data is downloaded, by default the server will decide.options (
Optional
[dict
]) – Python-irodsclient options found inirods.keywords
. The following keywords will be ignored since they are set by iBridges: FORCE_FLAG_KW, RESC_NAME_KW, NUM_THREADS_KW, REG_CHKSUM_KW, VERIFY_CHKSUM_KW.metadata (
Union
[None
,str
,Path
]) – If not None, the location to get the metadata from or store it to.progress_bar (
bool
) – Whether to display a progress bar.
- Raises:
CollectionDoesNotExistError: – If the source collection does not exist
NotACollectionError: – If the source is a data object.
NotADirectoryError: – If the local source is not a directory.
- Return type:
- Returns:
An operations object to execute the sync if dry-run is True.
Examples
>>> # Below, all files/dirs in "some_local_dir" will be transferred into "some_remote_coll" >>> sync(session, "some_local_dir", IrodsPath(session, "~/some_remote_col")
>>> # Below, all data objects/collections in "col" will tbe transferred into "some_local_dir" >>> sync(session, IrodsPath(session, "~/col"), "some_local_dir")
- ibridges.data_operations.upload(session, local_path, irods_path, overwrite=False, ignore_err=False, resc_name='', copy_empty_folders=True, options=None, dry_run=False, metadata=None, progress_bar=True)
Upload a local directory or file to iRODS.
- Parameters:
session (
Session
) – Session to upload the data to.local_path (
Union
[str
,Path
]) – Absolute path to the directory to uploadirods_path (
Union
[str
,IrodsPath
]) – Absolute irods destination pathoverwrite (
bool
) – If data object or collection already exists on iRODS, overwriteignore_err (
bool
) – If an error occurs during upload, and ignore_err is set to True, any errors encountered will be transformed into warnings and iBridges will continue to upload the remaining files. By default all errors will stop the process of uploading.resc_name (
str
) – Name of the resource to which data is uploaded, by default the server will decidecopy_empty_folders (
bool
) – Create respective iRODS collection for empty folders. Default: True.options (
Optional
[dict
]) – Python-irodsclient options found inirods.keywords
. The following keywords will be ignored since they are set by iBridges: FORCE_FLAG_KW, RESC_NAME_KW, NUM_THREADS_KW, REG_CHKSUM_KW, VERIFY_CHKSUM_KW.dry_run (
bool
) – Whether to do a dry run before uploading the files/folders.metadata (
Union
[None
,str
,Path
]) – If not None, it should point to a file that contains the metadata for the upload.progress_bar (
bool
) – Whether to display a progress bar.
- Return type:
- Returns:
Operations object that can be used to execute the upload in case of a dry-run.
- Raises:
FileNotFoundError: – If the local_path is not a valid filename of directory.
DataObjectExistsError: – If the data object to be uploaded already exists without using overwrite==True.
PermissionError: – If the iRODS server does not allow the collection or data object to be created.
Examples
>>> ipath = IrodsPath(session, "~/some_col") >>> # Below will create a collection with "~/some_col/dir". >>> upload(session, Path("dir"), ipath)
>>> # Same, but now data objects that exist will be overwritten. >>> upload(session, Path("dir"), ipath, overwrite=True)
>>> # Perform the upload in two steps with a dry-run >>> ops = upload(session, Path("some_file.txt"), ipath, dry_run=True) # Does not upload >>> ops.print_summary() # Check if this is what you want here. >>> ops.execute() # Performs the upload
ibridges.interactive module
Interactive authentication with iRODS server.
- ibridges.interactive.interactive_auth(password=None, irods_env_path=None, **kwargs)
Interactive authentication with iRODS server.
The main difference with using the
ibridges.Session
object directly is that it will ask for your password if the cached password does not exist or is outdated. This can be more secure, since you won’t have to store the password in a file or notebook. Caches the password in ~/.irods/.irodsA upon success.- Parameters:
password (
Optional
[str
]) – Password to make the connection with. If not supplied, you will be asked interactively.irods_env_path (
Union
[None
,str
,Path
]) – Path to the irods environment.kwargs – Extra parameters for the interactive auth. Mainly used for the cwd parameter.
- Raises:
FileNotFoundError: – If the irods_env_path does not exist.
ValueError: – If the connection to the iRods server cannot be established.
- Return type:
- Returns:
A connected session to the server.
ibridges.meta module
Operations to directly manipulate metadata on the iRODS server.
- class ibridges.meta.MetaData(item, blacklist='^org_[\\\\s\\\\S]+')
Bases:
object
iRODS metadata operations.
This allows for adding and deleting of metadata entries for data objects and collections.
- Parameters:
item (
Union
[iRODSDataObject
,iRODSCollection
]) – The data object or collection to attach the metadata object to.blacklist (
Optional
[str
]) – A regular expression for metadata names/keys that should be ignored. By default all metadata starting with org_ is ignored.
Examples
>>> meta = MetaData(coll) >>> "Author" in meta True >>> for entry in meta: >>> print(entry.key, entry.value, entry.units) Author Ben Mass 10 kg >>> len(meta) 2 >>> meta.add("Author", "Emma") >>> meta.set("Author", "Alice") >>> meta.delete("Author") >>> print(meta) {Mass, 10, kg}
- add(key, value, units='')
Add metadata to an item.
This will never overwrite an existing entry. If the triplet already exists it will throw an error instead. Note that entries are only considered the same if all of the key, value and units are the same. Alternatively you can use the
set()
method to remove all entries with the same key, before adding the new entry.- Parameters:
key (
str
) – Key of the new entry to add to the item.value (
str
) – Value of the new entry to add to the item.units (
Optional
[str
]) – The units of the new entry.
- Raises:
ValueError: – If the metadata already exists.
PermissionError: – If the metadata cannot be updated because the user does not have sufficient permissions.
Examples
>>> meta.add("Author", "Ben") >>> meta.add("Mass", "10", "kg")
- clear()
Delete all metadata entries belonging to the item.
Only entries that are on the blacklist are not deleted.
Examples
>>> meta.add("Ben", "10", "kg") >>> print(meta) - {name: Ben, value: 10, units: kg} >>> metadata.clear() >>> print(len(meta)) # empty 0
- Raises:
PermissionError: – If the user has insufficient permissions to delete the metadata.
- delete(key, value=Ellipsis, units=Ellipsis)
Delete a metadata entry of an item.
- Parameters:
key (
str
) – Key of the new entry to add to the item.value (
Optional
[str
]) – Value of the new entry to add to the item. If the Ellipsis value […] is used, then all entries with this value will be deleted.units (
Optional
[str
]) – The units of the new entry. If the Elipsis value […] is used, then all entries with any units will be deleted (but still constrained to the supplied keys and values).
- Raises:
KeyError: – If the to be deleted key cannot be found.
PermissionError: – If the user has insufficient permissions to delete the metadata.
Examples
>>> # Delete the metadata entry with mass 10 kg >>> meta.delete("mass", "10", "kg") >>> # Delete all metadata with key mass and value 10 >>> meta.delete("mass", "10") >>> # Delete all metadata with the key mass >>> meta.delete("mass")
- find_all(key=Ellipsis, value=Ellipsis, units=Ellipsis)
Find all metadata entries belonging to the data object/collection.
Wildcards can be used by leaving the key/value/units at default.
- from_dict(meta_dict)
Fill the metadata based on a dictionary.
The dictionary that is expected can be generated from the
to_dict()
method.- Parameters:
meta_dict (
dict
) – Dictionary that contains all the key, value, units triples. This should use the same format as the output of the to_dict method.
Examples
>>> meta.add("Ben", "10", "kg") >>> meta_dict = meta.to_dict() >>> meta.clear() >>> len(meta) 0 >>> meta.from_dict(meta_dict) >>> print(meta) - {name: Ben, value: 10, units: kg}
- refresh()
Refresh the metadata of the item.
This is only necessary if the metadata has been modified by another session.
- set(key, value, units='')
Set the metadata entry.
If the metadata entry already exists, then all metadata entries with the same key will be deleted before adding the new entry. An alternative is using the
add()
method to only add to the metadata entries and not delete them.- Parameters:
key (
str
) – Key of the new entry to add to the item.value (
str
) – Value of the new entry to add to the item.units (
Optional
[str
]) – The units of the new entry.
- Raises:
PermissionError: – If the user does not have sufficient permissions to set the metadata.
Examples
>>> meta.set("Author", "Ben") >>> meta.set("mass", "10", "kg")
- to_dict(keys=None)
Convert iRODS metadata (AVUs) and system information to a python dictionary.
This dictionary can later be used to restore the metadata to an iRODS object with the
from_dict()
method.Examples
>>> meta.to_dict() { "name": item.name, "irods_id": item.id, #iCAT database ID "checksum": item.checksum if the item is a data object "metadata": [(m.name, m.value, m.units)] }
- Parameters:
keys (
Optional
[list
]) – List of Attribute names which should be exported to “metadata”. By default all will be exported.- Return type:
dict
- Returns:
Dictionary containing the metadata.
- class ibridges.meta.MetaDataItem(ibridges_meta, prc_meta)
Bases:
object
Interface for metadata entries.
This is a substitute of the python-irodsclient iRODSMeta object. It implements setting the key/value/units, allows for sorting and can remove itself.
This class is generally created by the MetaData class, not directly created by the user.
- Parameters:
ibridges_meta – A MetaData object that the MetaDataItem is part of.
prc_meta – A PRC iRODSMeta object that points to the entry.
- property key: str
Return the key of the metadata item.
- matches(key, value, units)
See whether the metadata item matches the key,value,units pattern.
- remove()
Remove the metadata item.
- property units: str
Return the units of the metadata item.
- update(new_key, new_value, new_units='')
Update the metadata item changing the key/value/units.
- Parameters:
new_key (
str
) – New key to set the metadata item to.new_value (
str
) – New value to set the metadata item to.new_units (
Optional
[str
]) – New units to set the metadata item to, optional.
- Raises:
ValueError: – If the operation could not be completed because of permission error. Or if the new to be created item already exists.
- property value: str | None
Return the value of the metadata item.
ibridges.path module
A class to handle iRODS paths.
- class ibridges.path.CachedIrodsPath(session, size, is_dataobj, checksum, *args)
Bases:
IrodsPath
Cached version of the IrodsPath.
This version should generally not be used by users, but is used for performance reasons. It will cache the size checksum and whether it is a data object. This can be invalidated when other ibridges operations are used.
- property checksum: str
See IrodsPath.
- collection_exists()
See IrodsPath.
- Return type:
bool
- dataobject_exists()
See IrodsPath.
- Return type:
bool
- property size: int
See IrodsPath.
- class ibridges.path.IrodsPath(session, *args)
Bases:
object
A class analogous to the pathlib.Path for accessing iRods data.
The IrodsPath can be used in much the same way as a Path from the pathlib library. Not all methods and attributes are implemented, and some methods/attributes behave subtly different from the pathlib implementation. They mostly do with the expansion of the home directory. With the IrodsPath, the ‘~’ is used to denote the irods_home directory set in the Session object. So, for example the name of an irods path is always the name of the collection/subcollection, which is different from the pathlib behavior in some cases.
- absolute()
Return the absolute path.
This method does the expansion of the ‘~’ and ‘.’ symbols.
- Return type:
- Returns:
The absolute IrodsPath, without any ‘~’ or ‘.’.
Examples
>>> IrodsPath(session, "~").absolute() IrodsPath(/, zone, user)
- property checksum: str
Checksum of the data object.
If not calculated yet, it will be computed on the server.
- Return type:
The checksum of the data object.
- Raises:
DoesNotExistError: – When the path does not exist.
NotADataObjectError: – When the path points to a collection.
Examples
>>> IrodsPath(session, "~/some_dataobj.txt").checksum 'sha2:XGiECYZOtUfP9lnCGyZaBBkBGLaJJw1p6eoc0GxLeKU='
- property collection: iRODSCollection
Instantiate an iRODS collection.
- Raises:
NotADirectoryError: – If the path points to a dataobject and not a collection.
CollectionDoesNotExistError: – If the path does not point to a dataobject or a collection.
- Returns:
Instance of the collection with path.
- Return type:
iRODSCollection
Examples
>>> IrodsPath(session, "~/some_collection").collection <iRODSCollection 21260050 b'some_collection'>
- collection_exists()
Check if the path points to an iRODS collection.
- Return type:
bool
Examples
>>> IrodsPath(session, "~/does_not_exist").collection_exists() False >>> IrodsPath(session, "~/some_dataobj").collection_exists() False >>> IrodsPath(session, "~/some_collection").collection_exists() True
- static create_collection(session, coll_path)
Create a collection and all parent collections that do not exist yet.
- Parameters:
session – Session for which the collection is created.
coll_path (
Union
[IrodsPath
,str
]) – Irods path to the collection to be created.
- Raises:
PermissionError: – If the collection cannot be created due to insufficient permissions.
- Returns:
The newly created collection.
- Return type:
collection
Examples
>>> IrodsPath.create_collection(session, "/zone/home/user/some_collection") >>> IrodsPath.create_collection(session, IrodsPath(session, "~/some_collection"))
- property dataobject: iRODSDataObject
Instantiate an iRODS data object.
- Raises:
NotADataObjectError: – If the path is pointing to a collection and not a data object.
- Returns:
Instance of the data object with path.
- Return type:
iRODSDataObject
Examples
>>> IrodsPath(session, "~/some_dataobj.txt").dataobject <iRODSDataObject 24490075 some_dataobj.txt>
- dataobject_exists()
Check if the path points to an iRODS data object.
- Return type:
bool
Examples
>>> IrodsPath(session, "~/does_not_exist").dataobject_exists() False >>> IrodsPath(session, "~/some_collection").dataobject_exists() False >>> IrodsPath(session, "~/some_dataobj").dataobject_exists() True
- exists()
Check if the path already exists on the iRODS server.
- Return type:
bool
Examples
>>> IrodsPath(session, "~/does_not_exist").exists() False >>> IrodsPath(session, "~/some_collection").exists() True >>> IrodsPath(session, "~/some_dataobj").exists() True
- joinpath(*args)
Concatenate another path to this one.
- Return type:
- Returns:
The concatenated path.
Examples
>>> IrodsPath(session, "~").joinpath("x", "y") IrodsPath(~, x, y)
- property meta: MetaData
Metadata linked to the dataobject or collection.
- Return type:
The Metadata object pertaining to the dataobject or collection.
- Raises:
DoesNotExistError: – When the path does not point to a data object or collection.
- property name: str
Return the name of the data object or collection.
- Return type:
The name of the object/collction, similarly to pathlib.
Examples
>>> IrodsPath(session, "/zone/home/user") "user"
- open(mode='r', **kwargs)
Open a data object for reading or writing.
- Parameters:
mode – Whether to read or write, by default “r” meaning read. To write to a data object, use “w”, appending to a data object can be done with “a”.* Note that opening data objects is always done in binary mode, so to write a string you need to encode it, while reading a string from a data object requires you to decode it. You are advised to use a consistent (utf-8) encoding for all your data objects.
optional – Whether to read or write, by default “r” meaning read. To write to a data object, use “w”, appending to a data object can be done with “a”.* Note that opening data objects is always done in binary mode, so to write a string you need to encode it, while reading a string from a data object requires you to decode it. You are advised to use a consistent (utf-8) encoding for all your data objects.
kwargs – Extra keyword arguments for the python-irodsclient to parse.
- Returns:
A file handle to be used for reading or writing.
- Raises:
NotADataobjectError: – When the IrodsPath points to a collection.
DataObjectDoesNotExistError: – When the data object does not exist and the read mode is given.
Examples
>>> ipath = Irodspath(session, "some_obj.txt") >>> with ipath.open("w") as handle: >>> handle.write("This is a test string".encode("utf-8")) >>> with ipath.open("r") as handle: >>> print(handle.read().decode("utf-8)) This is a test string >>> with ipath.open("a") as handle: >>> handle.write("A string appended at the end.".encode("utf-8"))
- property parent: IrodsPath
Return the parent directory of the current directory.
- Return type:
The parent just above the current directory
Examples
>>> IrodsPath(session, "/zone/home/user").parent IrodsPath("/", "zone", "home") >>> IrodsPath(session, "~").parent IrodsPath("/", "zone", "home")
- relative_to(other)
Calculate the relative path compared to our path.
Can only calculate the relateive path compared to another irods path.
>>> IrodsPath(session, "~/col/dataobj.txt").relative_to(IrodsPath(session, "~")) PurePosixPath(col, dataobj.txt) :rtype: :sphinx_autodoc_typehints_type:`\:py\:class\:\`\~pathlib.PurePosixPath\``
>>> IrodsPath(session, "~/col/dataobj.txt").relative_to(IrodsPath(session, "~/col")) PurePosixPath(dataobj.txt)
- remove()
Remove the data behind an iRODS path.
- Raises:
PermissionError: – If the user has insufficient permission to remove the data.
Examples
>>> IrodsPath(session, "/home/zone/user/some_collection").remove()
- rename(new_name)
Change the name or the path of a data object or collection.
New collections on the path will be created.
- Parameters:
new_name (str or IrodsPath) – new name or a new full path
- Raises:
ValueError: – If the new path already exists, or the path is in a different zone.
PermissionError: – If the new collection cannot be created.
DoesNotExistError: – If the path does not exist.
- Return type:
Examples
>>> IrodsPath(session, "~/some_collection").rename("~/new_collection")
- property size: int
Collect the sizes of a data object or a collection.
- Returns:
Total size [bytes] of the iRODS object or all iRODS objects in the collection.
- Return type:
int
- Raises:
FileNotFoundError: – If the path is neither a collection or data object.
Examples
>>> IrodsPath(session, "~/some_collection").size 12345 >>> IrodsPath(session, "~/some_dataobj.txt").size 623
- walk(depth=None, include_base_collection=True)
Walk on a collection.
This iterates over all collections and data object for the path. If the path is pointing to a data object, it will simply yield this data object.
- Parameters:
depth (int) – The maximum depth relative to the starting collection over which is walked. For example if depth equals 1, then it will iterate only over the subcollections and data objects directly under the starting collection.
include_base_collection (
bool
) – Whether to yield the collection to be walked over or not. By default this is True, conforming to the os.path.walk behavior.
- Return type:
Iterable
[IrodsPath
]- Returns:
Generator that generates all data objects and subcollections in the collection.
Examples
>>> for ipath in IrodsPath(session, "~").walk(): >>> print(ipath) IrodsPath(~, x) IrodsPath(~, x, y) IrodsPath(~, x, y, z.txt) >>> for ipath in IrodsPath(session, "~").walk(depth=1): >>> print(ipath) IrodsPath(~, x)
ibridges.permissions module
Set and modify permissions.
- class ibridges.permissions.Permissions(session, item)
Bases:
object
Irods permissions operations.
This class allows the user retrieve the permissions as well as set them (if the iRODS server allows this).
- Parameters:
session – Session with the connection to the iRODS server.
item – Data object or collection to create or adjust the permissions for.
- property available_permissions: dict
Get available permissions.
- set(perm, user=None, zone=None, recursive=False, admin=False)
Set permissions (ACL) for an iRODS collection or data object.
- Return type:
None
ibridges.resources module
resource operations.
- class ibridges.resources.Resources(session)
Bases:
object
iRODS Resource operations.
On many systems, the selection and management of resources is done completely server side. In this case, the user will not need to worry about using the Resources class.
- Parameters:
session (Session) – Instance of the Session class
- get_free_space(resc_name)
Determine free space in a resource hierarchy.
- Parameters:
resc_name (
str
) – Name of monolithic resource or the top of a resource tree.- Return type:
int
- Returns:
Number of bytes free in the resource hierarchy. On some iRODS servers, the server does not report the available storage space, but instead will return: -1 if the resource does not exists (typo or otherwise), or 0 if no free space has been set in the whole resource tree starting at node resc_name.
- get_resource(resc_name)
Instantiate an iRODS resource.
Prameters
- resc_namestr
Name of the iRODS resource.
- returns:
Instance of the resource with resc_name.
- rtype:
iRODSResource
- raises irods.exception.ResourceDoesNotExist::
If the resource does not exist.
- get_resource_children(resc)
Get all the children for the resource resc.
- Parameters:
resc (
iRODSResource
) – iRODS resource instance.- Return type:
list
- Returns:
Instances of child resources.
- resources(update=False)
iRODS resources and their metadata.
- Parameters:
update (
bool
) – Fetch information from iRODS server and overwrite _resources- Return type:
dict
- Returns:
Name, parent, status, context, and free_space of all resources.
NOTE (free_space of a resource is the free_space annotated, if) – so annotated, otherwise it is the sum of the free_space of all its children.
- property root_resources: list[tuple]
Filter resources for all root resources.
Data can only be written to root resources. Return their names, their status and their free space.
- Return type:
List containing [(resource_name, status, free_space, context)]
ibridges.rules module
Rule operations.
- ibridges.rules.execute_rule(session, rule_file, params, output='ruleExecOut', instance_name='irods_rule_engine_plugin-irods_rule_language-instance', **kwargs)
Execute an iRODS rule.
iRODS rules are a very powerful way to interact with an iRODS server. This is a more advanced use case that most users will not need.
- Parameters:
session (
Session
) – The iRODS session with a connection to the iRODS server.rule_file (
Optional
[str
]) – Name of the iRODS rule file, or a file-like object representing it.params (
Optional
[dict
]) – Rule input variable(s).output (
str
) – Rule output variable(s).instance_name (
str
) – Changes between irods rule language and python rules.kwargs – Optional irods rule parameters. For more information: https://github.com/irods/python-irodsclient
- Return type:
tuple
- Returns:
Tuple containing (stdout, stderr) for the execution of the rule.
Examples
>>> # Notice extra quotes for string literals >>> params = { >>> '*obj': '"/zone/home/user"', >>> '*name': '"attr_name"', >>> '*value': '"attr_value"' >>> } >>> execute_rule(session, rule_file, params)
ibridges.search module
Search for data and metadata on the iRODS server.
- class ibridges.search.MetaSearch(key=Ellipsis, value=Ellipsis, units=Ellipsis)
Bases:
MetaSearch
Named tuple to search for objects and collections.
The key, value and units default to the elipsis (…), which indicate that the search accepts anything for this slot. This is principally the same as using the iRODS wildcard ‘%’ symbol except that during creation using elipses for key, value and units will raise a ValueError. Note that the None value has a different meaning, where it will actually test for the entry being None/empty.
- ibridges.search.search_data(session, path=None, path_pattern=None, checksum=None, metadata=None, item_type=None, case_sensitive=False)
Search for collections, data objects and metadata.
By default all accessible collections and data objects are returned. It is also possible to find items with specific metadata, using wild cards. The wildcard used in the iRODS universe is %, not *.
- Parameters:
session (
Session
) – Session to search with.path (
Union
[str
,IrodsPath
,None
]) – IrodsPath to the collection to search into, collection itself will not be considered. By default the home collection is searched.path_pattern (
Optional
[str
]) – Search pattern in the path to look for. Allows for the ‘%’ wildcard. For example, use ‘%.txt’ to look for all txt data objects.checksum (
Optional
[str
]) – Checksum of the dataobject, wildcard ‘%’ can be used. If left out, no checksum will be matched.metadata (
Union
[None
,MetaSearch
,list
[MetaSearch
],list
[tuple
]]) – Metadata triples that constrain the key, value and units of the results. For example, to get only items having a metadata entry with value “x”, use MetaSearch(value=”x”). SeeMetaSearch
for more detail. You can also provide a list of constraints. Then each of these constrains will have to be satisfied for the data item to show up in the results.item_type (
Optional
[str
]) – Type of the item to search for, by default None indicating both data objects and collections are returned. Set to “data_object” for data objects and “collection” for collections.case_sensitive (
bool
) – Case sensitive search for Paths and metadata. Default: False
- Raises:
ValueError: – If no search criterium is supplied.
- Return type:
list
[CachedIrodsPath
]- Returns:
List of CachedIrodsPaths. The CachedIrodsPaths for data objects contain the size and the checksum found in the search.
Examples
>>> # Find data objects and collections >>> search_data(session, "/path/to/sub/col", path_pattern="somefile.txt")
>>> # Find data ending with .txt in your home and on a collection path with the substring "sub" >>> search_data(session, path_pattern="%.txt") >>> search_data(session, path_pattern="%sub/%.txt")
>>> # Find all data objects with a specific checksum in the home collection >>> search_data(session, checksum="sha2:wW+wG+JxwHmE1uXEvRJQxA2nEpVJLRY2bu1KqW1mqEQ=") [IrodsPath(/, somefile.txt), IrodsPath(/, someother.txt)]
>>> # Checksums can have wildcards as well, but beware of collisions: >>> search_data(session, checksum="sha2:wW+wG%") [IrodsPath(/, somefile.txt), IrodsPath(/, someother.txt)]
>>> # Find data objects and collections with some metadata key >>> search_data(session, metadata=MetaSearch(key="some_key"))
>>> # Search for data labeled with several metadata constraints >>> search_data(session, metadata=[MetaSearch("some_key"), MetaSearch(value="other_value")]
>>> # Find data from metadata values using the wildcard >>> # Will find all data and collections with e.g. "my_value" and "some_value" >>> search_data(session, metadata=MetaSearch(value="%_value"))
>>> # Find data using metadata units >>> search_data(session, metadata=MetaSearch(units="kg"))
>>> # Different conditions can be combined, only items for which all is True will be returned >>> search_data(session, path_pattern="%.txt", metadata=MetaSearch(key="some_key", units="kg")
>>> # Find data without units >>> search_data(session, metadata=MetaSearch(units=None)
>>> Find only data objects >>> search_data(session, path_pattern="x%", item_type="data_object")
ibridges.session module
For creating sessions non-interactively.
- exception ibridges.session.LoginError
Bases:
AttributeError
Error indicating a failure to log into the iRODS server due to the configuration.
- exception ibridges.session.PasswordError
Bases:
ValueError
Error indicating failure to log into the iRODS server due to wrong or outdated password.
- class ibridges.session.Session(irods_env, password=None, irods_home=None, cwd=None)
Bases:
object
Session to connect and perform operations on the iRODS server.
When the session is initialized, you are connected succesfully to the iRODS server. Most likely you will need to supply a password to the initialization routine. This can be problematic from a security standpoint (the password might be recorded for others to see). In this case, you should use the
ibridges.interactive.interactive_auth()
function, which will ask for your password and not store it.The Session object is a context manager, so using it using the
with
statement is generally preferred, see examples below. Otherwise, the user is responsible for closing the connection using theclose()
method.- Parameters:
irods_env (
Union
[dict
,str
,Path
]) – iRODS environment (irods_environment.json) file, or a dictionary containing its contents.password (
Optional
[str
]) – Pass the password as a string. By default None, in which case it will try to use the cached password. If this fails, the initialization will fail and throw an exception.irods_home (
Optional
[str
]) – Override the home directory of irods. Otherwise attempt to retrive the value from the irods environment dictionary. If it is not there either, then use /{zone}/home/{username}.
- Raises:
FileNotFoundError: – If the irods_env parameter is interpreted as a file name and not found.
TypeError: – If the irods_env parameter is not a dict, str or Path.
LoginError: – If the connection to the iRODS server fails to establish.
Examples
>>> session = Session(Path.home() / ".irods" / "irods_environment.json", >>> password="your_password", irods_home="/zone/home/user") >>> session = Session(env_dictionary) # env_dictionary with connection info >>> with Session("irods_environment.json") as session: >>> # Do operations with the session here. >>> # The session will be automatically closed on finish/error.
- authenticate_using_auth_file()
Authenticate with an authentication file.
Internal use only.
- Return type:
iRODSSession
- authenticate_using_password()
Authenticate with the iRODS server using a password.
Internal use only.
- Return type:
iRODSSession
- close()
Disconnect the iRODS session.
This closes the connection, and makes the session available for reconnection with the
connect()
method.
- connect()
Establish an iRODS session.
Users generally don’t need to call this connect function manually, except if they called the
close()
explicitly and want to reconnect. If you call the connect method multiple times without disconnecting, this might result in stale connections to the iRODS server.- Return type:
iRODSSession
- Returns:
A python-irodsclient session. This is also stored in the ibridges.Session object itself, so users do not need to store this session themselves.
- property cwd: str
Current working directory for irods.
This is your current working directory to which other IrodsPaths are relative to. By default this is the same as your working directory. In IrodsPaths, a path relative to the current working directory can be denoted by the ‘.’.
- Returns:
The current working directory in the current session.
- Return type:
str
Examples
>>> session.cwd /zone/home/user
- property default_resc: str
Default resource name from iRODS environment.
- Returns:
Name of the default resource.
- Return type:
str
- get_user_info()
Query for user type and groups.
- Return type:
tuple
[list
,list
]- Returns:
Tuple containing (iRODS user type names, iRODS group names)
- has_valid_irods_session()
Check if the iRODS session is valid.
- Returns:
True if the session is valid, False otherwise.
- Return type:
bool
- property home: str
Home directory for irods.
In the iRODS community this is known as ‘irods_home’, in file system terms it would be your home directory.
- Returns:
The home directory in the current session.
- Return type:
str
Examples
>>> session.home /zone/home/user
- classmethod network_check(hostname, port)
Check connectivity to an iRODS server.
This method attempts to reach the iRODS server, without supplying any user credentials.
- Parameters:
hostname (str) – FQDN/IP of an iRODS server.
port (int) – Port to which to connect to the server
- Return type:
bool
- Returns:
Connection to hostname possible.
- property server_version: tuple
Retrieve version of the iRODS server.
- Returns:
Server version
- Return type:
(major, minor, patch).
- write_pam_password()
Store the password in the iRODS authentication file in obfuscated form.
Internal use only.
ibridges.tickets module
Ticket operations.
- class ibridges.tickets.TicketData(name, type, path, expiration_date)
Bases:
tuple
- expiration_date
Alias for field number 3
- name
Alias for field number 0
- path
Alias for field number 2
- type
Alias for field number 1
- class ibridges.tickets.Tickets(session)
Bases:
object
iRODS Ticket operations.
Tickets allow users to give temporary access to other users. These tickets are stored on the iRODS server, and can be deleted whenever the access is not needed anymore.
- Parameters:
session (
Session
) – Session connecting to the iRODS server.
- property all_ticket_strings: list[str]
Get the names of all tickets.
- clear()
Delete all tickets.
This revokes all access to data objects and collections that was granted through these tickets.
- create_ticket(irods_path, ticket_type='read', expiry_date=None)
Create an iRODS ticket.
This allows read or write access to the object referenced by obj_path.
- Parameters:
irods_path (
Union
[str
,IrodsPath
]) – Collection or data object path to create a ticket for.ticket_type (
str
) – read or write, default readoptional – read or write, default read
expiry_date (
Union
[str
,datetime
,date
,None
]) – Expiration date as a datetime, date or string in the form strftime(‘%Y-%m-%d.%H:%M:%S’).optional – Expiration date as a datetime, date or string in the form strftime(‘%Y-%m-%d.%H:%M:%S’).
- Raises:
TypeError: – If the expiry_date has the wrong type.
ValueError: – If the expiration date cannot be set for whatever reason.
- Returns:
Name of ticket and if expiration string successfully set: (str, bool)
- Return type:
tuple
- delete_ticket(ticket, check=False)
Delete iRODS ticket.
This revokes the access that was granted with the ticket.
- Parameters:
ticket (
Union
[str
,Ticket
]) – Ticket or ticket string identifier to be deleted.check (
bool
) – Whether to check whether the ticket actually exists.
- Raises:
KeyError: – If check == True and the ticket does not exist.
- fetch_tickets()
Retrieve all tickets and their metadata belonging to the user.
- Parameters:
update (bool) – Refresh information from server.
- Return type:
list
[TicketData
]- Returns:
A list of all available tickets: [(ticket string, ticket type, irods obj/coll path, expiry data in epoche)]
- get_ticket(ticket_str)
Obtain a ticket using its string identifier.
- Parameters:
ticket_str (
str
) – Unique string identifier with which the ticket can be retrieved.- Raises:
KeyError: – If the ticket cannot be found.
- Return type:
Ticket
- Returns:
Ticket with the correct identifier.
ibridges.executor module
Operations to be performed for upload/download/sync.
- class ibridges.executor.Operations(resc_name=None, options=None)
Bases:
object
Storage for all data and metadata operations.
This class should generally not be used directly by the user to create and execute the data and metadata operations. Instead, the user should use the upload/download/sync operations from the
ibridges.data_operations
module. The return value of these functions is an Operations instance. This can be useful in the case of a dry-run, since the user can print the to be performed operations withprint_summary()
and execute them if necessary withexecute()
.Examples
>>> ops = upload(session, "some_directory", ipath, dry_run=True) >>> ops.print_summary() # Check what which basic operations will be performed. >>> ops.execute(session) # Execute the upload operation.
- add_create_coll(new_col)
Add operation to create a new collection.
- Parameters:
new_col (
IrodsPath
) – IrodsPath that points to the new collection to be created.
- add_create_dir(new_dir)
Add operation to create a new directory.
- Parameters:
new_dir (
Path
) – Directory to be created.
- add_download(ipath, lpath)
Add operation to download a data object.
- Parameters:
ipath (
IrodsPath
) – IrodsPath for the data object to download.lpath (
Path
) – Local path for the data to be stored in.
- add_meta_download(root_ipath, ipath, meta_fp)
Add operation for downloading metadata archives.
This basic operation adds one IrodsPath point to either a collection or data object for metadata archiving.
- add_meta_upload(root_ipath, meta_fp)
Add operation to use a metadata archive.
This basic operation adds one metadata archive to be applied to a collection and its subcollections and data objects. It assumes that the data tree structure is the same for the metadata archive as for the destination iRODS path. If this is not the case, you will get errors during the execution of the operation.
- Parameters:
root_ipath (
IrodsPath
) – Root irods path to which all paths are relative to.meta_fp (
Union
[str
,Path
]) – File that contains the metadata.
- add_upload(lpath, ipath)
Add operation to upload a data object.
- Parameters:
lpath (
Path
) – Local path for the file to be uploaded.ipath (
IrodsPath
) – Destination IrodsPath for the data object to be created.
- execute(session, ignore_err=False, progress_bar=True)
Execute all added operations.
This also creates a progress bar to see the status updates.
- Parameters:
session (
Session
) – Session to perform the operations with.ignore_err (
bool
) – Whether to ignore errors when encountered, by default False Note that not all errors will be ignored.optional – Whether to ignore errors when encountered, by default False Note that not all errors will be ignored.
progress_bar (
bool
) – Whether to turn on the progress bar. The progress bar will be disabled if the total download + upload size is 0 regardless.
- execute_create_coll(session)
Execute all create collection operations.
- Parameters:
session (
Session
) – Session to create the collections with.
- execute_create_dir()
Execute all create directory operations.
- Raises:
PermissionError – If the path to the directory already exists and is not a directory.
- execute_download(session, pbar, ignore_err=False)
Execute all download operations.
- Parameters:
session (
Session
) – Session to perform the downloads with.down_sizes – Sizes of the data objects to be downloaded.
pbar (
Optional
[tqdm
]) – The progress bar to be updated.ignore_err (
bool
) – Whether to ignore errors when encountered, by default False.optional – Whether to ignore errors when encountered, by default False.
- execute_meta_download()
Execute all metadata download operations.
- execute_meta_upload()
Execute all metadata upload operations.
- Parameters:
session – Session to use with uploading the operations.
- execute_upload(session, pbar, ignore_err=False)
Execute all upload operations.
- Parameters:
session (
Session
) – Session to perform the downloads with.up_sizes – Sizes of the files to be uploaded.
pbar (
Optional
[tqdm
]) – Progress bar to be updated while uploading.ignore_err (
bool
) – Whether to ignore errors when encountered, by default False.
- print_summary()
Print a summary of all the operations added to the object.
ibridges.util module
Utilities to work with dataobjects and collections.
- ibridges.util.calc_checksum(filepath, checksum_type='sha2')
Calculate the checksum for an iRODS dataobject or local file.
- Parameters:
filepath (
Union
[Path
,str
,IrodsPath
]) – Can be either a local path, or an iRODS path. If filepath is a string, it will be assumed to be a local path.checksum_type – Checksum type to calculate, only sha2 and md5 are currently supported. Ignored for IrodsPath’s, since that is configured by the server.
- Returns:
The base64 encoding of the sha256 sum of the object, prefixed by ‘sha2:’.
- ibridges.util.checksums_equal(remote_path, local_path)
Check whether remote and local paths have the same checksum.
- Parameters:
remote_path (
IrodsPath
) – Remote path to calculate the checksum for.local_path (
Union
[Path
,str
]) – Local path to compute the checksum for.
- Returns:
Whether the two have equal checksums. The type of checksum done depends on what is configured on the remote server.
- ibridges.util.find_environment_provider(env_providers, server_name)
Find the provider that provides the right template.
- Parameters:
env_providers (
list
) – A list of all installed environment providers.server_name (
str
) – Name of the server for which the template is to be found.
- Return type:
object
- Returns:
The provider that contains the template.
- Raises:
ValueError – If the server_name identifier can’t be found in the providers.
- ibridges.util.get_collection(session, path)
Instantiate an iRODS collection.
This function is deprecated, use
ibridges.path.IrodsPath.collection()
instead.- Return type:
iRODSCollection
- ibridges.util.get_dataobject(session, path)
Instantiate an iRODS data object.
This function is deprecated, use
ibridges.path.IrodsPath.dataobject()
instead.- Return type:
iRODSDataObject
- ibridges.util.get_environment_providers()
Get a list of all environment template providers.
- Return type:
list
- Returns:
The list that contains the providers.
- ibridges.util.get_size(session, item)
Collect the size of a data object or a collection.
This function is deprecated, use
ibridges.path.IrodsPath.size()
instead.- Return type:
int
- ibridges.util.is_collection(item)
Determine if item is an iRODS collection.
This function is deprecated, use
ibridges.path.IrodsPath.collection_exists()
instead.- Return type:
bool
- ibridges.util.is_dataobject(item)
Determine if item is an iRODS data object.
This function is deprecated, use
ibridges.path.IrodsPath.dataobject_exists()
instead.- Return type:
bool
- ibridges.util.obj_replicas(obj)
Retrieve information about replicas (copies of the file on different resources).
It does so for a data object in the iRODS system.
- Parameters:
obj (irods.data_object.iRODSDataObject) – The data object
- Return type:
list
[tuple
[int
,str
,str
,int
,str
]]- Returns:
List with tuple where each tuple contains replica index/number, resource name on which the replica is stored about one replica, replica checksum, replica size, replica status of the replica
- ibridges.util.print_environment_providers(env_providers)
Print the environment providers to the screen.
- Parameters:
env_providers (
Sequence
) – A list of all installed environment providers.