ibridges.search.search_data
- ibridges.search.search_data(session, path=None, path_pattern=None, checksum=None, metadata=None, item_type=None, case_sensitive=False)
Search for collections, data objects and metadata.
By default all accessible collections and data objects are returned. It is also possible to find items with specific metadata, using wild cards. The wildcard used in the iRODS universe is %, not *.
- Parameters:
session (
Session) – Session to search with.path (
Union[str,IrodsPath,None]) – IrodsPath to the collection to search into, collection itself will not be considered. By default the home collection is searched.path_pattern (
Optional[str]) – Search pattern in the path to look for. Allows for the ‘%’ wildcard. For example, use ‘%.txt’ to look for all txt data objects.checksum (
Optional[str]) – Checksum of the dataobject, wildcard ‘%’ can be used. If left out, no checksum will be matched.metadata (
Union[None,MetaSearch,list[MetaSearch],list[tuple]]) – Metadata triples that constrain the key, value and units of the results. For example, to get only items having a metadata entry with value “x”, use MetaSearch(value=”x”). SeeMetaSearchfor more detail. You can also provide a list of constraints. Then each of these constrains will have to be satisfied for the data item to show up in the results.item_type (
Optional[str]) – Type of the item to search for, by default None indicating both data objects and collections are returned. Set to “data_object” for data objects and “collection” for collections.case_sensitive (
bool) – Case sensitive search for Paths and metadata. Default: False
- Raises:
ValueError: – If no search criterium is supplied.
- Return type:
list[CachedIrodsPath]- Returns:
List of CachedIrodsPaths. The CachedIrodsPaths for data objects contain the size and the checksum found in the search.
Examples
>>> # Find data objects and collections >>> search_data(session, "/path/to/sub/col", path_pattern="somefile.txt")
>>> # Find data ending with .txt in your home and on a collection path with the substring "sub" >>> search_data(session, path_pattern="%.txt") >>> search_data(session, path_pattern="%sub/%.txt")
>>> # Find all data objects with a specific checksum in the home collection >>> search_data(session, checksum="sha2:wW+wG+JxwHmE1uXEvRJQxA2nEpVJLRY2bu1KqW1mqEQ=") [IrodsPath(/, somefile.txt), IrodsPath(/, someother.txt)]
>>> # Checksums can have wildcards as well, but beware of collisions: >>> search_data(session, checksum="sha2:wW+wG%") [IrodsPath(/, somefile.txt), IrodsPath(/, someother.txt)]
>>> # Find data objects and collections with some metadata key >>> search_data(session, metadata=MetaSearch(key="some_key"))
>>> # Search for data labeled with several metadata constraints >>> search_data(session, metadata=[MetaSearch("some_key"), MetaSearch(value="other_value")]
>>> # Find data from metadata values using the wildcard >>> # Will find all data and collections with e.g. "my_value" and "some_value" >>> search_data(session, metadata=MetaSearch(value="%_value"))
>>> # Find data using metadata units >>> search_data(session, metadata=MetaSearch(units="kg"))
>>> # Different conditions can be combined, only items for which all is True will be returned >>> search_data(session, path_pattern="%.txt", metadata=MetaSearch(key="some_key", units="kg")
>>> # Find data without units >>> search_data(session, metadata=MetaSearch(units=None)
>>> Find only data objects >>> search_data(session, path_pattern="x%", item_type="data_object")