BBClient Module
bbclient
Classes
BBClient
BBClient(cache_folder=DEFAULT_CACHE_FOLDER, bedbase_api=DEFAULT_BEDBASE_API)
Bases: BedCacheManager
BBClient to deal with download files from bedbase and caching them.
Args:
cache_folder (Union[str, os.PathLike]): path to local folder as cache of files from bedbase,
if not given it will be the environment variable BBCLIENT_CACHE
bedbase_api (str): url to bedbase
Functions
load_bedset
load_bedset(bedset_id)
Load a BEDset from cache, or download and add it to the cache with its BED files.
Args: bedset_id (str): unique identifier of a BED set
Returns: BedSet: the BedSet object
load_bed
load_bed(bed_id)
Loads a BED file from cache, or downloads and caches it if it doesn't exist.
Args: bed_id (str): unique identifier of a BED file
Returns: RegionSet: the RegionSet object
add_bedset_to_cache
add_bedset_to_cache(bedset)
Add a BED set to the cache.
Args: bedset (BedSet): the BED set to be added, a BedSet class
Returns: str: the identifier if the BedSet object
add_bed_to_cache
add_bed_to_cache(bedfile, force=False)
Add a BED file to the cache.
Args: bedfile (Union[RegionSet, str]): a RegionSet object or a path or url to the BED file force (bool): whether to overwrite the existing file in cache
Returns: RegionSet: the RegionSet identifier
add_bed_tokens_to_cache
add_bed_tokens_to_cache(bed_id, universe_id)
Add a tokenized BED file to the cache.
Args: bed_id (str): the identifier of the BED file universe_id (str): the identifier of the universe
Returns: str: the identifier of the tokenized BED file
load_bed_tokens
load_bed_tokens(bed_id, universe_id)
Load a tokenized BED file from cache, or download and cache it if it doesn't exist.
Args: bed_id (str): the identifier of the BED file universe_id (str): the identifier of the universe
Returns: Array: the zarr array of tokens
remove_tokens
remove_tokens(bed_id, universe_id)
Remove all tokenized BED files from cache
cache_tokens
cache_tokens(bed_id, universe_id, tokens)
Cache tokenized BED file.
Args: bed_id (str): the identifier of the BED file universe_id (str): the identifier of the universe tokens (Union[list, Array]): the list of tokens
add_bed_to_s3
add_bed_to_s3(identifier, bucket=DEFAULT_BUCKET_NAME, endpoint_url=None, aws_access_key_id=None, aws_secret_access_key=None, s3_path=DEFAULT_BUCKET_FOLDER)
Add a cached BED file to S3.
Args: identifier (str): the unique identifier of the BED file bucket (str): the name of the bucket endpoint_url (str): the URL of the S3 endpoint [Default: set up by the environment vars] aws_access_key_id (str): the access key of the AWS account [Default: set up by the environment vars] aws_secret_access_key (str): the secret access key of the AWS account [Default: set up by the environment vars] s3_path (str): the path on S3
Returns: str: full path on S3
get_bed_from_s3
get_bed_from_s3(identifier, bucket=DEFAULT_BUCKET_NAME, endpoint_url=None, aws_access_key_id=None, aws_secret_access_key=None, s3_path=DEFAULT_BUCKET_FOLDER)
Get a cached BED file from S3 and cache it locally.
Args: identifier (str): the unique identifier of the BED file bucket (str): the name of the bucket endpoint_url (str): the URL of the S3 endpoint [Default: set up by the environment vars] aws_access_key_id (str): the access key of the AWS account [Default: set up by the environment vars] aws_secret_access_key (str): the secret access key of the AWS account [Default: set up by the environment vars] s3_path (str): the path on S3
Returns: str: bed file id
Raises: FileNotFoundError: if the identifier does not exist in cache
seek
seek(identifier)
Get local path to BED file or BED set with specific identifier.
Args: identifier (str): the unique identifier
Returns: str: the local path of the file
Raises: FileNotFoundError: if the identifier does not exist in cache
remove_bedset_from_cache
remove_bedset_from_cache(bedset_id, remove_bed_files=False)
Remove a BED set from cache.
Args: bedset_id (str): the identifier of BED set remove_bed_files (bool): whether also remove BED files in the BED set
Raises: FileNotFoundError: if the BED set does not exist in cache
list_beds
list_beds()
List all BED files in cache.
Returns: Dict[str, str]: the list of identifiers of BED files
list_bedsets
list_bedsets()
List all BED sets in cache.
Returns: Dict[str, str]: the list of identifiers of BED sets
remove_bedfile_from_cache
remove_bedfile_from_cache(bedfile_id)
Remove a BED file from cache.
Args: bedfile_id (str): the identifier of BED file
Raises: FileNotFoundError: if the BED set does not exist in cache