Skip to content

BBClient Module

bbclient

Classes

BBClient

BBClient(cache_folder=DEFAULT_CACHE_FOLDER, bedbase_api=DEFAULT_BEDBASE_API)

Bases: BedCacheManager

BBClient to deal with download files from bedbase and caching them.

Args: cache_folder (Union[str, os.PathLike]): path to local folder as cache of files from bedbase, if not given it will be the environment variable BBCLIENT_CACHE bedbase_api (str): url to bedbase

Functions
load_bedset
load_bedset(bedset_id)

Load a BEDset from cache, or download and add it to the cache with its BED files.

Args: bedset_id (str): unique identifier of a BED set

Returns: BedSet: the BedSet object

load_bed
load_bed(bed_id)

Loads a BED file from cache, or downloads and caches it if it doesn't exist.

Args: bed_id (str): unique identifier of a BED file

Returns: RegionSet: the RegionSet object

add_bedset_to_cache
add_bedset_to_cache(bedset)

Add a BED set to the cache.

Args: bedset (BedSet): the BED set to be added, a BedSet class

Returns: str: the identifier if the BedSet object

add_bed_to_cache
add_bed_to_cache(bedfile, force=False)

Add a BED file to the cache.

Args: bedfile (Union[RegionSet, str]): a RegionSet object or a path or url to the BED file force (bool): whether to overwrite the existing file in cache

Returns: RegionSet: the RegionSet identifier

add_bed_tokens_to_cache
add_bed_tokens_to_cache(bed_id, universe_id)

Add a tokenized BED file to the cache.

Args: bed_id (str): the identifier of the BED file universe_id (str): the identifier of the universe

Returns: str: the identifier of the tokenized BED file

load_bed_tokens
load_bed_tokens(bed_id, universe_id)

Load a tokenized BED file from cache, or download and cache it if it doesn't exist.

Args: bed_id (str): the identifier of the BED file universe_id (str): the identifier of the universe

Returns: Array: the zarr array of tokens

remove_tokens
remove_tokens(bed_id, universe_id)

Remove all tokenized BED files from cache

cache_tokens
cache_tokens(bed_id, universe_id, tokens)

Cache tokenized BED file.

Args: bed_id (str): the identifier of the BED file universe_id (str): the identifier of the universe tokens (Union[list, Array]): the list of tokens

add_bed_to_s3
add_bed_to_s3(identifier, bucket=DEFAULT_BUCKET_NAME, endpoint_url=None, aws_access_key_id=None, aws_secret_access_key=None, s3_path=DEFAULT_BUCKET_FOLDER)

Add a cached BED file to S3.

Args: identifier (str): the unique identifier of the BED file bucket (str): the name of the bucket endpoint_url (str): the URL of the S3 endpoint [Default: set up by the environment vars] aws_access_key_id (str): the access key of the AWS account [Default: set up by the environment vars] aws_secret_access_key (str): the secret access key of the AWS account [Default: set up by the environment vars] s3_path (str): the path on S3

Returns: str: full path on S3

get_bed_from_s3
get_bed_from_s3(identifier, bucket=DEFAULT_BUCKET_NAME, endpoint_url=None, aws_access_key_id=None, aws_secret_access_key=None, s3_path=DEFAULT_BUCKET_FOLDER)

Get a cached BED file from S3 and cache it locally.

Args: identifier (str): the unique identifier of the BED file bucket (str): the name of the bucket endpoint_url (str): the URL of the S3 endpoint [Default: set up by the environment vars] aws_access_key_id (str): the access key of the AWS account [Default: set up by the environment vars] aws_secret_access_key (str): the secret access key of the AWS account [Default: set up by the environment vars] s3_path (str): the path on S3

Returns: str: bed file id

Raises: FileNotFoundError: if the identifier does not exist in cache

seek
seek(identifier)

Get local path to BED file or BED set with specific identifier.

Args: identifier (str): the unique identifier

Returns: str: the local path of the file

Raises: FileNotFoundError: if the identifier does not exist in cache

remove_bedset_from_cache
remove_bedset_from_cache(bedset_id, remove_bed_files=False)

Remove a BED set from cache.

Args: bedset_id (str): the identifier of BED set remove_bed_files (bool): whether also remove BED files in the BED set

Raises: FileNotFoundError: if the BED set does not exist in cache

list_beds
list_beds()

List all BED files in cache.

Returns: Dict[str, str]: the list of identifiers of BED files

list_bedsets
list_bedsets()

List all BED sets in cache.

Returns: Dict[str, str]: the list of identifiers of BED sets

remove_bedfile_from_cache
remove_bedfile_from_cache(bedfile_id)

Remove a BED file from cache.

Args: bedfile_id (str): the identifier of BED file

Raises: FileNotFoundError: if the BED set does not exist in cache