Skip to content

Models and Region Set objects in Gtars

Gtars has multiple objects (structs/models) for representation of genomic regions and other related data.

🟢 Region

Region is Python representation of a genomic region. e.g. chr1:100-200 + additional information.

Example

from gtars.models import Region

# Create a Region
genomic_region = Region(chr="chr1", 
                         start=100, 
                         end=200, 
                         rest="peak1")
print(genomic_region)
use gtars::models::Region;

// Create a Region
let genomic_region: Region = Region { chr: "chr1".to_string(), 
                                      start: 100, 
                                      end: 200, 
                                      rest: Some("peak1".to_string()) 
                                    },
let identifier = genomic_region.digest();

println!("{:?}", identifier);

🟢 RegionSet

RegionSet is Python representation of a genomic region set, commonly named as BED file.

Quick example

Open BED file from URL and get its identifier.

from gtars.models import RegionSet

# Create a RegionSet from a url, or lcoal BED file.
rs = RegionSet("https://data2.bedbase.org/files/d/a/dafd661aa70590999e0ff9e1980217db.bed.gz")

# Get identifier for the RegionSet
rs.identifier

print(rs)
use gtars::models::RegionSet;

// Create a RegionSet from a url, or lcoal BED file.
let rs = RegionSet::try_from("https://data2.bedbase.org/files/d/a/dafd661aa70590999e0ff9e1980217db.bed.gz").unwrap();

// Get identifier for the RegionSet
let id = rs.identifier();

println!("{:?}", rs);

❗ Note: This is test example and may require additional setup to run.

import init from '@databio/gtars';
import { RegionSet } from '@databio/gtars';

init();

export type BedEntry1 = [string, number, number, string];

// Define entries (regions)
export const entries1: BedEntry1[] = [
  ['chr1', 100, 200, 'peak1'],
  ['chr2', 150, 250, 'peak2'],
  ['chr3', 300, 400, 'peak3'],
];

// Create a Region
const rs = new RegionSet(entries1);

console.log(rs);

❗ Note: RegionSet can be created from a local file path, URL, or by passing a list (vector) or Region objects.

Main commands in Python

  • Load a BED file from local path or URL
    rs = RegionSet("path/to/bedfile.bed")
    
  • Get number of regions
    len(rs)
    
  • Calculate mean reagion width
    rs.mean_region_width()
    
  • Get last base pair location for each chromosome
    rs.get_max_end_per_ch()
    
  • Get number of base pairs in the region set
    rs.get_nucleotide_length()
    
  • Save the regionSet as a BED file
    rs.to_bed("path/to/save/bedfile.bed")
    rs.to_bed_gz("path/to/save/bedfile.bed.gz")  # gzipped
    
  • Save the regionSet as a bigBed file
    rs.to_bigbed("path/to/save/bedfile.bb", chrom_sizes="path/to/chrom.sizes")