Skip to content

How to Reindex the BEDbase Database

BEDbase uses two main vector collections:

  • bed_collection: Stores all genomic regions from the BED files for the hg38 reference genome.
  • text_collection: Stores text embeddings for all BED files based on their metadata.

Information about whether a file is present in the vector database is stored together with its metadata in the relational database. This setup allows us to easily track which files are indexed in the vector database.

To reindex the database, BEDboss provides two commands:

  • 🟢 reindex: Reindex all the available hg38 files in qdrant database. │
  • 🟢 reindex-text: Reindex the semantic (text) search.

Example

Exapme of reindexing the entire database:

bedboss reindex --bedbase-config <path_to_bedbase_config> --purge

❗ Purge option will set all files in the database as not indexed, and reindex all files from the database. ❗ If purge will be set to false, only files that are not indexed will be indexed.

Full bedboss reindex help
bedboss reindex --help

 Usage: bedboss reindex [OPTIONS]                                                                                                                                                                                                   

 Reindex the bedbase database and insert all files to the qdrant database.                                                                                                                                                          

╭─ Options ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ *  --bedbase-config                  TEXT     Path to the bedbase config file [default: None] [required]                                                                                                                         │
│    --purge             --no-purge             Purge existing index before reindexing [default: no-purge]                                                                                                                         │
│    --batch                           INTEGER  Number of items to upload in one batch [default: 1000]                                                                                                                             │
│    --help                                     Show this message and exit.                                                                                                                                                        │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
 bedboss reindex-text --help

 Usage: bedboss reindex-text [OPTIONS]                                                                                                                                                                                              

 Reindex semantic (text) search.                                                                                                                                                                                                    

╭─ Options ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ *  --bedbase-config                  TEXT     Path to the bedbase config file [default: None] [required]                                                                                                                         │
│    --purge             --no-purge             Purge existing index before reindexing [default: no-purge]                                                                                                                         │
│    --batch                           INTEGER  Number of items to upload in one batch [default: 1000]                                                                                                                             │
│    --help                                     Show this message and exit.                                                                                                                                                        │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯