Interface

A BGEN file is associated with a bgen_file variable, returned by bgen_file_open(), and is required by many functions. The user can query, for example, the number of samples contained in a BGEN file by passing a bgen_file variable to the bgen_file_nsamples() function. The user has to release resources by calling bgen_file_close() after its use.

The function bgen_file_contain_samples() can be used to detect whether the BGEN file contain sample identifications. If it does, the function bgen_file_read_samples() will return a struct of sample identifications. The allocated resources must be released by a subsequent call to bgen_samples_destroy().

The function bgen_metafile_read_partition() reads the variants metadata in the corresponding partition (i.e., names, chromosomes, number of alleles, etc.). It returns the read information as an array of type bgen_partition. After use, its resources have to be released by calling bgen_partition_destroy().

To fetch a genotype information, the user has to first get a variant genotype handler (bgen_genotype) by calling bgen_file_open_genotype(). The number of possible genotypes of a given variant, for example, can then be found by a call to bgen_genotype_ncombs(). The probabilities of each possible genotype can be found by a call to bgen_genotype_read(). After use, the variant genotype handler has to be closed by a bgen_genotype_close() call.

Strings are represented by the bgen_string type, which contains an array of characters and its length.

File

struct bgen_file *bgen_file_open(char const *filepath)

Open bgen file and return a handler.

Remember to call bgen_file_close to close the file and release resources after the interaction has finished.

Parameters

filepath – File path to the bgen file.

Returns

Bgen file handler. Return NULL on failure.

void bgen_file_close(struct bgen_file const *bgen_file)

Close bgen file handler.

Parameters

bgen_file – Bgen file handler.

uint32_t bgen_file_nsamples(struct bgen_file const *bgen_file)

Get the number of samples.

Parameters

bgen_file – Bgen file handler.

Returns

Number of samples.

uint32_t bgen_file_nvariants(struct bgen_file const *bgen_file)

Get the number of variants.

Parameters

bgen_file – Bgen file handler.

Returns

Number of variants.

bool bgen_file_contain_samples(struct bgen_file const *bgen_file)

Check if the file contain sample identifications.

Parameters

bgen_file – Bgen file handler.

Returns

true if bgen file contains the sample ids; false otherwise.

struct bgen_samples *bgen_file_read_samples(struct bgen_file *bgen_file)

Return all sample identifications.

Parameters

bgen_file – Bgen file handler.

Returns

Sample identifications. Return NULL on failure.

struct bgen_genotype *bgen_file_open_genotype(struct bgen_file *bgen_file, uint64_t genotype_offset)

Open a variant for genotype queries.

Parameters
Returns

Variant genotype handler. Return NULL on failure.

struct bgen_file

Bgen file handler.

Genotype

void bgen_genotype_close(struct bgen_genotype const *genotype)

Close a variant genotype handler.

Parameters

genotype – Variant genotype handler.

int bgen_genotype_read(struct bgen_genotype *genotype, double *probabilities)

Read the probabilities of each possible genotype (64-bits).

The length of this array is equal to the product of the values obtained by calling the functions bgen_file_nsamples and bgen_genotype_ncombs.

See also

Please, refer to the corresponding section Probability data storage of the bgen format specification⧉ for more information.

Parameters
  • genotype – Variant genotype handler.

  • probabilities – Array of probabilities.

Returns

0 if it succeeds; 1 otherwise.

int bgen_genotype_read64(struct bgen_genotype *genotype, double *probabilities)

Read the probabilities of each possible genotype (64-bits).

The length of this array is equal to the product of the values obtained by calling the functions bgen_file_nsamples and bgen_genotype_ncombs.

See also

Please, refer to the corresponding section Probability data storage of the bgen format specification⧉ for more information.

Parameters
  • genotype – Variant genotype handler.

  • probabilities – Array of probabilities.

Returns

0 if it succeeds; 1 otherwise.

int bgen_genotype_read32(struct bgen_genotype *genotype, float *probabilities)

Read the probabilities of each possible genotype (32-bits).

The length of this array is equal to the product of the values obtained by calling the functions bgen_file_nsamples and bgen_genotype_ncombs.

See also

Please, refer to the corresponding section Probability data storage of the bgen format specification⧉ for more information.

Parameters
  • genotype – Variant genotype handler.

  • probabilities – Array of probabilities.

Returns

0 if it succeeds; 1 otherwise.

uint16_t bgen_genotype_nalleles(struct bgen_genotype const *genotype)

Get the number of alleles.

Parameters

genotype – Variant genotype handler.

Returns

Number of alleles.

bool bgen_genotype_missing(struct bgen_genotype const *genotype, uint32_t index)

Return 1 if variant is missing for the sample; 0 otherwise.

Parameters
  • genotype – Variant genotype handler.

  • index – Sample index.

Returns

1 for missing genotype; 0 otherwise.

uint8_t bgen_genotype_ploidy(struct bgen_genotype const *genotype, uint32_t index)

Get the ploidy.

Parameters
  • genotype – Variant genotype handler.

  • index – Sample index.

Returns

Ploidy.

uint8_t bgen_genotype_min_ploidy(struct bgen_genotype const *genotype)

Get the minimum ploidy of the variant.

Parameters

genotype – Variant genotype handler.

Returns

Ploidy minimum.

uint8_t bgen_genotype_max_ploidy(struct bgen_genotype const *genotype)

Get the maximum ploidy of the variant.

Parameters

genotype – Variant genotype handler.

Returns

Ploidy maximum.

unsigned bgen_genotype_ncombs(struct bgen_genotype const *genotype)

Get the number of genotype combinations.

Precisely, if the bgen file is of Layout 1, the number of combinations is always equal to 3. In the case of Layout 2, we have two options. For phased genotype, the number of combinations is equal to the product of bgen_genotype_nalleles with bgen_genotype_max_ploidy. For unphased genotype, let n and m be the values returned by calling bgen_genotype_nalleles and bgen_genotype_max_ploidy. This function returns the number of combinations n-1 alleles can be selected from n+m-1, such that the order of a selection does not matter.

Parameters

genotype – Variant genotype handler.

Returns

Number of combinations.

bool bgen_genotype_phased(struct bgen_genotype const *genotype)

Return 1 for phased or 0 for unphased genotype.

Parameters

genotype – Variant genotype handler.

Returns

1 for phased genotype; 0 otherwise.

struct bgen_genotype

Variant genotype handler.

Metafile

struct bgen_metafile *bgen_metafile_create(struct bgen_file *bgen_file, char const *filepath, uint32_t npartitions, int verbose)

Create a bgen metafile.

A bgen metafile contains variant metadata (id, rsid, chrom, alleles) and variant addresses. Those variants are grouped in partitions.

Parameters
  • bgen_file – Bgen file handler.

  • filepath – File path to the metafile.

  • npartitions – Number of partitions. It has to be a number between 1 and the number of samples.

  • verbose1 for showing progress; 0 otherwise.

Returns

Metafile handler. NULL on failure.

struct bgen_metafile *bgen_metafile_open(char const *filepath)

Open a bgen metafile.

Remember to call bgen_metafile_close to close the file and release resources after the interaction has finished.

Parameters

filepath – File path to the metafile.

Returns

Metafile handler. NULL on failure.

uint32_t bgen_metafile_npartitions(struct bgen_metafile const *metafile)

Get the number of partitions.

Parameters

metafile – Metafile handler.

Returns

Number of partitions.

uint32_t bgen_metafile_nvariants(struct bgen_metafile const *metafile)

Get the number of variants.

Parameters

metafile – Metafile handler.

Returns

Number of variants.

struct bgen_partition const *bgen_metafile_read_partition(struct bgen_metafile const *metafile, uint32_t partition)

Read a partition of variants.

Remember to call bgen_partition_destroy to release resources after the interaction has finished.

Parameters
  • metafile – Metafile handler.

  • partition – Partition index.

Returns

Partition of variants. Return NULL on failure.

int bgen_metafile_close(struct bgen_metafile const *metafile)

Close a metafile handler.

Parameters

metafile – Metafile handler.

Returns

0 on success; 1 otherwise.

struct bgen_metafile

Metafile handler.

Partition

void bgen_partition_destroy(struct bgen_partition const *partition)

Destroy a partition by releasing its resources.

Parameters

partition – Partition of variants metadata.

Warning

doxygenfunction: Cannot find function “bgen_partition_get” in doxygen xml output for project “bgen” from directory: doxyxml/

uint32_t bgen_partition_nvariants(struct bgen_partition const *partition)

Get the number of variants.

Parameters

partition – Partition of variants metadata.

Returns

Number of variants.

struct bgen_partition

Partition of variants metadata.

Samples

void bgen_samples_destroy(struct bgen_samples const *samples)

Destroy samples data by releasing its resources.

Parameters

samples – Samples data.

struct bgen_string const *bgen_samples_get(struct bgen_samples const *samples, uint32_t index)

Get a specific sample.

Parameters
  • samples – Samples data.

  • index – Sample index.

Returns

Sample information. Return NULL on failure.

struct bgen_samples

Bgen samples.

String

static struct bgen_string const *bgen_string_create(char const *data, size_t length)

Create a bgen string.

Parameters
  • data – Usual C string. It does not need to be null-terminated.

  • length – String length.

Returns

Bgen string.

static inline void bgen_string_destroy(struct bgen_string const *bgen_string)

Destroy a bgen string.

Parameters

bgen_string – Bgen string.

static inline char const *bgen_string_data(struct bgen_string const *bgen_string)

Get a pointer to the C string.

Parameters

bgen_string – Bgen string.

Returns

Pointer to the internal C string.

static inline size_t bgen_string_length(struct bgen_string const *bgen_string)

Get string length.

Parameters

bgen_string – Bgen string.

Returns

Length.

static inline bool bgen_string_equal(struct bgen_string a, struct bgen_string b)

Compare if two bgen strings are equal.

Parameters
  • a – First bgen string.

  • b – Second bgen string.

Returns

true if they are equal; false otherwise.

static struct bgen_string BGEN_STRING(char const *str)

Initialize a bgen string from a null-terminated string.

Parameters

str – Null-terminated string.

Returns

Bgen string.

struct bgen_string

String.

Variant

struct bgen_variant

Variant metadata.

Public Members

uint64_t genotype_offset

Genotype offset (bgen file).

struct bgen_string const *id

Variant identification.

struct bgen_string const *rsid

RSID.

struct bgen_string const *chrom

Chromossome name.

uint32_t position

Base-pair position.

uint16_t nalleles

Number of alleles.

struct bgen_string const **allele_ids

Allele ids.