Interface¶
A BGEN file is associated with a bgen_file
variable, returned by
bgen_file_open()
, and is required by many functions. The user can
query, for example, the number of samples contained in a BGEN file by passing
a bgen_file
variable to the bgen_file_nsamples()
function. The user has to release resources by calling
bgen_file_close()
after its use.
The function bgen_file_contain_samples()
can be used to detect
whether the BGEN file contain sample identifications. If it does, the function
bgen_file_read_samples()
will return a struct of sample
identifications. The allocated resources must be released by a subsequent call
to bgen_samples_destroy()
.
The function bgen_metafile_read_partition()
reads the variants
metadata in the corresponding partition (i.e., names, chromosomes, number of
alleles, etc.). It returns the read information as an array of type
bgen_partition
. After use, its resources have to be released by
calling bgen_partition_destroy()
.
To fetch a genotype information, the user has to first get a variant genotype
handler (bgen_genotype
) by calling
bgen_file_open_genotype()
. The number of possible genotypes of
a given variant, for example, can then be found by a call to
bgen_genotype_ncombs()
. The probabilities of each possible genotype
can be found by a call to bgen_genotype_read()
. After use, the
variant genotype handler has to be closed by a bgen_genotype_close()
call.
Strings are represented by the bgen_string
type, which contains an
array of characters and its length.
File¶
-
struct bgen_file *bgen_file_open(char const *filepath)¶
Open bgen file and return a handler.
Remember to call bgen_file_close to close the file and release resources after the interaction has finished.
- Parameters
filepath – File path to the bgen file.
- Returns
Bgen file handler. Return
NULL
on failure.
-
void bgen_file_close(struct bgen_file const *bgen_file)¶
Close bgen file handler.
- Parameters
bgen_file – Bgen file handler.
-
uint32_t bgen_file_nsamples(struct bgen_file const *bgen_file)¶
Get the number of samples.
- Parameters
bgen_file – Bgen file handler.
- Returns
Number of samples.
-
uint32_t bgen_file_nvariants(struct bgen_file const *bgen_file)¶
Get the number of variants.
- Parameters
bgen_file – Bgen file handler.
- Returns
Number of variants.
-
bool bgen_file_contain_samples(struct bgen_file const *bgen_file)¶
Check if the file contain sample identifications.
- Parameters
bgen_file – Bgen file handler.
- Returns
true
if bgen file contains the sample ids;false
otherwise.
-
struct bgen_samples *bgen_file_read_samples(struct bgen_file *bgen_file)¶
Return all sample identifications.
- Parameters
bgen_file – Bgen file handler.
- Returns
Sample identifications. Return
NULL
on failure.
-
struct bgen_genotype *bgen_file_open_genotype(struct bgen_file *bgen_file, uint64_t genotype_offset)¶
Open a variant for genotype queries.
- Parameters
bgen_file – Bgen file handler.
genotype_offset – Genotype offset obtained from bgen_variant::genotype_offset.
- Returns
Variant genotype handler. Return
NULL
on failure.
-
struct bgen_file¶
Bgen file handler.
Genotype¶
-
void bgen_genotype_close(struct bgen_genotype const *genotype)¶
Close a variant genotype handler.
- Parameters
genotype – Variant genotype handler.
-
int bgen_genotype_read(struct bgen_genotype *genotype, double *probabilities)¶
Read the probabilities of each possible genotype (64-bits).
The length of this array is equal to the product of the values obtained by calling the functions bgen_file_nsamples and bgen_genotype_ncombs.
See also
Please, refer to the corresponding section Probability data storage of the bgen format specification⧉ for more information.
- Parameters
genotype – Variant genotype handler.
probabilities – Array of probabilities.
- Returns
0
if it succeeds;1
otherwise.
-
int bgen_genotype_read64(struct bgen_genotype *genotype, double *probabilities)¶
Read the probabilities of each possible genotype (64-bits).
The length of this array is equal to the product of the values obtained by calling the functions bgen_file_nsamples and bgen_genotype_ncombs.
See also
Please, refer to the corresponding section Probability data storage of the bgen format specification⧉ for more information.
- Parameters
genotype – Variant genotype handler.
probabilities – Array of probabilities.
- Returns
0
if it succeeds;1
otherwise.
-
int bgen_genotype_read32(struct bgen_genotype *genotype, float *probabilities)¶
Read the probabilities of each possible genotype (32-bits).
The length of this array is equal to the product of the values obtained by calling the functions bgen_file_nsamples and bgen_genotype_ncombs.
See also
Please, refer to the corresponding section Probability data storage of the bgen format specification⧉ for more information.
- Parameters
genotype – Variant genotype handler.
probabilities – Array of probabilities.
- Returns
0
if it succeeds;1
otherwise.
-
uint16_t bgen_genotype_nalleles(struct bgen_genotype const *genotype)¶
Get the number of alleles.
- Parameters
genotype – Variant genotype handler.
- Returns
Number of alleles.
-
bool bgen_genotype_missing(struct bgen_genotype const *genotype, uint32_t index)¶
Return
1
if variant is missing for the sample;0
otherwise.- Parameters
genotype – Variant genotype handler.
index – Sample index.
- Returns
1
for missing genotype;0
otherwise.
-
uint8_t bgen_genotype_ploidy(struct bgen_genotype const *genotype, uint32_t index)¶
Get the ploidy.
- Parameters
genotype – Variant genotype handler.
index – Sample index.
- Returns
Ploidy.
-
uint8_t bgen_genotype_min_ploidy(struct bgen_genotype const *genotype)¶
Get the minimum ploidy of the variant.
- Parameters
genotype – Variant genotype handler.
- Returns
Ploidy minimum.
-
uint8_t bgen_genotype_max_ploidy(struct bgen_genotype const *genotype)¶
Get the maximum ploidy of the variant.
- Parameters
genotype – Variant genotype handler.
- Returns
Ploidy maximum.
-
unsigned bgen_genotype_ncombs(struct bgen_genotype const *genotype)¶
Get the number of genotype combinations.
Precisely, if the bgen file is of Layout 1, the number of combinations is always equal to
3
. In the case of Layout 2, we have two options. For phased genotype, the number of combinations is equal to the product of bgen_genotype_nalleles with bgen_genotype_max_ploidy. For unphased genotype, letn
andm
be the values returned by calling bgen_genotype_nalleles and bgen_genotype_max_ploidy. This function returns the number of combinationsn-1
alleles can be selected fromn+m-1
, such that the order of a selection does not matter.- Parameters
genotype – Variant genotype handler.
- Returns
Number of combinations.
-
bool bgen_genotype_phased(struct bgen_genotype const *genotype)¶
Return
1
for phased or0
for unphased genotype.- Parameters
genotype – Variant genotype handler.
- Returns
1
for phased genotype;0
otherwise.
-
struct bgen_genotype¶
Variant genotype handler.
Metafile¶
-
struct bgen_metafile *bgen_metafile_create(struct bgen_file *bgen_file, char const *filepath, uint32_t npartitions, int verbose)¶
Create a bgen metafile.
A bgen metafile contains variant metadata (id, rsid, chrom, alleles) and variant addresses. Those variants are grouped in partitions.
- Parameters
bgen_file – Bgen file handler.
filepath – File path to the metafile.
npartitions – Number of partitions. It has to be a number between
1
and the number of samples.verbose –
1
for showing progress;0
otherwise.
- Returns
Metafile handler.
NULL
on failure.
-
struct bgen_metafile *bgen_metafile_open(char const *filepath)¶
Open a bgen metafile.
Remember to call bgen_metafile_close to close the file and release resources after the interaction has finished.
- Parameters
filepath – File path to the metafile.
- Returns
Metafile handler.
NULL
on failure.
-
uint32_t bgen_metafile_npartitions(struct bgen_metafile const *metafile)¶
Get the number of partitions.
- Parameters
metafile – Metafile handler.
- Returns
Number of partitions.
-
uint32_t bgen_metafile_nvariants(struct bgen_metafile const *metafile)¶
Get the number of variants.
- Parameters
metafile – Metafile handler.
- Returns
Number of variants.
-
struct bgen_partition const *bgen_metafile_read_partition(struct bgen_metafile const *metafile, uint32_t partition)¶
Read a partition of variants.
Remember to call bgen_partition_destroy to release resources after the interaction has finished.
- Parameters
metafile – Metafile handler.
partition – Partition index.
- Returns
Partition of variants. Return
NULL
on failure.
-
int bgen_metafile_close(struct bgen_metafile const *metafile)¶
Close a metafile handler.
- Parameters
metafile – Metafile handler.
- Returns
0
on success;1
otherwise.
-
struct bgen_metafile¶
Metafile handler.
Partition¶
-
void bgen_partition_destroy(struct bgen_partition const *partition)¶
Destroy a partition by releasing its resources.
- Parameters
partition – Partition of variants metadata.
Warning
doxygenfunction: Cannot find function “bgen_partition_get” in doxygen xml output for project “bgen” from directory: doxyxml/
-
uint32_t bgen_partition_nvariants(struct bgen_partition const *partition)¶
Get the number of variants.
- Parameters
partition – Partition of variants metadata.
- Returns
Number of variants.
-
struct bgen_partition¶
Partition of variants metadata.
Samples¶
-
void bgen_samples_destroy(struct bgen_samples const *samples)¶
Destroy samples data by releasing its resources.
- Parameters
samples – Samples data.
-
struct bgen_string const *bgen_samples_get(struct bgen_samples const *samples, uint32_t index)¶
Get a specific sample.
- Parameters
samples – Samples data.
index – Sample index.
- Returns
Sample information. Return
NULL
on failure.
-
struct bgen_samples¶
Bgen samples.
String¶
-
static struct bgen_string const *bgen_string_create(char const *data, size_t length)¶
Create a bgen string.
- Parameters
data – Usual C string. It does not need to be null-terminated.
length – String length.
- Returns
Bgen string.
-
static inline void bgen_string_destroy(struct bgen_string const *bgen_string)¶
Destroy a bgen string.
- Parameters
bgen_string – Bgen string.
-
static inline char const *bgen_string_data(struct bgen_string const *bgen_string)¶
Get a pointer to the C string.
- Parameters
bgen_string – Bgen string.
- Returns
Pointer to the internal C string.
-
static inline size_t bgen_string_length(struct bgen_string const *bgen_string)¶
Get string length.
- Parameters
bgen_string – Bgen string.
- Returns
Length.
-
static inline bool bgen_string_equal(struct bgen_string a, struct bgen_string b)¶
Compare if two bgen strings are equal.
- Parameters
a – First bgen string.
b – Second bgen string.
- Returns
true
if they are equal;false
otherwise.
-
static struct bgen_string BGEN_STRING(char const *str)¶
Initialize a bgen string from a null-terminated string.
- Parameters
str – Null-terminated string.
- Returns
Bgen string.
-
struct bgen_string¶
String.
Variant¶
-
struct bgen_variant¶
Variant metadata.
Public Members
-
uint64_t genotype_offset¶
Genotype offset (bgen file).
-
struct bgen_string const *id¶
Variant identification.
-
struct bgen_string const *rsid¶
RSID.
-
struct bgen_string const *chrom¶
Chromossome name.
-
uint32_t position¶
Base-pair position.
-
uint16_t nalleles¶
Number of alleles.
-
struct bgen_string const **allele_ids¶
Allele ids.
-
uint64_t genotype_offset¶