Bernoulli Bayesian Sets

class bayessets.BernoulliBayesianSet(dataset, meanfactor=2, alpha=None, beta=None, alphaepsilon=0, betaepsilon=0)

Bayesian Sets assuming an independent Bernoulli distribution. Using the conjugate distribution (Beta), we build an efficient computation model based on a matrix multiplication.

The model needs hyperparameters alpha and beta, but these can be estimated using the mean of each feature. To use this estimation, the meanfactor argument should be set, while alpha and beta should be None.

The alphaepsilon and betaepsilon arguments are used for features with mean (close to) 0 or 1, because they will cause log-of-zero operations. One option in this case is to look into these features, but if even then the floating-point precision, the epsilon arguments could be used to correct the issue.

compute_query_parameters(query_indices)

Computes the query parameters, rank constant and rank query (called ‘c’ and ‘q’ in the original paper, respectively)

query(query_indices)

Computes the expansion of the seed set given by the argument query_indices

Parameters:query_indices – list of the indices of items in the seed set
Returns:ndarray – the score of each item
query_many(queries)

Computes the expansion of the seed sets given in the argument queries. Multiplies the dataset matrix only once, may be recommended for larger/more dense matrices.

Parameters:queries – list of the query indices
Returns:matrix – the score of each item for each query