tacco.utils.cdist¶
- cdist(A, B=None, metric='euclidean', parallel=True)[source]¶
Calclulate a dense pairwise distance matrix of sparse and dense inputs. For some metrics (‘euclidean’, ‘cosine’), this is considerably faster than
scipy.spatial.distance.cdist()
. For basically all other metrics this falls back toscipy.spatial.distance.cdist()
. Special distances are:‘bc’: 1 - Bhattacharyya coefficient, a cosine similarity equivalent for the Bhattacharyya coefficient, which is the overlap of two probability distributions. The input vectors are normalized to sum 1 first.
‘bc2’: 1 - (Bhattacharyya coefficient)^2, a cosine similarity equivalent for the squared Bhattacharyya coefficient. The input vectors are normalized to sum 1 first.
‘hellinger’: The Hellinger(-Bhattacharyya) distance defined as sqrt(1 - Bhattacharyya coefficient)
‘h2’: squared Hellinger Distance; synonymous to ‘bc’.
- Parameters:
- Returns:
A
ndarray
containing the distances.