clustringr - Cluster Strings by Edit-Distance
Returns an edit-distance based clusterization of an input vector of strings. Each cluster will contain a set of strings w/ small mutual edit-distance (e.g., levenshtein, optimum-sequence-alignment, damerau-lev), as computed by stringdist::stringdist(). The set of all mutual edit-distances is then used by g graph algorithms (from package igraph) to single out subsets of high connectivity.
Last updated 6 years ago
clusteringgraphsstrings
3.88 score 15 stars 4 scripts 174 downloads