Sally is a small tool for mapping a set of strings to a set of vectors. This mapping is referred to as embedding and allows for applying techniques of machine learning and data mining for analysis of string data. Sally implements a standard technique for mapping strings to a vector space that is
... [More] often referred to as vector space model or bag-of-words model. The strings are characterized by a set of features, where each feature is associated with one dimension of the vector space. Sally proceeds by counting the occurrences of the specified features in each string and generating a sparse vector of count values. The tool then normalizes the vectors and outputs them in a given format. [Less]
This site uses cookies to give you the best possible experience.
By using the site, you consent to our use of cookies.
For more information, please see our
Privacy Policy