DrDubWiki | t-SNE for All

There is this wonderful algorithm for dimensionality reduction for presentation purposes called t-distributed stochastic neighbor embedding, t-SNE for short, introduced by van der Maaten and others (2008):

Van der Maaten, Laurens, and Geoffrey Hinton. "Visualizing data using t-SNE." Journal of machine learning research 9.11 (2008).

This algorithm is wildly popular in data science. In NLP, its claim to fame is to visualize embeddings obtained, for example, from Word2Vec.

Now, it has even been said that if people could see in high dimensions machine learning would not be necessary. Therefore, if we were to apply t-SNE to datasets of interest to people, for example friends, locations, jobs, with dots color coded with class labels of importance to them (e.g., did you like that job?) from a 2D projection of the data, the users could figure out a boundary and make their own decisions *without the need of machine learning*.

The idea here is to make a simple tool that allow people to get t-SNE projections. For the most part that hinges is putting together distance metrics between their instances of interest that will work for t-SNE.

Backlinks

Page actions

t-SNE for All

System Menu