Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Aren't you just describing a bag-of-words model?

https://en.wikipedia.org/wiki/Bag-of-words_model



Yes! And the follow up that cosine similarity (for BoW) is a super simple similarity metric based on counting up the number of words the two vectors have in common.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: