Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

For NLP, what API/lib are you guys using?


The language and entity recognition are using custom stuff that isn't part of any standard libs. For instance, in a query like "whats a good happy hour place in the mission?" - nothing is capitalized and all of the standard libs out there can't deal with it.

For building blocks - it's a combination - mixture of word2vec for feature vectors, and some parts of spaCy (though still experimental).


I'm unsure of what you mean by stating that none of the standard libs include what you need. Could you expand on this point?

Using your example, why couldn't you use a standard parts of speech tagger, even on a poorly-punctuated sentence?

Also could you expand on the NLP techniques you are actually using? How are you embedding the words/phrases/sentences into the vector space? Are you doing compositional embedding (unsure of the proper phrasing) where you embed a word, the phrase the word is in, the sentence the word is, etc all in the same vector space?


@crazypro - The question to ask for me was - "what is the most important phrase and what does it mean?" In the example, "happy hour" and "mission" are the two things I need. Using some sort of knowledge base is hard because I can't predict every query the user may throw at me, and there will always be issues with disambiguation. For locations, I could have a gazetteer which would make life easier.

What I meant was that existing libs, in my limited experience, can't do entity recognition (even those that use classifier based chunkers) unless you've got a good dataset of similar conversations. Which I don't yet. So my option was to build something that would take an arbitrary sentence and do so while only having limited training. Does that make sense?

I'm utilizing word/phrase vectors - haven't moved to sentence vectors yet. And they are in the same space. I'm not entirely sure what compositional embedding entails, so presumably I'm not doing that. I would however love to chat offline and get more insight from you on it.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: