Hacker Newsnew | past | comments | ask | show | jobs | submit | exgrv's commentslogin

We can! At Kyutai, we released a real-time, on-device speech translation demo last week. For now, it is working only for French to English translation, on an iPhone 16 Pro: https://x.com/neilzegh/status/1887498102455869775

We released inference code and weights, you can check our github here: https://github.com/kyutai-labs/hibiki


Good work. The delay seems to be around 5 secods. This is a step in the right direction. I'm wondering how much more real-time can we push it.


Damn, this is pretty amazing. Feels like we’re not far off from the babel fish.



Except it does? After Equation 2: "v_w and v'_w are the input and output vector representations of w."


We decided to keep the casing, as it is useful for some applications such as named entity recognition.

Regarding the punctuation, as pointed out in another comment, these tokens might also be useful for some applications (and they are easy to filter out if you don't need them).


In the Tagalog file, } is near the top but { is over 8,000 lines down. Is there a reason they have such different frequencies? ( and ) are right next to each other.

And yes I realize this is a really odd question :)


This is probably due to our preprocessing of Wikipedia that did not get rid of all the '}' from the markup.


Oh true. I tried to clean up Wiki markup for ML years ago and it was a huge pain. Next time I think I'll parse the HTML version and pull out the text from the tags explicitly.


This is a much better way to do it. It's easier, cleaner, and gets the text which is generated by templates, which there is a surprising amount of (you get weird artifacts from that otherwise).


Your comment has twice as many ) as it does (

My first guess would be emojis ;)


These models were trained in an unsupervised way, and thus cannot be used with the "predict" mode of fastText.

The .bin models can be used to generate word vectors for out-of-vocabulary words:

  > echo 'list of words' | ./fasttext print-vectors model.bin
or

  > ./fasttext print-vectors model.bin < queries.txt
where queries.txt is a list of words you want a vector representation for.


Hi, because we trained these vectors on Wikipedia, we released models corresponding to the 90 largest Wikipedia first (in term of training data size). More models are on the way, including Irish.


I suspected it was something like this. Unfortunately the Vicipéid is not of very high quality. I just just hope Facebook doesn't forget which side its bread is buttered on.


Regarding the size of the word vectors files: the text files are sorted by frequency, so it is possible to easily load the top k words only.

We might also release smaller models in the future, for training on machines without large memory.


fwiw I have 32gb on my workstation and my personal laptop is maxed out at 16gb. Keeping within these thresholds may be useful to others.


These models were trained on Wikipedia.

It should be "Western Frisian" instead of "Western" (https://en.wikipedia.org/wiki/West_Frisian_language). Thanks for the catch!


Thanks.


Models are trained independently for each language. So unfortunately, you cannot directly compare words from different languages using these vectors.

If you have a bilingual dictionary, you might try to learn a linear mapping from one language to the other (e.g. see https://arxiv.org/abs/1309.4168 for this approach).


The graph algorithm described in the blogpost is more related to label propagation (which is more than 10 years old), than to "retrofitting". And the Google paper linked in the blogpost is citing the relevant literature correctly.


I probably sounded more accusatory than I should have, and I apologize for that wording.

But I do think this is much more like retrofitting than like label propagation. It's the vectors that are being propagated, as I understand it, not labels.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: