Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Looks similar to the new FUTO keyboard: https://voiceinput.futo.org/


I've been using this for a while (the voice input, not their keyboard) and it's so refreshing to be able to just speak and have the output come out as fully formed, well punctuated sentences with proper capitalization.


I agree. No more "speaking punctuation". Just talk as normal and it comes out fully formed


I actually don't mind speaking punctuation, in fact it kind of helps. What I really hate is the middle-spot where we are right now, where it tries to place punctuation and sucks badly at it.


In my experience, futo is actually pretty good at just knowing the right punctuation to use.


Anything like that available for iOS?


iOS already has on-device dictation built into the standard keyboard.

Years ago it got sent to the cloud, but as long as you have an iPhone from the past few years it's on-device.


You're right that it exists, but it's complete crap outside a quiet environment. Try to use it while walking around outside or in any semi-noisy area and it fails horribly (iPhone 13, so YMMV if you have a newer one).

You cannot use an iPhone as a dictation device without reviewing the transcribed text, which IMO defeats the purpose of dictation.

Meanwhile, i've gotten excellent results on the iPhone from a Whipser->LLM pipeline.


I've never found real-time dictation software that doesn't need to be reviewed.

I'm definitely waiting for Apple to upgrade their dictation software to the next generation -- I have my own annoyances with it -- but I haven't found anything else that works way better, in real time, on a phone, that runs in the background (like as part of the keyboard).

You talk about Whisper but that doesn't even work in real time, much less when you have to run it through an LLM.


What's the real-time requirement for? We may have different use cases, but it's not needed if I don't need to review the results. Speak -> Send, without reviewing the text, is the desired workflow. I.e. so you can compose messages without looking at your phone.

So yes, i'm not sure of alternate real-time solutions, but the non real-time solution of Whisper is much better for my real-world use case.


Aiko, mentioned elsewhere, includes a local copy of the OpenAI Whisper model: https://apps.apple.com/app/aiko/id1672085276


Aiko is a free app for iOS and macOS that also uses whisper for local TTS


There is also Sayboard (open-source, multiple languages): https://github.com/ElishaAz/Sayboard


This looks great! I've been wanting to drop the Swipe keyboard ever since I saw sneaky ads on it (like me typing "Google Maps" and getting "Bing Maps" as a "suggestion").


But open source, which is a pretty big difference


FUTO and Transcribro are open source.


No, FUTO made a new "Source First License"[1] that is not Open Source by the OSI definition.

[1] https://github.com/futo-org/android-keyboard/blob/master/LIC...


I can get behind people doing their own custom "licenses" that amount to throwing their work into the public domain, but if someone builds their own limited licenses around a thing, I won't touch their product. This FUTO license is garbage. Use a real license and either be open source or not; inventing new personal licenses doesn't do anyone any good.


Oh, that's lame.


FUTO is not open source.

https://gitlab.futo.org/alex/voiceinput/-/blob/master/LICENS...

> FUTO Source First License 1.0

> You may use or modify the software only for non-commercial purposes




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: