Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Text selection used to be frustrating on mobile for me too until Google fixed it with OCR. I get to just hold a button briefly and then can immediately select an area of the screen to scan text from, with a consistent UX. Like a screenshot but for text.


They are using OCR for selecting plain text?


It's possible to use the Gemini "ask me about this screen" to OCR the selected area of the screenshot. I guess that might be more efficient in some contexts then trying to use the native text select.


On iPhone too, taking a screenshot is the single reliable way to select text.


It becomes possible. Getting the handles to move correctly is still often a frustrating experience.


At least it's not AI... yet.


Multi-modal LLMs like Gemini are better than traditional OCR in most ways.


It is a poor person, sitting in a 3rd world country, thanscribing the text in your clipboard. See Alexa for details. /s

I'm only half joking.


There’s an API (Actually People Implemented) for that.


This is such an indictment of modern technology. No offense is meant to you for doing what works for you, but it is buck wild that this is the "fix" they've come up with. As somebody learning about this for the first time it sounds equivalent to a world where screenshotting became really hard so people started taking photos of their screen so they could screenshot the photo. How could such a fundamental aspect of using a computer become so ridiculous? It's like satire.


Unfortunately, some apps don't support text selection and on some websites the text selection is unpredictable.

I'd actually compare screen OCR to screenshots. Instead of every app and every website implementing their own screenshot functionality, the system provides one for you.

Same goes for text selection. Instead of every context having to agree on tagging the text and directions, your phone has a quick way of letting you scan the screen for text.

To be fair, I still use the "hold the text to select it" approach when I want to continue with the "select all" action and have some confidence that is going to do what I want.


> some apps don't support text selection and on some websites the text selection is unpredictable.

That correctly identifies the problem. Now why is that, and how can we fix it?

It seems fixable; native GUI apps have COM bindings that can fairly reliably produce the text present in certain controls in the vast majority of cases. Web apps (and "desktop" apps that are actually web apps) have accessibility attributes and at least nominally the notion of separating document data from presentation. Now why do so few applications support text extraction via those channels? If the answer is "it's hard/easier not to", how can we make the right way easier than the wrong way?


Does it automatically scroll down while selecting if the text is larger than the screen?


Fair point, it does not on my device


That’s how I do it on the iPhone as well. I take a screen shot first.

You can count on it, it is reliable, it always works.


Unless you need to select more text than fits on the screen




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: