I'm working with Germans, Japanese and Polish that use a lot of special characters, including Kanji, umlauts, extra quote characters etc. I need the non-ASCII parts and had no problem with them in Python 2 on Linux (now, C++ libraries that reinvented their own string classes: many problems; C libraries: no problems).
The point is when bytes is str, everything works just fine in Python 2 Linux with UTF-8 locale (which are used in all modern Linux distributions). No need to have a distinction between bytes and str.
That how the rest of the OS works, too. Even a lot of Gtk, Glib and so on (for example the GNOME desktop environment) assume that you are in an UTF-8 locale for file names, for example.
> A more abstract "string" type is all they need, without worrying about how it works under the hood (and if they do, they need to understand how encode/decode works properly anyway).
Ehh, we had students write drivers for measurement apparatuses and they all used Python 2 str (without being prompted to do so). No encode or decode anywhere. Of the students, almost no one who tried Python 3 for that stayed with it (instead they were using Python 2). There was just no upside for this use case.
I agree that, long term, having a distinction str vs bytes makes sense. But then you ARE juggling things that the OS doesn't need--it's basically busywork in Linux.
I'm not trying to minimize your experience--but I don't think it would happen if you tried python2 on Linux today. Not sure it was worth it breaking compat for that.
There was no conversion. `bytes` and `str` were the same type.
http://docs.python.org/whatsnew/2.6.html#pep-3112-byte-liter... says:
> Python 2.6 adds bytes as a synonym for the str type, and it also supports the b'' notation.
I just checked in Python 2.7:
I'm working with Germans, Japanese and Polish that use a lot of special characters, including Kanji, umlauts, extra quote characters etc. I need the non-ASCII parts and had no problem with them in Python 2 on Linux (now, C++ libraries that reinvented their own string classes: many problems; C libraries: no problems).The point is when bytes is str, everything works just fine in Python 2 Linux with UTF-8 locale (which are used in all modern Linux distributions). No need to have a distinction between bytes and str.
That how the rest of the OS works, too. Even a lot of Gtk, Glib and so on (for example the GNOME desktop environment) assume that you are in an UTF-8 locale for file names, for example.
> A more abstract "string" type is all they need, without worrying about how it works under the hood (and if they do, they need to understand how encode/decode works properly anyway).
Ehh, we had students write drivers for measurement apparatuses and they all used Python 2 str (without being prompted to do so). No encode or decode anywhere. Of the students, almost no one who tried Python 3 for that stayed with it (instead they were using Python 2). There was just no upside for this use case.
I agree that, long term, having a distinction str vs bytes makes sense. But then you ARE juggling things that the OS doesn't need--it's basically busywork in Linux.
I'm not trying to minimize your experience--but I don't think it would happen if you tried python2 on Linux today. Not sure it was worth it breaking compat for that.