Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The same arguments can be applied to HTML, CSS, JSON, RSS and so on. I fail to see the crucial difference between those and HTTP. Or would you say the web as a whole should be binary?

Fifth, redundancies introduced from the beginning of time need to go away I wholeheartedly agree with this, but it doesn't automatically warrant binary encoding.

Human readability is a huge bonus in any protocol or format. Not because normally people read those protocols, but because people read ASCII and therefore they have good tools to work with ASCII.



...? Seriously? You don't see the difference?

HTTP is a layer 7 communication protocol. HTML/CSS are markup languages for designing an interface. JSON is a data interchange format. RSS is a content syndication format.

They are all wildly, vastly different. The only thing they have in common is they're all ASCII. If anything, you're making my argument for me: a communications protocol is not a format for displaying documents, it is a language for communicating machine instructions to network applications. Historically they have always been binary because it works better that way.

Your argument that "people can read ASCII, so ASCII is good" leaves out a couple points. Like, human beings do not read an HTTP statement, go into a file folder, bring out a document and present it to their computer. It's the other way around.

Really this just reflects a strange phobia people seem to have. Like your brain is tricking you into thinking you'll lose something by not looking directly "at the wire".

When you look at HTTP headers, 90% of the time you're actually looking at a pre-parsed, normalized set of fields. If you look at a raw packet dump, the whole message may not show up in one packet; you may have to reassemble it, which means parsing. If you have multiple requests in one connection, you have to find the end of the last request, which means seeking through the stream; seeing requests broken down individually means a tool already parsed them. Firebug and wireshark and other tools all take care of the automated, machine-operated work for you.

And what's left? What do you have to do with HTTP, really? Apache rules? They'd stay human-readable. Application testing? We use proxies that handle it, and APIs for client/server programming. Firewalling? Handled by tools and appliances.

Stop giving me the blanket "ASCII is great for everything" excuse and tell me one thing, one single thing, that only humans are able to do with HTTP using their eyes without a tool. But you don't have to, because that's impossible: HTTP is not for humans.


Apache rules? They'd stay human-readable.

I look forward to servers having different text representation of the same binary headers in their config files.

tell me one thing, one single thing, that only humans are able to do with HTTP using their eyes

You're missing the point.

No one writes HTML manually anymore either. People generate it using tools (string processing tools in a language or templates) and read it using browsers. Heck, even Notepad++ is a tool, but a generic one.

If you want, you can generate all your HTML using DOM. But almost no one does that, because DOM tools are clumsy, while text-based tools are easy to use.


"No one writes HTML manually anymore either."

I was up until four last night doing just this. It's commonly done for templating purposes all the time, or quick hacks and placeholders.

Have you lost your mind?


You're actually still arguing for my point instead of against it.

If no one writes HTML manually anymore, then we have no need for it to look like English when the computer interprets it! We can compile the HTML down to bytecode and have it be interpreted much quicker by the computer, which won't have to do the job of lexing, compiling, assembling, etc. Here, two steps would be eliminated immediately, resulting in increased speed and more efficient storage and transmission: http://www.html5rocks.com/en/tutorials/internals/howbrowsers...

For that matter, if it's generated by tools, and we use programs designed to interpret and decipher and color-code it, all of that can happen without it being in English!

On top of that, you missed when I said HTTP is a communications protocol. Ever seen the movie The Matrix? Know how the sentinels would sometimes look at each other and make scuttling noises, then shovel off somewhere? They weren't speaking English ASCII. They were speaking a binary communications protocol. Know how I know? BECAUSE MACHINES AREN'T HUMANS! It would be absolutely moronic for them to speak English to each other. It would be like dogs saying the English word "bark" instead of just barking. Completely unnecessary and crazy. But that's what an ASCII communications protocol for machines is.

On top of that, there is no benefit, not one at all, to humans being able to read it when tools already exist to interpret and display it even more human-readable than its natural state. We squish and compress and strip HTML and JS already just to make it more efficient, and then undo the whole process just to read it. It's insane.


So you really think we'd be here today if instead of HTML, CSS , Javascript, JSON, XML we had a web based on bytecode formats?

The web is made by people, not computers. Open an ubiquitous text-editor and you can start working on something right away. If you have to download a dozen different compilers and IDEs to do that, it's definitely not the same.


"The web" is actually just a collection of hyperlinks, applications that parse markup and document storage and retrieval services. You don't see code. You see pictures of cats. And you never, ever need a text editor to use it.

Face it. Your love affair with ASCII is just that: an emotion.

(As to your original question: humans haven't needed to program in binary or assembly for decades. That's what so great about computers: they do the hard work for us, so we don't need to type everything manually into a text editor. Is that such a hard pill to swallow?)


You're completely ignoring the fact that the web began as (and still is, in part) a collaborative tool and publishing platform. Text-based formats played an immense part in that, geocities, the rise of personal publishing, blogs, these would not have happened without them.

Yes, binary is more efficient, but then tell me why is JSON the most popular data interchange format on the web today?


Because XML, the preeminent human-editable data interchange format, sucked balls. It's only superseded YAML because it can be stripped of whitespace and it has the word "Javascript" in it.


binary formats sucked so much, that they had to invent XML and it was a much better way to start the interaction era, were services talk to each other without having to read a 30 page spec just to understand how to write the right payload for the interchange format used. Let alone the byte order...


For that matter, if it's generated by tools, and we use programs designed to interpret and decipher and color-code it, all of that can happen without it being in English!

Yes, let's base HTML 6 on Word .doc.

Also, the machines in The Matrix were hostile to humans. We'd like machines in the real world to be... not so.


Are you arguing that we should have embraced Java Applets and ActiveX controls, because they are binary formats, hence more efficient? HTTP is NOT a communication protocol, it is an APPLICATION protocol. HTTP is an application on top of a transport layer, HTTP, just like SMTP, IRC, FTP, IMAP etc etc is just a protocol that describes applications. It is not TCP or UDP and SHOULD NOT BE!


> and tell me one thing, one single thing, that only humans are able to do with HTTP using their eyes without a tool.

debugging.


How are you going to see the header? Useless to debug if you can't see it.


Haven't you ever done an HTTP request through nc or even just telnet to see what responses came back? This is the best way to troubleshoot strange reverse proxies or rewrite rules.


Boy, wouldn't it be crazy if applications included debugging modes that told you exactly what they were doing?


Do all applications include something like that? If not, what makes you think that they will in your hypothetical "binary is king" future?


Have you ever heard of the tcp/ip protocol suite? I hear there's some things you can use to debug it. Might even support HTTP in the future.


Yes, but that has nothing whatsoever to do with the question I asked.


Here's the problem; you keep on assuming that good tools will magically appear, that help with debugging. But good tools take a lot of time and work to perfect. In reality, you usually wind up with just barely good enough tools.

With a text based protocol, you can inspect it visually with no special tools, and munge it with general purpose tools that you already know how to use (shell script, sed, awk, perl, python, ruby, what have you) with no special support libraries or anything of the sort. Support libraries can help you with the more complex aspects of the protocols, but for basic debugging purposes, you can do it all with general purpose tools.

With a binary protocol, you need those libraries to even have a chance of being able to work with it. Now you can't use a general purpose shell pipeline to munge it; no more nc | grep or what have you. You have to have a wireshark dissector; and good luck figuring out how to grep through the results of what a wireshark dissector generates.

The main point is that the overhead of the ASCII encoding isn't the main problem with HTTP. Reading ASCII encoded CRLF delimited headers is a solved problem (and heck, you could probably switch that to just LF delimiters, since I'm sure that most processors already handle that case just fine).

The problems are things like having to repeat headers over and over again for each request in a session, enormous cookies that need to get sent with every request, and the like. But you can solve those without throwing away the easily debuggable ASCII-encoded headers; and compression really does solve most of the problem with the inefficiency of ASCII encoding (and you're going to want to use anyhow, since the HTML, CSS, and JavaScript that you're delivering is all a fairly inefficient ASCII representation too).


Glad you came by, you can surely help me. I'm in dire need of a debugging tool that allows me to start corba requests and works with all large corba vendors.

There isn't one, don't bother looking. Corba is the poster child for the problems with binary protocols: fragmentation, buggy implementations, incompatible extensions.

I'd rather not see HTTP follow the Corba path.


> The same arguments can be applied to HTML, CSS, JSON, RSS and so on.

It can be, but that doesn't really make sense. The vast majority of web development is done without manually editing HTTP headers. It could change to a binary format tomorrow and - so long as our tools preserved their interfaces - we wouldn't even notice. The same cannot be said for any of the other technologies you listed.


First, I'm not sure what majority you refer to. Neither do I know what you mean by "manually" editing. I used this text-based function on more than one occasion:

http://php.net/manual/en/function.header.php

This is a good example where adding an object-oriented representation to every header out there would require a lot of work. Not sure if it would justify the gains.

It could change to a binary format tomorrow and - so long as our tools preserved their interfaces - we wouldn't even notice.

Until you try to use grep or something of that sort for some non-trivial analysis operation. Everything 'speaks' ASCII. Custom tools for binary format would take years to evolve to be as powerful as generic text tools.


Custom tools for binary format would take years to evolve to be as powerful as generic text tools.

All you need is one parsing tool that produces a textual representation of the binary protocol, and you can once again use grep and friends.


All you need is one parsing tool ported to 1000 platforms in existence now. We already have ASCII tools on all of those platforms, and we are pretty much guaranteed once new platform is created it would have basic ASCII tools. However it is not at all guaranteed it would have decoder tool for every binary protocol out there. That's why ASCII protocols are easier to handle than binary ones. And for 99.999% of protocol users, savings from converting to binary would not be even measurable. Sure, for likes of Google and Amazon economy of scale would be substantial. But 99.999% of web users aren't humongous-scale projects, they are relatively low-tech projects for which simplicity is much more important than squeezing out every last bit of performance.


So long as nobody that needs to speak ASCII deploys a server that doesn't speak HTTP/1.1, I think switching to a more compact binary protocol for HTTP/2.0 is a good thing. Embedded devices will be able to handle more sessions with less CPU power, for example.


I'm not convinced for average embedded device parsing HTTP headers represents significant amount of energy spent. Are there any data that suggest that for average device - I don't mean Google's specialized routers or any other hardware specifically designed to parse HTTP - this change would produce measurable improvement? In other words, how much longer the battery on my iPad would last? I don't think I'd gain even a single second, but I'd be very interested to see data that suggest otherwise.


I'm talking about things like Philips hue and hardware with a 70MHz CPU, or Arduino even.


> This is a good example where adding an object-oriented representation to every header out there would require a lot of work.

Most of the tools I've used represent the header as a hash/dictionary. I fail to see how that approach "requires a lot of work".

> Until you try to use grep or something of that sort for some non-trivial analysis operation.

You're arguing from the assumption that a binary protocol would be implemented by idiots. Custom binary tools can always emit a textual representation, at which point you can grep through it to your heart's content. This is the exact same problem that we've been solving with compilers for generations. It isn't nearly as insurmountable as you seem to believe.


Most people aren't writing HTTP requests by hand in a text editor.


No, but a lot of people are writing them in telnet prompts and through netcat pipes.


Curl and wget are nearly as ubiquitous as netcat.


The older, HTTP/1.X compatible versions maybe. What happens after the web upgrades to 2.0? How long until compatible tools make it to default installs? Even now, OSX doesn't ship with wget.


But they watch, capture and post them to mailing lists and stack overflow.


People are actually writing HTML/CSS/JS, so no. They have a whole other set of issues like being XML based (for HTML), but they do the job and are not likely to experience fundamental change in the next 5 years in broad adoption.


> They have a whole other set of issues like being XML based (for HTML)

HTML isn't XML based; there is an XML-based relative of HTML (XHTML) which was originally (before HTML5) viewed as a potential successor to HTML, and with HTML5 there is an available XML-based serialization of the HTML's semantics, but HTML is its own thing (prior to HTML 5, HTML was SGML-based; XML was inspired by HTML rather than serving as the basis for it.)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: