Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

A great all around introduction to programming.

It poses an interesting question which I think it doesn't really answer well: "What Is the Relationship Between Code and Data?"

I can easily understand what a "program" is. It is a set of instructions we give to the computer for it to do something we want it to do.

But what is "data"?



> But what is "data"?

It's a sequence of bits. You take reality (or whatever you want) and represent it as bits. That sequence of bits is called "data".


In the computer science world yes. But the word also exists outside of that context (and did so long before computers).

Data is the plural of datum. A datum is a piece of information. Of course not any piece information is automatically a datum, as that noun implies a certain structured nature or a normative perspective on that piece of information.

Certain information only became commonplace because it became a datum first. Before using family names as an identifier many people had no family names, they were identified by their first name and by the places they grew up in.


OP's question seems to be in the context of computers, not the semantics of the word "data" in the broader anthropological and socio-cultural context.


I would say that data is "encoded information". But then again is there any other kind?


Data is information about things and code is information about how to transform data. Every program transforms input data to output data. Code is interpreted by the computer. Data and code are encoded to bits for the digital computer.


That sounds enlightening to me.

Code is interpreted by the computer, Data is interpreted by Code !!!

But when we say "code is interpreted by computer" we are really saying "Code is interpreted by Code" right? Meaning the interpreter or compiler.

So how come this snake doesn't eat its own tail?

The answer, the MAGIC of computers, is that at the lowest level somehow Hardware is able to interpret Code .


My answer tried to be as generic as possible and leave out the "real world" and implementation details in general. There are fundamental limitations to what you can express in four sentences...

Compilation is "just" an intermediate step between code and using it in an interpreter. The interpreter can be pure hardware, mix of hardware and software, or you can do it on paper yourself (given time). These are implementation "details". Also, to me code is not a synomym with software, since I'm not referring to programmability, only to the transformation aspect. Programmability is an implementation detail. You may perform a transformation with hardware only, as mentioned above. The hardware itself could be described with code, e.g. in Verilog language. Manufacturing a silicon chip based on Verilog code is also an implementation detail.

In fact your CPU/PC is hardware and it is making transformation from input data to output data. In this case, besides the other input data, like keyboard input and files on disk, you may consider the binary code as part of the input data. This is where the confusion starts to happen. This is an implementation detail and should not be confused with the general notion of utility we want to have. Code is "only" the means to an end. We want to do stuff with the machine. We want to transform input data to output data, since that is essentially what we are after. For the goal of getting stuff done, data and code are separate things.

When saying "data is interpreted by code", it is partially correct. First of all data carries meaning, as in meaning for people (the data user). Data is encoded to some format (which is also data, but implementation data) and you could say that this encoding is "interpreted". However, the encoding is "just" an implementation detail which depends on the machine you are trying to use for the transformation. When the computer paints pixels on your monitor, it transfers data bits through display driver, through HDMI protocol (for example), to monitor itself. There are plenty of "interpreters" on that path. On the screen the pixels could represent the letter "A" and that could carry meaning for you (depending on the other stuff on the screen). This is why I stated that data is information about things. Encoded data is an implementation detail and varies between implementations, but the "true" data is essential and implementation independent. It carries meaning and utility value. Essential data is not interpreted by code. The user (a person) interpretes the essential data.

Code is data for the compiler. The compiler transforms source code (input data) to object code (output data). Code is data also for the interpreter. Algorithm contains all the essential information about the data transformations to be performed, and code in form of programming language text, is an implementation of that algorithm in the specific programming language. Algorithm is programming language independent. It's fair to say that by code I mean the algorithm. But also an algorithm must be presented in some form or another, so its code. :)


I would say that data defines as something that might convey information, might in the sense that there are one or more interpretations of it that are meaningful to something. This sentence has plenty of open ended terms, however.

Code is data that can be executed, provided that there is a something that offers an interpretation of it. It think we must claim that this interpretation is meaningful to something.

Executability is a semantic in the context of interpretation, which ties it all back into what data is again. Something something Apply/Eval.

So more open ended terms.


Data are instructions for the computer interpreted in the context of a program. Or no program in the case of instructions which can be interpreted directly by the computer.

That is, data are programs, programs are data. Just need to establish the environment in which to execute/process them.


Right, except what got me thinking is that "programs" can be conceptually seen as a sequence of "instructions" to the computer. Whereas it is more difficult to think of data as "instructions". Data is more like a model of the world, than a set of "instructions".


It is, but it also influences how the software and hardware actually execute, making it a kind of program. Essentially every program that accepts any kind of data is really operating as a virtual machine, a software specified computer. It interprets the data and performs actions conditioned on that input.

mp3 files are instructions to an audio producing "computer", mp4 to an audio/video "computer". Whether it's a software description of the machine or a hardware implementation. Zip files are data, but when run through a decompression program they determine what gets decompressed (the program determines how) making them instructions for the decompression computer. The text of a text file opened in an editor providing rendering instructions. TCP packets are data, but they influence the TCP state machine, which is an explicit rendering of the idea that the software (sometimes hardware in this case) is describing a computer and the data (packets) represent instructions that effect the state of that computer.

However, this interpretation is usually not explicit so the manner of these virtual machines is ad hoc and ill-specified (if at all).


and there you have an unanswerable question! :-) at an atomic level in computing, as others have said, it's a stream of ones and zeroes.. after that everything else is implementation, i.e how to interpret, use, store etc. In fact it's one of the "bug bears" in computing that there are so many "data format standards" (used loosely not definitionally)..

or the age old joke: "The good thing about (data) standards is there are so many of them!"


If code is a relationship between the next state and the current state, then this state is data.


Well said. Code stays as it is, typically. Data changes.

Of course code can modify its own 'code". But that makes it much more difficult to reason about its behavior. It is good to have some "invariants". If we can assume that code does not change except when the human author changes it, then it becomes much easier to reason about the changes the program may cause to the data.


The Harvard architecture makes the separation clearer:

https://en.wikipedia.org/wiki/Harvard_architecture


And Turing machine? Can it modify its own tape?


> But what is "data"?

A set of instruction activators we give the computer

Oh and that easily understood set of instructions are also data


There is no difference between code and data. Data is code. Data is code that retains its identity.


Any information (including code) that can be stored by the abstract machine under consideration.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: