Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What does “Data is just dumb code, and code is just smart data” mean? [closed]

Tags:

I just came across an idea in The Structure And Interpretation of Computer Programs:

Data is just dumb code, and code is just smart data

I fail to understand what it means. Can some one help me to understand it better?

like image 532
Dhanapal Avatar asked May 16 '09 06:05

Dhanapal


2 Answers

This is one of the fundamental lessons of SICP and one of the most powerful ideas of computer science. It works like this:

What we think of as "code" doesn't actually have the power to do anything by itself. Code defines a program only within a context of interpretation -- outside of that context, it is just a stream of characters. (Really a stream of bits, which is really a stream of electrical impulses. But let's keep it simple.) The meaning of code is defined by the system within which you run it -- and this system just treats your code as data that tells it what you wanted to do. C source code is interpreted by a C compiler as data describing an object file you want it to create. An object file is treated by the loader as data describing some machine instructions you want to queue up for execution. Machine instructions are interpreted by the CPU as data defining the sequence of state transitions it should undergo.

Interpreted languages often contain mechanisms for treating data as code, which means you can pass code into a function in some form and then execute it -- or even generate code at run time:

#!/usr/bin/perl
# Note that the above line explicitly defines the interpretive context for the
# rest of this file.  Without the context of a Perl interpreter, this script
# doesn't do anything.
sub foo {
    my ($expression) = @_;
    # $expression is just a string that happens to be valid Perl

    print "$expression = " . eval("$expression") . "\n";
}

foo("1 + 1 + 2 + 3 + 5 + 8");              # sum of first six Fibonacci numbers
foo(join(' + ', map { $_ * $_ } (1..10))); # sum of first ten squares

Some languages like scheme have a concept of "first-class functions", which means that you can treat a function as data and pass it around without evaluating it until you really want to.

The upshot is that the division between "code" and "data" is pretty much arbitrary, a function of perspective only. The lower the level of abstraction, the "smarter" the code has to be: it has to contain more information about how it should be executed. On the other hand, the more information the interpreter supplies, the more dumb the code can be, until it starts to look like data with no smarts at all.

One of the most powerful ways to write code is as a simple description of what you need: Data which will be turned into code describing how to get you what you need by the interpretive context. We call this "declarative programming".

For a concrete example, consider HTML. HTML does not describe a Turing-complete programming language. It is merely structured data. Its structure contains some smarts that let it control the behavior of its interpretive context -- but not a lot of smarts. On the other hand, it contains more smarts than the paragraphs of text that appear on an average web page: Those are pretty dumb data.

like image 87
darch Avatar answered Sep 26 '22 08:09

darch


In the context of security: Due to buffer overflows, what you thought of as data and thus harmless (such as an image) can become executed as code and p0wn your machine.

In the context of software development: Many developers are very afraid of "hardcoding" things and very keen on extracting parameters that might have to change into configuration files. This is often based on the idea that config files are just "data" and thus can be changed easily (perhapy by customers) without raising the issues (compilation, deployment, testing) that changing anything in the code would.

What these developers don't realize is that since this "data" influences the behaviour of the program, it really is code; it could break the program and the only reason not to require complete testing after such a change is that, if done correctly, the configurable values have a very specific, well-documented effect and any invalid value or a broken file structure will be caught by the program.

However, what all too often happens is that the config file structure becomes a programming language in its own right, complete with control flow and everything - one that's badly documented, has a quirky syntax and parser and which only the most experienced developers in the team can touch without breaking the application completely.

like image 28
Michael Borgwardt Avatar answered Sep 24 '22 08:09

Michael Borgwardt