The Rule of Least Power
- The World Wide Web is unique in its ability to promote information reuse on a global scale. Information published on the Web can be flexibly combined with other information, read by a broad range of software tools, and browsed by human users of the Web. For such reuse to succeed, the broadest possible range of tools must be capable of understanding the data on the Web, and the relationships among that data. Thus, when publishing information or programs on the Web, the choice of language is important.
Principle: Powerful languages inhibit information reuse
- There is an important tradeoff between the computational power of a language and the ability to determine what a program in that language is doing.
- Computer languages range from the plainly descriptive (such as HTML), through logical languages with limited propositional logic (regular expressions), to the nearly Turing-complete (some versions of SQL), through those that are in fact Turing-complete though one is led not to use them that way (more powerful versions of SQL), through those that are functional and Turing-complete (Haskell), to those that are unashamedly imperative and Turing-complete (Java, Javascript/ECMAScript or C).
- The Turing-complete languages are shown by computer science to be equivalent in their ability to compute any result of which a computer is capable, and are in that sense the most powerful class of languages for computers. The tradeoff for such power is that you typically cannot determine what a program in a Turing-complete language will do without actually running it. Indeed, you often cannot tell in advance whether such a program will even reach the point of producing useful output.
- Of course, you can easily tell what a simple program such as
print "2+2"
will do, but given an arbitrary program you'd likely have to run it, and possibly for a very long time.
- Of course, you can easily tell what a simple program such as
- Conversely, if you capture information in a simple declarative form, anyone can write a program to analyze it in many ways.
- Thus, there is a tradeoff in choosing between languages that can solve a broad range of problems and languages in which programs and data are easily analyzed.
- Expressing constraints, relationships and processing instructions in less powerful languages increases the flexibility with which information can be reused: the less powerful the language, the more you can do with the data stored in that language.
- Less powerful languages are usually easier to secure. A bug-free regular expression processor, for example, is by definition free of many security exposures that are inherent in the more general runtime one might use for a language like C++. Because programs in simpler languages are easier to analyze, it's also easier to identify the security problems that they do have.
- There are many dimensions to language power and complexity that should be considered when publishing information.
- For example, a language with a straightforward syntax may be easier to analyze than an otherwise equivalent one with more complex structure.
- A language that wraps simple computations in unnecessary mechanics, such as object creation or thread management, may similarly inhibit information extraction.
- This finding observes that a variety of characteristics that make languages powerful can complicate or prevent analysis of programs or information conveyed in those languages, and it suggests that such risks be weighed seriously when publishing information on the Web.
- Indeed, on the Web, the least powerful language that's suitable should usually be chosen. This is The Rule of Least Power:
Good Practice: Use the least powerful language suitable for expressing information, constraints or programs on the World Wide Web.
- In aiming for simplicity, one must of course go far enough but no further.
- The language you choose must be powerful enough to successfully solve your problem, and indeed, complexity and lack of clarity can easily result from clumsy efforts to patch around use of a language that is too limited.
- Furthermore, the suggestion to use less powerful languages must in practice be weighed against other factors. Perhaps the more powerful language is a standard and the less powerful language not, or perhaps the use of simple idioms in a powerful language makes it practical to use the powerful languages without unduly obscuring the information conveyed
- Overall, the Web benefits when less powerful languages can be successfully applied.
Web Technologies and the Rule of Least Power
- Many Web technologies are designed to exploit the Rule of Least Power.
- HTML is intentionally designed not to be a full programming language, so that many different things can be done with an HTML document: software can present the document in various styles, extract tables of contents, index it, and so on.
- Similarly, CSS is a declarative styling language that is easily analyzed.
- The Semantic Web is an attempt, largely, to map large quantities of existing data onto a common language so that the data can be analyzed in ways never dreamed of by its creators.
- If, for example, some weather data is published as a Web resource using RDF, a user can retrieve it as a table, perhaps average it, plot it, or deduce things from it in combination with other information. At the other end of the scale is the weather information conveyed by an ingeniously written Java applet. While the applet might provide a very cool user interface or other sophisticated features, the results of the program will not usually be predictable in advance. A search engine finding the resource will have no idea of what the weather data is or even, in the absence of other information, that it is a weather-related resource The only way to find out what a Java applet means is generally to set it running, and see what it does.
- Thus, HTML, CSS and the Semantic Web are examples of Web technologies designed with "least power" in mind. Web resources that use these technologies are more likely to be reused in flexible ways than those expressed in more powerful languages.
- When publishing on the Web, you should usually choose the least powerful or most easily analyzed language variant that's suitable for the purpose.