As we once thought


The Internet, the World-Wide-Web and hypertext were all forecast by Vannevar Bush, in a July 1945 article for The Atlantic, entitled  As We May Think.  Perhaps this is not completely surprising since Bush had a strong influence on WW II and post-war military-industrial technology policy, as Director of the US Government Office of Scientific Research and Development.  Because of his influence, his forecasts may to some extent have been self-fulfilling.
However, his article also predicted automated machine reasoning using both logic programming, the computational use of formal logic, and computational argumentation, the formal representation and manipulation of arguments.  These areas are both now important domains of AI and computer science which developed first in Europe and which still much stronger there than in the USA.   An excerpt:

The scientist, however, is not the only person who manipulates data and examines the world about him by the use of logical processes, although he sometimes preserves this appearance by adopting into the fold anyone who becomes logical, much in the manner in which a British labor leader is elevated to knighthood. Whenever logical processes of thought are employed—that is, whenever thought for a time runs along an accepted groove—there is an opportunity for the machine. Formal logic used to be a keen instrument in the hands of the teacher in his trying of students’ souls. It is readily possible to construct a machine which will manipulate premises in accordance with formal logic, simply by the clever use of relay circuits. Put a set of premises into such a device and turn the crank, and it will readily pass out conclusion after conclusion, all in accordance with logical law, and with no more slips than would be expected of a keyboard adding machine.
Logic can become enormously difficult, and it would undoubtedly be well to produce more assurance in its use. The machines for higher analysis have usually been equation solvers. Ideas are beginning to appear for equation transformers, which will rearrange the relationship expressed by an equation in accordance with strict and rather advanced logic. Progress is inhibited by the exceedingly crude way in which mathematicians express their relationships. They employ a symbolism which grew like Topsy and has little consistency; a strange fact in that most logical field.
A new symbolism, probably positional, must apparently precede the reduction of mathematical transformations to machine processes. Then, on beyond the strict logic of the mathematician, lies the application of logic in everyday affairs. We may some day click off arguments on a machine with the same assurance that we now enter sales on a cash register. But the machine of logic will not look like a cash register, even of the streamlined model.”

Edinburgh sociologist, Donald MacKenzie, wrote a nice history and sociology of logic programming and the use of logic of computer science, Mechanizing Proof: Computing, Risk, and Trust.  The only flaw of this fascinating book is an apparent misunderstanding throughout that theorem-proving by machines  refers only to proving (or not) of theorems in mathematics.    Rather, theorem-proving in AI refers to proving claims in any domain of knowledge represented by a formal, logical language.    Medical expert systems, for example, may use theorem-proving techniques to infer the presence of a particular disease in a patient; the claims being proved (or not) are theorems of the formal language representing the domain, not necessarily mathematical theorems.
References:
Donald MacKenzie [2001]:  Mechanizing Proof: Computing, Risk, and Trust (2001).  Cambridge, MA, USA:  MIT Press.
Vannevar Bush[1945]:  As we may thinkThe Atlantic, July 1945.

Those pesky addition symbols


On 28 March 1979, there was a partial core meltdown in a reactor at the Three Mile Island Nuclear Power Generating Station near Harrisburg, PA, USA.  The accident was soon headline news, at least throughout the western world.   An obscure computer programmer apparently hearing this news had a crisis of conscience and, in April 1979, phoned anonymously to the US Nuclear Regulatory Commission, telling them to examine particular program code used in the design of certain types of nuclear reactors.   The code in question was a subroutine intended to calculate the total stresses on pipes carrying coolant water, but instead of adding the different stresses, the routine subtracted them.  So the resulting coolant pipes were extra-thin instead of being extra-thick.

This story would not surprise anyone with any software development experience.   Few other people understand, I think, just how dependent modern society is on the correct placing of mundane arithmetic operators or the appropriate invocation of variable references in obscure lines of old program code.  No programmers, in my experience, took less than seriously, for example, the threat of the Millenium Bug, although lots of people who are not programmers still think it was a threat without substance, or even a scam.

Below I have re-typed the article where first I read about this, as I can find no reference elsewhere on the web.

Faulty software may close more nuclear plants
(from Australasian Computerworld, 18 May 1979, pages 1 and 15).

Washington, DC. – The Nuclear Regulatory Commission (NRC) in the US may soon order more shutdowns of nuclear plants if it finds the design of their piping relies on invalid computer algorithms.

The commission is completing a study [page-break] to determine whether earthquakes could rupture the computer-designed piping of active US nuclear plants as well as those under construction.  In March, the commission ordered five plants in the eastern US to cease operation after an error was discovered in their design software.

The study was initiated following an anonymous phone call last month from an individual who reportedly told the NRC that many other plants were designed or are being designed with similarly flawed routines.  As a result, the commission ordered all 70 licensed plants and the 92 granted construction permits to declare whether they rely on any of three algebraic summation methods.

The water to cool reactor cores in the five suspended plants ran through pipes with tolerances far below NRC standards because of an algebraic summation routine subtracted, rather than added, stress figures.
Nuclear energy experts consider reliable reactor piping a critical safety factor.  A reactor core overheats if not enough water circulates around its radioactive rods to carry heat away.  Pipe ruptures or pump failures would thus induce core overheating that, if unchecked by reserve cooling systems, might force the reactor to discharge dangerous radiation.
NRC inspectors may order more shutdowns if they find other plants in violation of piping tolerance requirements, a spokesman said.  Their decisions will be based on responses to a general bulletin to holders of licences and construction permits.

Crowd-sourcing for scientific research

Computers are much better than most humans at some tasks (eg, remembering large amounts of information, tedious and routine processing of large amounts of data), but worse than many humans at others (eg, generating new ideas, spatial pattern matching, strategic thinking). Progress may come from combining both types of machine (humans, computers) in ways which make use of their specific skills.  The journal Nature yesterday carried a report of a good example of this:  video-game players are able to assist computer programs tasked with predicting protein structures.  The abstract:

People exert large amounts of problem-solving effort playing computer games. Simple image- and text-recognition tasks have been successfully ‘crowd-sourced’ through games, but it is not clear if more complex scientific problems can be solved with human-directed computing. Protein structure prediction is one such problem: locating the biologically relevant native conformation of a protein is a formidable computational challenge given the very large size of the search space. Here we describe Foldit, a multiplayer online game that engages non-scientists in solving hard prediction problems. Foldit players interact with protein structures using direct manipulation tools and user-friendly versions of algorithms from the Rosetta structure prediction methodology, while they compete and collaborate to optimize the computed energy. We show that top-ranked Foldit players excel at solving challenging structure refinement problems in which substantial backbone rearrangements are necessary to achieve the burial of hydrophobic residues. Players working collaboratively develop a rich assortment of new strategies and algorithms; unlike computational approaches, they explore not only the conformational space but also the space of possible search strategies. The integration of human visual problem-solving and strategy development capabilities with traditional computational algorithms through interactive multiplayer games is a powerful new approach to solving computationally-limited scientific problems.”

References:
Seth Cooper et al. [2010]: Predicting protein structures with a multiplayer online gameNature, 466:  756–760.  Published:  2010-08-05.
Eric Hand [2010]:  Citizen science:  people powerNature 466, 685-687. Published 2010-08-04.
The Foldit game is here.

By their voice ye shall know them

Effective strategies are often counter-intuitive.  If you are speaking to a large group, some of whom are speaking to each other, your natural tendency will be to try to speak over them, to speak more loudly.  But doing this just encourages the talkers in the audience to increase their levels of speech, and so an arms race results.   Better for you to speak more softly, which means that audience talkers can hear themselves more clearly over you, and so typically, and unthinkingly, drop the levels of their own speech.
A recent issue of ACM Transactions on Computer Systems (ACM TOCS) carries a paper with a wonderful example of this principle.  Faced with a denial-of-service attack, they propose that a server ask all its clients to increase their messages to the server.  Most likely, attackers among the clients are already transmitting at their local full capacity, and so are unable to do this, which means that messages from attackers will form a decreasing proportion of all messages received by the server.   The paper abstract is:

This article presents the design, implementation, analysis, and experimental evaluation of speak-up, a defense against application-level distributed denial-of-service (DDoS), in which attackers cripple a server by sending legitimate-looking requests that consume computational resources (e.g., CPU cycles, disk). With speak-up, a victimized server encourages all clients, resources permitting, to automatically send higher volumes of traffic. We suppose that attackers are already using most of their upload bandwidth so cannot react to the encouragement. Good clients, however, have spare upload bandwidth so can react to the encouragement with drastically higher volumes of traffic. The intended outcome of this traffic inflation is that the good clients crowd out the bad ones, thereby capturing a much larger fraction of the server’s resources than before. We experiment under various conditions and find that speak-up causes the server to spend resources on a group of clients in rough proportion to their aggregate upload bandwidths, which is the intended result.

Reference:
Michael Walfish, Mythili Vukurutu, Hari Balakrishnan, David Karger and Scott Shenker [2010]:  DDoS defense by offense.  ACM Transactions on Computer Systems, 28 (1), article 3.

Informatics

A recent issue of the Communications of the ACM has an interesting article about degrees in Informatics, by Dennis Groth and Jeffrey Mackie-Mason.  They present a nice definition of the subject in this para:

The vision for informatics follows from the natural evolution of computing. The success of computing is in the resolution of problems, found in areas that are predominately outside of computing. Advances in computing—and computing education—require greater understanding of the problems where they are found: in business, science, and the arts and humanities. Students must still learn computing, but they must learn it in contextualized ways. This, then, provides a definition for informatics: informatics is a discipline that solves problems through the application of computing or computation, in the context of the domain of the problem. Broadening computer science through attention to informatics not only offers insights that will drive advances in computing, but also more options and areas of inquiry for students, which will draw increasing numbers of them to study computation.

Sadly, these views are not uncontroversial, as the online experiences which motivated my parody here illustrate.  The interesting experience of Georgia Tech, where the School of Computing is split into three parts  — Computer Science; Interactive Computing; and Computational Science and Engineering, — is described here.
Reference:
Dennis P. Groth and Jeffrey K. MacKie-Mason [2010]: Why an Informatics degree?  Isn’t computer science enough? Communications of the ACM, 53 (2):  26-28.  Available here.

Research funding myopia

The British Government, through its higher education funding council, is currently considering the use of socio-economic impact factors when deciding the relative rankings of university departments in terms of their research quality, the Research Assessment Exercise (RAE), held about every five years.   These impact factors are intended to measure the social or economic impact of research activities in the period of the RAE (ie, within 5 years). Since the RAE is used to allocate funds for research infrastructure to British universities these impact factors, if implemented, will thus indirectly decide which research groups and which research will be funded.    Some academic reactions to these proposals are here and here.
From the perspective of the national economy and technological progress, these proposals are extremely misguided, and should be opposed by us all.    They demonstrate a profound ignorance of where important ideas come from, of when and where and how they are applied, and of where they end up.  In particular, they demonstrate great ignorance of the multi-disciplinary nature of most socio-economically-impactful research.
One example will demonstrate this vividly.  As more human activities move online, more tasks can be automated or semi-automated.    To enable this, autonomous computers and other machines need to be able to communicate with one using shared languages and protocols, and thus much research effort in Computer Science and Artificial Intelligence these last three decades has focused on designing languages and protocols for computer-to-computer communications.  These protocols are used in various computer systems already and are likely to be used in future-generation mobile communications and e-commerce systems.
Despite its deep technological nature, research in this area draws fundamentally on past research and ideas from the Humanities, including:

  • Speech Act Theory in the Philosophy of Language (ideas due originally to Adolf Reinach 1913, John Austin 1955, John Searle 1969 and Jurgen Habermas 1981, among others)
  • Formal Logic (George Boole 1854, Clarence Lewis 1910, Ludwig Wittgenstein 1922, Alfred Tarski 1933, Saul Kripke 1959, Jaakko Hintikka 1962, etc), and
  • Argumentation Theory (Aristotle c. 350 BC, Stephen Toulmin 1958, Charles Hamblin 1970, etc).

Assessment of the impacts of research over five years is laughable when Aristotle’s work on rhetoric has taken 2300 years to find technological application.   Even Boole’s algebra took 84 years from its creation to its application in the design of electronic circuits (by Claude Shannon in 1938).  None of the humanities scholars responsible were doing their research to promote technologies for computer interaction or to support e-commerce, and most would not have even understood what these terms mean.  Of the people I have listed, only John Searle (who contributed to the theory of AI), and Charles Hamblin (who created one of the first computer languages, GEORGE, and who made major contributions to the architecture of early computers, including invention of the memory stack), had any direct connection to computing.   Only Hamblin was afforded an obituary by a computer journal (Allen 1985).
None of the applications of these ideas to computer science were predicted, or even predictable.  If we do not fund pure research across all academic disciplines without regard to its potential socio-economic impacts, we risk destroying the very source of the ideas upon which our modern society and our technological progress depend.
Reference:
M. W. Allen [1985]: “Charles Hamblin (1922-1985)”. The Australian Computer Journal, 17(4): 194-195.

Social surveys in the developing world

Robert Chambers, sociologist of development, writing about social science surveys in the developing world:

As data collection is completed, processing begins. Coding, punching and some simple programming present formidable problems. Consistency checks are too much to contemplate. Funds begin to run out because the costs of this stage have been underestimated. Reports are due before data are ready. There has been an overkill in data collection; there is enough information for a dozen Ph.D. theses but no one to use it. Much of the material remains unprocessed, or if processed, unanalysed, or if analysed, not written-up, or if written-up, not read, or if read, not remembered, or if remembered, not used or acted upon. Only a minuscule proportion, if any, of the findings affect policy and they are usually a few simple totals. These totals have often been identified early on through physical counting of questionnaires or coding sheets and communicated verbally, independently of the main data processing.”

Reference:
Robert Chambers [1983]: Rural Development: Putting the Last First. London, UK: Longman. p. 53.

The websearch-industrial complex

I think it is now well-known that the creation of Internet was sponsored by the US Government, through its military research funding agencies, ARPA (later DARPA).   It is perhaps less well-known that Google arose from a $4.5 million research project sponsored also by the US Government, through the National Science Foundation.   Let no one say that the USA has an economic system involving “free” enterprise.

In the primordial ooze of Internet content several hundred million seconds ago (1993), fewer than 100 Web sites inhabited the planet. Early clans of information seekers hunted for data among the far larger populations of text-only Gopher sites and FTP file-sharing servers. This was the world in the years before Google.
Continue reading ‘The websearch-industrial complex’

Straitjackets of Standards

This week I was invited to participate as an expert in a Delphi study of The Future Internet, being undertaken by an EC-funded research project.   One of the aims of the project is to identify multiple plausible future scenarios for the socio-economic role(s) of the Internet and related technologies, after which the project aim to reach a consensus on a small number of these scenarios.  Although the documents I saw were unclear as to exactly which population this consensus was to be reached among, I presume it was intended to be a consensus of the participants in the Delphi Study.
I have a profound philosophical disagreement with this objective, and indeed with most of the EC’s many efforts in standardization.   Tim Berners-Lee invented Hyper-Text Transfer Protocol (HTTP), for example, in order to enable physicists to publish their research documents to one another in a manner which enabled author-control of document appearance.    Like most new technologies. HTTP was not invented for the many other uses to which it has since been put; indeed, many of these other applications have required hacks or fudges to HTTP in order to work.  For example, because HTTP does not keep track of the state of a request, fudges such as cookies are needed.  If we had all been in consensual agreement with The Greatest Living Briton about the purposes of HTTP, we would have no e-commerce, no blogging, no social networking, no easy remote access to databases, no large-scale distributed collaborations, no easy action-at-a-distance, in short no transformation of our society and life these last two decades, just the broadcast publishing of text documents.
Let us put aside this childish, warm-and-fuzzy, touchy-feely seeking after consensus.  Our society benefits most from a diversity of opinions and strong disagreements, a hundred flowers blooming, a cacophony of voices in the words of Oliver Wendell Holmes.  This is particularly true of opinions regarding the uses and applications of innovations.   Yet the EC persists, in some recalcitrant chasing after illusive certainty, in trying to force us all into straitjackets of standards and equal practice.    These efforts are misguided and wrong-headed, and deserve to fail.

Myopic utilitarianism

What are the odds, eh?  On the same day that the Guardian publishes an obituary of theoretical computer scientist, Peter Landin (1930-2009), pioneer of the use of Alonzo Church’s lambda calculus as a formal semantics for computer programs, they also report that the Government is planning only to fund research which has relevance  to the real-world.  This is GREAT NEWS for philosophers and pure mathematicians! 
What might have seemed, for example,  mere pointless musings on the correct way to undertake reasoning – by Aristotle, by Islamic and Roman Catholic medieval theologians, by numerous English, Irish and American abstract mathematicians in the 19th century, by an entire generation of Polish logicians before World War II, and by those real-world men-of-action Gottlob Frege, Bertrand Russell, Ludwig Wittgenstein and Alonzo Church – turned out to be EXTREMELY USEFUL for the design and engineering of electronic computers.   Despite Russell’s Zen-influenced personal motto – “Just do!  Don’t think!” (later adopted by IBM) – his work turned out to be useful after all.   I can see the British research funding agencies right now, using their sophisticated and proven prognostication procedures to calculate the society-wide economic and social benefits we should expect to see from our current research efforts over the next 2300 years  – ie, the length of time that Aristotle’s research on logic took to be implemented in technology.   Thank goodness our politicians have shown no myopic utilitarianism this last couple of centuries, eh what?!
All while this man apparently received no direct state or commercial research funding for his efforts as a computer pioneer, playing with “pointless” abstractions like the lambda calculus.
And Normblog also comments.
POSTSCRIPT (2014-02-16):   And along comes The Cloud and ruins everything!   Because the lower layers of the Cloud – the physical infrastructure, operating system, even low-level application software – are fungible and dynamically so, then the Cloud is effectively “dark” to its users, beneath some level.   Specifying and designing applications that will run over it, or systems that will access it, thus requires specification and design to be undertaken at high levels of abstraction.   If all you can say about your new system is that in 10 years time it will grab some data from the NYSE, and nothing (yet) about the format of that data, then you need to speak in abstract generalities, not in specifics.   It turns out the lambda calculus is just right for this task and so London’s big banks have been recruiting logicians and formal methods people to spec & design their next-gen systems.  You can blame those action men, Church and Russell.