Anyone trying to integrate Generative AI using Large Language Models into some commercial or professional business process should understand the dangers of so-called hallucinations. Because LLMs are generating the next text, given a corpus of training text and prompt data, all activity by these models is the same – prediction of the most highly likely completion text. Some outputs may appear to be hallucinatory only to those of us tethered to a real world. Anyone untethered – and this includes the LLMs themselves – will not be able to distinguish so-called real from so-called hallucinations, just as a sleeper cannot tell if his or her dreams are plausible or not.
Archive for the ‘Computer Science’ Category
Page 0 of 6
Doing and thinking
Someone once said that computer programming is 10% syntax and 90% thinking. Here is Andrej Karpathy with a similar thought, writing on Twitter:
You could see it as there being two modes in creation. Borrowing GAN terminology:
1) generation and
2) discrimination.
e.g. painting – you make a brush stroke (1) and then you look for a while to see if you improved the painting (2). these two stages are interspersed in pretty much all creative work.
The rise of Blockchain
Blockchains are the new black!
In late 2014, the first edition of DevCon (labelled DevCon0, in computing fashion), the Ethereum developers conference held in Berlin had a dozen or so participants.  A year later, several hundred people attended DevCon1 in the City of London.  The participants were a mix of pony-tailed hacker libertarians and besuited, besotted bankers, and I happened to speak at the event.  Since New Year, I have participated in a round-table discussion on technology and entrepreneurship with a Deputy Prime Minister of Singapore, previewed a smart contracts project undertaken by a prominent former member of Anonymous, briefed a senior business journalist on the coming revolution in financial and regulatory technology, and presented to 120 people at a legal breakfast on distributed ledgers and blockchains. That audience was, as it happens, the quietest and most attentive I have ever encountered.
For only the second time in my adult life, we are experiencing a great dramatic sea-change in technology infrastructure, and this period feels exactly like the early days of the Web.  In 1994-1997, every corporation and their sister was intent on getting online, but most  did not know how, and skills were scarce.  Great fortunes were to be made in clicks and mortar:  IBM took out full-page ads in the WSJ offering to put your company online in only 3 months and for just $1 million!  Today, sisters are urging investment in blockchains, and as much as $1 billion of venture funding went to blockchain and Bitcoin startups in 2015 alone.
The Web revolution helped make manifest the Information Society, by putting information online and making it easily accessible.  But, as I have argued before, most real work uses information but is not about information per se.  Rather, real work is about doing stuff, getting things done, and getting them done with, by, and through other people or organizations. Exchanges, promises, and commitments are as important as facts in such a world.  The blockchain revolution will manifest the Joint-Action Society, by putting transactions and commitments online in a way that is effectively unrevokable and unrepudiable.  The leaders of this revolution are likely to arise from banking and finance and insurance, since those sectors are where the applications are most compelling and the business needs most pressing. So expect this revolution to be led not from Silicon Valley, but from within the citadels of global banking: New York and London, Paris and Frankfurt, and perhaps Tokyo and Singapore.
London Life: Ethereum DevCon1
Gibson Hall, London, venue for DevCon1, 9-13 November 2015. There was some irony in holding a conference to discuss technology developments in blockchains and distributed ledgers in a grand, neo-classical heritage-listed building erected in 1865. At least it was fitting that a technology currently taking the financial world by storm should be debated in what was designed to be a banking hall (for Westminster Bank). The audience was split fairly evenly between dreadlocked libertarians & cryptocurrency enthusiasts and bankers & lawyers in smart suits: cyberpunk meets Gordon Gekko.
The science of delegation
Most people, if they think about the topic at all, probably imagine computer science involves the programming of computers.  But what are computers?  In most cases, these are just machines of one form or another.  And what is programming?  Well, it is the issuing of instructions (“commands” in the programming jargon) for the machine to do something or other, or to achieve some state or other.   Thus, I view Computer Science as nothing more or less than the science of delegation.
When delegating a task to another person, we are likely to be more effective (as the delegator or commander) the more we know about the skills and capabilities and curent commitments and attitudes of that person (the delegatee or commandee).   So too with delegating to machines.   Accordingly, a large part of theoretical computer science is concerned with exploring the properties of machines, or rather, the deductive properties of mathematical models of machines.  Other parts of the discipline concern the properties of languages for commanding machines, including their meaning (their semantics) – this is programming language theory.  Because the vast majority of lines of program code nowadays are written by teams of programmers, not individuals, then much of computer science – part of the branch known as software engineering – is concerned with how to best organize and manage and evaluate the work of teams of people.   Because most machines are controlled by humans and act in concert for or with or to humans, then another, related branch of this science of delegation deals with the study of human-machine interactions.   In both these branches, computer science reveals itself to have a side which connects directly with the human and social sciences, something not true of the other sciences often grouped with Computer Science: pure mathematics, physics, or chemistry. 
And from its modern beginnings 70 years ago, computer science has been concerned with trying to automate whatever can be automated – in other words, with delegating the task of delegating.  This is the branch known as Artificial Intelligence.   We have intelligent machines which can command other machines, and manage and control them in the same way that humans could.   But not all bilateral relationships between machines are those of commander-and-subordinate.  More often, in distributed networks machines are peers of one another, intelligent and autonomous (to varying degrees).  Thus, commanding is useless – persuasion is what is needed for one intelligent machine to ensure that another machine does what the first desires.  And so, as one would expect in a science of delegation, computational argumentation arises as an important area of study.
 
Strategic Progamming
Over the last 40-odd years, a branch of Artificial Intelligence called AI Planning has developed. One way to view Planning is as automated computer programming:
- Write a program that takes as input an initial state, a final state (“a goal”), and a collection of possible atomic actions, and produces as output another computer programme comprising a combination of the actions (“a plan”) guaranteed to take us from the initial state to the final state.
A prototypical example is robot motion:  Given an initial position (e.g., here), a means of locomotion (e.g., the robot can walk), and a desired end-position (e.g., over there), AI Planning seeks to empower the robot to develop a plan to walk from here to over there.   If some or all the actions are non-deterministic, or if there are other possibly intervening effects in the world, then the “guaranteed” modality may be replaced by a “likely” modality. 
Another way to view Planning is in contrast to Scheduling:
- Scheduling is the orderly arrangement of a collection of tasks guranteed to achieve some goal from some initial state, when we know in advance the initial state, the goal state, and the tasks.
- Planning is the identification and orderly arrangement of tasks guranteed to achieve some goal from some initial state, when we know in advance the initial state, the goal state, but we don’t yet know the tasks; we only know in advance the atomic actions from which tasks may be constructed.
Relating these ideas to my business experience, I realized that a large swathe of complex planning activities in large companies involves something at a higher level of abstraction. Henry Mintzberg called these activities “Strategic Programming”
- Strategic Programming is the identification and priorization of a finite collection of programs or plans, given an initial state, a set of desirable end-states or objectives (possibly conflicting). A program comprises an ordered collection of tasks, and these tasks and their ordering we may or may not know in advance.
Examples abound in complex business domains.   You wake up one morning to find yourself the owner of a national mobile telecommunications licence, and with funds to launch a network.  You have to buy the necessary equipment and deploy and connect it, in order to provide your new mobile network.   Your first decision is where to provide coverage:  you could aim to provide nationwide coverage, and not open your service to the public until the network has been installed and connected nationwide.  This is the strategy Orange adopted when launching PCS services in mainland Britain in 1994.   One downside of waiting till you’ve covered the nation before selling any service to customers is that revenues are delayed. 
Another downside is that a competitor may launch service before you, and that happened to Orange:  Mercury One2One (as it then was) offered service to the public in 1993, when they had only covered the area around London.   The upside of that strategy for One2One was early revenues.  The downside was that customers could not use their phones outside the island of coverage, essentially inside the M25 ring-road.   For some customer segments, wide-area or nationwide coverage may not be very important, so an early launch may be appropriate if those customer segments are being targeted.  But an early launch won’t help customers who need wider-area coverage, and – unless marketing communications are handled carefully – the early launch may position the network operator in the minds of such customers as permanently providing inadequate service.   The expectations of both current target customers and customers who are not currently targets need to be explicitly managed to avoid such mis-perceptions.
In this example, the different coverage rollout strategies ended up at the same place eventually, with both networks providing nationwide coverage.  But the two operators took different paths to that same end-state.   How to identify, compare, prioritize, and select-between these different paths is the very stuff of marketing and business strategy, ie, of strategic programming.  It is why business decision-making is often very complex and often intellectually very demanding.   Let no one say (as academics are wont to do) that decision-making in business is a doddle.   Everything is always more complicated than it looks from outside, and identifying and choosing-between alternative programs is among the most complex of decision-making activities.
Progress in computing
Computer science typically proceeds by first doing something, and then thinking carefully about it: Engineering usually precedes theory. Some examples:
- The first programmable device in modern times was the Jacquard Loom, a textile loom that could weave different patterns depending on the instruction cards fed into it. This machine dates from the first decade of the 19th century, but we did not have a formal, mathematical theory of programming until the 1960s.
- Charles Babbage designed various machines to undertake automated calculations in the first half of the 19th century, but we did not have a mathematical theory of computation until Alan Turing’s film-projector model a century later.
- We’ve had a fully-functioning, scalable, global network enabling multiple, asynchronous, parallel, sequential and interleaved interactions since Arpanet four decades ago, but we still lack a fully-developed mathematical theory of interaction. In particular, Turing’s film projectors seem inadequate to model interactive computational processes, such as those where new inputs arrive or partial outputs are delivered before processing is complete, or those processes which are infinitely divisible and decentralizable, or nearly so.
- The first mathematical theory of communications (due to Claude Shannon) dates only from the 1940s, and that theory explicitly ignores the meaning of messages. In the half-century since, computer scientists have used speech act theory from the philosophy of language to develop semantic theories of interactive communication. Arguably, however, we still lack a good formal, semantically-rich account of dialogs and utterances about actions. Yet, smoke signals were used for communications in ancient China, in ancient Greece, and in medieval-era southern Africa.
An important consequence of this feature of the discipline is that theory and practice are strongly coupled and symbiotic. We need practice to test and validate our theories, of course. But our theories are not (in general) theories of something found in Nature, but theories of practice and of the objects and processes created by practice. Favouring theory over practice risks creating a sterile, infeasible discipline out of touch with reality – a glass bead game such as string theory or pre-Crash mathematical economics. Favouring practice over theory risks losing the benefits of intelligent thought and modeling about practice, and thus inhibiting our understanding about limits to practice. Neither can exist very long or effectively without the other.
Combining actions
How might two actions be combined? Well, depending on the actions, we may be able to do one action and then the other, or we may be able do the other and then the one, or maybe not. We call such a combination a sequence or concatenation of the two actions. In some cases, we may be able to do the two actions in parallel, both at the same time. We may have to start them simultaneously, or we may be able to start one before the other. Or, we may have to ensure they finish together, or that they jointly meet some other intermediate synchronization targets.
In some cases, we may be able to interleave them, doing part of one action, then part of the second, then part of the first again, what management consultants in telecommunications call multiplexing. For many human physical activities – such as learning to play the piano or learning to play golf – interleaving is how parallel activities are first learnt and complex motor skills acquired: first play a few bars of music on the piano with only the left hand, then the same bars with only the right, and keep practicing the hands on their own, and only after the two hands are each practiced individually do we try playing the piano with the two hands together.
Computer science, which I view as the science of delegation, knows a great deal about how actions may be combined, how they may be distributed across multiple actors, and what the meanings and consequences of these different combinations are likely to be. It is useful to have an exhaustive list of the possibilities. Let us suppose we have two actions, represented by A and B respectively. Then we may be able to do the following compound actions:
- Sequence: The execution of A followed by the execution of B, denoted A ; B
- Iterate: A executed n times, denoted A ^ n (This is sequential execution of a single action.)
- Parallelize: Both A and B are executed together, denoted A & B
- Interleave: Action A is partly executed, followed by part-execution of B, followed by continued part-execution of A, etc, denoted A || B
- Choose: Either A is executed or B is executed but not both, denoted A v B
- Combinations of the above: For example, with interleaving, only one action is ever being executed at one time. But it may be that the part-executions of A and B can overlap, so we have a combination of Parallel and Interleaved compositions of A and B.
Depending on the nature of the domain and the nature of the actions, not all of these compound actions may necessarily be possible. For instance, if action B has some pre-conditions before it can be executed, then the prior execution of A has to successfully achieve these pre-conditions in order for the sequence A ; B to be feasible.
This stuff may seem very nitty-gritty, but anyone who’s ever asked a teenager to do some task they don’t wish to do, will know all the variations in which a required task can be done after, or alongside, or intermittently with, or be replaced instead by, some other task the teen would prefer to do. Machines, it turns out, are much like recalcitrant and literal-minded teenagers when it comes to commanding them to do stuff.
PhD Vivas
Awhile back, I posted some advice from my own experiences on doing a PhD.  Since then, several people have asked me for advice about the viva voce (or oral) examination, which most PhD programs require at the end of the degree.    Here are some notes I wrote for a candidate recently.
It is helpful to think about the goals of the examiners.   In my opinion, they are trying to achieve the following goals:
1. First, they simply want to understand what your dissertation says.   This means they will usually ask you to clarify or explain things which are not clear to them.
2.  Then, they want to understand the context of the work.  This refers to the previous academic literature on the subject or on related subjects, so they will generally ask about that literature.  They may consider some topic to be related to your work which you did not cover; in that case, you would normally be asked to add some text on that topic.
3.  They want to assess if the work makes a contribution to the related literature.    So they will ask what is new or original in your dissertation, and why it is different from the past work of others.  They will also want to be able to separate what is original from what came before (which is sometimes hard to do in some dissertations, due to the writing style of the candidate or the structure of the document).   To the extent that Computer Science is an engineering discipline, and thus involves design, originality is usually not a problem:  few other people will be working in the same area as you, and none of them would have made precisely the same sequences of design choices in the same order for the same reasons as you did.
4.  They will usually want to assess if the new parts in the dissertation are significant or important.  They will ask you about the strengths and weaknesses of your research, relative to the past work of others.   They will usually ask about potential future work, the new questions that arise from your work, or the research that your work or your techniques make possible.  Research or research techniques which open up new research vistas or new application domains are usually looked upon favourably.
5.  Goals #3 and #4 will help the examiners decide if the written dissertation is worth receiving a PhD award, since most university regulations require PhD dissertations to present an original and significant contribution to knowledge.
6.  The examiners will also want to assess if YOU yourself wrote the document.  They will therefore ask you about the document, what your definitions are, where things are, why you have done certain things and not others, why you have made certain design choices and not others,  etc.     Some examiners will even give the impression that they have not read your dissertation, precisely to find out if you have!
7.  Every dissertation makes some claims (your “theses”).  The examiners will generally approach these claims with great scepticism, questioning and challenging you, contesting your responses and arguments, and generally trying to argue you down.   They want to see if you can argue in favour of your claims, to see if you are able to justify and support your claims, and how you handle criticism.   After all, if you can’t support your claims, no one else will, since you are the one proposing them.
The viva is not a test of memory, so you can take a copy of your thesis with you and refer to it as you wish.  Likewise, you can take any notes you want.    The viva is also not a test of speed-thinking, so you can take your time to answer questions or to respond to comments.    You can ask the examiners to explain any question or any comment which you don’t understand.   It is OK to argue with the examiners (in some sense, it is expected), but not to get personal in argument or to lose your temper.
The viva is one of the few occasions in a research career when you can have an extended discussion about your research with people interested in the topic who have actually read your work.   Look forward to it, and enjoy it!
