Hi, my name is Seung Park, and I’m the new pathology informatics fellow at UPMC. I’m one of the junior editors of JPI for this year, which means I’ll be your blogger for this year as well. I’m excited to be here, and I hope our journey through pathology informatics together this year will be a fruitful one.
As a computer scientist by vocation and a historian by avocation, over the course of the year I wanted to introduce you to landmark papers and events in the history of human-computer interaction (HCI) that to me serve as a powerful indicator not only of where we have been, but also of where we might be going. Most people will be reading this blog post from their computers; this means a user interface (UI) paradigm that involves monitors, keyboards, and mice. This paradigm is immediately familiar to us. Others will be reading this blog post from their tablets or smartphones; this means a UI paradigm that revolves around a capacitive touchscreen with multitouch capabilities. This paradigm is likewise immediately familiar to us. When you compare these two UI paradigms, most implementation details are different – most people who have a computer and a smartphone will tell you that the monitor-keyboard-mouse control schema of a computer is very different from the multitouch control schema of a smartphone – yet at their core these paradigms are fundamentally similar beasts. Over the course of the next few months, I hope to show you how these UI paradigms came to be – and what evolution in UI design means for pathology informatics systems of the present and the future.
We begin our survey of landmarks in UI design with a man named Vannevar Bush. Born in 1890, Bush saw firsthand how lack of cooperation between civilian scientists and the military had crippled American research efforts during World War I. He rose to a position of national prominence in the interbellum period, becoming vice-president and dean of engineering at MIT in 1932. In 1940, when Nazi Germany invaded France, Bush sought – and rapidly obtained – a personal meeting with President Roosevelt. All he took into the meeting was a single sheet of paper describing an organization that was later to be known as the National Defense Research Committee (NDRC). Roosevelt approved this proposal without any changes in ten minutes.
It is not a leap of the imagination to say that the NDRC – and its successor organization, the Office of Scientific Research and Development (OSRD) – coordinated the total scientific output of the United States during the period of World War II. Among its achievements were nuclear weapons, sonar, the proximity fuse, and mass production of penicillin. It is estimated that at one point in time, two-thirds of the nation’s physicists were working for Bush. During this time, Bush had an unshakeable belief that a strong national defense was critical for the development of science – and vice versa. This attitude was to be reversed in his later years, when he grew increasingly aware that the arms race that he had help to create had mushroomed into what we now know of as the Cold War.
At the end of World War II, though, Bush’s optimism for a bright future with technological and scientific innovation at its core was at its strongest. It was at this point in time that he penned an essay titled “As We May Think” – arguably the first paper in the history of human-computer interaction. Though written in 1945, it introduces concepts that are modern and relevant even to the reader of the current day. The astute reader can pick out ideas that would later go on to influence the creation of hypertext, personal computers, the Internet, the World Wide Web, speech recognition software, and collaboration-based knowledge bases such as Wikipedia. A copy of this paper is available here; I encourage you to read along with me.
“This has not been a scientist's war; it has been a war in which all have had a part. The scientists, burying their old professional competition in the demand of a common cause, have shared greatly and learned much. It has been exhilarating to work in effective partnership. Now, for many, this appears to be approaching an end. What are the scientists to do next?”
The opening section of the article brings up a few interesting points. Most interesting perhaps is the notion that collaborative work toward a strong common cause causes rapid advance in science. Compare this to the way that academic research is often conducted in the modern day and age: though there is a good deal of collaboration in academia, there is also a large amount of secrecy that stems from the fear of being “scooped”. The competitive race to publish in large quantities has resulted in a large amount of literature, but much of it is of little use – and it becomes increasingly difficult to sort out the wheat from the chaff.
“Science has provided the swiftest communication between individuals; it has provided a record of ideas and has enabled man to manipulate and to make extracts from that record so that knowledge evolves and endures throughout the life of a race rather than that of an individual.”
This is, indeed, the cornerstone of all scientific endeavor: a public, trusted, reproduceable record of ideas and implementations upon which innovations and new discoveries can be made. The fascinating thing about this statement is that as the volume of scientific research has grown, it has become increasingly difficult for the scientist to determine which extracts from this record are of value, and which are not.
“There is a growing mountain of research. But there is increased evidence that we are being bogged down today as specialization extends. The investigator is staggered by the findings and conclusions of thousands of other workers—conclusions which he cannot find time to grasp, much less to remember, as they appear. Yet specialization becomes increasingly necessary for progress, and the effort to bridge between disciplines is correspondingly superficial.”
It is precisely this environment that makes true peer review so difficult – and allows for less-than-ethical researchers to prosper for large amounts of time before they are detected. The need for some way to organize, filter, and analyze the vast amount of data that is being generated every day becomes more and more pressing every day.
“The difficulty seems to be, not so much that we publish unduly in view of the extent and variety of present day interests, but rather that publication has been extended far beyond our present ability to make real use of the record. The summation of human experience is being expanded at a prodigious rate, and the means we use for threading through the consequent maze to the momentarily important item is the same as was used in the days of square-rigged ships.”
Here Bush articulates the major point of his paper: since drinking from a fire hose does no one any good, there exists a need to augment – and perhaps fundamentally change – the way that we deal with knowledge.
Much of this section is dedicated to advances in photography, and how such advances might be used. He speaks of compression schema as well, in the form of microfilm. Bush notes how much data could be compressed in his day and age – and again notes that the volume of data that exists is so high that “even the modern great library is not generally consulted; it is nibbled at by a few”. His last paragraph is very interesting in its clairvoyance:
“Compression is important, however, when it comes to costs. The material for the microfilm Britannica would cost a nickel, and it could be mailed anywhere for a cent. What would it cost to print a million copies? To print a sheet of newspaper, in a large edition, costs a small fraction of a cent. The entire material of the Britannica in reduced microfilm form would go on a sheet eight and one-half by eleven inches. Once it is available, with the photographic reproduction methods of the future, duplicates in large quantities could probably be turned out for a cent apiece beyond the cost of materials.”
We now exist in an era where it is trivially simple to reproduce a given amount of data; it costs virtually nothing to then transmit that data to millions of different people worldwide, at the same time (unfortunately for us, this is why e-mail spam is so prevalent). There once existed a time when the cost of media was significant enough for us to pay, say, significant amounts of money for a phonograph, or for a compact disc. Nowadays it is becoming more and more recognized that it is the original act of creation itself that is important; distribution and media costs have largely become a thing of the past.
“At a recent World Fair a machine called a Voder was shown. A girl stroked its keys and it emitted recognizable speech. No human vocal chords entered into the procedure at any point; the keys simply combined some electrically produced vibrations and passed these on to a loud-speaker. In the Bell Laboratories there is the converse of this machine, called a Vocoder. The loudspeaker is replaced by a microphone, which picks up sound. Speak to it, and the corresponding keys move. This may be one element of the postulated system.
The other element is found in the stenotype, that somewhat disconcerting device encountered usually at public meetings. A girl strokes its keys languidly and looks about the room and sometimes at the speaker with a disquieting gaze. From it emerges a typed strip which records in a phonetically simplified language a record of what the speaker is supposed to have said. Later this strip is retyped into ordinary language, for in its nascent form it is intelligible only to the initiated. Combine these two elements, let the Vocoder run the stenotype, and the result is a machine which types when talked to.”
It is interesting that current voice recognition technologies do not do much more than this, even after decades of research in the field.
“Rapid electrical counting appeared soon after the physicists found it desirable to count cosmic rays. For their own purposes the physicists promptly constructed thermionic-tube equipment capable of counting electrical impulses at the rate of 100,000 a second. The advanced arithmetical machines of the future will be electrical in nature, and they will perform at 100 times present speeds, or more.”
This neatly describes how fast progress has been in the realm of microprocessors. Today’s smartphones pack a processing power that would have been unheard of in the mainframe world even 20 years ago. We have, among other things, Moore’s Law to thank for this.
“The repetitive processes of thought are not confined however, to matters of arithmetic and statistics. In fact, every time one combines and records facts in accordance with established logical processes, the creative aspect of thinking is concerned only with the selection of the data and the process to be employed and the manipulation thereafter is repetitive in nature and hence a fit matter to be relegated to the machine.”
Consider this: there was one point in time when the clinician was expected to run lab tests by him/herself on all of his/her patients. There was another point in time when that work was done by individual hospital labs, but by human beings. Compare this to the largely automated clinical lab of the modern day, where the basic acquisition and manipulation of raw data is indeed left to machines.
“A mathematician is not a man who can readily manipulate figures; often he cannot. He is not even a man who can readily perform the transformations of equations by the use of calculus. He is primarily an individual who is skilled in the use of symbolic logic on a high plane, and especially he is a man of intuitive judgment in the choice of the manipulative processes he employs.”
This is an interesting statement. It holds true today just as well as it did back in Bush’s day. However, there is an added layer of interpretation to be had here: as pathology informaticists – indeed, as clinical scientists – we are all users of “symbolic logic on a high plane”. The data sets we manipulate are large, and our manipulations are increasingly automated processes. There are attempts – IBM’s Deep Blue and Watson come to mind here – to give machines the ability to not only index and manipulate the data they store, but to also interpret said data. However, this area is still nascent, and progress remains slow. Current computational systems that interpret and act upon data are only successful when the data and the rules that govern that data are tightly constrained: this is why it is possible to build a program that plays chess well, but why current efforts to build a program that plays Go well have failed.
“So much for the manipulation of ideas and their insertion into the record. Thus far we seem to be worse off than before—for we can enormously extend the record; yet even in its present bulk we can hardly consult it. This is a much larger matter than merely the extraction of data for the purposes of scientific research; it involves the entire process by which man profits by his inheritance of acquired knowledge. The prime action of use is selection, and here we are halting indeed. There may be millions of fine thoughts, and the account of the experience on which they are based, all encased within stone walls of acceptable architectural form; but if the scholar can get at only one a week by diligent search, his syntheses are not likely to keep up with the current scene.”
This is truer today than it was when Bush wrote his paper; it is easy for a person to get on the Internet and write up a website filled with one sort of knowledge or another. Another analogy is that of the hard drive: how many of you have written files to a hard drive and then completely lost them later, not because the data disappeared but because you forgot where you stored that data? The solution, Bush thinks – and if present computer operating systems are any indicator, so too do companies like Apple, Microsoft, and Palm – is to make search technology smart, ubiquitous, and above all instant.
In pathology, we have our own version of this dilemma: suppose that all the world’s slides were magically digitized tomorrow. What would happen then? How would we store all that information, and how would we make it useful to not only pathologists of today, but researchers who might want to analyze the image data? How do we catalog it, how do we make it searchable, and above all how do we make it easily useful to the pathologist who may not have much in the way of a computer science background?
“This process, however, is simple selection: it proceeds by examining in turn every one of a large set of items, and by picking out those which have certain specified characteristics. There is another form of selection best illustrated by the automatic telephone exchange. You dial a number and the machine selects and connects just one of a million possible stations. It does not run over them all. It pays attention only to a class given by a first digit, then only to a subclass of this given by the second digit, and so on; and thus proceeds rapidly and almost unerringly to the selected station. It requires a few seconds to make the selection, although the process could be speeded up if increased speed were economically warranted.”
Here Bush argues for the encoding of knowledge in symbols such as (in this example) numerical codes. This is a forerunner to the idea of file formats and classification systems like MIME types. There is a problem to this, though. Bush is describing indexing in a limited-character system; we know this approach to work well in situations where the inputs are limited and the outputs are trustworthy. More interesting is the problem of indexing a system where the inputs are unlimited and many of the outputs are untrustworthy: though the best efforts of companies like Google are indeed spectacular, searchers often have to read through several pages of hits before they finds what they’re looking for.
Sections 6 and 7
These are the two most important parts of the paper. In this section, Bush describes a machine that he calls a “memex” (a contraction of ‘memory extender’) that sounds clairvoyantly close to our modern concept of a personal computer. He also introduces a concept that would later become known as hypertext. See this:
“The human mind does not work that way. It operates by association. With one item in its grasp, it snaps instantly to the next that is suggested by the association of thoughts, in accordance with some intricate web of trails carried by the cells of the brain. It has other characteristics, of course; trails that are not frequently followed are prone to fade, items are not fully permanent, memory is transitory. Yet the speed of action, the intricacy of trails, the detail of mental pictures, is awe-inspiring beyond all else in nature.”
“The owner of the memex, let us say, is interested in the origin and properties of the bow and arrow. Specifically he is studying why the short Turkish bow was apparently superior to the English long bow in the skirmishes of the Crusades. He has dozens of possibly pertinent books and articles in his memex. First he runs through an encyclopedia, finds an interesting but sketchy article, leaves it projected. Next, in a history, he finds another pertinent item, and ties the two together. Thus he goes, building a trail of many items. Occasionally he inserts a comment of his own, either linking it into the main trail or joining it by a side trail to a particular item. When it becomes evident that the elastic properties of available materials had a great deal to do with the bow, he branches off on a side trail which takes him through textbooks on elasticity and tables of physical constants. He inserts a page of longhand analysis of his own. Thus he builds a trail of his interest through the maze of materials available to him.
And his trails do not fade. Several years later, his talk with a friend turns to the queer ways in which a people resist innovations, even of vital interest. He has an example, in the fact that the outraged Europeans still failed to adopt the Turkish bow. In fact he has a trail on it. A touch brings up the code book. Tapping a few keys projects the head of the trail. A lever runs through it at will, stopping at interesting items, going off on side excursions. It is an interesting trail, pertinent to the discussion. So he sets a reproducer in action, photographs the whole trail out, and passes it to his friend for insertion in his own memex, there to be linked into the more general trail.”
This sounds a lot like the way the World Wide Web currently operates, doesn’t it?
This is the final section of the paper, and the most prophetic of all.
“Wholly new forms of encyclopedias will appear, ready made with a mesh of associative trails running through them, ready to be dropped into the memex and there amplified. The lawyer has at his touch the associated opinions and decisions of his whole experience, and of the experience of friends and authorities. The patent attorney has on call the millions of issued patents, with familiar trails to every point of his client's interest. The physician, puzzled by a patient's reactions, strikes the trail established in studying an earlier similar case, and runs rapidly through analogous case histories, with side references to the classics for the pertinent anatomy and histology. The chemist, struggling with the synthesis of an organic compound, has all the chemical literature before him in his laboratory, with trails following the analogies of compounds, and side trails to their physical and chemical behavior.”
This is Wikipedia, in a nutshell. However, it is doubtful that Bush ever considered that the editors of such encyclopedias themselves might not be trustworthy. Vandalism, prank edits, and surreptitious positive edits of articles on organizations by staff of those organizations are rampant. Here it seems that in a sense Bush was too optimistic, in that he assumed that those who generated knowledge would be benevolent – or at least not malicious – in their efforts.
“In the outside world, all forms of intelligence whether of sound or sight, have been reduced to the form of varying currents in an electric circuit in order that they may be transmitted. Inside the human frame exactly the same sort of process occurs. Must we always transform to mechanical movements in order to proceed from one electrical phenomenon to another? It is a suggestive thought, but it hardly warrants prediction without losing touch with reality and immediateness.”
Here Bush speaks of the fact that we are, in a sense, translating electrical signals in the brain into electrical signals in a computer system – and vice versa. Direct neural interfaces and other thought-based human-computer interaction paradigms are still largely the stuff of science fiction, but implications for the future are startling.
This was a landmark article in several respects. As you can see, it predicted a good number of advances that – in one way or another – have come to fruition. It also – largely in sections 6 and 7 – lays down concepts for a basic user interface for the amassing and cataloguing of data. Ted Nelson, who invented the words “hypertext” and “hyperlink”, was directly influenced by this article. The subject of my next article on the history of UI design, Douglas Engelbart, was likewise directly inspired by “As We May Think” to begin work on a system called NLS – which included the invention of the mouse, word processor, hyperlinking, and videoconferencing. A group of researchers who worked on NLS broke away and became the original staff members of Xerox PARC. From Xerox PARC came the Xerox Alto – which is widely recognized as the first computer to codify the rules that govern the monitor-keyboard-mouse windowing graphical user interface. The Xerox Alto directly inspired the creation of the Apple Macintosh and Microsoft Windows. Evolved forms of Mac OS and Windows are in use on billions of computers today, many running mission-critical software for various organizations.
Therefore, in a very real sense, all of us who work with computers owe a very large debt of gratitude to Vannevar Bush and “As We May Think”.