cps reading room (creative problem solving): Paul Baran (RAND)

Oral-History:Paul Baran
https://ethw.org/Oral-History:Paul_Baran

Cold War Threat, ~ 1959
Baran:

In late 1959 when I joined the RAND Corporation, the Air Force was synonymous with National Defense. The other services were secondary. The major problem facing the Country and the World was that the Cold War between the two super powers had esculated to the point by 1959 when both sides were starting to build highly vulnerable missile systems prone to accidents. Whichever side fired their thermonuclear weapons first would essentially destroy the retaliatory capacity of the other. This was a highly unstable and dangerous era. A single accidental fired weapon could set off an unstoppable nuclear war. A preferred alternative would be to have the ability to withstand a first strike and the capability of returning the damage in kind. This reduces the overwhelming advantage by a first strike, and allows much tighter control over nuclear weapons. This is sometimes called Second Strike Capability. If both sides had a retaliatory capability that could withstand a first-strike attack, a more stable situation would result. This situation is sometimes called Mutually Assured Destruction, also known by its appropriate acronym, MAD. Those were crazy times.

Communications: the Achilles Heel, 1960+
Baran:

The weakest spot in assuring a second strike capability was in the lack of reliable communications. At the time we didn’t know how to build a communication system that could survive even collateral damage by enemy weapons. RAND determined through computer simulations that the AT&T Long Lines telephone system, that carried essentially all the Nation’s military communications, would be cut apart by relatively minor physical damage. While essentially all of the links and the nodes of the telephone system would survive, a few critical points of this very highly centralized analog telephone system would be destroyed by collateral damage alone by missiles directed at air bases and collapse like a house of card. This rendered critical long distance communications unlikely. Well what about high frequency radio, i.e. the HF or short wave band? The problem here is that a single high altitude nuclear bursts destroys sky wave propagation for hours. While propagation would continue via the ground wave, the sky wave badly needed for long distance radio would not function reducing usable radio ranges to a few tens of miles.

The fear was that our communications were so vulnerable that each missile base commander would face the dilemma of either doing nothing in the event of a physical attack, or taking action that would mean an all out irrevocable war. A communications system that could withstand attack was needed that would allow reduction of tension at the height of the Cold War.

Broadcast Station Distributed Teletypewriter Network, 1960
Baran:

At that time the expressed concern was for a system able to support Minimum Essential Communications -- a euphemism for the President authorizing a weapons launch.

In 1960 I proposed using broadcast stations as the links of a network . Broadcast stations during the daytime depend soley only on the ground wave, not subject to the loss of the sky wave. This is the reason that AM broadcast stations have such a short range during the day. I was able to demonstrate using FCC station data that there were enough AM broadcast stations in the right location and of the right power levels to allow signals to be relayed across the country. I proposed a very simple protocol. Just flood the network with the same message.

When I took this briefing around to the Pentagon, and other parts of the defense establishment I received the objection that it didn’t fix the problem of the military. “OK, a very narrow band capacity may take care of the President issuing the orders at the start of a war, but how do you support all the other important communications requirements that you need to operate the military during such a critical time.”

High Data Rate Distributed Communications, 1961 - 64
Baran:

The response was unambiguous. What I proposed wouldn’t fully hack it. So it was “back to the drawing board” time. I started to examine what military communications needs were regarded as essential by reading reports on the subject, and asking people at various military command centers. The more that I examined the issues, the longer the list. So I said to myself. “As I can’t figure out what essential communications is needed, let’s take a different tack. I’ll give those guys so much damn bandwidth that they wouldn’t know what in Hell to do with it all.” In other words, I viewed the challenge to be the design of a secure network able send signals over a network being cut up, and yet having the signals delivered with perfectly reliability. And, with more capacity than anything built to date. When one starts a project aim for the moon. Reality will cut you back later. But if you don’t aim high at the outset you can never advance very far.

Why Digital? Why Message Blocks?
Baran:

I knew that the signals would have to find their way through surviving paths, which would mean a lot of switching through multiple tandem links. But, at that time long distance telephone communications systems transmitted only analog signals. This placed a fundamental restriction on the number of tandem connected links that could be used before the voice signal quality became unusable. A telephone voice signal could pass through no more than about five independent tandem links before it would become inaudible. This ruled out analog transmission in favor of digital transmission. Digital signals have a wonderful property. As long as the noise is less than the signal’s amplitude it is possible to reconstruct the digital signal without error.

The future survivable system had to be all-digital. At each connected node, the digital signal would be verified that the next node correctly received it. And, if not, the signal would be retransmitted. As one day the network would also have to carry voice as well as teletypewriter and computer data, all traffic would be in the same form – bits. All analog signals would first be digitized. To keep the delay times short the digital stream would be packaged into small message blocks each with a standardized format. Work on time division multiplexing of digital telephone signals was in an early state in Bell Labs. Their experimental equipment used a data rate of about 1.5 Megabits/sec. I then started with the premise that it would be feasible to use digital transmission, at least for short distances at 1.5 Megabits/sec. since the signals could be reconstructed at each node. A big problem blocking long distance digital transmission was transmission jitter buildup. Every mile a repeater amplifier chopped the tops off the wave and reconstituted a clean digital signal. But noise caused a cumulative shifting of the zero crossing points. This limited the span distance. I thought that a node terminating each link in a non-synchronous manner should effectively clean up the accumulated jitter. This would and provide a de facto way of achieving long distances by such jitter error clean up. And I felt that if that didn’t work, then our fall back technology would then be the use of extremely cheap microwaves that could be feasible in this noise margin tolerable application.

On Parallelism
Baran:

By this time it was beginning to become clear that the new system’s overall reliability would be significantly greater than the reliability of any one component. Hence I could think in terms of building the entire system out of cheap parts – something previously inconceivable in the all-analog world.

Hochfelder:

Because it is in parallel?

Baran:

Yes. In parallelism there is strength. Many parts must fail before no path could be found through the network. It took a redundancy level of only about three times the theoretical minimum to build a very tough network . If you didn’t have to worry about enemy attacks, then a redundancy level of about 1.5 would suffice for a very reliable network out of very inexpensive and unreliable parts. And, it would later show that it would be possible to reduce the cost of communication by almost two decimal orders of magnitude. The saving in part came from being able to design the long distance transmission systems as links of a meshed network with alternative paths without allowing huge fade margins where all the links are connected in tandem. With analog transmission every link of the network must be “gold plated” to achieve reliability.

Hot-Potato Routing
Baran:

A key element of the concept was that it would be necessary to keep a “carbon copy” of each message block using computer technology, until the next station successfully received the message. The next challenge was to find a way for the packets to seek their own way through the network. This meant that some implicit path information must be contained as housekeeping data within the message block itself. The housekeeping includes data about the source and destination of the packet together with a an implied time measurement such as the number of times the message block had been retransmitted. This small amount of information allowed creation of an algorithm that did a very effective job of routing dynamically changing traffic to always find the best instantaneous path through the network.

Basic Concepts Underlying Packet Switching, 1960
Baran:

I had earlier discovered that very robust networks could be built with only modest increases in redundancy over that required for the minimum connectivity. And, then it dawned on me that the process or resending defective or missing packets would then allow the creation of an essentially error-free network . Since it didn’t make any difference whether a failure was due to enemy attacks or poor reliable components, it would be possible to build systems where the system reliability is far greater than the reliability of any of its parts And, even with inexpensive components a super reliable network would result.

Another interesting characteristic was the network learning property would allow users to move around the network, with that person’s address following them . This would allow separating the physical address from the logical address throughout the network, a fundamental characteristic of the Internet.

Another that I learned was that in building self-learning systems it is equally important to forget, as it is to learn. For example, when you destroy parts of a network, the network must quickly adapt to routing traffic entirely differently. I found that by using two different time constants, one for learning and the other for forgetting provided the balanced properties desired. And, I found it helpful to view the network as an organism, as it had many of the characteristics of an organism as it responds to overloads, and sub-system failures.

Dynamic Routing, 1961
Baran:

I first thought that it might be possible to build a system capable of smart routing through the network after reading about Shannon’s mouse through a maze mechanism . But instead of remembering only a single path, I wanted a scheme that not only remembered, but also knew when to forget, if the network was chopped up. It is interesting to note that the early simulation showed that after the hypothetical network was 50% instantly destroyed, that the surviving pieces of the network reconstituted themselves within a half a second of real world time and again worked efficiently in handling the packet flow.

Hochfelder:

How would the packets know how to do that?

Baran:

Through the use of a very simple routing algorithm. Imagine that you are a hypothetical postman and mail comes in from different directions, North, South, East and West. You, the postman would look at the cancellation dates on the mail from each direction. If for example if our postman was in Chicago, mail from Philadelphia would tend to arrive from the East with the latest cancellation date. If the mail from Philadelphia had arrived from the North, South, or West it would arrive with a later cancellation date because it would have had to take a longer route (statistically). Thus, the preferred direction to send traffic to Philadelphia would be out over the channel connected from the East as it had the latest cancellation date. Just by looking at the time stamps on traffic flowing through the post office you get all the information you need to route traffic efficiently.

Each hypothetically post office would be built the same way. And, each would have a local table that recorded the statistics of traffic flowing through the post office. With packets, it was easier to increment a count in a field of the packet than to time stamp. So, that is what I did. It’s simple and self-learning. And when this “handover number” got too big, then we knew that the end point was unreachable and dropped that packet so that it didn’t clutter the network.

Hochfelder:

Always searching for the shortest path.

Baran:

Yes, that is the scheme. We needed a learning constant and a forgetting constant as no single measurement could be completely trusted. The forgetting constant also allows the network to respond to rapidly varying loads from different places. If the instantaneous load exceeded the capacity of the links, then the traffic is automatically spread through more of the network. I called this doctrine, “Hot Potato Routing.” These days this approach is called “Deflection Routing.” By the way, the routing doctrine used in the Internet differs from the original Hot Potato approach, and is the result of a large number of improvements over the years.

Basic Properties of Packet Switching, 1960 - 62
Baran:

The term “packet switching” was first used by Donald Davies of the National Physical Laboratory in England who independently came up with the same general concept in November 1965 .

Essentially all the basic concepts of today’s packet switching can be found described either in the 1962 paper or in the Augurst 1964 RAND Memoranda in which such key concepts as the virtual circuit are described in detail.

The concept of the “virtual circuit” is that the links and nodes of the system are all free, except during those instances when actually sending packets. This allows a huge saving over circuit switching, because 99 percent of the time nothing is being sent so the same facilities can be shared with other potential users.

Then there is the concept of “flow control”, which is the mechanism to automatically prevent any node from overloading. All the basic concepts were worked out in engineering detail ina series of RAND Memoranda (between 10 to 14 volumes, depending on how they are counted) What resulted was a realization that the system would be extremely robust, with the end to end error rate essentially zero, even if built with inexpensive components. And, it would be very efficient in traffic handling in comparison to the circuit-switching alternative.

Economic Payoff Potential Versus Perceived Risks
Baran:

This combination of economy and capability suggested that if built and maintained at a cost of $60,000,000 (1964 Dollars) that it could handle the long distance telecommunications within the Department of Defense that was costing the taxpayer about $2 billion a year.

At the time, the great saving in cost claimed was so great that it made the story intuitively unbelievable. It violated the common sense instincts of the listener who would say in effect that: “If it were ever possible to achieve such efficiencies the phone company (AT&T) would have done it already.”

Another understandable objection was “This couldn’t possibly work. It is too complicated.” This perception was based on the common view, correct at the time, that computers were big, taking up large glass walled rooms, and were notoriously unreliable. When I said that that each switching node could be a shoe sized box with the required computer’s capabilities, many didn’t believe it. (I had planned doing everything in miniaturized hardware in lieu of using off the shelf minicomputers.) So I had the burden of proof, to define the small box down to the circuit level to show that it could indeed be done.

Another issue was the separation of the transmission network from the analog to digital conversion points. This is described in detail in Vol. 8 of the ODC series This RAND Memorandum describes in detail how users are connected to the switching network. The separate unit that is described connects up to 1024 users and convert their analog signals into digital signals. This included voice, teletypewriters, computer modems, etc. One side of the box would connect to the existing analog telephones, while the other side which was digital would connect to the switching network, preferably at multiple points to eliminate a single point of failure.

This constant increase in desire for engineering details caused so much paper to be written at the time cluttering up the literature. On a positive note it left us with a very detailed description of packet switching proposed at that time. This record has been helpful in straightening out some of the later misrepresentations of who did what and when as found in the popular press’s view of history.

Opposition and Detailed Definition Memoranda, 1961+
Baran:

The enthusiasm that this early plan encountered was mixed. I obtained excellent support from RAND (after an early cool and cautious initial start). Others, particular those from AT&T (the telephone monopoly at the time) objected violently. Many of the objections were at the detail level, so the burden of proof was then on me to provide proposed implementation descriptions at an ever finer level of detail. Time after time I would return with increasingly detailed briefing charts and reports. But, each time I would hear a mantra, “It won’t work because of ____”. “ It won’t work because of (some new objection).” I gave the briefings to many places to various government agencies, to research laboratories, to commercial companies, but primarily to the military establishment I gave briefings at least 35 times. It was hard for a visitor with an interest in communications to visit RAND without being subject to a presentation. My chief purpose in giving these presentations so broadly was that I was looking for reasons that it might not work. I wanted to be absolutely sure that I hadn’t overlooked anything that could affect workability. After each encounter where I could not answer the questions quantitatively, I would go back and study each of the issues raised and fill in the missing details. This was an iterative process constituting a wire brush treatment of a wild set of concepts.

In fairness, much of the early criticism was valid. Of course the burden of proof belongs to the proponent. Among the many positive outcomes of the exercise was that, 1) I was building a better understanding the details of such new systems, 2) I was building a growing degree of confidence in the notions, and 3) I had accumulated a growing pile of paper including simulation data to support the idea that the system would be self learning and stable.

Publication, 1964
Baran:

Most of the work was done in the period 1960-62. As you can imagine old era analog transmission engineers unable to understand what was being contemplated in detail. And, not understanding, they were negative and intuitively believed that it could possibly work. However, I did build up a set of great supporters as I went along. My most loyal supporters at RAND included Keith Uncapher my boss at the time, and Paul Armer and Willis Ware, co-heads of the Computer Science Department. RAND provided a remarkable degree of freedom to do this controversial work, and supported me in external disagreements. By 1963 I felt that I had carried this work about as far as appropriate to RAND (which some jokingly say stands for “Research And No Development.”) And, I felt that as I had completed the bulk of my work I began wrapping up the technical development phase in 1964 when I published the set of memoranda in 1964 which were primarily written on airplanes in the 1960 to 1962 era. There were some revisions in 1963, and the RAND Memoranda came out in 1964. I continued to work on some of the non-technical issues and gave tutorials in many places including summer courses at the University of Michigan in 1965 and 1966.

In May 1964 I published a paper in the IEEE Communications Transactions which summarizes the work and provides a pointer to each of a dozen volumes of Rand Memoranda for the serious reader who wanted to read the backup material Essentially all this work was unclassified in the belief that we would all be better off if the fate of the world relied on more robust communications networks. Only two of the first twelve Memoranda were classified. One dealt with cryptography and the other with weak spots that were discovered and the patches to counter the weak spots. A thirteenth classified volume was written in 1965 by Rose Hirshfield on real world geographical layout of the network. And there was a 14th describing a secure telephone that could be used with the system and had possible applications outside of the network and so wasn’t included in the number series. This was co-authored with Dr. Rein Turn.

Baran:

Getting a new idea out to a larger audience is always challenging. Perhaps more so if it is a departure from the classical way of doing things. The IEEE Spectrum which is sent to all IEEE members picked up the article in a “Scanning the Transactions”. I looked to this short summary to being a pointer to the IEEE Transaction article, for those that didn’t normally read the Communications Society Transactions. This article in turn pointed to RAND Memoranda. readily available either from RAND or its depositories around the world. In those days RAND Publications were mailed free to anyone who requested a copy.

But no matter how hard one tries, it seems that it is impossible to get the word out to everyone. This is not a novel problem. And, it contributes to duplicative research, made more common by the reluctance by some to take the time to review the literature before proceeding with their own research. Some even regard reviewing the literature as a waste of time. I was surprised many years later to find a few key people in closely related research say that they were totally unaware of this work until many years later. I recall describing the system in detailed discussions, only to find out at a later date that the listener had completely forgotten what was said, and who didn’t receive his Epiphany until much later and ostensibly through a different channel.

Conceptual Gap Between Analog and Digital Thinking
Baran:

The fundamental hurdle in acceptance was whether the listener had digital experience or knew only analog transmission techniques. The older telephone engineers had problems with the concept of packet switching. On one of my several trips to AT&T Headquarters at 195 Broadway in New York City I tried to explain packet switching to a senior telephone company executive. In mid sentence he interrupted me, “Wait a minute, son. “Are you trying to tell me that you open the switch before the signal is transmitted all the way across the country?” I said, “Yes sir, that’s right.” The old analog engineer looked stunned. He looked at his colleagues in the room while his eyeballs rolled up sending a signal of his utter disbelief. He paused for a while, and then said, “Son, here’s how a telephone works….” And then he went on with a patronizing explanation of how a carbon button telephone worked. It was a conceptual impasse.

On the other hand, the computer people over at Bell Labs in New Jersey did understand the concept. That was insufficient. When I told the AT&T Headquarters folks that their own research people at Bell Labs had no trouble understanding and didn’t have the same objections as the Headquarters people. Their response was, “Well, Bell Labs is made up of impractical research people who don’t understand real world communication.”

Willis Ware of RAND tried to build a bridge early in the process. He knew Dr. Edward David Executive Director of Bell Labs and he aske for help. Ed set up a meeting at his house with the chief engineer of AT&T and myself to try to overcome the conceptual hurdle. At this meeting I would describe something in language familiar to those that knew digital technology. Ed David would translate what I was saying into language more familiar in the analog telephone world (he practically used Western Electric part numbers) to our AT&T friend, who responded in a like manner. Ed David would translate it back into computer nerd language.

I would encounter this cultural impasse time after time between those who were familiar only with the then state of the art of analog communications – highly centralized and with highly limited intelligence circuit switching and myself talking about all-digital transmission, smart switches and self-learning networks. But, all through the process of erosion, more and more people came to understand what was being said. The base of support strengthened in RAND, the Air Force, academia, government and some industrial companies --and parts of Bell Labs. But I could never penetrate AT&T Headquarters objections who at that time had a complete monopoly on telecommunications. It would have been the perfect organization to build the network. Our initial objective was to have the Air Force contract the system out to AT&T to build the network but unfortunately AT&T was dead set against the idea.

Hochfelder:

Were there financial objections as well?

AT&T Headquarters Lack of Receptivity
Baran:

Possibly, but not frontally. They didn’t want to do it for a number of reasons and dug their heels in looking for publicly acceptable reasons. For example, AT&T asserted that were not enough paths through the country to provide for the number of routes that I had proposed for the National packet based network but refused to show us their route maps. (I didn’t tell them that someone at RAND had already acquired a back door copy of the AT&T maps containing the physical routes across the US since AT&T refused to voluntarily provide these maps that were needed to model collateral damage to the telephone plant by attacks at the US Strategic Forces.) I told AT&T that I thought that they were in error and asked them to please check their maps more carefully. After a month’s delay in which they never directly answered the question, one of their people responded by grumbling, “It isn’t going to work, and even if it did, damned if we are going to put anybody in competition to ourselves.”

I suspect the major reason for the difficulty in accommodating packet switching at the digital transmission level was that it would violate a basic ground rule of the Bell System -- everything added to the telephone system had to work with all previous equipment presently installed. Everything had to fit to into the existing plan. Nothing totally different could be allowed except as a self contained unit that fit into the overall system. The concept of long distance all-digital communications links connecting small computers serving as switches represents a totally different technology and paradigm, and was too hard for them to swallow. I can understand and respect that reason, but can also appreciate the later necessity for divestiture. Competition better serves the public interest in the longer term than a monopoly, no matter how competent and benevolent that monopoly might. There is always the danger that the monopoly can be in error and there is no way to correct this.

On Bell Labs Response
Baran:

While the folks AT&T Headquarters violently opposed the technology, there were digitally competent people in Bell Labs who appreciated what it was all about. One of the mysteries that I have never figured out is why after packet switching was shown to be feasible in practice and many papers published by others that it took so many years before papers in packet switching would ever emerge from Bell Labs.

The first paper on the subject that I recall being published in the Bell System Technical Journal was by Dr. John Pierce. This paper described a packet network made up of overlapping Ballantine rings. It was a brilliant idea and his architecture used in today’s ATM systems.

Hochfelder:

What is a Ballantine ring?

Baran:

Have you ever seen the Ballantine Beer’s logo? It is made up of three overlapping rings? Since a signal can be sent in both directions on a loop, no single loop cut need stop communications from flowing from the other direction. Because the signal can go both ways any single cut can be tolerated without loss allowing time for repair. It is a powerful idea.

The RAND Formal Recommendation to the Air Force, 1965
Baran:

In 1965 the RAND Corporation issued a formal Recommendation to the Air Force (which they do very rarely) for the Air Force to proceed to build the proposed network . The Air Force then asked the MITRE Corporation, a not-for-profit organization that worked for the government to set up a study and review committee. The Committee after independent investigation concluded that the design was valid and that a viable system could be built and that the Air Force should immediately proceed with implementation.

As the project was about to launch, the Department of Defense said that as this system was to be a National communications system, it would in accordance with the Defense Reorganization Act of 1949 (finally being implemented in 1965) fall into the charter of the Defense Communications Agency.

The choice of DCA would have been fine years later when DCA was more appropriately staffed. But at that time the DCA was a shell organization staffed by people who lacked strength in digital understanding. I had learned through the many briefings I had given to various audiences that there was an impenetrable barrier to understanding packet switching by those who lacked digital experience. At RAND I was essentially free to work on anything that I felt to be of most importance to National Security. This allowed me for example to serve on various ad hoc DDR&E (Department of Defense Research & Engineering) committees. I sometimes consulted with Frank Eldridge in the Comptrollers Office of the Department of Defense helping him to review items in the command and control budgets submitted by the services. Frank Eldridge was an old RAND colleague initially responsible for the project on the protection of command and control. He was among the strongest supporters for the work that I was doing on Distributed Communications. He had gone over to the Pentagon working with McNamara’s “whiz kids.” Frank Eldridge had undergone many of the same battles with AT&T and understood the issues of the RAND thence Air Force proposal.

Approval for the money for the Defense Communication Agency (DCA) to undertake the RAND distributed communications system development was under Frank Eldridge’s responsibility. Both Frank and I agreed that DCA lacked the people at that time who could successfully undertake this project and would likely screw up this program. An expensive failure would make it difficult for a more competent agency to later undertake this project. I recommended that this program not be funded at this time and the program be quietly shelved, waiting for a more auspicious opportunity to resurrect it.

The Cold War at this time had cooled from loud threats of thermonuclear warheads to the lower level of surrogate small wars. And, we were bogged down in Viet Nam.

source:
https://ethw.org/Oral-History:Paul_Baran
____________________________________

cps reading room (creative problem solving)

Thursday, October 10, 2024

Paul Baran (RAND)

No comments:

Post a Comment

ba place space

Report Abuse