Van Jacobson Denies Averting Internet Meltdown in 1980s
All Van Jacobson wanted to do was upload a few documents to the internet. Unfortunately, it was 1985.
The internet wasn’t yet called the internet. It was called the ARPAnet, and it had only recently been upgraded to the TCP/IP protocol that still underpins the internet today. Jacobson was teaching a computer science course at the University of California at Berkeley, and all he wanted to do was upload some class materials to Berkeley’s computers so his students could read them. But the internet wasn’t really working. The network throughput was about a bit per second. In other words, it was slow as molasses.
“I was getting a bit per second between two network gateways that were literally in the same room,” Jacobson remembers.
For the next six months, Jacobson — together with Mike Karels, who oversaw Berkeley’s BSD UNIX operating system — worked to solve this internet traffic jam, and the result was an update to TCP that is widely hailed as averting an internet meltdown in the late 80s and early 90s. The soft-spoken Jacobson doesn’t see it that way, but his pioneering work with the internet’s underlying protocols recently earned him a spot in the inaugural class of the Internet Society’s (ISOC) Internet Hall of Fame, alongside such as names as Vint Cerf, Steve Crocker, and Tim Berners-Lee.
In 1985, Berkeley ran one of the IMPs, or interface message processors, that served as the main nodes on the ARPAnet, a network funded by the U.S. Department of Defense that connected various research institutions and government organizations across the country. The network was designed so that any node could send data at any time, but for some reason, Berkeley’s IMP was only sending data every twelve seconds.
As it turns out, the IMP was waiting for other nodes to complete their transmissions before sending its data. The ARPAnet was meant to be a mesh network, where all nodes can operate on their own, but it was behaving like a token ring network, where each node can only send when they receive a master token.
“Our IMP would just keep accumulating data and accumulating data for about twelve seconds and then it would dump it,” says Jacobson. “It was like the old token ring networks when you couldn’t say anything until you got the token. But the ARPAnet wasn’t built to do that. There was no global protocol like that.”
The trouble was that if one node was talking to another, a third node couldn’t break into the conversation. It had to wait and send it’s conversation on the tail end of the other. That, Jacobson says, made the entire network organize itself like a token ring network even though it wasn’t a token ring network. Or, in more prosaic terms, the traffic backed up like cars at an intersection.
“If you have to wait for all the cross traffic at an intersection, then a long line of cars builds up behind you,” Jacobson says. “Then, if they have to wait for everything to go through, a biggest line builds up on the other side, and you end up with everyone getting stuck in long lines — not just at that intersection, but at intersections down the street.”
The ARPAnet had been up and running since the late 1960s, but it had only just moved to TCP/IP, and according to Jacobson, this is one of the first times that researchers realized that the network’s reliance on large scale self-organization had untended consequences.
Solving the problem was doubly difficult because in those days, there wasn’t an easy way to analyze the network. “You got log files when things failed,” Jacobson says, “but that doesn’t really tell you wants actually happening on the wire.”
In order to find out what was actually happening, Jacobson tracked down some other Berkeley researchers would had built a network debugger interface on a machine from Sun Microsystems, but he didn’t have the money for his own Sun machine. So, he secured a consulting job that would earn him the money he needed, and with his Sun 350 plugged into the network, he developed a diagnostic utility that would let him actually print out information about the packets going across the network.
What he found is that the nodes not only had trouble establishing connections, but that they responded incredibly poorly to this problem. If a node sent a dozen packets in trying to establish a connection and one didn’t get through, it would resend all twelve. “It was a combination of really bad startup behavior and poor recovery code. We were just wasting all the bandwidth.”
The solution was essentially to slow down the startup process — i.e. not sent so many packets so quickly. “The problem was that we had no clock at startup. We had to build a clock,” Jacobson says. “You couldn’t just send one packet and wait. But we had to figure out what you could do. Could you send two and wait? We needed a slow start that let the connection get going.”
Over the next six months, Jacobson and Karels built what they would officially be called Slow Start, a change to TCP/IP that added the sort of clock he talks about. It was little more than three lines of code. Soon, this change was added to BSD Unix, the Berkeley-developed operating system that had become the de facto standard for the ARPAnet, and after a few tweaks, the network behaved as it should.
Twenty five years later, the digerati credits Jacobson with averting an ARPAnet meltdown that would have stunted the growth of the modern internet — or even destroyed it all together. But Jacobson doesn’t see it that way.
“TCP/IP was really well crafted,” he says. “Most of what Mike and I did was dealing with performance problems. But they weren’t keeping the net from running, and they weren’t going to cause total disaster. We were just trying to make something that was working work better.”
In the 90s, as the internet took off, Jacobson left Berkeley for networking giant Cisco, and in August 2006, he joined PARC, the Xerox outfit that grew out of the company’s old Palo Alto Research Center. There, he’s still working to improve the internet. But this time, he wants to build an entirely new networking model.
“Tim Berners-Lee sparked a revolution at the user level with the web, but we also need a revolution at the communications level. At the user level, the web has changed the way content moves, but at the lower-level, it’s still the 1890s. We still think about building a wire between two points and pouring bits into it,” Jacobson says.
“We want to build a low-level communications model that’s much closer to what Tim has shown us on the web.”