There are a number of different schemes proposed and in use. Each has different technological requirements and capablities, which lead to different implications for finance, marketing, and corporate power. Each increase in interactivity adds to the system's complexity and cost.
The bottom level of interactivity merely injects a small level of choice into the traditional "broadcast" scheme: the cable to your house has a constant stream of small packages of information, such as games, stock reports, weather, or news. You choose one of these packages (maybe you would like to explore a virtual-reality "world" that you have seen on the menu), and the next time it comes by the cable box snags it out of the "bit-stream." It also logs that fact, and every month the cable system queries the set-top box, adds up all the cost of everything you have used, and sends you a bill.
Doing movies this way is called "near video on demand" (NVOD): the movies come in a continuous stream, starting, say, every five minutes. You can watch one whenever you want to, with only a few minutes wait. You get charged only for what you watch.
That obviously only works with a limited number of movies, not with a huge library. True "video on demand" (VOD) calls for a different architecture: you click on a choice on your television remote control, and the set top box sends a query into the system through a "back channel." The system routes the request to a video server, where the video is stored in a compressed format.
Depending on the size of the program (from a two-hour movie to a 30-second description of a restaurant) and its popularity, it might be stored on any of a number of formats: CDROM, DAT (digital audio tape), laserdisc, videotapes in a jukebox-like machine, a large array of computer hard disks, or a computer chip (DRAM, or dynamic random access memory). The server finds the video you have chosen, and sends it back through the cable system to the set-top box, which de-compresses it and plays it, usually within a second of the moment you clicked on the remote.
This requires a lot more memory and processing capability in the set-top box, and greater capacity in the channel from the consumer back to the system. But the big difference is that now the system must be "switched:" you have to be able to send the video only to the particular box that requested it. Most systems use the new high-capacity "asynchronous transfer mode" (ATM) switches. Once you put in a switch, you have crossed the divide between a "tree and branch" broadcast system (like a traditional cable system) and a "network" (like a traditional telephone system).
But suppose you want do more than choose something and watch it? Suppose you want to let the customer "play along with the experts," calling plays in a football game, trying to beat the contestants on a game show, or voting on a political debate? Suppose you want the computer to interact with you in your "virtual world?" Now the central computer and video server has to react much more quickly to a much wider range of demands from the customer. The architecture of the system remains much the same, but the equipment must have much greater speed and capacity.
Finally, what if the customers want to interact with each other? What if you would like the other people in your virtual world to actually be other people, exploring the world from their own homes? Then the amount of information shuttling back and forth through the system rapidly multiplies. Every player has to get the information about every other player's movements. At this level of complexity the system becomes, as TCI chairman John Malone said in Wired, "more complex than anything that has ever been designed, probably by an order of magnitude. . . . The whole system for the space program is small compared to this."
The big question is: where do you put the intelligence, the computing power, the memory? Do you put it at the edges of your system (on the customers' TVs)? Or do you put it in the middle (at the cable company)? Suppose your virtual world requires a mountain. We can send the image of a mountain from a central computer and display it on your television. But images are large, so if we do that we need very big "pipes," fibers, cables, and switches of high capacity and speed. On the other hand, we could give your set-top box the ability to draw mountains. Then all we have to send is the instruction: "Put a mountain here." The "pipes" can be a lot smaller. But your set-top box has to have a lot more memory (for storing mountain-making instructions) and computing power (for turning those instructions into real mountains), so it has to be a lot more expensive.
Traditional televisions are not very smart. They do not process or store the signal. All the intelligence is in the center of the system. Most of the intelligence in computer networks, on the other hand, is on the edges, in the computers. The central "nodes" are just routers and servers, reading addresses on packets of information and sending them on.
The implications of a choice of architecture are not merely technical, but financial, political and even cultural - and we can see these implications in working models that already exist. Put the intelligence and power at the center, and you end up with traditional broadcast networks: powerful and wealthy corporations or national organizations such as CBS or the BBC setting the agenda for much of the global discussion. Put it at the periphery, and you get something that looks much more like the Internet: yeasty, unpredictable, even a little wild, with minimal rules and no clear center.