For too long I've wanted to connect small, inexpensive computers with a network protocol. In ignorance of UART, I thought about the problem in abstract and thought about particular cases. The MVP [Minimum Viable Product] would be something like MIDI for process control. And, actually, why is MIDI not used for process control? One or more numbered devices would receive commands, measure variables and switch relays. It could be run in an open loop or, preferably, in a closed loop. At this point, it should be obvious to anyone that I should have investigated MIDI in more detail. It should also be obvious that MIDI lacks something.
A good start to this process was to devise a protocol with three byte payloads: device, command, payload. The intent was to send three byte triples around a ring. For every triple sent, it is also possible to receive a triple. Meanwhile, specific devices fill a (possibly dummy) payload with an acknowledgement or a measurement. Further, it is possible to decrement device number at each hop. This eliminates all of the tedious device numbering and associated device contention. Also, for a micro-controller, less pins for jumper switches means more pins for I/O.
The astute will notice this scheme is unworkable because there is no mechanism to prevent device, command or payload from being jumbled. The automatic numbering is merely an extra mechanism for packet mangling. It is possible to patch this scheme with reserved values and escape mechanisms but it becomes very inefficient and nothing like the original intention. How do most people solve this problem? Well, they don't. A ridiculous number of protocols require low error rates or no transmission error. If you've even *considered* bit errors then you're ahead of some people. If you're lucky, a protocol might have a 16 bit checksum. It might even be a good one. However, for a high volume of data or widespread deployment this is demonstrably inadequate.
I was stuck but not vanquished. Along the way I learned about parity, CRC polynomials, Hamming code, Reed Solomon, Viterbi, LDPC, Turbo codes, Raptor codes, Fountain codes, FEC and various aspects of oversampling. However, this is all very abstract in absence of wire format encodings, such as UART start/stop bits, Manchester encoding and bit stuffing. Techniques for bit stuffing may seem overly pedantic to a programmer but they are essential when handling the maximum size of a magnetic domain, the minimum length of a pit on a pressed Compact Disc or the accumulated charge over long distance cabling. Indeed, these are all related problems which may require mutually incompatible solutions.
Bit stuffed encodings typically use combinatorics and this may leave free variables or unused encodings. As someone trying to convey binary data, I initially ignored this quirk but I eventually realized that it solves problems further up the network stack (ISO seven layer or otherwise). Specifically, bit stuffing may provide a frame marker which prevents accidental mangling or malicious packet-in-packet attack.
The reason for ignoring this matter was encapsulation. I preferred each layer to be fully encapsulated because it decreases coupling and increases portability and the generality of the solution. In practice, it is common for one loose end to be used further up the stack. Obviously, one of the upper layers now depends a loose end of some form. This may lead to ridiculous hacks which can most generously be described as loose end encapsulation. The most ridiculous part is that we have an inversion of concerns and that we are encapsulating the upper layer from the vantage of the lower layer. One of the worst examples is AAL5 which is used to encapsulate Internet Protocol or raw Ethernet over ATM. If you've ever used copper or fiber broadband then you've very probably used AAL5. However, the one dangling bit used to determine the end of a sequence of fixed length cells is an offense to my sensibilities. It was sufficiently bad for me to devised something with better encapsulation and better channel capacity. (AAL5 requires 8-10 bytes of a 48 byte payload. Oh, plus one bit.)
This was enough to prod me a devise a network stack from the wire format upwards. Starting from the crudest 4/5 bit stuffing (one bit of every nybble is Manchester encoded), I was able to devise a 256 bit cell format which is ideally suited to 8 bit computing due to the use of, er, 8 bit counters. It has a 16 bit frame marker and three sets of 80 bit payloads. Each of these contain 64 bits of raw data. That's 24 bytes per cell and it is therefore a miniature version of ATM. With a minor alteration, I switched to 8/9 bit stuffing and gained enough space to add a Hamming code. This allows one bit fix per 80 bit payload. In optimistic cases, this allows multiple bit errors to be fixed.
This was deeply satisfying because I was now devising an almost universal PAN/LAN/WAN format which can also be adapted to tape or disk storage with minor alteration. However, there is only a finite amount of research which can be completed before embarking upon a project. And much of my research determined that I had spent an atypical amount of time on research. However, I might have saved time by concentrating on 1980s home computer formats. In particular, after implementation, I found the state diagram of the 6854 network adapter used in AppleTalk and Acorn's EcoNet. It is not often that I could have drawn the diagram myself.
Unfortunately, my work has determined why serial formats developed in such a piecemeal manner. A software-only, bit-banged magnetic/PAN/LAN/WAN format requires about 6.5KB ROM. It also requires a horrendous amount of processing power. If a device is doing nothing else, this is feasible. For example, when loading from tape. However, the same system is infeasible to obtain mouse position. This would explain at least some of the proliferation of serial formats. However, some of it is willful incompatibility. For example, CAN bus was developed for similar reasons to USB: to contain a proliferation of incompatible connectors and protocols. And it might have been acceptable for an industrial standard to develop one year ahead of a consumer standard. However, civil aircraft and military aircraft already provided two incompatible standards which were suitable in cars and lorries.
There's a babel of incompatible formats out there. It is absolute madness. There are the wired network formats, the wireless network format, the infrared formats, the tape formats, the floppy disk formats, the hard disk formats. Anyone with any sense has retreated to clocked serial protocols. However, even here there is incompatibility. Geeks argue about the merits of big-endian and little-endian (or the lesser known middle-endian). Geeks also argue about Ethernet and ATM sending the bits in a byte in opposite order. And for the clocked protocols, geeks argue about clocking on the rising edge, the falling edge - or both. If there were two ways to breathe, a geek would find a third method.
That's how we get ADB, SIO, IWM, PS/2 mouse, MIDI, AppleTalk, EcoNet, X10, DMX512, DALI, iLink, I2C, I2S, RC-5, IrDA, Ethernet, ATM and thousands of more obscure protocols. Unfortunately, when much of this was rolled into USB, the wire format was so awful that it required two incompatible replacements while failing to incorporate developments from FireWire and ThunderBolt. Radio formats are no better. BlueTooth and Zigbee may sound exotic but these 802.15.1 and 802.15.4 variants are merely 802.11 with incompatible headers, incompatible packet size and incompatible authentication running on incompatible radio frequencies. The net result is an increased attack surface with no appreciable gain.
So, my answer to this problem is
add to the dung pile and define more protocol. Any fool can define a wire format with CRC in 8KB ROM. I am one such fool. Now try doing the same in 2KB or less. By the time 8KB became affordable, protocols had already fragmented. This would account for much of the difficulty in historical systems. For example, a superior tape format was planned for Galaksija but it required too much space and an inferior protocol was substituted. And why was firmware so expensive? Assuming four transistors per bit of ROM, 8KB requires 2^18 transistors. It is cheaper to implement dedicated hardware with a large number of useless options. More recently, I've found that:
- A large number of protocols use a random doubling of 75Hz, 225Hz or 25MHz.
- Protocols start at a random speed and get faster. Anything slower is always outside of the scope of the protocol.
- Protocols start with a random size address-space and often require address extension.
This leads to inanity such as iLink with an 8 bit address-space, I2C with a 7 bit address-space, DALI with a 6 bit address-space, RC-5 with a 5 bit address-space and HDMI slow bus with a 4 bit address-space. Do we have any bids for a 3 bit address-space or smaller?
I have plans to avoid the numerous limitations of the past. Unfortunately, the result looks like a random mix of IPv4, 802.15.4, USB, SNA and SONET with a PS/2 connector. I apologize in advance. I initially planned to use a 7 pin Mini-DIN connector and allow mobile devices to share energy. However, due to the proliferation of PS/2 devices, 6 pin connectors are noticeably cheaper than other variants. Unfortunately, this requires some features to be dropped. Regardless, if the connectors are physically compatible with PS/2 it is worthwhile to investigate electrical compatibility. Specifically, it is possible for negotiate the protocol up from a PS/2 keyboard connection while hubs are expected to encapsulate legacy keyboards and mice.
While it is nice to think about an alternate history where CAN bus never existed and likewise for the "Universal" Serial Bus with its nine connectors, three wire formats and numerous optional extensions, there is the small matter of implementation.
I initially considered using a long chain of shift registers. Actually, a very long line of shift registers because I am trying to implement a 256 bit cell format. Assuming that I use 8 bit latching shift registers, that would be 32 chips to send and 32 chips to receive. Oh no, actually, due to the Nyquist limit, we have to 2x oversample or maybe 3x to compensate for the fun things like ring and ground bounce. I also thought that the magic frame marker could set input latches and trigger an interrupt. Obviously, bit errors interfere in this process but this is the bootstrap version. So, that's a minimum of 128 chips per channel and we probably want a minimum of four channels per computer. Ye gads!!! That's at least 512 chips. How does MyCPU solve this? MyCPU is a computer loosely based on 6502 and made from discrete components, oh, except the network interface which uses a micro-controller.
This is looking like an intractable problem. I am quite certain that it is possible to dash off a short circuit description and deploy it on FPGA. However, I hoped for discrete hardware implementation which didn't look like a DRAM shortage. And that's the answer. The design has a large amount of needless redundancy. After
consideration of video display, and the
minimal implementation of a Sinclair ZX Spectrum in particular, I realized that parallel to serial conversion only requires one 8 bit latching shift register. 3x oversampling can technically be performed with one chip but it may be preferable to collect each phase of sampling separately. The remainder of the exercise is FIFO and maybe the occasional interrupt. Yup, that's it. We're not conforming to any other standards beyond UART at one speed.
Does this work? Can we just count to 768 and interrupt a processor? Yes. Framing can be implemented in software. The device driver for the mag/PAN/LAN/WAN interface holds the previous 32 bytes and the current 32 bytes before looking at the most likely locations for a frame marker. This may drift around by one phase as sender and receiver clocks drift. Indeed, this technique is tolerant to bit errors in the frame marker while Hamming codes also allow bit errors in the cell payload. To reduce processor load, it may be desirable to provide a specialist barrel shifter which decodes the 8/9 wire format. This unit would be shared by all channels but is not essential.
Anyhow, the minimal implementation requires less than 20 chips and some of this is shared across multiple bi-directional channels. It is possible to start each channel in PS/2 compatibility mode then switch to 31kHz mode, 1MHz mode or faster. By running each port at different frequencies, it is possible to massively oversample UART with, for example, 256 samples for 9600 baud 8N1. This would be processed in the same manner as a 256 bit cell. Specifically, processing would occur with reference to the previous 256 samples across three phases.
Some people are surprised by some implementation details. For example, Hamming code is applied to encoded data. This allows a network switch to optionally re-generate a payload without decoding it. Indeed, on the least powerful devices with the least memory, such as 6502 with 2KB RAM, it may be preferable to buffer and route encoded 32 byte cells rather than decoded 24 byte payloads. Another surprise is the choice to not interleave the 80 bit fields. This would provide extra resilience against periodic noise, such as mains switching. However, I regard the transmission rate as incidental and specific sources of interference are most problematic at specific frequencies. It is preferable to provide an implementation in which cell size may be varied or Hamming codes may be applied in hardware at wire speed. These considerations are simplified by not interleaving. Bit stuffing is not current balanced but it guarantees a signal transition every 9-10 bits. Although 64/66 encoding and similar provide more channel capacity, 8/9 encoding provides easier options for framing and phase locking.
Finally, I discovered why MIDI is unsuitable for process control. It doesn't enforce checksums and is therefore unsuitable for critical applications.