I threw something together to see how to make a data processor with variable length instruction words.
That's a pretty common thing in real-world computers after all: an instruction may e.g. consist of three bytes, one demanding a "load", the second to determine where the loaded information should go (if you have a lot of registers) and a third to determine where to take the information from (from a memory adress, typically, and those can in fact take up more than one byte). The computer then needs to read and process those three bytes (without mixing them up) and make sure that the program counter advances accordingly - next operation starts three bytes past the current, you don't (usually) want to use the source adress as operation code of the next action.
Dwarven computers so far (all two or three of them) have used instructions of fixed length, because keeping signals apart in DF is a massive pain. But variable instruction word lengths would be one way to build more complicated machines; the other would be using much longer instruction words, which would quickly lead to escalating investments in raw machinery. So, to keep it simple, i slapped together a crude tape processor that has no computing power whatsoever: it uses a two-bit "instruction", allowing for four activities:
00: invert contents of the accumulator
01: right-rotate contents of the accumulator
10: load accumulator from adress
11: store accumulator at adress
00 and 01 need no extra argument, 10 and 11 take a second double bit as argument, and for that end, the machine must be able to handle a single vs. a double-length "instruction grunt".
For now, i'll just put up two pictures of the finished demo installation and try to explain the issues when i feel like it:
The two activated plates on the southern track show the last operation was double-length. Above, the activities show the operation was 11 01 - store at adress 10 (adresses are decoded through simple gears, so they actually output the XOR of the two "grunts": store operations take the inverse of the actual adress as second argument, loads use +/-2).
The actual operation circuit. Decoded signals are sent here and either one or both carts get moving. The cart to the left is heavier and only calculates the inverse of the accumulator's contents, everything else - read/write and right-rotate are done by the lighter cart on the right side - currently somewhere in the southeast of the circuit (not visible because apparently i hit print in the wrong half-second).
Four "memory" and one "accumulator" locations of six bits each (ordinary water-logic cells), sixteen-grunt looping tape; processing speed somewhere between 150 and 200 steps per full operation. Took something like a thousand mechanisms.
EDIT:O.k., so what's the point?
Primarily, variable-length instruction words are something that greatly increases the capability of real-world computers relative to their notional instruction word length: an eight-bit chunk of program data gives 256 options, which would allow for a large number of operations, but would be next to nothing if you have operations that can occur with a large number of different parameters like memory adresses to read from/write to. Expanding the size of the basic instruction word to fit the bill is not necessarily the best option; a common approach in microprocessors was/is to construct some of the instructions from several bytes, often a variable number.
Now, a real-world instruction pipeline would be difficult to implement in DF: reading and moving data is generally complicated and slow in the game. But i was curious how viable the basic principle would be. That's all this Single-Purpose Experimental Machine (S.P.E.M.) is supposed to achieve and demonstrate.
The Tapes:The
PROGRAM TAPE holds one minecart per position. Carts fall into one of four weight groups: up to 100, 100-400, 400-500, 500 and above. Minecarts are set in motion by signals from the
COUNTING TAPE. The single cart on that tape is pushed forward one position everytime power is received.
On each advancement, the counting cart sends activation signals to
two positions on the program tape. This sets the two carts in motion and pushes them over two weight-sensitive pressure plates: one responds to weights 400 and above, the other to 100-500. This decodes the cart weights into signal outputs:
</= 100: 00
100-400: 01
400-500: 11
>/= 500: 10
Operation identification:Of course, if we fed all received signals into a single unified decoder, we'd just get some (boolean) function derived from _both_ carts' values. This would mean information loss - when two states of the first bit are XORed (as usual with single signal-driven gears), the output is either 0 or 1, while we have four possible input configurations. Only the first position's value interests us for the operation itself, so we need to keep the first and second carts' signals separate. We also must keep in mind that with variable-length instructions, values can switch roles: if on one read you find a single-grunt instruction, the operation's "parameter", derived from the value of the second grunt, will be discarded. The counting cart will advance one position and on the next read, the erstwhile second cart is now the first and determines the operation.
Thus, i built a "position-sensitive"
OPERATION DECODER:
1A
PR☼☼RO
R☼☼R
2B
Decodes one bit; there are two of these since we're dealing with a two-bit processor. An eight-bit processor would need eight. If instructions could be three "instruction atoms" long, each bit-decoder would need to be three instead of two gear pairs wide.
P- power supply
R- NS-rollers, only used for power transmission
O- output
1- gear assembly engaged by the "odd" position signal of the counting tape
2- gear assembly engaged by the "even" position signal of the counting tape
A- gear assembly engaged when weight of the cart at the "odd" position matches
B- gear assembly engaged when weight of the cart at the "even" position matches
The trick is that the cart on the counting tape alternates between engaging the 1 and 2 gear on successive advancements. At the same time, the weighing plates on the program tape are only wired to one of the A and B assemblies, they also alternate between A and B with each position. The output will be on when either 1 and A or 2 and B are on.
The value of the "operand" is decoded by inputting signals from both positions to the same gear: since we already successfully determined the precise value of the first grunt, the second _can_ now be reverse-decoded from an XOR of grunts one and two. The same would apply for instructions with even more parts - if we already know the value of grunts one and two, a parity-gate indiscriminately checking grunts one to three is enough to determine the value of grunt three.
Stepping controlNow, the second issue is that with occasional longer instructions, the counting cart needs to advance an extra step sometimes. This could cause all sorts of issues: generally, the counting cart will disengage power supply to the counting tape to limit its advancement to a single step. After a double-length instruction, this blockage must be bypassed to allow the extra advancement (unless you want to wait the full 100 steps for normal time-out. I don't.). But wait, there's more! If you bypass the limit, the next self-limiting signal of the counting cart could just re-engage the "limiter" gear assembly. Solution, quite simply: use two limiter assemblies in line and alternate between them again. The result is this
TAPE CONTROL CLUTCH:
PI-R12R-PO
RBLR
PI- Power in
R- Rollers, as above
PO- Power out
1- self-limiter from "odd" positions
2- self-limiter from "even" positions
B- bypass gear assembly, engaged when the bypass condition is fulfilled
L- bypass limiter, disengaged a few steps after triggering the bypass (two pressure plates on the same minecart circuit)
The only remaining control measure is another power-interrupt triggerred from the program tape, so that the program signals are limited and the advancing counting cart can't trigger another program read.
Component cost:Each pressure plate on the counting tape is connected to: two gears in the decoder, two gears powering rollers on the program tape, an interruptor gear, the operation trigger (minecart delay circuit). Twelve linkage mechanisms for each position.
Each position on the program tape contains three pressure plates. Two are weighing plates, each linked to two decoders - one position-sensitive operation decoding, one insensitive adress decoding (just XORs both grunts where the second is not reused by other instructions, thus no impediment on adressing). One plate built on every other position connects power to the decoder (not actually required), another also only built on every other position (thus three per-position) disconnects power to the program tape. Ten mechanisms (averaged) per position for links.
This gives us 22 mechanisms for linkages only. There are seven more mandatory for pressure plates, rollers and powering gears; add another for power supply and we get thirty mechanisms per program position, to handle two-bit "words". Trying to tell the purposes apart, i'd say that comes out at fourteen mechanisms per position to regulate the tapes, eight mechanisms per position per bit for decoding program data.
ConclusionThe S.P.E.M. works reliably and pretty quickly, just under 160 steps per full operation, including double-length instructions. The main obstacle to getting closer to the intrinsic limit of signal processing (the 100 step signal time-out) is decoding: it takes ~30 steps to properly read and identify an instruction. Shaving off another ten steps from the current 160 should be feasible without fiddly step-by-step finetuning, but i expect a hard limit of around 130 for this overall design. The 30 bits of fluid memory, as usual, are suffering from overseer stupidity - once again, i built the memory bank directly below the river's main level, allowing vermin fish into the construction, which of course went and gummed up the works.
Variable-length instruction words are possible in DF and can be executed at fairly high speed: the given case performs at 159 steps per instruction, while the theoretical signal-processing limit is 100 theoretically, 110 practically. Operations of actual computational value would be more complicated, but could likely run at similar speeds. The one-off and per-bit-per-position investment costs for the controller are substantial; limiting activity times of machinery is crucial and fairly complicated. The actual usefulness hasn't really been touched on because the S.P.E.M. is only a feature demonstrator, not an actual dwarven computer.