When writing software that uses any kind of API or hardware functionality, sooner or later there will be questions about what a particular API call or hardware operation is supposed to do. In principle, such questions should be answered by referring to the specification (and user documentation). I am a firm believer in writing readable and clear specs and keeping software coding to follow the spec, as that ensures future compatibility. But reality is not that simple. Things are generally better today than they used to be, though. Reading up a bit on the history of my first computer (the ZX Spectrum), I found some rather interesting cases of spec vs implementation, and “discovered functionality”.
The Spec(trum)
Back in the 1980s, we had a situation where most home computers were in practice designed first and specified later, and the implementation that shipped was the specification as far as the users were concerned. This meant that in practice, “whatever works” was the order of the day, and users were actively looking to discover how the machines worked.
All ZX Spectrum computers sold in the early days contained essentially the same hardware – same processor, at the same speed, same peripheral hardware (of which there was precious little). It was the same case for Commodore 64s and other computers of the day. The need to get every last cycle of performance out of the computers drove programmers towards programming to the very edge of the implementation, in particular figuring out timing-dependent tricks that made the machines do things they were not supposed to be able to do.
I recall myself using tricks like relying on known values from certain locations in the Spectrum ROM and writing precisely-timed code to change display properties in sync with the redraw, in order to increase the apparent color resolution. Why not? How could the ROM ever change? And I had no idea that the timing might change. If it worked it worked.
The Spectrum Implementation Changes
However, once it was time to upgrade the Spectrum to keep up with the competition, things did indeed change. I never personally encountered any of this as the updated Spectrums never made much of an impact in Sweden, and I moved on to my first 68000-based Macintosh in 1989. But I found some great information about various ways in which this did change during the later life of the product over at https://spectrumforeveryone.com/technical/spectrum-compatibility-issues/.
There are some real gems in that collection that provide food for thought about the hazards of coding to the implementation and not the specification, as well as how to deal with specification changes.
Memory Contention Specification vs Implementation
My favorite is the issue of memory contention. The Spectrum design and the industry overall had matured enough that there were actually specifications for programmers to follow. Too bad they were wrong:
When the original 128K Derby technical specification was published, it described the 8 pages of RAM that could be paged in as either contended or uncontended (see the above linked memory contention article for more detail). As code runs slower in contended RAM, programmers would not place timing critical code (like loading routines) in these contended banks, which according to the spec were banks 4,5,6 and 7.
This would have been fine, except for an error in the PAL chip which resulted in the contended banks being 1,3,5 and 7 instead. Programmers either wrote their code against the spec and encountered unexpected results, or observed actual behaviour – which caused problems down the line when the +2A and +3 range were architected to implement contention as per the original spec!
https://spectrumforeveryone.com/technical/spectrum-compatibility-issues/
Ouch. Whether you code to the implementation of the specification, there is case where your code is broken.
How should this kind of issue be handled?
The best answer is probably to have the code detect the hardware that is being used and adjust accordingly (assuming you have come to understand that there is a problem to be handled). Either read some version register, or use something like the x86 CPUID instruction, measure performance during startup. Alternatively, you could also just bite the bullet and write code for the worst case, essentially trading off peak performance against robustness. In the end, hardware is hard and is what it is, and the software has to work around it. That’s the way the story always goes.
Precise Timing
Other cases involve changing the precise clock-cycle timing of the screen redraw.
The net result of this is that the 128K machine runs slightly faster than the 48K machine (70908 clock cycles per 20ms/50Hz frame as opposed to 69888). Screen timings are altered slightly due to this (228 x 311 T states per frame on the 128, 224 x 312 on the 48).
Why does this matter? Well, any game that does fancy border effects and relies on their being exactly 224 T-states per line for timing purposes is going to look wrong on 128 machines. The most obvious case of this is Dark Star, which uses careful timing to draw a space invader graphic in the border area – this is badly skewed on 128 machines.
https://spectrumforeveryone.com/technical/spectrum-compatibility-issues/
Fixed timing is something that I used a lot in my own code. One of the first things I learnt on the Spectrum was that the smart move was to put all your screen update code in an interrupt handler that synchronized with the screen redraw. There was nothing else to use interrupts for! And then very precisely count cycles so that the redraw never overlapped with the redraw, giving a nice stable image. Precise timing also allowed tricks like changing the color attributes of a line of text during the time that the border was being drawn, allowing me to increase the vertical color resolution from 8 pixels to 1 pixel.
Just what IS the Spec?
The pixel timing example in particular gives rise to the question about just what the specification is. Is precise-timing-dependent code technically correct? Given where hardware was at the time, I am tempted to claim that the timing was part of the de-facto specification, even if it was not written down explicitly in any official manuals. The behavior was the same on all machines that shipped (until it was not), and utilizing such hardware behaviors was accepted and encouraged (even if Sir Clive famously did not like talking about games for his computers).
In general, timing is likely to change more and in more unpredictable way than functional results. Making a new hardware implementation that does something better run exactly the same as an old implementation is really hard, as exemplified by the pixel timing.
It would clearly have been better if the computers of the day came with clear specs as to what was and was not expected to work now and in the future. However, I guess that programmers would have ignored that anyway to get the best performance of the machines.
Often, the timing aspects are by-products of the implementation rather than explicitly designed and specified. The timing is discovered by the designers as well, and it is whatever it is once the system is complete enough to work. ArsTechnica had a really interesting piece on the accuracy needed for SNES emulation, way back in 2011, my analysis of which has been lost.
However, things are different today. Or should be. At least for mainstream get-the-job-done computing. We expect software to live much longer than any specific hardware implementation. We expect software to work on a wide range of machines, just with different performance. A game might run slower, but it should not crash or give a blank screen just because your gaming rig is different in some way.
Even low-level device drivers typically need to work with multiple generations, multiple different performance levels, and even multiple implementations of the same hardware spec, running at different speeds and with very different boards and systems around them.
Thus, software should be written to the specification. If the specification does not mention something, the robust option is to avoid it. Programmers should not try to figure out where the boundaries of the hardware are, as that is likely to create brittle code.
Implementing the Specification
The above is a bit of a legalistic argument, considering the specification as a contract to be followed by both sides. Unfortunately, you can expect the hardware side to break the contract just because things happen (as exemplified by the contended banks).
When the hardware does not work as intended, we get work-arounds. In way, the presence of work-arounds means that software works when it strictly speaking should not be able to. Work-arounds are a fine engineering tradition, and are in reality always needed in reality even though in theory they can be avoided.
Given the work-arounds have been implemented in software, what should next-gen hardware do? Implement the specification correctly and break existing code, or implement something bug-compatible and perpetuate the issue? In practice, in the past we have seen a lot of being bug-compatible, sometimes going as far as encoding the implementation-accidents of one generation as the specification for the next generation.
But I would also say we are seeing more cases of sticking to the specification over time, mostly for performance reasons. A good specification should not over-constrain the implementation. It does mean that we see more software that fails due to changes, but that we can also say that such software was broken in the first place and should be fixed. Over time, we have seen more and more aspects of a system move from implicit to explicit specification, including being explicit on things that are likely to change and that should not be relied on.
Final Words
I still find the kind of code we wrote back in the 1980s fascinating. The machines were simpler, timing could be understood, and you could take a lot of pride in clever solutions that overcame the limitations of the hardware. There is a cleverness and joy in it.
But it is not a particularly professional or mature way to write code. Today, the model has to be to follow the functional specification and adapt to the timing. Follow the designer guidelines, and push back if the guidelines are not good enough.
If work-arounds are needed, put them in a separate part of the code, and try to find ways to detect that they are needed. Dynamic hardware capability detection is pretty much mandatory for portable software today, but usually you want to put that into system libraries of libraries from the hardware designers. Not in the application code directly. For example, the Linux kernel and libraries provide many examples of how hardware is probed in order to select the right implementation to use or apply necessary work-arounds for broken hardware.