Exactly. Types like "int", "long", "short", and "long long" can vary in length across architectures (only a few actually do), but the compilers Toady uses also support types like "int32_t", "int64_t", and "uint16_t", both on 32-bit and 64-bit architectures.
That's good news, I thought it would be a bunch of illegible compiler directives or something.
These can be expanded without breaking save compatibility. The only thing that would break compatibility is if DF writes integers of varying sizes to disk without making sure their size is consistent across architectures. I don't know how many things this applies to, but it should only apply to serialization. Are you thinking of any specific examples of calculations that overflow? (If any of these values are saved to disk, Toady would probably be able to expand them and patch existing saves fairly easily, given that he's done much more complicated save patching in the past.)
I'm not worried about Toady making an import tool to read in an old save and up-convert it, I'm worried about innumerable calculations inside the simulation that are now subtly different.
It's worth noting that this isn't necessary on other platforms, where 32-bit programs can address 4 GB of memory (I'm not clear on why Windows limits them to 2 GB by default, since pointers really should be unsigned).
Did a little bit of research on this because I was curious.
Pointers ought to be unsigned, and they might even be unsigned under the hood. 32-bit CPUs have actually had 36-bit memory buses since the mid-90s, allowing a theoretical limit of 64GB, but individual processes continued to be limited to a virtual memory space of 4GB because the PCI bus has only 32 bits for addresses. In that virtual space, all of the I/O registers are mapped to addresses at the top of the space and can use up quite a bit of the available addresses. At some point, someone made a decision to avoid all possible conflicts and restrict user-mode programs to address only the bottom 2GB of the 4GB space.
This wouldn't be so bad if it wasn't a blow-by-blow replay of
exactly the same issue that limited early PCs to 640KB or RAM. You'd think Microsoft and Intel would have learned from that episode, but you'd be wrong.
Coming back to the 4GB space, clever folks have come up with ways of detecting exactly what parts of the upper 4GB are used up with I/O mapping and allow the user-mode program to use whatever is left over. It sounds suspiciously like the Upper Memory Block tricks from the 640KB days, because it's basically the same thing all over again. This is the Large Address Aware mode, and it gives your program about 2.2GB-3.5GB of space to work with though it's usually not contiguous.
In principle, the OS ought to be able to use those 36 bits of memory address to shadow real RAM on top of the I/O mapped addresses, and give the user-mode program a full 4GB of addressable memory (shadow RAM was one of the tricks used in the 640KB era). If one is willing to take a performance hit, it'd even be possible to give a full 64GB of addressable space minus a GB or two eaten up with I/O registers. Microsoft says that device driver compatibility issues prevent using either of these solutions.
The issues supposedly don't exist in 64-bit architectures. Maybe I/O is addressed in a completely different mode than RAM now, but I wouldn't be surprised to learn in a few years that all of the I/O is mapped in the upper bit of the 16EB address space.
Edit: typo.