Topic: What code changes would be required for a 64 bit recompile by Toady? (Read 2426 times)

Devin · « **on:** March 27, 2015, 03:11:36 pm »

Asking out of curiosity, as I'm not sure if it would take any but I've never handled a 32 bit to 64 bit application conversion myself. A native 64 bit compile would have significant benefits, such as allowing DF to use more than 4GB of ram. It'd definitely be worthwhile if it wouldn't require a significant investment of Toady's time.

lethosor · « **Reply #1 on:** March 27, 2015, 04:16:53 pm »

I don't know if this is the case anymore, but saving and loading used to involve copying data between files and memory directly. If this is still the case, any data types that are saved whose sizes vary across 32-bit and 64-bit architectures would be incompatible (and could end up breaking 64-bit saves entirely). I don't know if Toady still uses "long"s, which are a common example in GCC, but a 64-bit build would most likely require rewriting much of the serialization code to work properly. (I'm not discouraging it at all, but the time-consuming nature of it and the potential instability probably means that Toady won't look at it until 0.41, or whatever the next version will be, is released.)

Dirst · « **Reply #2 on:** March 29, 2015, 07:59:09 am »

Making 64-bit DF save-compatible with 32-but DF might defeat the whole purpose of going to 64-bit, since you'd be limited to the same value ranges and item counts and so on. Basically it would be an incredibly complicated fix to the specific issue of an out-of-memory crash. Going to 64-bit ought to be an excuse to go through the game and remove a bunch of constraints.

So look for this at a major release. If Toady intends to keep a 32-bit version as a legacy mode, there would need to be a setting to keep the save "small" for compatibility which seems like a lot of extra work. Most likely, users on Windows 98SE will just be out of luck.

lethosor · « **Reply #3 on:** March 29, 2015, 08:18:23 am »

Not at all. The only thing limited to 32 bits when compiling a 32-bit build is memory addresses, which effectively limits the memory that can be addressed to 4 gigabytes. Regardless of the architecture compiled for, compilers (at least the ones Toady uses) can use 64 bit integers, or even 128-bit integers. Any restrictions you may be thinking of in the current builds, besides memory availability, will almost never be reached, unless you have a fort with billions of items, units, etc.

Edit: Also, Windows is not the only possible 32-bit platform - there are plenty of 32-bit Linux distributions (although few are 32-bit only). Dropping support for them entirely isn't feasible, not to mention that Toady wants to avoid breaking save compatibility so he can test older bugged saves.

It's worth noting that I don't know exactly how DF's serialization code works, but it's possible that it's reading, say, 4 bytes from disk into a 32-bit integer. If this integer is actually 64 bits in memory (which would only be the case when using "long" in GCC, from what I've tested), the value interpreted by DF could be incorrect. If DF is reading the number of bytes that corresponds to the size of the specific integer type it's reading, that would only cause incompatibilities between 32-bit and 64-bit builds (which Toady wants to avoid).

Dirst · « **Reply #4 on:** March 30, 2015, 09:56:39 am »

Quote from: lethosor on March 29, 2015, 08:18:23 am

Not at all. The only thing limited to 32 bits when compiling a 32-bit build is memory addresses, which effectively limits the memory that can be addressed to 4 gigabytes. Regardless of the architecture compiled for, compilers (at least the ones Toady uses) can use 64 bit integers, or even 128-bit integers. Any restrictions you may be thinking of in the current builds, besides memory availability, will almost never be reached, unless you have a fort with billions of items, units, etc.

Not familiar with the internals of compilers, but I know that the same data type can have different sizes when compiled for 32 or 64 bit architectures. There's probably some setting buried somewhere that lets the programmer choose the actual size for each type, which would make it possible for the 32 and 64 bit versions to have the same definitions for each type. I still think it should be done in a major release without regard for save compatibility because it's going to be exposing a LOT of the code to new conditions that could require fixing. The signed 4-byte numbers used for internal calculations already cause some issues for the game; it'd be nice to see those bumped up (and if those overflows were responsible for any of the game's odd behavior).

utunnels · « **Reply #5 on:** March 30, 2015, 10:23:43 am »

I think save format is the least problem. You can always use text format instead of direct memory/structure dump.

Just curious, has somebody really run out of memory? Are you playing 16x16 maps?
It is terrifying!

BoredVirulence · « **Reply #6 on:** March 30, 2015, 12:45:12 pm »

Quote from: utunnels on March 30, 2015, 10:23:43 am

I think save format is the least problem. You can always use text format instead of direct memory/structure dump.

Just curious, has somebody really run out of memory? Are you playing 16x16 maps?
It is terrifying!

I have run out of memory, but making DF large address aware can rectify it. I played a 16x16 map with 500+ dwarves. With LAA, you can do that. Of course FPS is awful, but its possible.

Running out of memory is certainly possible, just not likely.

As for moving to 64-bit, Toady is going to have to do it eventually. Same with multi-threading, but that will be much more work...

Dirst · « **Reply #7 on:** March 30, 2015, 12:57:15 pm »

I remember the first time I got a computer that could address four gigabytes of RAM, an Amiga 2000, back when the mega in megabyte still sounded big. That machine had nine megabytes of RAM and a forty megabyte hard drive. That was orders of magnitude bigger than the five kilobytes of RAM in my first computer. Four gigabytes seemed ridiculous.

Get off my lawn

lethosor · « **Reply #8 on:** March 30, 2015, 02:33:45 pm »

Quote from: Dirst on March 30, 2015, 09:56:39 am

Not familiar with the internals of compilers, but I know that the same data type can have different sizes when compiled for 32 or 64 bit architectures. There's probably some setting buried somewhere that lets the programmer choose the actual size for each type, which would make it possible for the 32 and 64 bit versions to have the same definitions for each type.

Exactly. Types like "int", "long", "short", and "long long" can vary in length across architectures (only a few actually do), but the compilers Toady uses also support types like "int32_t", "int64_t", and "uint16_t", both on 32-bit and 64-bit architectures.

Quote from: Dirst on March 30, 2015, 09:56:39 am

I still think it should be done in a major release without regard for save compatibility because it's going to be exposing a LOT of the code to new conditions that could require fixing. The signed 4-byte numbers used for internal calculations already cause some issues for the game; it'd be nice to see those bumped up (and if those overflows were responsible for any of the game's odd behavior).

These can be expanded without breaking save compatibility. The only thing that would break compatibility is if DF writes integers of varying sizes to disk without making sure their size is consistent across architectures. I don't know how many things this applies to, but it should only apply to serialization. Are you thinking of any specific examples of calculations that overflow? (If any of these values are saved to disk, Toady would probably be able to expand them and patch existing saves fairly easily, given that he's done much more complicated save patching in the past.)

Quote from: utunnels on March 30, 2015, 10:23:43 am

I think save format is the least problem. You can always use text format instead of direct memory/structure dump.

A direct memory dump is arguably easier and more compact than a text format. (Switching would also break existing saves, of course.)

Quote from: BoredVirulence on March 30, 2015, 12:45:12 pm

I have run out of memory, but making DF large address aware can rectify it. I played a 16x16 map with 500+ dwarves. With LAA, you can do that. Of course FPS is awful, but its possible.

It's worth noting that this isn't necessary on other platforms, where 32-bit programs can address 4 GB of memory (I'm not clear on why Windows limits them to 2 GB by default, since pointers really should be unsigned).

utunnels · « **Reply #9 on:** March 30, 2015, 07:21:28 pm »

Quote from: BoredVirulence on March 30, 2015, 12:45:12 pm

I have run out of memory, but making DF large address aware can rectify it. I played a 16x16 map with 500+ dwarves. With LAA, you can do that. Of course FPS is awful, but its possible.
Running out of memory is certainly possible, just not likely.

I see.
Speaking of fps...I think DF is actually doing well, for a realtime sim game. It is just the map is too big. Even a 1x1 map can has 48x48*100 active area all the time. But it is what makes the game playable, sadly. Even a small map can have so much fun, but 10 fps all the time? Proably no.

Quote from: lethosor on March 30, 2015, 02:33:45 pm

A direct memory dump is arguably easier and more compact than a text format. (Switching would also break existing saves, of course.)

Yeah, but breaking saves will happen eventually, as features being added.
Actually, if we switch to text format, it is actually easier to make the saves compatible in the long run. You know, if we add a new property to a structure, a binary save will more likely become unusable.

Dirst · « **Reply #10 on:** March 31, 2015, 08:54:30 am »

Quote from: lethosor on March 30, 2015, 02:33:45 pm

Exactly. Types like "int", "long", "short", and "long long" can vary in length across architectures (only a few actually do), but the compilers Toady uses also support types like "int32_t", "int64_t", and "uint16_t", both on 32-bit and 64-bit architectures.

That's good news, I thought it would be a bunch of illegible compiler directives or something.

Quote from: lethosor on March 30, 2015, 02:33:45 pm

These can be expanded without breaking save compatibility. The only thing that would break compatibility is if DF writes integers of varying sizes to disk without making sure their size is consistent across architectures. I don't know how many things this applies to, but it should only apply to serialization. Are you thinking of any specific examples of calculations that overflow? (If any of these values are saved to disk, Toady would probably be able to expand them and patch existing saves fairly easily, given that he's done much more complicated save patching in the past.)

I'm not worried about Toady making an import tool to read in an old save and up-convert it, I'm worried about innumerable calculations inside the simulation that are now subtly different.

Quote from: lethosor on March 30, 2015, 02:33:45 pm

It's worth noting that this isn't necessary on other platforms, where 32-bit programs can address 4 GB of memory (I'm not clear on why Windows limits them to 2 GB by default, since pointers really should be unsigned).

Did a little bit of research on this because I was curious.

Pointers ought to be unsigned, and they might even be unsigned under the hood. 32-bit CPUs have actually had 36-bit memory buses since the mid-90s, allowing a theoretical limit of 64GB, but individual processes continued to be limited to a virtual memory space of 4GB because the PCI bus has only 32 bits for addresses. In that virtual space, all of the I/O registers are mapped to addresses at the top of the space and can use up quite a bit of the available addresses. At some point, someone made a decision to avoid all possible conflicts and restrict user-mode programs to address only the bottom 2GB of the 4GB space.

This wouldn't be so bad if it wasn't a blow-by-blow replay of exactly the same issue that limited early PCs to 640KB or RAM. You'd think Microsoft and Intel would have learned from that episode, but you'd be wrong.

Coming back to the 4GB space, clever folks have come up with ways of detecting exactly what parts of the upper 4GB are used up with I/O mapping and allow the user-mode program to use whatever is left over. It sounds suspiciously like the Upper Memory Block tricks from the 640KB days, because it's basically the same thing all over again. This is the Large Address Aware mode, and it gives your program about 2.2GB-3.5GB of space to work with though it's usually not contiguous.

In principle, the OS ought to be able to use those 36 bits of memory address to shadow real RAM on top of the I/O mapped addresses, and give the user-mode program a full 4GB of addressable memory (shadow RAM was one of the tricks used in the 640KB era). If one is willing to take a performance hit, it'd even be possible to give a full 64GB of addressable space minus a GB or two eaten up with I/O registers. Microsoft says that device driver compatibility issues prevent using either of these solutions.

The issues supposedly don't exist in 64-bit architectures. Maybe I/O is addressed in a completely different mode than RAM now, but I wouldn't be surprised to learn in a few years that all of the I/O is mapped in the upper bit of the 16EB address space.

Edit: typo.

Bumber · « **Reply #11 on:** March 31, 2015, 09:11:44 am »

Quote from: utunnels on March 30, 2015, 10:23:43 am

Just curious, has somebody really run out of memory? Are you playing 16x16 maps?
It is terrifying!

I've had it happen during world gen of a medium world. Depends on how the history plays out.

Zarathustra30 · « **Reply #12 on:** March 31, 2015, 07:34:27 pm »

Quote from: Bumber on March 31, 2015, 09:11:44 am

Quote from: utunnels on March 30, 2015, 10:23:43 am
Just curious, has somebody really run out of memory? Are you playing 16x16 maps?
It is terrifying!
I've had it happen during world gen of a medium world. Depends on how the history plays out.

I've had this happen once or twice with large worldgens, but it stopped after I started culling historical figures.

Quote from: lethosor on March 30, 2015, 02:33:45 pm

Quote from: utunnels on March 30, 2015, 10:23:43 am
I think save format is the least problem. You can always use text format instead of direct memory/structure dump.
A direct memory dump is arguably easier and more compact than a text format. (Switching would also break existing saves, of course.)

Are memory dumps transferable between computers? Right now, a save is a save is a save, but if saving was switched to memory dumps, I think different hardware and different operating systems would be completely incompatible.

Anyway, going to memory based saves from what we have now is going to remove the ability to mod the game without a world regen. The outcry would be terrible to behold.

utunnels · « **Reply #13 on:** March 31, 2015, 07:43:53 pm »

Quote from: Zarathustra30 on March 31, 2015, 07:34:27 pm

Are memory dumps transferable between computers? Right now, a save is a save is a save, but if saving was switched to memory dumps, I think different hardware and different operating systems would be completely incompatible.

No. Different systems have different memory layout.

But you can use a virtual machine, and save a snapshot of the machine. It should be compatible.
But, a virtual machine usually runs slower than a real computer. And you need to copy the whole image to another computer, which can be at least gigabytes in size.

lethosor · « **Reply #14 on:** March 31, 2015, 08:05:35 pm »

"direct memory dump" was the wrong term to use. DF simply writes the contents of integers and other data types directly to disk (or it did back in 0.31), albeit with some method of organization that I'm not familiar with. This does mean that saves from Intel and PPC machines aren't compatible, since they store integers in memory differently, but it's not an issue in recent versions since DF doesn't work on PPC architectures. With some work, Toady can ensure that all saves across 32-bit and 64-bit architectures are saved in a consistent format that matches the existing format.
In contrast, a text format (which I assume means a human-readable text format) would be much less efficient and more difficult to implement, although compression could help keep save sizes manageable.

News:

Author Topic: What code changes would be required for a 64 bit recompile by Toady? (Read 2426 times)

Devin

What code changes would be required for a 64 bit recompile by Toady?

lethosor

Re: What code changes would be required for a 64 bit recompile by Toady?

Dirst

Re: What code changes would be required for a 64 bit recompile by Toady?

lethosor

Re: What code changes would be required for a 64 bit recompile by Toady?

Dirst

Re: What code changes would be required for a 64 bit recompile by Toady?

utunnels

Re: What code changes would be required for a 64 bit recompile by Toady?

BoredVirulence

Re: What code changes would be required for a 64 bit recompile by Toady?

Dirst

Re: What code changes would be required for a 64 bit recompile by Toady?

lethosor

Re: What code changes would be required for a 64 bit recompile by Toady?

utunnels

Re: What code changes would be required for a 64 bit recompile by Toady?

Dirst

Re: What code changes would be required for a 64 bit recompile by Toady?

Bumber

Re: What code changes would be required for a 64 bit recompile by Toady?

Zarathustra30

Re: What code changes would be required for a 64 bit recompile by Toady?

utunnels

Re: What code changes would be required for a 64 bit recompile by Toady?

lethosor

Re: What code changes would be required for a 64 bit recompile by Toady?