Bay 12 Games Forum

Please login or register.

Login with username, password and session length
Advanced search  
Pages: [1] 2

Author Topic: Suggestion to split DF's pathfinding into asynchronous 2nd thread (2 cores use).  (Read 4622 times)

Sarmatian123

  • Bay Watcher
    • View Profile

Hi!

This proposition, should be an option in init.txt for CPUs with more then 2 cores.
The idea is for both executing threads, to use the same available in memory data for map.
The idea is DF should not ever freeze lock entire computer, even if one or two of its processes become busy over 100% speed of CPU for noticeable periods of time.

This is not exactly an example for parallell programming, but more like using 2 processes instead of 1 process, which is possible even on 30+ year old UNIX OS through forking. However this opens possibility to have two 1 thread programs running on 2 cores, instead of one 1 thread program running on 1 core of CPU.

Currently, when pathing algorithm strikes (specially on hugest maps with multiple crossing rivers, not to mention hugest caves under),
the game is freezing repetitiously for uncomfortable 30-60 seconds, even on 4ghz CPU. Those freezes are not producing any informative feedback in game's logs.
I checked it out repeatedly. Only repetitious errors are "Connected Component Overflow", "path fail:..." and "...:site walker could not find walkable area" and some single errors like "loop path fail:" or "removed erroneous unit occupancy flag". So, I believe those freezes occur due path-finding algorithm. No sense even bug-reporting it, as those apply only on largest of the maps with crossing rivers and such.
When no connecting path is found, units usually just pause their movement. Even in air with Keas path blocked by doors, as I noticed.
This auto-pausing of units can be exploited, if pathing algorithm is forked into separate process, for asynchronous path-feeding to units. Units can wait a wee, before moving on.
Even, if the path feeding, due changes in map results that units path will become blocked (or conditions for combat arise),
then they can just pause and wait for path to be recalculated.

Removing cpu intensive path-finding algorithm into its own process presents additional programming opportunities:

1. GPU
Path finding algorithm is using a lot of data, but deploys repeatedly same computing instruction set.
The only issue to feed it instead of CPU to GPU would be recursive nature of the algorithm. Older GPU can not allow it.
However latest GPU architectures allows for execution of recursive algorithms.
This demands understanding for GPU programming and modern GPU architectures.
It would be usable, as an optional feature for those having particular GPU cards with particular GPU architectures.
However GPU with its many million of processing units has potential to outperform CPU with its 2/4/8 processing units by scale of 1mln:1 speed.
Imagine heavy computing task performed by CPU in 1.5 minute shortened to barely 3 seconds.

2. ML
Path finding algorithm's process could be, for its performance sake, enhanced with its own Machine Learning algorithm.
This could lead to optimizations in path finding.
It requires understanding of ML algorithm.
A little like people walking through forest repeatedly, are finding same parts of path, so it becomes visible to naked eye. Then they do not need to find it any more. Even, if there is dug out a shortcut (like dwarves digging into a cave) into path, but which is not so visible. Therefore it (like insides of fortress for example by cave beings) is rather ignored by majority of people, who are walking through forest from one end to another. :)
Logged

GoblinCookie

  • Bay Watcher
    • View Profile

A lot of this comes down to the work that would be involved.  Though once we have the Steam release Toady One could probably hire someone to do the work. 
Logged

Strik3r

  • Bay Watcher
  • Persistently work-in-progress.
    • View Profile

+1

Every time someone posts a suggestion regarding multi-threading, i grow more and more frustrated. Not with people posting the suggestion but with Toady and DF. Toady has made it abundantly clear that he has no intention of doing anything to improve the game's performance, let alone implement multi-threading/multi-process, while piling on more and more crap onto a single core, which are not getting much faster year by year. The future is parallel processing and DF is rapidly falling behind.

Even if this suggestion will never be implemented, its a very good suggestion and there are other things that could be offloaded onto other cores, such as temperature and projectile calculation.
Logged
NOTICE: If you can't update your profile/signature, stop using a Imgur URL for your profile picture.
Upload it to somewhere else.

Sarmatian123

  • Bay Watcher
    • View Profile

All due respect Strik3r, but my suggestion is not exactly about parallell programming.

Parallell programming would not improve this part of DF, as my suggestion could. DFHack works that way already, without any multi-threading or parallell programming involved. All it needs is access to DF's map in memory. It wouldn't change anything like DFHack, but would feed DF client on request with "ready to use" paths (for map exit maybe?) or take a moment to compute a new one. It is more about providing DF client program with path-finding server program, which will run hopefully on different core of CPU. So this is why you need 2 threads. One client thread. One server thread. Something UNIX programmers did with fork command like in 1990:ties.

However, because of the nature of path-finding algorithm and its data, in future the path-finding server could be using GPU instead of CPU. Furthermore even ML algorithm could be created to improve the path-finding algorithm's basic line awful execution time.

On Linux for decades now, all sorts of programs are changing from single thread to client-server architecture. Even how sonds are played changed from ALSA driver to PulseAudio driver.

I do not think DF would get extensively improved speed with parallell programming in multi-threads. More like 10%-20% faster? In last round of bug clearing made DF faster by 20% alone.

I am talking here about permanently and entirely removing freezes from path-finding calculations. This is gain of 200% at least, plus no annoying DF freezes no more. Server may lag, but then you will see just animals and dwarfs idling a wee in DF, before continuing on their way. It is quite doable on skills Toady (and hid community -> DF Hack) already has. Though I would imagine this change would take some major rewriting of the code, because of changed architecture. Maybe not. I have no idea what architecture Toady deploys. There are other architectures for this to be used too, instead of basic server-client, like for example layered, client-server tiered, client-server service oriented.

<optional:>
1. Toady could, for the payed stream version, hire a professional GPU programmer to grant the server part of DF faster performance on latest GPUs. Enabled in options.
2. Toady then could offer to some student on some uni a master or bachelor thesis to develop ML algorithm to his DF server.
Logged

Strik3r

  • Bay Watcher
  • Persistently work-in-progress.
    • View Profile

oh dont worry, i 100% got what you meant, even if you didn't mention the server-client setup at all before and i'm 100% in agreement with you, i'm just saying:

There are also other performance hogs that could be put onto other threads in the same way. (whatever the implementation may be)

Toady has an intense aversion to any mention of the word "thread" or "threading" :P


Also, i like the idea of having like a ML system in place(IIRC there was a project a while ago that simulated IRL ant behaviour with ML, that could be useful). It would mean that dorfs would frequenly take non-optimal paths(but who cares? its not like humans always take the most optimal route, now do we?), but there would be less pathfinding, because at least part of the path is already "treaded-in".
Logged
NOTICE: If you can't update your profile/signature, stop using a Imgur URL for your profile picture.
Upload it to somewhere else.

voliol

  • Bay Watcher
    • View Profile
    • Website

+1 At first thought, units idling might seem like a problem, but when you consider that also applies to thinking creatures IRL trying to find the optimal path, the issue is nigh eliminated. 

PatrikLundell

  • Bay Watcher
    • View Profile

As far as I understand, Toady isn't averse to multi threading as much as to rewriting more or less the whole code from scratch to allow for multi threading. The problem is that as soon as you have two parallel activities that access the same set of data, and at least one thread modifies this data you have to protect the data items with some kind of lock, as well as to ensure that deletion/addition of items does not result in major mishaps (Accessing an object that no longer exists? Traversing a linked list that's modified by one another thread? ...). Add processor cashing to this, so you'd have to ensure one thread isn't using a cashed obsolete copy of the data, and you have a nice little total rewrite of DF on your hands.

Sure, there may be nice isolated issues where you can farm off some activities to another thread to work with data that's rendered safe, but you still have to change to code to farm off the activity early on and pick up the results later, which adds additional administration to the code.

Add to this that DF is largely memory bound, i.e. the speed of the program is largely constrained by the rate at which data can be fed to the CPU from memory, and the immense gains from doing things in parallel can melt down to a puddle (or even a loss, if the added administration costs too much).

Bugs where the same path is attempted to be calculated over and over again (e.g. pets attempting to path through pet locked doors) are fixed by fixing the bugs, not by furiously failing to find the path in a different thread, and problems with large map pathing are solved by allowing the path finding to spend more resources before giving up (which I guess is what's happening). The move to 64 bits allows for maps that are much larger than they could be before, so the earlier cut off limits may need to be adjusted.
Logged

Manveru Taurënér

  • Bay Watcher
    • View Profile

Since there seems to be some serious hyperbole going on here, I thought I'd post what I think is his most recent talk on the issue from the 2 AMA's he did after the steam version was announced. Hopefully helps clear the picture up for someone:

Right now, I'm still thinking of just doing the incremental optimizations I've been doing -- people have seen some boosts as I recollect as I've done those. It's not great progress but it's the kind we can make if there are so many dwarves and 50K items etc. Multithreading etc. is way too hard to do now (still have a microthreading thing to look at.)

I'm not a multithreading expert, which is part of the problem, but the issue as I see it with the maps we have is the, for instance, the buffers that are used. Even if just two dwarves are pathfinding at the same time, it can't involve even thinking about the same tiles without allocating additional buffers, which is its own nightmare. Same goes for flows/etc., to some extent, if they cross boundaries -- some sort of chessboard time staggering might help there, and indeed that's how we've handled flows on one cpu so far to make it faster. So there is some room to move, but I think it's difficult.

My plan continues to be to squeeze out time where I can; I got 15% out of the test forts last time I did it, and that carried over to the players to varying degrees. I'm not sure there's a magic bullet here, but I'm also not an expert. Bringing in outside help has a few barriers we can't yet cross, though we did try for pathfinding and it ultimate just didn't lead to a speed increase (subregion stuff is too slow to precalc as map changes.)

Logged

Shonai_Dweller

  • Bay Watcher
    • View Profile

Toady has an intense aversion to any mention of the word "thread" or "threading" :P
Except that when he last talked about it, without any apparent aversion, he mentioned having started experimenting with it and that he's used it on his side projects. And, as mentioned above, 3 years to rewrite the game for a 10% increase in speed doesn't yet seem worth it as right now more can be achieved with the partial optimisations he's been doing.

But yeah, make stuff up, that fine.
Logged

PlumpHelmetMan

  • Bay Watcher
  • Try me with sauce...
    • View Profile

To be fair, it's not hard for someone without full knowledge of Toady's development process to reach Strik3r's conclusions.
Logged
It's actually pretty terrifying to think about having all of your fat melt off into grease because you started sweating too much.

lethosor

  • Bay Watcher
    • View Profile

Currently, when pathing algorithm strikes (specially on hugest maps with multiple crossing rivers, not to mention hugest caves under),
the game is freezing repetitiously for uncomfortable 30-60 seconds, even on 4ghz CPU. Those freezes are not producing any informative feedback in game's logs.
I checked it out repeatedly. Only repetitious errors are "Connected Component Overflow", "path fail:..." and "...:site walker could not find walkable area" and some single errors like "loop path fail:" or "removed erroneous unit occupancy flag". So, I believe those freezes occur due path-finding algorithm. No sense even bug-reporting it, as those apply only on largest of the maps with crossing rivers and such.
If there's no informative log message, how do you know pathfinding is the cause of the freezes you're experiencing? Pathfinding happens when units need to find a path somewhere, which tends to happen on-demand, not in large groups of units at once (unless you're doing things like designating hundreds of jobs at once, but I assume you would have mentioned that). There have been several other things identified as causes of freezes in the past (vegetation growth in caverns, for instance), so it would be more productive for Toady to figure out exactly what's locking up instead of starting by refactoring pathfinding. (Of course, if pathfinding is the issue in your game, then it's definitely worth addressing.)

As for the log messages, DF simply doesn't produce many log messages about other things in-game, so the log messages you're seeing don't necessarily correspond to the freezes. They could also be from old games that you had loaded, since the log never gets cleared. To figure it out, I would clear (or delete, or rename) the log file, load the game, and open the log file in some sort of editor/viewer that automatically refreshes. If you see new pathfinding messages right before or after a freeze, that's a likely culprit; otherwise, it's probably something else.

Logged
DFHack - Dwarf Manipulator (Lua) - DF Wiki talk

There was a typo in the siegers' campfire code. When the fires went out, so did the game.

Sarmatian123

  • Bay Watcher
    • View Profile

Yes, Toady is right about changes in architecture. The changes sometimes are even prohibitive.
For changes in architecture, one needs seriously think about new architecture. Then consider how much good it will bring.
The major issue with architecture and its changes is better maintainability or better performance.

Path-finding algorithm is time expensive and DF is using already the best one. It can be optimized even on full maps with horizon values.
However Toady is not an expert on GPU programming either.
So, imho it is good he started doing path-finding into separate thread and checking out architectures for multi-threading.
Because path-finding algorithm maybe is slowpoke on one core of CPU, but on million cores of GPU difference in performance is shocking.
This second thread could also give home to ML algorithm, which recently GPU manufacturers started automatic adaptation, so deploying it on GPU is also possible.
As the rest of the DF doesn't fit requirements of code and data, which could be run on GPU, then it is no brainer to kick those two into separate thread, even if currently gain from it, is not too great.

There is a deep reason for refactoring path finding for the future, even if there is currently no big gain. Sooner pathing will get refactored, the less work it will consume. Particularly that great additions will be coming in next couple of years of releases.

About access to map memory. There is no issue, if one process is just listener. There is need to program a server-client interface between two threads, but it is nothing overly too much. Having path-finding thread constantly recalculating pathing on tiniest changes of map could be an issue with one CPU programming tbh. Server would just fed an old path and then recalculate it, when it was sent message from client, that this path got somehow blocked.

My guess is, that maybe currently there is no option for moving items to detect, if they are blocked or not, on their path. Climb down/up pathing, for blocked in their pathing creatures, could be fun for example. :) What is not fun however are those birds sitting in air and sleeping there. Oh my my, these trees really got to get some more usage.

@lethosor,
I started biggest map 4x4 with many rivers crossing. It was like 2 months ago on the latest vanilla DF. I was thinking to build lots of bridges and create a neat fortress to rest for my future adventurer. So, playing in fortress mode wasn't actually my goal. Place is like in the middle of the generated small world. My guess freezes are from some land animals, who can't swim, as during some seasons the freezes got shorter and fewer. Then they came back. Maybe something in caves causes it? There are still no bridges. In first year, I just got through aquifer (deep sand everywhere on the map) and started digging granit out to build my fortress bunker on surface as usual. Maybe, I will get some time on Christmas to play some more. :)
« Last Edit: October 18, 2019, 06:12:21 pm by Sarmatian123 »
Logged

Strik3r

  • Bay Watcher
  • Persistently work-in-progress.
    • View Profile

Toady has an intense aversion to any mention of the word "thread" or "threading" :P
Except that when he last talked about it, without any apparent aversion, he mentioned having started experimenting with it and that he's used it on his side projects. And, as mentioned above, 3 years to rewrite the game for a 10% increase in speed doesn't yet seem worth it as right now more can be achieved with the partial optimisations he's been doing.

But yeah, make stuff up, that fine.

no need to get tilted, it was just a(n apparently terrible) joke.

To be fair, it's not hard for someone without full knowledge of Toady's development process to reach Strik3r's conclusions.

Pessimism is the critical component here. And a bit of snarkiness.
But yeah, i wish i had full knowledge of that process, but unfortunately, i dont.

If there's no informative log message, how do you know pathfinding is the cause of the freezes you're experiencing? Pathfinding happens when units need to find a path somewhere, which tends to happen on-demand, not in large groups of units at once (unless you're doing things like designating hundreds of jobs at once, but I assume you would have mentioned that). There have been several other things identified as causes of freezes in the past (vegetation growth in caverns, for instance), so it would be more productive for Toady to figure out exactly what's locking up instead of starting by refactoring pathfinding. (Of course, if pathfinding is the issue in your game, then it's definitely worth addressing.)
If other games like DF are any indication then pathfinding is probably the biggest CPU hog.
Although, there is a chance it might also be temperature calculations or plant growth, especially if they're not really well optimized.
But IMO if something's causing a hang or a freeze, its probably pathfinding as its likely the one to see a sudden spike in activity with a mass of creatures trying to path at once or getting stuck and failing to path... repeatedly.
Logged
NOTICE: If you can't update your profile/signature, stop using a Imgur URL for your profile picture.
Upload it to somewhere else.

Shonai_Dweller

  • Bay Watcher
    • View Profile

Last time I got crazy stuttering and lag it turned out to be people being born at vast rates in the outside world (was fixed). So, what looks like pathfinding bug, might be something completely different.
Logged

Strik3r

  • Bay Watcher
  • Persistently work-in-progress.
    • View Profile

Last time I got crazy stuttering and lag it turned out to be people being born at vast rates in the outside world (was fixed). So, what looks like pathfinding bug, might be something completely different.

You may be right and it was something i already considered that there's a chance of it being something that nobody could expect. That's how it usually goes with bugs.
Still, of the things we know to cause lag, pathfinding is at the top of the list
Logged
NOTICE: If you can't update your profile/signature, stop using a Imgur URL for your profile picture.
Upload it to somewhere else.

Pages: [1] 2