Bay 12 Games Forum

Please login or register.

Login with username, password and session length
Advanced search  

Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Messages - IronGremlin

Pages: [1] 2
1
But we're talking like 100 ns for a round trip here. It's not "one hundred cycles" as Putnam threw out for a random example, it's certainly not 30 cycles as in your example, it's MULTIPLE HUNDREDS of cycles.

Op time and data transfer windows are tiny fractions of the total time the CPU spends waiting. Nobody talks about this part because there isn't dick anyone can do about it, because we haven't managed to make electron wave propagation any faster over those kinds of distances yet, but when you hear things like "accessing data in RAM is an order of magnitude slower than accessing data in the L2 cache" this is what they are talking about. Adjusting the RAM clock is a fractional percent difference here.
Halving the clock still increases the total time from 100ns to 110ns, enough to make a measurable difference if this is the bottleneck imo.
Whereas it had no impact on FPS but CPU clock speed had a nearly 1:1 reduction in FPS.

If I could underclock my ram any more to get yet more evidence I would, but I think I'll try seeing how linked lists perform at 3200 vs 1600 instead since they are known to be ass slow because of the exact memory latency issue we're discussing.

Another thing to consider is that the unit list and other data structures in the game are afaik stored in vectors which are very apt to efficient memory prefetching when you're iterating it in order, so the latency issue isn't going to be pronounced.


1.6 vs. 3.2 billion cycles per second translates into each "gap" in communication taking 5/8ths of a nanosecond vs. 5/16ths of a nanosecond. I'm not really sure where you're getting '10 nanoseconds' from - I don't follow.

Regardless - Assuming you achieved a 10% difference in your primary bottleneck, that would make a measurable difference in a controlled benchmark - I don't know if that's going to reliably translate to a consistently measurable difference in a Dwarf Fortress FPS counter. My money would be on "no," because in my experience conducting proper benchmarks is hard and observation bias is a bitch.

The linked list vs. the flat vector is a a great case example and I would think that should illustrate the relative performance change in RAM I/O bandwidth vs. latency very clearly.


I would caution against attempting to reduce the problem space to 'oh but DF uses flat vectors so it can't fall victim to pathological cases involving a lot of blind pointer hopping' - this presumes an awful lot about the architecture of software that you don't have the source code for, and more to the point most of the shit I've heard Putnam say about this subject makes me extremely suspicious of that line of reasoning.

2
A clock speed of 3200 mhz means you get 3 billion operations in a second
Sort of, actually DDR (double data rate) frequency quoted is twice the actual clock speed because it transfers data twice per clock cycle, so it's 1.6 billion clock cycles per second. But the actual operations like set row to read, read a column etc. take several clock cycles - let's say 15 because that's what mine takes. So it's about 10ns just in the latency for the ram chip to respond, which works out to around 33 clock cycles of my CPU unless I'm mistaken, so I'm thinking halving the memory clock would increase the inherent latency on the chip to a full 66 clock cycles.

But anyway, neither the RAM chip's latency or the delay waiting for the signal to carry through the little wires from RAM to CPU would be increased by underclocking the CPU so I don't see how a highly memory bound workload would show such a direct correlation with CPU clock speed.

Right, sorry, 3 billion distinct opportunities to request/return an operation per second, which is not the same thing as ops per second - my bad.


But we're talking like 100 ns for a round trip here. It's not "one hundred cycles" as Putnam threw out for a random example, it's certainly not 30 cycles as in your example, it's MULTIPLE HUNDREDS of cycles.


Op time and data transfer windows are tiny fractions of the total time the CPU spends waiting. Nobody talks about this part because there isn't dick anyone can do about it, because we haven't managed to make electron wave propagation any faster over those kinds of distances yet, but when you hear things like "accessing data in RAM is an order of magnitude slower than accessing data in the L2 cache" this is what they are talking about. Adjusting the RAM clock is a fractional percent difference here.


That kind of small optimization -really- adds up when you're performing tens of thousands of operations in series - I am not saying RAM timings do not matter for performance - but the moment that one operation depends on the next one to complete it becomes very meager difference. Not all memory I/O problems are created equal, hence me talking about 'latency' vs. 'bandwidth' in network terms.


Here's is how this relates to the CPU / Ram timing test:

Decrease CPU clockspeed - Program moves slower. But you have introduced other opportunities for bottleneck. A program that was not CPU bound before can now become CPU bound.

We have proven nothing with this test, save that there is a point at which CPU speed can be a limiting factor for DF performance, but this shouldn't really be a surprise to anyone.


Turn the CPU clockspeed back up, and adjust RAM timings - Program stays the same speed. But as we've established, if the RAM I/O problem you are having is one of data latency, and not one of bandwidth, adjusting the timings will have near 0 effect.

We have also proven nothing with this test, other than that our problem probably wasn't one of RAM bandwidth.


Hence, we have not proven that the program was CPU bound or memory bound - those observations are consistent with a scenario in which your problems with performance are a result of conditional execution against data in RAM. Given that Putnam has also talked a bit about how conditionals in RAM are a big problem for DF performance, I'm inclined to interpret that as the explanation until evidence to the contrary is presented.


3
A clock speed of 3200 mhz means you get 3 billion operations in a second, but no force of nature can make electric potential travel from your CPU to RAM and back again in a third of a nanosecond, as that would violate general relativity.


When you talk about RAM latency, you're usually talking about the dead interval between send/receive, which is determined by clock speed as you say.
 I'm a network guy, so when I say "latency" I mean signal latency, not time RAM spends waiting to communicate. Sorry for the mismatch in terminology.

Signal latency is hard locked here to the amount of time it takes charge to propagate from RAM to CPU - not sure on exact specifics but it'd take a photon about a nanosecond to round-trip so it's going to take an electron a fuck of a lot longer (as elementary particles measure such things, at least).


Upshot is that if you have a bunch of operations to execute in RAM in series, clock speed will make a massive difference, but if you have conditional shit that needs a round trip to CPU to decide what to do next there's an upper limit to how much clock speed can help.

4
DF Dwarf Mode Discussion / Re: mist generators: are they worth it?
« on: January 22, 2023, 10:24:55 pm »
I don't think you can reasonably claim to have a "normal" fortress if happy thoughts and bad thoughts are breaking even with each other.


There are very few bad thoughts that come from things outside of player control - like, getting into arguments I guess is one, and sure sometimes dwarves die but that shouldn't be a source of continuous bad thoughts.


Everything you do for your dwarves gives them an opportunity for a positive thought or at least prevents a negative one, so the overwhelming majority of your dwarves should have a positive thought balance at any given time, with only a few malcontents having majority neutral/negative thoughts.



That having been said - a mist generator isn't better than, say, making sure your dwarves have clothes. Both are optional, both have rather extreme effects on the mood of a fort over time, but neither one is going to make up for your dwarves living in squalor while surrounded by their friends corpses / ghosts.

5
Is throwing more cores at a highly CPU bound problem really being obtuse or just a logical conclusion? Optimizations are always good but at a certain point your choices really are 'do less' or 'more compute'

You're assuming it's CPU bound rather than memory bound.
It's more of an educated guess seeing as the utilization of one core is always 100% when playing the game.

In the interests of being thorough though I went and underclocked my CPU to 2ghz from 3.7ghz and my FPS went from 38 to 20
And then returning the CPU to normal underclocked my ram from 3200mhz to 1600mhz and found my FPS went back to 38

underclocking your RAM that much isn't going to matter much when it takes well over 100 clocks to even get to the RAM in the first place
Doubling the amount of time you have to wait for a response from your RAM chip isn't going to make a difference to a task that's bottlenecked by RAM?

If you dispatch a porter who travels one mile per hour one mile away, it'll take 2 hours for a round trip.

If you dispatch 6 porters over the course of 1 hour, you'll get back 6 times as much shit, but your shit still won't arrive until 2 hours post dispatch no matter how frequently you are sending the porters.

Case in point, sometimes you don't get to decide WHERE to ask for information in RAM until after you get a response back FROM RAM, visa vi, if your bottleneck is latency, your bottleneck is latency, and clockspeed won't do shit for you (within sensible ranges).

6
DF Suggestions / Re: Possibility of AI integration in DF?
« on: January 20, 2023, 01:42:28 am »
No.

Machine learning would be antithetical to everything that makes DF such an amazing example of procedural generation in game design.

DF is all about hand built proc-gen - proc-gen derived from observed fundamentals.

AI is all about the machine deriving patterns from an input set for you and coming to it's own conclusions - asking for machine learning in Dwarf Fortress is like asking why someone wouldn't just take a helicopter up to the peak of Mt. Everest, it sort of misses the point of the whole endeavor.

7
DF Dwarf Mode Discussion / Re: Planter priorities
« on: January 18, 2023, 07:19:26 pm »
There isn't and it's not an oversight.

I am just absolutely blowing a gasket over here trying to figure out what you were doing that somehow turned this into a problem - like you really do not need that many farm plots, and they don't take very long to construct. Like, hell, you can only have 200 seeds for a given crop to plant at any given time anyway. Are you trying to just plant a dozen 10x20 plots right out the gate or something? Even with that it seems like you should be able to finish those plots before it screws you...

8
DF Dwarf Mode Discussion / Re: Planter priorities
« on: January 18, 2023, 03:15:36 pm »
What actual problem are you having?

If a dwarf could theoretically choose between two tasks you have no way of telling them which task to pick (with the exception of priority designations and workshop jobs).

It's sort of part of the soul of the game that you have limits on how much control you can exert over your dwarves, so generally the better plan is to achieve your goal in some way that doesn't involve explicit instructions.

9
DF Dwarf Mode Discussion / Re: Should Size matter more?
« on: January 18, 2023, 01:03:22 pm »
I think the reason it matters more is because dwarves train so damned fast now.


Not that I'm complaining -exactly-, dwarves being able to train to at least 'professional' in combat skills in a reasonable amount of time is really important to make dwarf military seem worth it, but it does feel a little ridiculous to have ~20 legendary dwarves with like 2-3 years of training.


As far as combat skill vs. size - yeah I absolutely think a legendary warrior should make mincemeat out of an elephant. That makes sense, it's a high fantasy game. Would you bet on Guts, or a large tiger?


But also a 1200 year old dragon should probably ALSO be legendary in everything, and probably it should be a bit more difficult to have an army of legendary swordsdwarves.

10
Nah, it's probably just my brain lying to me about the past, that happens.

Regardless, it seems like the kind of thing that would be cool if it existed - but you'd need to slow down the game a lot for it to be enjoyable.

11
At some point you have to bound some aspects of what the simulation handles per tick, but that's not quite the same thing as saying that you have to bound what is possible.


There is a lot of trickery that you can get away with in terms of deferring processing or decreasing precision, as well as just "vanilla" optimizations to reduce the amount of work that gets done.


So as a professional I strongly disagree with the idea that the ONLY way to prevent lag is to limit the player - there's a big toolbox to draw from there. Also the idea.that FPS death is an inevitability with the game as it currently stands is totally ridiculous and abjectly false.


However, I completely agree that armchair software development is at best a futile effort - it's pretty arrogant to assume that you know THE solution to a problem without ever seeing the project source. I don't think there's anything wrong with theorizing possible causes of slowdown per se, but things like "oh well you really just need to use multithreading" is obtuse and condescending on a good day.

12
The new labor system is phenomenal.


There are a few clunky pieces still, but a fortress will "just work" without heavy handed intervention, and the process of tweaking things is just so much smoother with this paradigm. Just a few more little tweaks and dwarf therapist will be useless in comparison.


The soundtrack is great and the contextual music system is just absolutely fantastic.


The creature details screen and all the various subscreens are massive improvements, much better presentation of information.


Being able to see at a glance whether or not my military dwarves have actually equiped at least some of their uniforms yet is a welcome change - being able to recognize a specific dwarf is also just such a great little addition.


As much as I am eager to see the return of a persistent log, the new alerts system is actually really really great.


The organization of menus is much, much better.

Zones is a better system than rooms were. Just light years easier to use and more sensible/approachable.


Scroll wheel to navigate z-levels is awesome. Being able to see multiple z-levels at once is likewise awesome.

The settings menu instead of digging through config files rocks.

The new save system is great.

Honestly everything that was added or changed was absolutely fantastic. I miss some things that haven't come back in yet (full keyboard controls and persistent logs, mostly) but all in all the game is just way more fun to play than it ever has been before.

13
Perhaps I'm misremembering, but I thought for sure the "you absolutely must be paused to see the logs at all" was new. Like I don't remember the 'workflow ' of open log, read, advance time, open log...

Maybe keyboard shortcuts just blended that all together in my head? A lot of that stuff does become kind of autopilot once you get used to it.

The organization thing is only really a "special" beef now due to not having persistent logs - it's easy to accidentally dismiss shit. It was always some defective work to piece that together, and I think I sort of understand why asking for a consolidated log isn't exactly a small ask given the whole multiple perspective thing.



14
DF General Discussion / Re: Future of the Fortress
« on: January 13, 2023, 08:45:11 pm »
A question to Putnam. What was your first impression on DF source code as a developer? Was it good? Was it OK? Was ot bad?


If we have questions about code and we want someone to actually answer them in a forum where the sole author of said code is reading their response, it might be advisable to ask questions that can be answered more objectively.

Tarn's an amazingly chill human being, so maybe he'd be alright with being a little irreverent about code quality, but even so, this is his life's work - it's not like code you'd work on at your job that slowly grows into a monster that nobody really owns anymore, it's one guy's baby.  Best case they laugh a little and give you an answer like : "Well there are some gnarly bits but that's to be expected" - which doesn't tell you much anyway. 

If we're more specific and objective, we're likely to get a better response, and we don't run the risk of giving the impression that we're digging around for mud to sling - not saying that's what you're doing, just saying it's a little funky.

Here are some examples of specific, more objective questions about the code:

- What areas seem like they are most/least optimized?

- Are the techniques Tarn is using familiar patterns, or is this mostly novel?

- Is there anything interesting that you can tell us about how the project is organized?

- How approachable was the project? Do you feel like you're getting a handle on it or do you think you might be lost for awhile?



15
The thing with the magma sea though is that the slowdown is always immediate - it's always like a sudden pop as soon as I breach it. I have frequently had the experience of being pinned at 100 FPS for an entire fort history and then dropping down below 50 as soon as that channel gets dug.

Caverns may or may not be the real culprit in any of my slowdowns - I sometimes seem to notice a bit of a hit around that time in the game, but it is never so drastic and it's not immediate, so it could be something else happening. The game is doing a hell of a lot every tick and I am deeply skeptical of any anecdotal observations about framerate, even my own - but the relationship with large water features and the magma sea is just too immediate for me to ignore.

Pages: [1] 2