Topic: Dwarf Therapist v42.1.5 | DF 50.12 (Read 399465 times)

thistleknot · « **Reply #930 on:** August 23, 2020, 05:52:26 pm »

If anyone is curious how I proposed the median normalization. (I've successfully moved from staging stuff in excel to R)

This is the code in R (screenshot with a picture of how it works). This method replaces ECDF and the prior "s transform methods" in one go. Why? Because it combines distance as well as median calculations which makes flat distribution curve of ecdf redundant.

It's the best of both worlds and arrives at almost exact .5 mean and min is ~0% and max ~100%

Things I dont know. If my stratified sdev and its use in pnorm are the best approach, but what was most important for me was a .5 mean and near 0 and 100%. I see in this case min is almost 10% but if I had to err I'd prefer to err on tail ends as this is meant to be used in a weighted sum model (akin to Euclidean distance). I could stretch the refactoring process out or decrease the standard deviation but averaging the zscore sums as a refactoring base seems to be good enough for my purposes

https://imgur.com/gallery/d9iYI8d

https://github.com/thistleknot/medianNormalization/blob/master/medianNormalization.R

Clément · « **Reply #931 on:** August 24, 2020, 04:42:59 am »

Quote from: thistleknot on August 23, 2020, 10:19:07 am

Stratified MAD (+ factors) should be capable of replacing everything.

I had the impression it worked on almost normal distribution. I've always seen you testing it on attributes data. What about skills? personality traits? needs? roles?

Quote from: thistleknot on August 23, 2020, 10:19:07 am

As to per role, I compare across roles. I see what your point is if your doing it job by job, but that was something I tried to overcome (I was doing that in spreadsheets way back) but the point of the labor optimizer was to start from a single list of %'s and start from the highest and work down (across roles)

The way this greedy algorithm works make it even more important that ratings can be compared across different roles. Although it is very subjective to compare roles. What is best a farmer or a butcher?

Quote from: thistleknot on August 23, 2020, 11:15:41 am

I did a really cheap hack to work around my aspect concerns.

We still don't know if there is a bug here.

Quote from: thistleknot on August 23, 2020, 01:56:35 pm

I forget how skill rust was calculated into role ratings if at all. Do you know? Is it counted at all? Is there a penalty applied (like potential) for skills that are experiencing rust?

From what I can read, I don't think so. Skill rate is used but not skill rust.

Quote from: thistleknot

[...lots of personal messages with numbers and formulae...]

Sorry, I did not read all of it. I cannot comment on that any way. I can translate your math to C++, but I don't exactly know what you are doing. I fear you are overthinking it. Don't forget the goal is to help the player find the best dwarf, not to generate perfect numbers that only a statistician can appreciate.

thistleknot · « **Reply #932 on:** August 24, 2020, 01:37:05 pm »

I definately am overthinking it

The "bug" wasn't so much the nurse roll. I still think it was related to skills reporting 0 but for some reason the logs stopped showing zero after that change which I know wasn't checking for 0

The change in the code is to exclude aspects with no elements from the weighted sum algo

Thanks for being patient w my neurotic stats rants

As to the method it can replace

The minmax around mean/median and merge w ecdf

I'd be wary of using it on skills

thistleknot · « **Reply #933 on:** August 24, 2020, 08:41:45 pm »

I was thinking about it (as I always do) and it was kind of bothering me that I made the method non linear. For our purposes it didn't really matter, but for machine learning (I was sharing this method on linkedin) it's not a "good" method as it's not linear. That's what standard deviations normally do. They work with a mean which has the statistical property of averaging to 0 (i.e. average z score is 0 when using the mean). But that's not the case with median.

So I was thinking how could I make that happen? I visualized what I would need to do to the curves (the green method is the method you coded in pretty much but with rescaled ends (based on means of <0 and >0 zscores [vs sums of zscores]), which is pretty much fine [i.e. non linear] for game purposes). However, the blue and orange methods are linear versions of that which also take into consideration the scaling I mentioned. Actually, blue doesn't do any scaling, but the orange does.

What blue does is say look. It's more important to map the 25 and 75% points (the avg z scores of <0 and >0) and derive a new centerpoint based on that for a new set of z scores and be happy that the median is still in the middle close enough, but it won't be exactly 50%. This comes at the benefit of having a 50% mean. It's kind of a two pass. The first pass uses a stratified mean to find these 25/75% points, and then derives a new sdev.

Orange, scales both ends (similar to the green using the means of <0 and >0 zscores) before deriving a 25% and 75% point. This comes at the cost of not having exactly a 50% mean, but median will be 50%. It's close though for large distributions. Orange derives a new sdev, but uses the median as the centerpoint. You can think of the orange as the green method, but linear.

https://imgur.com/a/XAuefhw

https://github.com/thistleknot/medianNormalization/commit/6babec227e79fcd9a9ae16ccf86ce2b93434b232

thistleknot · « **Reply #934 on:** August 25, 2020, 05:31:25 am »

Round 3
Magenta

Linear, mean/median = 50%
No rescaling is done (an issue with the orange method). Uses median as center point, 25/75% are fluffed but used as a determination of a new sdev which allows for a linear method around median and achieving a 50% mean (in this extremely skewed 3 point sample, 50% mean is almost achieved)

https://imgur.com/gallery/30uNs74

https://github.com/thistleknot/medianNormalization/commit/2df745e6fe5c27e04dd293f491d09abcbb8baab0

Before I was leaning towards blue or orange, but now I'm team magenta

Blue because the overall average was .5 w median near .5

Magenta average ~.5 and median is .5

Idk. Green still seems to be good enough for our purposes. It's basically two linear minmax's But you can see them all side by side. I'd be happy with any method tbh

thistleknot · « **Reply #935 on:** August 25, 2020, 07:59:44 pm »

After hammering things out a bit more, I've come to realize that the problem isn't linear. That's why I can't arrive at .5 mean as well as median unless both sides of the median have a symmetrical average absolute z score which doesn't happen unless I transform values (and hence a non linear curve). The best solution would be blue for a linear solution (if desiring a .5 mean) else magenta (if desiring median at .5).

So I was thinking, fine I'll go with the green method which is a non linear transform, but that too has a problem by focusing on the average of either side which has an unforeseen consequence of reducing the tails away from their extremes...

So I thought, fine I'll go back to sum of z scores (i.e. sum(abs(z<0)) & sum(z>0)) and rescale based on that average***

I did try to use a median based standard deviation vs stratified and I came upon the initial issue I had and forgot which was it would have too many values in the extreme z scores (>3) which had me wanting to fall back to the stratified method...

***which is the formula you coded for me as this has better ends.

Which is full circle back to the initial proposal

thistleknot · « **Reply #936 on:** August 26, 2020, 08:22:13 pm »

old idea I had for melee dwarf and ranged dwarf

http://www.bay12forums.com/smf/index.php?action=post;msg=4054602;topic=122968.30

Clément · « **Reply #937 on:** August 27, 2020, 03:59:12 am »

I guess the proper link is: http://www.bay12forums.com/smf/index.php?topic=122968.msg4054602#msg4054602

I don't understand, do you want new roles with all the skills? Or do you want to modify the current military roles to include more skills?

I don't mind a complete overhaul for the military roles. The current roles are incoherent: Ambusher and Pikeman includes armor/shield skills, but that's not the case with other weapon skills. Hunter is the role that include only Ambusher (it is the role linked to the hunting labor).

thistleknot · « **Reply #938 on:** August 27, 2020, 09:22:28 pm »

Geezus I can't even link right

I believe what I was getting at was the ability to create custom subgroupings and assign weights to the group as opposed to weights to just aspects then individual items within the aspect. This may be achievable outside of changing code though (by creating the groups in something like excel and creating weights and then copying just the subweights over)

Another idea I had was OR logic for skills (swordsman or macedwarf, and report highest? Maybe a max). Idk if the script editor can do this, it might.

I'll have to look into it this weekend.

BTW I was thinking of creating an R version of how I thought roles could be written (vs spreadsheets) (at least traits, skills, attributes & maybe preferences) and remove irrelevant methods (no more minmax and no more rankecdf) in hopes to better assist in visualizing streamlined proposed code reductions.

But idk if I want to go down a 🐇 hole on skewed skills right now. I know rankecdf was used for skills somehow to score values equal to 0 to <~50% or something like that.

I really like kernel density estimates (kde) and they might be a good solution for skills. Tbh I haven't really looked at skewed solutions in r but they certainly have them. Who knows maybe this median method would work well idk but I havent tried a comprehensive reduction/demo over all methods yet since this new median method

thistleknot · « **Reply #939 on:** August 30, 2020, 11:13:06 am »

I didn't want to leave anyone hanging on some R version of things. My damn elbow got infected and laid me out AND I lost my zfs partition due a fire scare when I disconnected everything in a hurry (NAS) but I exported some CSV's and imported them into R to try some things...

So here is my analysis.

Stratified MAD might work better for some scenarios (like dice rolls), but it still is a bad choice for skills.

Using a basic median in place of mean to derive a standard deviation is more than enough for attributes. When I did that I got near .5 mean average and median was .5.

The only reason the initial method does this weird thing with merging a minmax with a mean and median is because it helped achieve a .5 mean, but the mean isn't important at all for what we hope to achieve (A 50/50 split) and using a median based approach is more than adequate.

Having said all that. I don't really have a solution for skills. It seems pretty easy what I need to do.

Count only values greater than 0,
derive %'s for that, divide by 2,
and add .5

then convert the 0's to .5
then find average-.5
then adjust the entire set by this amount to bring the skills within a .5 mean.

The question is what normalization method to use for skills. I believe right now it's using that min/max around mean/median merged with Empirical Cumulative Distribution Function.

That's where i'm kind of stuck right now. I looked at Kernel Density Estimates but am not done. I still feel like crap atm.

I'm looking at this atm
https://rdrr.io/cran/sn/man/dsn.html

nxsnexus · « **Reply #940 on:** September 03, 2020, 03:51:52 am »

Hi!

The first post states that:

Quote

Compatible with Dwarf Fortress from 0.42.06 to 0.47.04 (some versions may be missing memory layout depending on the operating system).

Currently, I'm running the last Lazy Newb Pack version (with DF 0.47.04) on windows 10 64bits and Dwarf Therapist doesn't work. The error message flashes at the bottom right before I close DT. The error message is related to invalid/missing memory layout. I tried to download the very last version of DT, same issue. I just cannot connect it to my current game.

As it's written in the first post, this issue seems to be known but no workaround is provided. I tried searching this topic and found several suggestions (like going to the data folder and delete anything in the memory_layouts folder) but without success. I also checked in the "data/memory_layouts/windows" folder where the DT executable is and I can find the "v0.47.04_graphics_win64.ini" file.

What can I do to make it work?

Clément · « **Reply #941 on:** September 03, 2020, 04:20:03 am »

The error message when closing DT reminds me of this issue: https://github.com/Dwarf-Therapist/Dwarf-Therapist/issues/227

Try disabling the automatic updates.

nxsnexus · « **Reply #942 on:** September 03, 2020, 07:42:54 am »

I manage to connect and read my current game by disabling the auto-update. Thank you very much!

Uthimienure · « **Reply #943 on:** October 20, 2020, 01:37:56 pm »

I'm curious about why the names for artifact and named weapons are different in Therapist than they are in the DF's military equipment screen.

Clément · « **Reply #944 on:** October 21, 2020, 05:02:12 am »

Thanks for the bug report, it will be fixed in the next version: https://github.com/Dwarf-Therapist/Dwarf-Therapist/pull/231

News:

Author Topic: Dwarf Therapist v42.1.5 | DF 50.12 (Read 399465 times)

thistleknot

Re: Dwarf Therapist v41.1.7 | DF 47.04

Clément

Re: Dwarf Therapist v41.1.7 | DF 47.04

thistleknot

Re: Dwarf Therapist v41.1.7 | DF 47.04

thistleknot

Re: Dwarf Therapist v41.1.7 | DF 47.04

thistleknot

Re: Dwarf Therapist v41.1.7 | DF 47.04

thistleknot

Re: Dwarf Therapist v41.1.7 | DF 47.04

thistleknot

Re: Dwarf Therapist v41.1.7 | DF 47.04

Clément

Re: Dwarf Therapist v41.1.7 | DF 47.04

thistleknot

Re: Dwarf Therapist v41.1.7 | DF 47.04

thistleknot

Re: Dwarf Therapist v41.1.7 | DF 47.04

nxsnexus

Re: Dwarf Therapist v41.1.7 | DF 47.04

Clément

Re: Dwarf Therapist v41.1.7 | DF 47.04

nxsnexus

Re: Dwarf Therapist v41.1.7 | DF 47.04

Uthimienure

Re: Dwarf Therapist v41.1.7 | DF 47.04

Clément

Re: Dwarf Therapist v41.1.7 | DF 47.04