Bay 12 Games Forum

Dwarf Fortress => DF Modding => Utilities and 3rd Party Applications => Topic started by: PeridexisErrant on August 14, 2014, 09:50:01 am

Title: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: PeridexisErrant on August 14, 2014, 09:50:01 am
There's been a lot of talk lately about ways to make combining small mods easy for new players.  To me that means a GUI or GUIs, and given the proliferation of such things around here a top priority is a sensible standard way to package mods. 

Here's one of many mod manager discussions (http://www.bay12forums.com/smf/index.php?topic=140771.msg5548935#msg5548935).  TLDR is that everyone likes the idea of building up something like a starter pack for mods, where new people can get their toes wet in a GUI that can stack mods.  The plot is basically to get a launcher, probably the PyLNP, and then take over the world start including optional and easy mods in everything. 

Regarding format, I think we need to nail this down soon.  The longer we leave it, the more likely it becomes that the community ends up with multiple patch formats... So my proposal is: 

- Each distinct mod gets it's own folder, named whatever the mod is called. 
 - In that folder, include the raw folder structure with any files changed from the vanilla raws; plus at the top level other stuff (eg readme, random files and folders) that can be ignored without error. 
 - Possibly also a manifest file if this takes off and needs extending to more arcane cases. 
 - One of the mod folders is the full vanilla raws; that folder is named identically to the version it came as, eg "df_40_08" (omit OS as it's irrelevant to raws).  Folder names starting "df_*" are reserved, case-insensitive. 
 - A *real* patch is derived when needed by running a diff between the latest vanilla folder (autodetected) and the mod
 - A folder "raw" is created (another reserved name; the old is deleted if present) and the vanilla raws copied in. 
 - Diffs are then applied to this folder in the specified order
 - If there's a conflict, highlight the problematic mod in the list and delete the raw folder (no invalid mods created here!).  This ensures compatibility and acts as a live preview for feedback.


 - Mods with modules, dependencies, dfhack, etc: all tricky.  I propose ignoring that for now; complex behaviour and handling comes after the basics are working and there's some interest from both modders and users.  Suggestions welcome anyway.

Advantages:
 - such a basic format makes repacking easy; if someone distributes a preinstalled mod all you have to do is remove the other files (easy for a launcher to do)
 - inversely, it's easy to install such a packaged mod without tools just by drag-and-dropping it over a vanilla install
 - a Mod Starter Pack becomes possible, and easily user-extensible (the latter is the coolest part)

I'd love to hear feedback on this idea. 

For more info, see https://github.com/PeridexisErrant/Py-Mod-Loader
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Dirst on August 14, 2014, 10:31:53 am
I really like this idea, and it brilliantly handles the idea of removing a mod: you just rebuild from vanilla skipping the mod-pack you no longer want.

The sequence you described would be a great starting point, but collision handling needs to happen sooner rather than later because virtually ALL custom buildings and reactions need to go into entity_default.txt.  I think the responsibility for treading carefully in vanilla files should be placed on the modder, with some handholding in the launcher's documentation.

For example, it is more reasonable to tell the modder to translate his/her positioning hints into regular expressions than it is to ask the user (who may not know the first thing about raw tokens) if a partial match is valid.  I do not recommend expecting the launcher to pile on special handling for umpteen different situations (inserting custom buildings, changing gaits, adjust stone tile, etc etc etc etc); pick one or two general formats.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: PeridexisErrant on August 14, 2014, 06:48:12 pm
I really like this idea, and it brilliantly handles the idea of removing a mod: you just rebuild from vanilla skipping the mod-pack you no longer want.
Exactly.  As well as the difficulty of removing a patch, it also avoids breaking things by removing a dependency for something else.  This way, every combination the user tries is explicitly checked and rejected if it won't work. 


The sequence you described would be a great starting point, but collision handling needs to happen sooner rather than later because virtually ALL custom buildings and reactions need to go into entity_default.txt.  I think the responsibility for treading carefully in vanilla files should be placed on the modder, with some handholding in the launcher's documentation.

For example, it is more reasonable to tell the modder to translate his/her positioning hints into regular expressions than it is to ask the user (who may not know the first thing about raw tokens) if a partial match is valid.  I do not recommend expecting the launcher to pile on special handling for umpteen different situations (inserting custom buildings, changing gaits, adjust stone tile, etc etc etc etc); pick one or two general formats.
Yeah, more intelligent handling of collisions is going to be required before too long; I absolutely agree that special cases should be kept to a minimum - or preferably avoided entirely.  To avoid incompatibility there may eventually be a canonical way of doing it, but for now keep everything as simple as possible so it actually happens!

That's why I don't think 'soon' is now - if the first version usually only allows one mod to substantially alter entity_default.txt, that's OK for the first version. 

-----------------------------------------------------

And an idea on how to handle mods that have been split up into modules:

Note that this is not a final idea, and for now each module can simply be treated as a standalone mod with an indicative name.  This is much easier to handle, and probably more elegant overall.  Despite the annoyance for modders, it might just be best to require all parts of split mods to be standalone and treat them as such.  Conflicts or dependencies should be caught by the launcher anyway.

 - Single folder for the whole mod
 - subfolders for each part of it, each of which would be valid (if useless/conflicting) as a standalone mod
 - some standard way of declaring what should be loaded by default, and in what order
 - user can treat the collection as a single mod, or choose to see the more detailed settings

I think that this will require a manifest, which should not be modifiable (except obviously with a text editor).  The user can adjust the settings, but this is only held until the launcher is closed.  I'm leaning towards xml format; the second part of the manifest being the load order for submodules.  Given that there is no top-level "raw" folder, a missing manifest will mean no modules are loaded and the collection is treated as an empty mod.  The manifest can also include metadata such as name of mod, author, link to source, and description (so non-collection mods may also want one, though it's optional). 

Hopefully, mods which have graphics options can put the main mod in a top-level "raw" folder - ASCII raws, graphics turned off - which will always be loaded, and then have modules which add in the graphics.  This would actually handle a general case of options with shared dependencies.  Thinking this through raises some issues about diffs; for consistency these mod--specific graphics should be diffed from vanilla but I can see that getting impractical or conflicted. 

All of this more complex handling comes later though; for now a simple launcher that can do diffs and simply rejects conflicts will be a big step up.  First the simple, then the practical, then the possible, and last of all take over the world!

-----------------------------------------------------

Currently my biggest wish for mod-manager is some sort of state tracking. I.e. I don't like how it detects if mod is already merged in.

Second in wish list is this. Namely some smarter way of adding-removing everything and being raw aware. Imho there is no silver bullet: even multi-million programming industry did not crack this problem and there are so called "merge conflicts" that need to be sorted out by hand.

The rebuild-without-thing-to-remove approach handles this nicely, but only if the original set of mods and merge order is saved.  The simplest way to do this would be to create a text file in the built raw folder; "included_mods.txt".  First line includes a timestamp and ID for the launcher that created it, so people can detect if it was someone else's launcher.  Second line is the base DF raws, eg "df_40_08".  Subsequent lines give mods in order of application:  "first_mod_applied", "why do people put spaces in folder names", "thismodlast". 

I think that this is premature for this idea though, when we don't even have the basic concept implemented. 

Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 14, 2014, 09:25:13 pm
Although this is a crosspost.

I was thinking of a class definition for tagToken's

And keeping track of tagToken's within entities would be a good start.

Spoiler: tagToken.h (click to show/hide)
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Dirst on August 14, 2014, 10:10:00 pm
Yeah, more intelligent handling of collisions is going to be required before too long; I absolutely agree that special cases should be kept to a minimum - or preferably avoided entirely.  To avoid incompatibility there may eventually be a canonical way of doing it, but for now keep everything as simple as possible so it actually happens!

That's why I don't think 'soon' is now - if the first version usually only allows one mod to substantially alter entity_default.txt, that's OK for the first version. 

Agreed, let's get something out the door.  The world we be there for conquering tomorrow.

And an idea on how to handle mods that have been split up into modules:

Note that this is not a final idea, and for now each module can simply be treated as a standalone mod with an indicative name.  This is much easier to handle, and probably more elegant overall.  Despite the annoyance for modders, it might just be best to require all parts of split mods to be standalone and treat them as such.  Conflicts or dependencies should be caught by the launcher anyway.

 - Single folder for the whole mod
 - subfolders for each part of it, each of which would be valid (if useless/conflicting) as a standalone mod
 - some standard way of declaring what should be loaded by default, and in what order
 - user can treat the collection as a single mod, or choose to see the more detailed settings

I think that this will require a manifest, which should not be modifiable (except obviously with a text editor).  The user can adjust the settings, but this is only held until the launcher is closed.  I'm leaning towards xml format; the second part of the manifest being the load order for submodules.  Given that there is no top-level "raw" folder, a missing manifest will mean no modules are loaded and the collection is treated as an empty mod.  The manifest can also include metadata such as name of mod, author, link to source, and description (so non-collection mods may also want one, though it's optional). 

This works, but there needs to be at least one of two failsafes, preferably both: (1) a way to declare dependencies (even if it's implemented as just a pretty-please note to the user) so that the diff can be made from the parent module, and/or (2) a way to gracefully handle attempts to diff things that just aren't there (non-fatal notification to the user, post v1.0 some kind of error handling as well).

Hopefully, mods which have graphics options can put the main mod in a top-level "raw" folder - ASCII raws, graphics turned off - which will always be loaded, and then have modules which add in the graphics.  This would actually handle a general case of options with shared dependencies.  Thinking this through raises some issues about diffs; for consistency these mod--specific graphics should be diffed from vanilla but I can see that getting impractical or conflicted. 

See above.  I think it's perfectly fine for the game to vomit into the errorlog if you load the TWBT widgets for workshops you aren't using resulting in a bunch of unknown tags.  Explicit dependencies would help prevent that kind of thing.

All of this more complex handling comes later though; for now a simple launcher that can do diffs and simply rejects conflicts will be a big step up.  First the simple, then the practical, then the possible, and last of all take over the world!

Maybe a placeholder of just reading the manifest file and printing a notice that it was noticed.  Hook into that later for actual logic.

Currently my biggest wish for mod-manager is some sort of state tracking. I.e. I don't like how it detects if mod is already merged in.

Second in wish list is this. Namely some smarter way of adding-removing everything and being raw aware. Imho there is no silver bullet: even multi-million programming industry did not crack this problem and there are so called "merge conflicts" that need to be sorted out by hand.

The rebuild-without-thing-to-remove approach handles this nicely, but only if the original set of mods and merge order is saved.  The simplest way to do this would be to create a text file in the built raw folder; "included_mods.txt".  First line includes a timestamp and ID for the launcher that created it, so people can detect if it was someone else's launcher.  Second line is the base DF raws, eg "df_40_08".  Subsequent lines give mods in order of application:  "first_mod_applied", "why do people put spaces in folder names", "thismodlast". 

I think that this is premature for this idea though, when we don't even have the basic concept implemented.

Again, I think it's worth a placeholder here... generate the state/order file when building the raws as a debug tool, even if the tool itself has no idea how to parse it yet.

Does this sound like a reasonable starting point?
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 14, 2014, 10:51:38 pm
note: I use tag/token interchangeably.

it seems like your whole idea is centered around the same concept I gleaned from using github.  Start with vanilla, then add to it and derive patches (really no different than software merging I guess, which is still something I just recently learned).  Basically, make whatever mod you want.

I think Putnam wanted to make the system for "packaging" the mods with patch files in the same manner.

That's what got me started with this mod tool merge idea.  Having to MANUALLY resolve conflicts is what I was thinking google-diff-patch-match might do 'auto-magically'.  I initially wanted to create a tool that patched raw files using that library, but... alas I decided to manually parse the files and keep track of tokens instead.  I would still like to implement the patch algorithm, but I think keeping track of an object's entire set of tags in their ordinal positions before and after merge should be enough to keep some sort of state tracking system for the object.

Anyways, when I realized why diff patch's conflicted, I figured if there was a special way we can work with the raws maybe we can isolate those merge conflicts, that's where my concept of tag tracking came in.  Keeping track of an entire object's tag state from both sets of files to be compared with (say base vanilla against mylittleponymod).  One can either see entire new entities created, or entities modded, or deleted, etc.  But it would be like a tag definition of the object before and after merge.  Like an additive/subtractive tag diff.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: PeridexisErrant on August 14, 2014, 11:30:44 pm
@Dirst:  Yep, sounds like we're on the same page. 

How's this for a set of goals?
v0.1 - basic logic in place, folders findable, etc.  Written in Python for PyLNP compatibility
       - can derive a diff correctly and create a set of raws with one mod-patch applied
v0.2 - installs multiple non-conflicting mods correctly
       - handles errors and conflicts gracefully
       - write log file with mod merge order etc
       - place holder handling for a manifest file (note existence)
v0.9 - GUI time!  Integrated as a tab in the PyLNP - get it functional in new context, refactor to fit, etc.
    .1 - implement manifest and use information from it in display (if present)
       - 'simplify mod folders' option ala LNP, deleting extra files (not readme etc, but eg rest of DF install)
v1.0 - start finding or soliciting or formatting some mods

In roughly that order.  1.x probably includes storing patches instead of changed files and maybe declaring the version of the raws to diff against on a per-mod basis in the manifest.  Smart handling for mods with dependencies and a way to intelligently resolve merge conflicts without user input would take us to 2.0, but we can discuss that kind of thing in more detail once we get to 0.9!

@Thistleknot: Yeah, it's a pretty similar concept to your patch ideas and not by coincidence :)

Putnam inspired the format by pointing out that a standard format is, duh, the vanilla raws.  This has the important advantage over distributing just diffs that it's a standard format and would allow you to just dump a preinstalled mod, DF executable and all, into the folder and have it work. 

Honestly I'm hoping that by the time this gets up to building in a smarter way of handling merges all that needs to happen is we port you logic over.  Obviously that would be much easier if you took inputs in the same format as I'm proposing  ;D
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Sean Mirrsen on August 15, 2014, 12:54:58 am
ModBase used to use something similar, it provided its own markup for tag adding and removal, and kept a listing of mods applied sequentially to each other, each with their own folders.

The big problem with it, as-is, is that it doesn't handle logical ordered clusters of tags yet. The introduction of the caste system way back when broke it, and I sort of left it there when I couldn't make it work. It needs a complete logic rewrite to handle being able to read and write tags to and from specific castes, specific attacks, tissue defs, etc.

It also needs a complete rewrite because it's all in VB.NET, but that's another thing. :P
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: PeridexisErrant on August 15, 2014, 01:05:47 am
The thing to remember is that this is primarily a mod *manager* - making it easy to for new players to use mods is the top priority, and merging mods is pleasant side effect if and only if it can be done without compromising that goal.  Which means that a basic system comes first, then extending that to a smart parser. 

All that said, it would probably be useful to see ModBase if the code is open!
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Sean Mirrsen on August 15, 2014, 01:09:35 am
Yeah, I think I always kept it open. I really wasn't that good a coder then though (I used VISUAL BASIC for crying out loud), so the result isn't terribly great, but it has/had a lot of neat functions, not least of which is keeping track of your init file and reapplying your settings on every new DF release or install. Also switching tilesets (but not graphics sets). Little broken swiss army app, that.

Looking at the code now, it's also not terribly well commented, and yeah, will need a great big rewrite to be able to tell apart creature castes and keep them ordered. The base mod-manager functions should only need minor tuning though, as I doubt the process itself will be very different.

To clarify a bit: everything the OP says the standard should do, ModBase already does/did. It was intended, by me, to become a modding standard back then, since diffs are much easier to make, and its conditional content creation functions are incredibly useful sometimes. The only problem(s) with it is that it's currently broken in regards to creatures, and is written in relatively slow VB.NET by a relatively low-skilled programmer (me).
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 15, 2014, 01:13:44 am
Well I'm a newb at qt but... If I had access and collaborators maybe we can port it over
.so far the app I'm working w has two panes side by side for file loading
 I figured I was gone a make it look more and more like mod manager but modbase sounds like a good candidate as well. I figured at some poi t I was gonna port some code over from another tool.

I don't know if anyone cares to learn qt w me cough cough "Sean". Especially if one knows VB already. Practically the same.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Pidgeot on August 15, 2014, 05:23:31 am
I fully agree with the basic process, but I'm not sure about keeping the mods as full raw folders - at least not long-term. It is definitely the simplest approach to distribution, given the current state of the mod community, but it does bring about a few issues.

Distributing as a complete raw folder means that it's tied down to that single version of DF. If you have a mod for, let's say, 0.40.03, then the files need to be updated for 0.40.04 - even though the actual changes made by the mod may be the same. (if all the raws changed by the mod are identical between the two DF versions, you can still reuse it, of course, but assume that things change)

Suppose that we have a LNP-style folder structure for this: DF is stored in X\DF and the mods are stored in a folder X\LNP\Mods, and that our current installation has a specific mod M bundled (so X\LNP\Mods\M). If a new bundle is released for a new DF version, but without mod M (because it wasn't updated), then you have a problem if users still try to apply it - because the raws get reverted to the previous version. We cannot, IMO, expect users to always extract new bundles to a freshly created folder; many people will end up using the same base folder.

Distributing as a patch avoids this issue; if the mod changes can still be applied, then there's a good chance they still have the desired effect; if not, then the mod is probably broken anyway.

Nothing prevents us from starting with raw distribution, of course, and it's probably the fastest way for us to get started, but it's worth keeping in mind, and considering whether or not we really save much in the long run.

This does lead me to the next issue, which is perhaps the most important one - what does a patch actually look like? We basically have two options; a standard diff similar to the ones you get from existing diff utilities, or a custom format tailored to the raw file structure. A standard diff would require minimal effort, but it is potentially brittle when you add more mods, and it also forces mods to be very independent.

To see why this is a possible problem, assume we have a mod to remove aquifers, and a mod that adds material layers. If any of the added materials are listed as aquifers, then a standard patch format will have no way of removing that when the aquifers mod is added - and the aquifer tags that do get removed may well depend on the order those extra materials are added in, because the patch is unlikely to provide enough context.

The alternative is to create our own custom format - and that means deciding what we want from that format. The way I see it, we would need a proper model of the raw file format, and essentially treat patches as a script, which might look something like this:

Code: [Select]
[FILE:raw/objects/creature_standard.txt] (opens the specified filename)
[UNDER_TAG:CREATURE:DWARF] (finds [CREATURE:DWARF])
[CHANGE:NAME:3:dwarf:dwarves:dwarven:dorf:dorfs:dorfen] (changes the subtag [NAME:dwarf:dwarves:dwarven] to [NAME:dorf:dorfs:dorfen] - the 3 is to specify the first three values are for searching)

[FILE:raw/objects/inorganic_stone_layer.txt]
[REMOVE_ALL:AQUIFER] (removes all AQUIFER tags)

This is of course just an example off the top of my head; an actual format would likely be different.

A custom format is of course more complex, and requires more effort to get started, but I suspect it will be better in the long run - we can easily start with a standard patch and switch over later, but once we do want a custom format for the added flexibility, it is crucial that we do our best to get the format "right" (preferably making it extensible so we can handle use cases that haven't been considered earlier). We could still create a patch automatically; only mods that want to affect other mods would require the advanced stuff (and would need to do so manually).

Perhaps the quick approach is good enough for now, but we should at least be sure that the limitations are acceptable (until a better solution is made).
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Sean Mirrsen on August 15, 2014, 05:44:23 am
<snip>

ModBase uses a custom markup format that allows you to only distribute changes to raws, and you're only tied down to original raws in the sense that you might not get results you expect if the raws change too much between versions (i.e. trees in the recent version, creatures when castes and tissues were introduced, etc.)

The format allowed conditional creation of procedural entries, like creatures or weapons.
I.e. this:
Quote
[CREATURE:#_MAN]
[COND:BODY@1&QUADRUPED][COND:MEGABEAST!][COND:SEMIMEGABEAST!][COND:POWER!][COND:NAME]
[NAME:#man:@1#men:@1#man]
[CAN_SPEAK][CAN_LEARN][CANOPENDOORS]
[SIZE:#+1]
[BODY:QUADRUPED!!HUMANOID:>>]
[MUNDANE:!]
[PET:!]
[EQUIPS]
[COMMON_DOMESTIC:!]
[NATURAL]
[PREFSTRING:mystery]
[CHILDNAME:NAME@1#man child:NAME@1#man children]
[ATTACK:MAIN:BYTYPE:GRASP:punch:punches:1:2:BLUDGEON][ATTACKFLAG_WITH]
[ATTACK:MAIN:BYTYPE:STANCE:kick:kicks:1:3:BLUDGEON][ATTACKFLAG_WITH]
This mod combs through all creature entries loaded at the point where it's applied, and looks for several conditions (it must have QUADRUPED in its body tag, it must have a NAME, it must not be a MEGABEAST, SEMIMEGABEAST, or POWER). For each creature that matches these conditions, it creates a new creature entry called <CREATURE>_MAN, switches its QUADRUPED body for a HUMANOID, strips it of MUNDANE, PET, and COMMON_DOMESTIC tags, increases its size by 1, gives it ability to learn, speak, open doors and equip equipment, and gives it new names and attacks to match. The mod was, appropriately, named after Dr.Moreau. :)

Well, it used to do that until all the tissue and caste changes, anyway. -_-

The markup generally works, and mods that only add stuff can be plugged into it as-is, since regular tags and entries are just added in.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: PeridexisErrant on August 15, 2014, 06:26:48 am
I fully agree with the basic process, but I'm not sure about keeping the mods as full raw folders - at least not long-term. It is definitely the simplest approach to distribution, given the current state of the mod community, but it does bring about a few issues.

Distributing as a complete raw folder means that it's tied down to that single version of DF. If you have a mod for, let's say, 0.40.03, then the files need to be updated for 0.40.04 - even though the actual changes made by the mod may be the same. (if all the raws changed by the mod are identical between the two DF versions, you can still reuse it, of course, but assume that things change)
<snip>
Perhaps the quick approach is good enough for now, but we should at least be sure that the limitations are acceptable (until a better solution is made).

Key point:  good enough for now, and basic process.  Until we have a working basic example this is mostly academic. Once we have a manifest file, it's easy enough to note which version the mod is based against.  If the pack then also includes old vanilla mods, either entirely or as a backwards patch, you can then diff against that and it should work unless big changes are made.  In any case major-raw-changing updates are very rare, and thinking too much about them just because we're currently in that period would be a mistake. 

Again though, the problem is not advanced merging tools or special syntax. 

The problem is that no new players use them. 

Instead, I'm proposing a simple format which *already almost exists*.  The only difference is that many files in a pre-installed mod can be automatically deleted by the utility; content already exists and is compatible.  The key point for new players is an easily loaded collection, not the advanced merging. 
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: expwnent on August 15, 2014, 06:56:41 am
In the most recent version of DFHack, the raw/scripts folder will be checked for Lua scripts before the hack/scripts folder. This means that each savegame can have its own collection of scripts that come with it and you can transfer saves more easily between people that don't have the same mods. It also means that for this project you can just throw all the scripts into the same folder as long as there's no overlap and it'll work fine.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Pidgeot on August 15, 2014, 06:59:33 am
The key point for new players is an easily loaded collection, not the advanced merging.

I agree, but that's exactly why the standard patch format might not be good enough.

Yes, distributing mods as full raw folders and using standard diff utilities is easy, and we could get something running without too much effort - but it comes with a cost: the patches don't know anything about the raw structure, and that makes it that much more difficult to make sure everything works as intended when you deal with multiple patches.

A standard patch basically looks for a specific part of the original file. This is generally annotated with a line number to provide a reference point, but that line number can be flexible to accomodate other changes.

What this means is that if mod A adds 10 creatures to the middle of a file, then the patch for mod B, which changes something about other creatures, might end up modifying one of those newly added creatures - even though it wasn't supposed to. We have no way of knowing when this happens - we only know whether or not the patch was applied to something.

Now, this may indeed be acceptable for the first implementation - I don't use mods, so I don't know if this is a common case or not. If it's good enough, then it's good enough, and I have no objections beyond that - I just think it's important to at least be aware of the possible problems.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 15, 2014, 07:00:42 am
I've been thinking I need such a too for a while, and recently considering writing my own. Everything In the OP makes sense to me. The biggest problem is dealing with conflicts. The OP says you should abort on conflict, but that requires being able to detect the conflict first. Also, I think it's really limiting if two mods cannot modify the same vanilla file. When adding content, this is relatively easy, but with mods that remove content, you can run into problems, and it can be hard to detect them.

One thing that could help is if there are different settings for applying a mod. So one setting would apply a diff to each file that it modifies that already exists. Another would instead overwrite any conflicting files; this is useful for situations like sorting gems, because the order gems are loaded affects their appearance in the jewelers workshop. And you might need a third setting for some other special cases.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Sean Mirrsen on August 15, 2014, 07:14:40 am
So I'm guessing that all of this means that I'm going to have to resurrect ModBase after all. Everyone seems to be dancing around in circles over the functionality that ModBase already mostly provided back when it worked. All it really couldn't do on its own was extract the mods by itself. *sigh* I do not really have the time for it nowadays.

Heh, I even had a name for that project. Clean Slate. Might start a repo up on GitHub and see if I can make any progress...
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 15, 2014, 07:30:42 am
@Dirst:  Yep, sounds like we're on the same page. 

How's this for a set of goals?
v0.1 - basic logic in place, folders findable, etc.  Written in Python for PyLNP compatibility
       - can derive a diff correctly and create a set of raws with one mod-patch applied
v0.2 - installs multiple non-conflicting mods correctly
       - handles errors and conflicts gracefully
       - write log file with mod merge order etc
       - place holder handling for a manifest file (note existence)
v0.9 - GUI time!  Integrated as a tab in the PyLNP - get it functional in new context, refactor to fit, etc.
    .1 - implement manifest and use information from it in display (if present)
       - 'simplify mod folders' option ala LNP, deleting extra files (not readme etc, but eg rest of DF install)
v1.0 - start finding or soliciting or formatting some mods
I think the steps would be:
v0.1 - able to apply and remove any one mod.
v0.2 - able to merge any number of mods as long as no two mods modify the same file
       - detect duplicate raws.
v0.3 - allow scripts: a mod that is actually a shell script that is applied naively
v0.4 - allow mods to modify the same file, start to manage conflicts
... - more as needed

The GUI can be worked on in parallel.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: PeridexisErrant on August 15, 2014, 07:40:14 am
I like the principle of NO SPECIAL CASES.  No special syntax, no different behaviour, just diffs derived from the full set of changed files in the raws.  If that makes some advanced stuff impossible, simple is better.  (I've been reading this list (https://en.wikipedia.org/wiki/List_of_software_development_philosophies))

As to detecting merge conflicts, there are two ways to do it - both of which work to combine files.  The fast and simple way is the standard merge conflict test; if your diff returns an error show the problem in the live preview. 

The slow way, which will return no false negatives, is to add and subtract combinations of diffs to check that changes don't overwrite.  For example, we take vanilla DF and apply mods by diffs A, B, and C in that order.  We can then check for problems by confirming that VABC-A == VBC, VABC-B == VAC, skip the case VABC-C == VAB as trivially true, and also check VABC-AB == VC for completeness.  Given that this is N-1 factorial checks for N mods, it could be too slow for a live preview but if we only call it when the fast merge check returns OK that should be acceptable.  If one passes and the other fails, I'd probably alert the user and allow them to decide. 

Modbase, rubble, Thistleknot's project are all awesome but that's not what new players need.  They don't need amazing merge tools.  They just need an easy way to try some mods, and the LNP provides the model I want for graphics and utilities already.  If you can only use one mod at a time that's not a fatal flaw for these users! 


@King Mir:  good points;
    0.2 - Files that are identical to vanilla should be detected and removed, though this might come later (not 0.9 though, you're right)
         - Any test versions that overwrite files instead of using diffs would be pre-0.1
    0.3 - No.  No special formats in the mod.  Not until 2.0 minimum, and even then it has to be backwards compatible
    0.4 - As above, I foresee using diffs early.  Could change if it's harder than I expect, but unlikely. 
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Sean Mirrsen on August 15, 2014, 07:46:31 am
If they want to try out one mod, then distributing mod diffs will work, as well as distributing whole raw folders. You don't need a program to install one mod, unless it's a mod with lots of stuff that you might not want all of.

ModBase started as a commandline script that installed or removed my Martial Arts and Minerals mods with various optional elements for people that didn't want certain things I added. It just evolved into a mod manager from there, and soon became the only real way to install the mods I made because I developed them using the program's functions.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 15, 2014, 08:40:31 am
I like the principle of NO SPECIAL CASES.  No special syntax, no different behaviour, just diffs derived from the full set of changed files in the raws.  If that makes some advanced stuff impossible, simple is better.  (I've been reading this list (https://en.wikipedia.org/wiki/List_of_software_development_philosophies))

As to detecting merge conflicts, there are two ways to do it - both of which work to combine files.  The fast and simple way is the standard merge conflict test; if your diff returns an error show the problem in the live preview. 

The slow way, which will return no false negatives, is to add and subtract combinations of diffs to check that changes don't overwrite.  For example, we take vanilla DF and apply mods by diffs A, B, and C in that order.  We can then check for problems by confirming that VABC-A == VBC, VABC-B == VAC, skip the case VABC-C == VAB as trivially true, and also check VABC-AB == VC for completeness.  Given that this is N-1 factorial checks for N mods, it could be too slow for a live preview but if we only call it when the fast merge check returns OK that should be acceptable.  If one passes and the other fails, I'd probably alert the user and allow them to decide. 

Modbase, rubble, Thistleknot's project are all awesome but that's not what new players need.  They don't need amazing merge tools.  They just need an easy way to try some mods, and the LNP provides the model I want for graphics and utilities already.  If you can only use one mod at a time that's not a fatal flaw for these users! 
Are you saying to use diff to see if two mods modify the same file, or if diff3 returns an overlap? Seeing if two mods modify the same file doesn't need diff: you can assume they do if both include the same file. Testing if diff3 returns no overlap would need to be done for every combination of added mods, so you still have N factoral checks. But you're just dealing with a small number of text files, so maybe performance won't be an issue.

You also need to support the options LNP already supports. That means being able to apply a remove aquifer patch and a remove exotic animal patch on top of a graphics pack. Then you want to have a mod add a creature or stone type to that. That's a baseline for which conflicts this tool would need to be able to manage. I concede that you want to keep things simple, but can you?

I do think it's important that the only thing a Mod needs to be included is the files that would override the standard raws, and that you don't invent a DSL for writing a mod. 
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 15, 2014, 08:53:44 am
One way I was going g to deal w mod conflicts was...

To do the merge conflicts manually, derive a patch between vanilla and my now manually merged mod.

Only issue is. This has to be done ahead of time and it would mean a modder would have to manually do the merge.  However... It is possible but its no different than doing it manually it's just you have a patch file afterwards. Case in point. Accelerated mod + modest mod changes both alter creature and things like arcvision and clutch size at the same spot in creature files... So... A modder could make that patch file afterwards, but this isn't an ideal solution.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: PeridexisErrant on August 15, 2014, 09:37:17 am
@King Mir
I'm talking about overlapping diffs, not simply diffs in the same file; the latter is as you note unproblematic. Never underestimate the impact of a factorial performance hit - the problem is that it doesn't take many more than the test case to cause a big problem.  It's (n-1)! too, because the last diff on can always be reversed.

Other things like the aquifer tags should be fine, as they're not done by merging raws but rather by editing them based on tags. Which *will* remain a separate function.

@Thistleknot - modders producing manually combined mods (perhaps with the merger tool) is not a particularly elegant solution, true. However it's no worse than the current situation and I think substantially better than making the format and workings of this system any more complicated than it has to be.

Graphics are complicated enough that I'm just ignoring the issue for now. I'll think of something clever for that later. Many ideas but nothing worth sharing yet.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Pidgeot on August 15, 2014, 10:06:56 am
For example, we take vanilla DF and apply mods by diffs A, B, and C in that order.  We can then check for problems by confirming that VABC-A == VBC, VABC-B == VAC, skip the case VABC-C == VAB as trivially true, and also check VABC-AB == VC for completeness.  Given that this is N-1 factorial checks for N mods

How did you arrive at (n-1)!? By my calculations, you create the expected final set of raws (N patches), N sets of raws where each set skips 1 patch (N-1 patches per set), and N sets of raws where you "unpatch" one patch from the final set (N "unpatches"). That's 2N+1 sets of raws, N^2 patches applied, and N patches unapplied.

Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Dirst on August 15, 2014, 10:11:28 am
I like the principle of NO SPECIAL CASES.  No special syntax, no different behaviour, just diffs derived from the full set of changed files in the raws.  If that makes some advanced stuff impossible, simple is better.  (I've been reading this list (https://en.wikipedia.org/wiki/List_of_software_development_philosophies))

I think we can go one or two steps past vanilla diffs without getting bogged down.

First, location hints need to be tags rather than lines.  This handles all file collisions unless they try to modify the same tag.  It might help if the vanilla raws were pre-flattened to have one tag per line.

Second, allow for a position hint to be a regular expression.  In other words, the scripting version of a mod becomes a slightly hardened sed engine.  This is extraordinarily powerful (for example, one can write a regular expression that matches a syndrome and any possible subtags under it) and puts the onus entirely on the modder rather than the player. 

A location hint of \[CREATURE:ELEPHANT\].*?\[PETVALUE:[0-9]+?\] will find the petvalue of an elephant, no matter what it has been changed to by another mod.  A sed-like like of s/(\[CREATURE:ELEPHANT\].*?\[PETVALUE:)[0-9]+?\]/\1666\] changes the petvalue of an elephant to 666 (in tribute to the elephants in 23a).  The hardening is only necessary if you want to do something other than silently skip any failed matches.

With these two features in place, it should be possible to handle graphics packs like any other mods.  The one thing it does not do gracefully is re-order things within a file, such as sorting the gemstones.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Sean Mirrsen on August 15, 2014, 10:29:43 am
I did all that... with simple strings... I could reference certain values of certain tags of certain objects, like creating a sword with a damage value dependent on that of an existing sword.

Quote from: Magic Weapons submod of Martial Arts+
[ITEM_WEAPON:ITEM_WEAPON_SSWORD_CRUEL]
[NAME:shortsword:shortswords]
[ADJECTIVE:Cruel]
[DAMAGE@:%ITEM_WEAPON_SWORD_SHORT+10:GORE]
[WEIGHT:35]
[SKILL:SWORD]
[CRIT_BOOST:1]
[TWO_HANDED:4]
[MINIMUM_SIZE:4]
[MATERIAL_SIZE:6]
[STICK_CHANCE:30]

It's context-sensitive in this case (pulling the damage value from the same spot of the same tag of a different item of the same type), but was intended to work just as well for different-type objects and different tags.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Pidgeot on August 15, 2014, 11:29:17 am
I like the principle of NO SPECIAL CASES.  No special syntax, no different behaviour, just diffs derived from the full set of changed files in the raws.  If that makes some advanced stuff impossible, simple is better.  (I've been reading this list (https://en.wikipedia.org/wiki/List_of_software_development_philosophies))

I think we can go one or two steps past vanilla diffs without getting bogged down.

The minute you go beyond vanilla diffs, you're creating a custom patch format. This means you need to re-implement patching yourself; existing libraries cannot be reused.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 15, 2014, 01:26:50 pm
@King Mir
I'm talking about overlapping diffs, not simply diffs in the same file; the latter is as you note unproblematic. Never underestimate the impact of a factorial performance hit - the problem is that it doesn't take many more than the test case to cause a big problem.  It's (n-1)! too, because the last diff on can always be reversed.

Other things like the aquifer tags should be fine, as they're not done by merging raws but rather by editing them based on tags. Which *will* remain a separate function.
So effectively you're allowing modding scripts as long as they are run after your tool, with aquifers being an example of such a script.

If you assume that if A and B don't conflict with C, then A+B don't conflict with C, then you reduce your checking to N^2. N is also quite small. So you might be ok.

You have a very strict requirement for non-conflicting mods. The approach I had in mind would instead have mods have a specified order of loading, so that a later mod overrides a prior mod.

And BTW- I think the suggestion to have one tag per line is a good one. You can preprocess the mod and vanilla raws, so that each tag is on one line and strip comments before feeding it to your differ.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Dirst on August 15, 2014, 01:41:20 pm
I like the principle of NO SPECIAL CASES.  No special syntax, no different behaviour, just diffs derived from the full set of changed files in the raws.  If that makes some advanced stuff impossible, simple is better.  (I've been reading this list (https://en.wikipedia.org/wiki/List_of_software_development_philosophies))

I think we can go one or two steps past vanilla diffs without getting bogged down.

The minute you go beyond vanilla diffs, you're creating a custom patch format. This means you need to re-implement patching yourself; existing libraries cannot be reused.
I think some kind of custom format is unavoidable; it just needs to be open.

We don't need a mod managing tool if all we have are mini-mods that can be unzipped on top of a vanilla install.  We also don't get much bang for the buck if we don't store things as diffs of some type (a graphics pack can make tiny changes to lots of large files).
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 15, 2014, 01:47:18 pm
We don't need a mod managing tool if all we have are mini-mods that can be unzipped on top of a vanilla install.
I do. I want to be able to easily add and remove mini-mods without having to manually reapply each mod to vanilla on each remove or to work out how to remove a mini-mod while leaving others intact.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Pidgeot on August 15, 2014, 01:50:08 pm
The minute you go beyond vanilla diffs, you're creating a custom patch format. This means you need to re-implement patching yourself; existing libraries cannot be reused.
I think some kind of custom format is unavoidable; it just needs to be open.

We don't need a mod managing tool if all we have are mini-mods that can be unzipped on top of a vanilla install.  We also don't get much bang for the buck if we don't store things as diffs of some type (a graphics pack can make tiny changes to lots of large files).

It's unavoidable in the long-term, definitely, but for the short-term, Peridexis is proposing to use a completely standard diff (with the pitfalls that brings along), to provide a more easily attainable starting point.

If you're making a custom format, then you'd prefer to make sure it is as complete (or extensible) as we would ever need, and that's when it takes more time - and you would likely end up with something like (but not necessarily identical to) ModBase.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Dirst on August 15, 2014, 02:18:07 pm
King Muir, I was being a little too flippant... forgetting about the removal part.  That said, if you assume away file collisions then it could be a simple script.  (For the ultimate in simplicity, back in VMS you could accomplish this kind of rollback with a single command.)

Pidgeot, something like ModBase would be a bit too raw-aware.  I like that idea of being able to do math in the script, and I noted above that positioning should be done at the tag level, but otherwise it should be processing more-or-less arbitrary text.  Otherwise we dive down a rabbit-hole of trying to parse increasingly complex tag structures with a never-ending parade of special cases.  A couple *nix standards, slightly buffed, gets you just about everything that modders do except re-ordering things.

But I do agree, the simple diffs that can't handle collisions is a good starting point.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 15, 2014, 03:36:18 pm
I did all that... with simple strings... I could reference certain values of certain tags of certain objects, like creating a sword with a damage value dependent on that of an existing sword.

Quote from: Magic Weapons submod of Martial Arts+
[ITEM_WEAPON:ITEM_WEAPON_SSWORD_CRUEL]
[NAME:shortsword:shortswords]
[ADJECTIVE:Cruel]
[DAMAGE@:%ITEM_WEAPON_SWORD_SHORT+10:GORE]
[WEIGHT:35]
[SKILL:SWORD]
[CRIT_BOOST:1]
[TWO_HANDED:4]
[MINIMUM_SIZE:4]
[MATERIAL_SIZE:6]
[STICK_CHANCE:30]

It's context-sensitive in this case (pulling the damage value from the same spot of the same tag of a different item of the same type), but was intended to work just as well for different-type objects and different tags.

I think your tool needs to be updated to include castes then. However, I think a simple collection of "patches" that players could load on vanilla would be the next step. I do know ur tool required vanilla as a base but the steps seedseem a bit cumbersome. I would think a simple batch file should all that would be needed to apply some patches to vanilla
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: milo christiansen on August 15, 2014, 05:12:16 pm
One word: Rubble. (http://www.bay12forums.com/smf/index.php?topic=140853.0)

The only real lack is a cross platform GUI, but that is just the front end, the actual tool works on all OS's.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: PeridexisErrant on August 15, 2014, 06:32:10 pm
The minute you go beyond vanilla diffs, you're creating a custom patch format. This means you need to re-implement patching yourself; existing libraries cannot be reused.
I think some kind of custom format is unavoidable; it just needs to be open.

We don't need a mod managing tool if all we have are mini-mods that can be unzipped on top of a vanilla install.  We also don't get much bang for the buck if we don't store things as diffs of some type (a graphics pack can make tiny changes to lots of large files).
I do. I want to be able to easily add and remove mini-mods without having to manually reapply each mod to vanilla on each remove or to work out how to remove a mini-mod while leaving others intact.
It's unavoidable in the long-term, definitely, but for the short-term, Peridexis is proposing to use a completely standard diff (with the pitfalls that brings along), to provide a more easily attainable starting point.

If you're making a custom format, then you'd prefer to make sure it is as complete (or extensible) as we would ever need, and that's when it takes more time - and you would likely end up with something like (but not necessarily identical to) ModBase.

One word: Rubble. (http://www.bay12forums.com/smf/index.php?topic=140853.0) 

Custom formats are a lot of work, have been done, and have been done well.  ModBase is great, Rubble is simply brilliant, but they're solving a different problem.  The goal here is *not* to merge overlapping mods well, it's to make using mods easy for new players. 

Maybe a custom format is unavoidable in the long term, I suspect not, but until something has wide adoption any new format will lead to nothing but fragmentation and stagnation as it becomes too much trouble to use these tools.  In any event, if a special format was used at all I would need to hear some pretty good reasons to do anything but include Rubble and give users the option of the simple and advanced mod loaders.  In response to all the ideas about flattening raw structures, maths in scripts, etc:  different goals.  The point of using standard diffs here is ease of use, and eye to future size reduction in the format, and *not* advanced merging.  Making two unrelated changes to a file is enough!

The advantage of a diff over simple file overwrites is simply that it handles bloated small mods more elegantly, and it's fairly easy to implement with standard libraries.  If not, file overwrites will do.  I'm hoping we'll also see a slightly different mod philosophy where instead of including all the favourite tweaks, we get just the changes that are a core part of the mod, equivalent to graphics packs leaving the population cap at vanilla levels. 

Other scripts, fancy tools, etc should be just as compatible with the output of this process as any other mod; so scripts which (eg) edit aquifer tags will work just fine. 
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: milo christiansen on August 15, 2014, 06:53:35 pm
I still think that Rubble is just a good GUI away from solving the problem, and for those who are stuck on diffs Rubble can (as of 4.4) apply standard format patches.

It is easy to look at Rubble as a modding tool, but I designed it as a general purpose mod installer first and a modding tool second. The only drawback is that mods need to be prepared to work with it, you cannot just drop any old mod in and expect it to work.

If someone comes up with a general purpose mini-mod format that does not depend on a specific scripting system or other host I would be happy to add Rubble support for it, just post in the Rubble thread when finalized.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 15, 2014, 07:15:12 pm
In response to all the ideas about flattening raw structures, maths in scripts, etc:  different goals.  The point of using standard diffs here is ease of use, and eye to future size reduction in the format, and *not* advanced merging.  Making two unrelated changes to a file is enough!
I see your point with maths in scripts, but flattening out raws is a simple way to allow more mods to be merged with very little cost. You'd still be using a standard diff, and standard input, but comments, the lack of newlines should not prevent mod merging.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: PeridexisErrant on August 15, 2014, 07:59:52 pm
In response to all the ideas about flattening raw structures, maths in scripts, etc:  different goals.  The point of using standard diffs here is ease of use, and eye to future size reduction in the format, and *not* advanced merging.  Making two unrelated changes to a file is enough!
I see your point with maths in scripts, but flattening out raws is a simple way to allow more mods to be merged with very little cost. You'd still be using a standard diff, and standard input, but comments, the lack of newlines should not prevent mod merging.

For clarity, I'm opposed to requiring this in the input format but think it's a good idea in the logic. 

It should be easy enough to flatten all the raws in memory or create temporary copies processed in whatever way we like, which are then handled however.  "merge(flatten(mod1), flatten(mod2))" rather than "merge(flat_mod1, flat_mod2)".  Same effect, easy enough to code, more consistent, easier and more flexible for modders. 
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 15, 2014, 08:13:04 pm
I'm thinking instead... of xml files.

The raws should be loaded into xml's, comments and all.  The goal is to preserve original file structure (comments and their contextual positions) and token information.

Then some custom patch files are based on the differences based on the xml token's.

Then you have a standardized format that is based on something that works with xml encapsulation vs some other arbitrary patch algorithm.  Only catch is, someone would have to make xml structure to read the raw files (I'm reading up on TinyXML, but not necessarily volunteering). 

Then another program to measure differences between the two xml's of each compared file.

Of course then another overall utility to implement those changes in a way similar to rubble, mod manager, or mod base.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: PeridexisErrant on August 15, 2014, 08:24:28 pm
So long as there's a rigidly defined way to translate raws into xml format, I guess that could work well.  I'm still not touching that kind of thing until after we have a mod loader 1.0 out and working. 
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 15, 2014, 09:25:31 pm
look at this from some rawexplorer link I was viewing

http://dftokens.gumpstudio.com:8080/ViewToken.aspx?Token=PHILOSOPHER

Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 15, 2014, 10:45:21 pm
In response to all the ideas about flattening raw structures, maths in scripts, etc:  different goals.  The point of using standard diffs here is ease of use, and eye to future size reduction in the format, and *not* advanced merging.  Making two unrelated changes to a file is enough!
I see your point with maths in scripts, but flattening out raws is a simple way to allow more mods to be merged with very little cost. You'd still be using a standard diff, and standard input, but comments, the lack of newlines should not prevent mod merging.

For clarity, I'm opposed to requiring this in the input format but think it's a good idea in the logic. 

It should be easy enough to flatten all the raws in memory or create temporary copies processed in whatever way we like, which are then handled however.  "merge(flatten(mod1), flatten(mod2))" rather than "merge(flat_mod1, flat_mod2)".  Same effect, easy enough to code, more consistent, easier and more flexible for modders.
Yeah that's what I meant. Flatten the mods in the tool. 

I'm mostly sold on this proposal now. My biggest concern is if it'll be good enough to handle graphic packs, but worst case you could use that as the baseline instead of vanilla. Maybe I'll have other concerns after I sleep on it more.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 15, 2014, 11:15:30 pm
I did some searching for a powershell script that could be modified to accomplish that for a set of raws.

http://stackoverflow.com/questions/15283931/powershell-find-replace-string-with-carriage-return-and-line-feed

Spoiler (click to show/hide)

Spoiler (click to show/hide)

Okay, here's the powershell script that will process your entire subdirectory and flatten all the files as you described.

Code: [Select]
$files = @(get-childitem -include *.txt -recurse -path $path -filter $filter)
Write-Host "files loaded";
foreach ($file in $files) {
        $outfile = "$file" + ".out"

        Get-Content $file | Foreach-object {
            $_ -replace '\[',"[`r`n" `
-replace '\]',"`r`n]"
        } | Set-Content $outfile
    }

You know what's funny, is how would one reverse that?  Say you didn't want to leave the raws looking like that after you merged them.  How would one reverse this formula?  I can't merely swap the  '\['   with   "[`r`n"

Update

by doing this [conversion] to new lines, contextual differences might have to be extended to 4 or 5 or maybe even more lines when deriving diff's.

Carrying further on this idea.

http://www.thinkplexx.com/learn/howto/scm/svn/how-to-create-and-use-local-svn-subversion-repository-on-windows-or-linux-simple-and-fast-step-by-step

One could use a local svn repository system to apply these mods.  Possibly even create little git batch scripts with a front end menu to load configurations of mods using a cherry picking system.

Update
I tried it out on github to see what a 34.11 to 40_08 accelerated modest mod conversion would look like and it has less conflicts, but, breaking up the token's has chaotic affects, especially with conflicting tokens.  Two token's are kind of wanted to be merged together.

http://imgur.com/yLFixQO

It would also tremendously help by first parsing all tokens from non tokens, and just working with tokens so as to avoid any merging errors with comments and flattened tokens.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 16, 2014, 08:54:24 am
I fixed that script a little more (was putting the [ and ]'sin the wrong place)...

Code: [Select]
$files = @(get-childitem -include *.txt -recurse -path $path -filter $filter)
Write-Host "files loaded";
foreach ($file in $files) {
        $outfile = "$file" + ".out"

        Get-Content $file | Foreach-object {
            $_ -replace '\[',"`r`n[" `
-replace '\]',"]`r`n"
        } | Set-Content $outfile
    }

sample output

Code: [Select]
c_variation_default


[OBJECT:CREATURE_VARIATION]



[CREATURE_VARIATION:ANIMAL_PERSON]


[CV_REMOVE_TAG:NAME]


[CV_REMOVE_TAG:GENERAL_CHILD_NAME]


[CV_REMOVE_TAG:GENERAL_BABY_NAME]


[CV_REMOVE_TAG:CASTE_NAME]


[CV_REMOVE_TAG:CHILDNAME]


[CV_REMOVE_TAG:BABYNAME]


[CV_REMOVE_TAG:SMALL_REMAINS]


[CV_REMOVE_TAG:DESCRIPTION]


[CV_REMOVE_TAG:CREATURE_TILE]


[CV_REMOVE_TAG:COLOR]


[CV_REMOVE_TAG:MAXAGE]


[CV_REMOVE_TAG:SOUND]

I'm thinking I can modify this to remove all blank lines


I may... Don't hold me to it, be able to indent object type using a regular expression

Then use a batch file to derive diff's between two folders to generate a unified diff patch
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 16, 2014, 09:46:34 am
Here's a sed (*nix utility) command that will do it:
Code: [Select]
sed -e "s/^[^[]*//" -e "s/][^\[]*$/]/" -e "s/][^[]*\[/]\n\[/g"It removes comments and splits tokens to one per line. To just split up tokens you can do this:
Code: [Select]
sed -e "s/][^[]*\[/]\n\[/g"
You can also use those regular expressions with any regular expression library.

EDIT: But instead of doing this all in shell scripts, it would be better and more portable to use Python.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Merkator on August 16, 2014, 10:09:24 am
I was thinking about some more complex version of extended diff format.
For example:
Code: [Select]
raw>object>entity_default.txt
?[ENTITY:MOUNTAIN]
+[NOPAIN]
-[PERMITTED_REACTION:SMELT_IRON] # just sample.
@[ENTITY:DWARVES]

Where

Code: [Select]
     ? find line
     + mean insert line
     -  remove line
     @ replace line
     # comment

This can be easily connected with each other.
For example if you want replace tag ITEM_WEAPON:Sword with ITEM_WEAPON:Big Sword
You need to just write it like that
Code: [Select]
raw>object>item_weapon.txt
?[ITEM_WEAPON:Sword]
@[ITEM_WEAPON:Big Sword]

It should be much easier that regexp.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 16, 2014, 11:13:07 am
I was thinking about some more complex version of extended diff format.
For example:
Code: [Select]
raw>object>entity_default.txt
?[ENTITY:MOUNTAIN]
+[NOPAIN]
-[PERMITTED_REACTION:SMELT_IRON] # just sample.
@[ENTITY:DWARVES]

Where

Code: [Select]
     ? find line
     + mean insert line
     -  remove line
     @ replace line
     # comment

This can be easily connected with each other.
For example if you want replace tag ITEM_WEAPON:Sword with ITEM_WEAPON:Big Sword
You need to just write it like that
Code: [Select]
raw>object>item_weapon.txt
?[ITEM_WEAPON:Sword]
@[ITEM_WEAPON:Big Sword]

It should be much easier that regexp.

while I agree with you logging the token id with the patch file.  Peredexis was concerned with the drama involved with modifying the patch format.

I believe by flattening the raw file, it allows for contextual block matching to take over.  So, in theory, creatures should line up and token differences should be noticed more.  It might not really be the ideal solution and may require additional parsing (such as token comparisons between the same object between two mods).
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 16, 2014, 11:17:42 am
Here's a sed (*nix utility) command that will do it:
Code: [Select]
sed -e "s/^[^[]*//" -e "s/][^\[]*$/]/" -e "s/][^[]*\[/]\n\[/g"It removes comments and splits tokens to one per line. To just split up tokens you can do this:
Code: [Select]
sed -e "s/][^[]*\[/]\n\[/g"
You can also use those regular expressions with any regular expression library.

EDIT: But instead of doing this all in shell scripts, it would be better and more portable to use Python.

thx, linux cl works in powershell apparently
nm
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 16, 2014, 01:01:02 pm
okay, final udpate.

I worked really hard to ensure when I split token's that it didn't create [empty] newlines.

basically, tabs are removed first, then a carriage return "`r" and new line "`n" is created inbetween ]['s

works in powershell

Code: [Select]
$files = @(get-childitem -include *.txt -recurse -path $path -filter $filter)
Write-Host "files loaded";
foreach ($file in $files) {
        $out1Pass = "$file" + ".1pass"
$outFile = "$file" + "2"

        Get-Content $file | Foreach-object {
            $_ -replace "`t","" `
-replace '\]\[',"]`r`n["
        } | Set-Content $outFile
    }




output is .txt2

sample before and after output

Spoiler: before (click to show/hide)


Spoiler: after (click to show/hide)

Update
I did some tests with trying a 3 way merge.  The merging process is a lot smoother with the flattened raws, but I could still see a need for still checking if the token's in conflicts are being duplicated.  That would resolve most conflicts I think.

Update 2
I took a walk and thought about why I was having [merge] conflicts.  I was wondering if, using this parse method, it was due to not setting all tokens on their own line by themself when they were next to a commented out line... 

however, the merge conflicts I had were not due to token's being next to a comment (I think I might have to update the parse method to consider if a token is adjacent to a comment).

I think it might be because I applied two mods (patch of 34_11 to accelerated, and a patch of accelerated to modest...as one patch over a 3rd mod (40_08)), which I think may have cause some [merge] conflicts with [duplicate] grazer tags.

In other words, I'm thinking if raws are parsed using this method, and applied one at a time to a 3rd mod.

For ex.

diff of 34_11 vs 34_11 accelerated: as a patch
  applied to 40_08
diff 34_11 accelerated vs 34_11 accelerated modest: as a patch
  applied to 40_08 accelerated

one might avoid conflicts?  I certainly believe raw parsing the files in this manner before merging reduces the # of contextual conflicts.

Here's what a diff between 34_11 and 34_11 accelerate modest looks like with this parse.

http://pastebin.com/curfNcMH

Update 3:
I think if one back-update's some in house object tracking variables (using these patch files); such as file_name, and the first [object: name] token, using a scripting language like python, one should be able to fill in the blanks when it comes to patching location (even without col, row information) ;)

Then one could parse the diff files and derive the object id, and be able to ensure the diff file is being applied to the correct object/token id (such as: creature or item or entity, or whatever) by keeping track of duplicate[ token]s!

That would resolve the grazer issue.  If castes come into play...  I guess a simple check if the current caste is having a token duplicated, then figure out what token is the master token to be used.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: PeridexisErrant on August 16, 2014, 07:08:21 pm
The general impression I'm getting is that we can get quite a long way with standard libraries, and don't need to write complex token-based logic at this stage. 

Flattening clearly makes a standard diff much more powerful.  It's unclear to me how the mixed raws might later by un-flattened, and whether this is important.  The cases where flattening means more context is required are somewhat concerning, but hopefully not a show stopper.  I suggest we basically set the flattening aside for now, since it can be added back into the middle of the logic later without too much trouble and having a working codebase to add it to will help. 

I propose, for version 2.0 - ie NOT YET -
 - adding to the standard the *exact* regexp used to flatten the raws, (once it's agreed after testing)
 - that flattened raws go in the mod's folder in "/raws_flat/",
 - once we have an exact diff that also becomes part of the standard
 - diff-only mods go in "/raws-diffs/"
 - packs can then include any or all of these
 - the launcher must be able to accept any of these formats
 - but may use any of those present (since it must make no difference to the outcome)

I think it's worth distinguishing between a no-conflict n-way merge, and sequential (overwriting) simple merges.  The GUI should highlight the former green, the latter orange, and impossible merges red; ideally this is also applied to subsets (eg mods 1-3 green, 4 and five yellow, six red) to give an indication of compatibility.  A stretch goal might be to find the least-conflicting load order for the user.  Part of the goal of this system is to make mods easy, and this simple feedback would help a lot. 

If at all possible, it would be good to post code snippets in Python - shell scripts or whatever are fine for testing, but it's all got to be Python eventually and in many cases you can just import $tool and avoid an ugly workaround. 
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: hermes on August 16, 2014, 07:34:30 pm
Just want to chime in with a hope for this great project...  Please make the process of integrating/formatting mods into this system very very easy.  I feel that over the years the DFHack and Masterwork projects have basically turned modding into programming, taking away what made it accessible and fun in the beginning.  It's be a shame if there was yet another barrier to entry in the form of a standard which everyone had to learn and comply to. 
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: PeridexisErrant on August 16, 2014, 07:58:47 pm
Please make the process of integrating/formatting mods into this system very very easy.

It'd be a shame if there was yet another barrier to entry in the form of a standard which everyone had to learn and comply to.

This is a core part of the project:  the format is simply a normal raw folder, containing only changed files.  The launcher should be able to deal with a full install as well, by simply deleting the other files.  This makes creating and adding mods very very easy.  The standard should be trivial to comply with, since it's only what a mod needs anyway, and all the potentially-difficult manipulation is done in the program. 
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: PeridexisErrant on August 16, 2014, 08:37:46 pm
Hi all.  First code - this simplifies the mod folders and df folders down to a raw folder, but leaves top-level files whose names include 'readme' or 'config'.  Oh, and builds a list of mods and vanilla raws, but that doesn't come into this bit. 

Next plan is to remove any files in any raw folder under the mods list which exactly matches the corresponding vanilla file, after checking that there is only one vanilla folder to compare to!

Code: [Select]
import os
import fnmatch
import fileinput
import shutil

mod_folder = 'LNP/Mods/'

mod_folders_list = []
vanilla_folders_list = []
for mod in os.listdir(mod_folder):
    if mod.startswith('df_'):
        vanilla_folders_list.append(mod)
    else:
        mod_folders_list.append(mod)


def simplify_mod_and_df_folders():
    for mod in os.listdir(mod_folder):
        files_removed = 0
        folders_removed = 0
        for item in os.listdir(mod_folder + mod):
            # delete anything top-level not containing string 'raw', 'readme', or 'config'
            if 'raw' in item:
                pass
            elif 'readme' in item.lower():
                pass
            elif 'config' in item.lower():
                pass
            elif os.path.isfile(mod_folder + mod + '/' + item):
                os.remove(mod_folder + mod + '/' + item)
                files_removed += 1
            else:
                shutil.rmtree(mod_folder + mod + '/' + item)
                folders_removed += 1
        if files_removed + folders_removed == 0:
            print(mod, 'folder is already simplified')
        else:
            print(mod, 'folder simplified!  (removed', files_removed, 'files and', folders_removed, 'folders)')

simplify_mod_and_df_folders()
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 16, 2014, 09:15:23 pm
I don't think you want to modify the mods. Just ignore the parts that you don't care about. So you want to create a list of valid files files that are to be merged, but you don't want to remove everything else.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: PeridexisErrant on August 16, 2014, 09:40:46 pm
I don't think you want to modify the mods. Just ignore the parts that you don't care about. So you want to create a list of valid files files that are to be merged, but you don't want to remove everything else.

The goal is to imitate the LNP graphics loading; it works with a full install being referenced but you can choose to simplify / delete inactive stuff if you want to to save space.  Later the removal functions can be called on temporary copies for processing if the user doesn't want to delete this stuff, but for now I'm building them and don't mind testing directly. 

Next up is to detect and remove vanilla files (likewise movable to temp files later), and then the first merge logic: overwrite a copy of the vanilla folder with the mod.  Actual merging comes later ;D
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 16, 2014, 09:50:41 pm
So the psudocode I imagine is this:
Code: [Select]
string vanilla_raw_location //constant
string generated_raw_location //constant
mod_folders_list = find_mod_folders()
mod_raw_paths = get_raw_paths(mod_folders_list)

copy_vanilla_raws(vanilla_raw_location, generated_raw_location)

for each mod_raw_path in mod_raw_paths
  for each raw_relative_path in get_valid_raw_files(mod_raw_path)

      from_file = mod_raw_path  + raw_relative_path
      to_file = generated_raw_location + raw_relative_path
      vanilla_file = vanilla_raw_location + raw_relative_path

      if file(to_file) does not exist
         copy_file(from_file, to_file)
      else if(file(vanilla_file) does not exist
         cleanup_and_abort()
      else if(diff3 (from_file,vanilla_file,to_file) does not merge smoothly
         cleanup_and_abort()
      else
         merge(from_file,to_file)         
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 16, 2014, 10:02:04 pm
I don't think you want to modify the mods. Just ignore the parts that you don't care about. So you want to create a list of valid files files that are to be merged, but you don't want to remove everything else.

The goal is to imitate the LNP graphics loading; it works with a full install being referenced but you can choose to simplify / delete inactive stuff if you want to to save space.  Later the removal functions can be called on temporary copies for processing if the user doesn't want to delete this stuff, but for now I'm building them and don't mind testing directly. 

Next up is to detect and remove vanilla files (likewise movable to temp files later), and then the first merge logic: overwrite a copy of the vanilla folder with the mod.  Actual merging comes later ;D
You're just doing extra work that way. Trimming functions may be useful for somethings, but you don't need them here. You just need a list of valid files specified as a mod-relative path.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: PeridexisErrant on August 16, 2014, 11:10:20 pm
You're just doing extra work that way. You just need a list of valid files specified as a mod-relative path.
Quite possibly.  However I'd rather have working code to refactor than spend years in design, so...
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 17, 2014, 12:04:16 am
even better script guys.

Thanks to briantist on stackexchange

http://stackoverflow.com/questions/25345739/powershell-regexp-to-replace-any-character-left-bracket-and-replace-wit

Code: [Select]
$files = @(get-childitem -include *.txt -recurse -path $path -filter $filter)
Write-Host "files loaded";
foreach ($file in $files) {
        $out1Pass = "$file" + ".1pass"
$outFile = "$file" + "2"

        Get-Content $file | Foreach-object {
            $_ -replace '(?m)^\s*','' `
-replace '(\[.+?\][^\[\r\n]*)(?=\[)' , "`$1`r`n"
        } | Set-Content $outFile
    }
   
Spoiler: what it does (click to show/hide)


input: *.txt
Spoiler (click to show/hide)

output: *.txt2
Spoiler (click to show/hide)

this method should hopefully address comments onto their own lines as well.

Update

so... to test these changes out yourself, install a git client (I use github, it's all I know).

and you can derive your own patch files after parsing your raws and replacing the old ones with generated ones

how?
install a git client
create a commit based on vanilla raws
run the above script on your copy of [vanilla] raws
replace yours raws with the "flattened" raws
recommit
Create a branch based on your old client (can be done using the filter menu by typing in a new branch if using github)
open a git console, and run this to derive a patch file

git diff a e > mynewpatch.patch

a being the hex commit # of pre changes
and
b being the hex commit # of post changes

ex.

a = vanilla hex commit
b = civforge hex commit

that should give you a contextual diff file

then you can use tortoisegitmerge to merge the diff file to a mod folder.

I would imagine that if there are no conflicts, that you could use diff command line to do all the merging

Update 2
I just made some merges using this parsing method, and it still requires manual intervention.  Maybe about 5 minutes worth of work to apply a mod, then derive a patch.

With this system.  Whoever implements the mod options, has to do all the manual merge conflicts themself, then derive a patch that will be ran over vanilla.

Which means version tracking for the mod creator to fine tweak the commits/new branches and base patch files off those branches.

Here's my patch files:

http://dffd.wimbli.com/file.php?id=9418

...

In all honesty, it would almost be better to use a github repository to host all the mod options from...


Otherwise, you have to basically ship out a git offline client to the end user to be able to fire up and have all the local repositories available.

Or... one could make an elaborate setup of patch files to apply, but ultimately, someone somewhere has to keep track of it using a git version tracking.

Update again
Reuploaded a new set of patches, missed the 40_08 patch accelerated (appliied)...

http://dffd.wimbli.com/file.php?id=9418
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 17, 2014, 01:06:37 am
You're just doing extra work that way. You just need a list of valid files specified as a mod-relative path.
Quite possibly.  However I'd rather have working code to refactor than spend years in design, so...
Understood. I don't wanna tell you how to code.

Let me know if I can help with anything.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 17, 2014, 01:37:15 am
I figured out what causes the merge conflicts...

It's when I derive a patch from:

34_11 to accelerated

then...

I try to apply it to

40_08

the conflicts arise because my patch file is confused when it expects a 34_11 line and see's a 40_08 line.  Only way to fix that is to incorporate into the patch the 34_11 to 40_08 changes as well as the accelerated patch

so...

to alleviate this...  it's almost like we need to apply a diff patch of 34_11 to 40_08 + the accelerated patch in the same patch file somehow.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: PeridexisErrant on August 17, 2014, 01:51:09 am
To be honest, I think it's perfectly reasonable to ask people to update their mods for a major DF update.  The diff should cope with 40_01 to 40_08 raws and associated changes, but 34.11 to 40.01 is a bit much.  For that case, producing a diff against vanilla of the relevant version should let you manually update pretty easily - but it's not my problem for this tool.  The script I've got so far expects all mods to be for the latest minor version anyway, and that won't change until a configuration.xml (or similar) is implemented. 
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: PeridexisErrant on August 17, 2014, 02:56:38 am
Happily, it looks there are some rather powerful tools for this kind of thing in the Python standard libraries (who am I kidding, of course there are). 

See especially difflib (https://docs.python.org/3.4/library/difflib.html), which looks like it calculates perfect git-style deltas and patch files.  I imagine that could save you a fair bit of clicking, Thistleknot.
filecmp (https://docs.python.org/3.4/library/filecmp.html) should also make it trivial to isolate the files which are different to the vanilla raws. 
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: hermes on August 17, 2014, 03:53:35 am
Just a quick question... Valdemar's mod manager (http://www.bay12forums.com/smf/index.php?topic=74828.0) looks like it does the comparison in a better way by pulling out all the entities from the raws and comparing them on an individual basis... Since the code is already available, and this way would make resolving conflicts clearer and easier, why don't you guys use this instead of a text comparison that is, in computational terms, relatively meaningless?  I don't see the long term use of a diff patch.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: PeridexisErrant on August 17, 2014, 07:01:22 am
Just a quick question... Valdemar's mod manager (http://www.bay12forums.com/smf/index.php?topic=74828.0) looks like it does the comparison in a better way by pulling out all the entities from the raws and comparing them on an individual basis... Since the code is already available, and this way would make resolving conflicts clearer and easier, why don't you guys use this instead of a text comparison that is, in computational terms, relatively meaningless?  I don't see the long term use of a diff patch.

There are two main reasons to use diffs now (though I'm just going to get file overwrites working first):
 - fast, easy, standard logic means that it's easy to implement in this and also for anyone else to make their own compatible launcher
 - the goal of this tool is to act as a mod loader, not an advanced mod merge tool

I'm definitely interested in exploring more advanced merging eventually, but what this needs most of all is to get added to something like the LNP where anyone can use it and add content to it. 
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 17, 2014, 07:47:16 am
I plan on using mod managers merge for my own personal project BTW.

It is entirely possible to do the same thing that mod manager does using regular patch files and merely checking the states of objects before and after merges (python anyone?)

My idea of merging 3411 mods into 40_08 was just an example of PORTING mods over and HOW it can be done, but expect to resolve some minor diff conflicts then derive a final 40_08 diff for your project
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: PeridexisErrant on August 17, 2014, 08:15:59 am
Ok, I now have code that will turn a folder full od modded DF installs into a much smaller folder full of mods conforming with the format at the top of the thread.  That is, 'LNP/Mods/$mod' contains only files with 'readme' or 'config' in the name, and a raw folder containing those files that are not identical to vanilla. 

For Accelerated Modest Mod (40.08), that shrinks it from 10MB to 2MB.  Not bad for such naive code!

Code: [Select]
import os
import shutil
import filecmp

# path from script to mods folder
mods_folder = 'LNP/Mods/'

mod_folders_list = []
for mod in os.listdir(mods_folder):
    if mod.startswith('df_'):
        # there must be only one folder beginning 'df_', the basis of comparison
        vanilla_folder = mods_folder + mod
        vanilla_raw_folder = vanilla_folder + '/raw/'
    else:
        mod_folders_list.append(mod)


def simplify_mod_and_df_folders():
    for mod in os.listdir(mods_folder):
        files_removed = 0
        folders_removed = 0
        for item in os.listdir(mods_folder + mod):
            # delete anything top-level not containing string 'raw', 'readme', or 'config'
            if item == 'raw':
                pass
            elif 'readme' in item.lower():
                pass
            elif 'config' in item.lower():
                pass
            elif os.path.isfile(mods_folder + mod + '/' + item):
                os.remove(mods_folder + mod + '/' + item)
                files_removed += 1
            else:
                shutil.rmtree(mods_folder + mod + '/' + item)
                folders_removed += 1
        if not files_removed + folders_removed == 0:
            print(mod, 'folder simplified!  (removed', files_removed, 'files and', folders_removed, 'folders)')

def remove_vanilla_files_from_mod_raws():
    for mod in mod_folders_list:
        mod_raw_folder = mods_folder + mod + '/raw/'
        files_removed = 0
        for file_tuple in os.walk(mod_raw_folder):
            for item in file_tuple[2]:
                item_path_str = os.path.join(file_tuple[0], item).replace('\\', '/').replace(mod_raw_folder, '')
                if os.path.isfile(vanilla_raw_folder + item_path_str): # if the file exists in the vanilla raws
                    if filecmp.cmp(mod_raw_folder + item_path_str, vanilla_raw_folder + item_path_str): # and it's the same file
                        os.remove(mod_raw_folder + item_path_str)
                        files_removed += 1
        if files_removed > 0:
            print('\n' + mod + ':\n    removed', files_removed, 'files identical to vanilla raws')

def make_raws_with_mods():
    print('What mods do you want to load?')
    for mod in mod_folders_list:
        print('    ', mod_folders_list.index(mod), mod)
    mods_to_load = input('Enter the indicies in load order seperated by spaces.\n    ')

    # gives ordered list with indicies as strings
    mods_to_load = mods_to_load.split(' ')

    print('Just kidding, this doesn\'t do anything yet!')


simple_folders_q = input('Do you want to simplify mod folders? (y/n)\n    ')
if simple_folders_q == 'y':
    simplify_mod_and_df_folders()
    print('  Done!\n')
else:
    print('Not simplifying mod folders\n')

remove_files_q = input('Do you want to remove files from mods which are identical to vanilla raws? (y/n)\n    ')
print(remove_files_q)
if remove_files_q == 'y':
    remove_vanilla_files_from_mod_raws()
    print('  Done!\n')
else:
    print('Not removing vanilla files\n')

make_raws_with_mods()
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 17, 2014, 08:38:43 am
So... I think I may have a solution with how to apply older mods to newer versions of the game.

http://cyberelk.net/tim/patchutils/man/rn01re02.html

http://en.wikipedia.org/wiki/Quilt_%28software%29

Basicaly, merge patch files.

Vs trying to apply a 34_11 mod to 40_08 as a patch of changes that we based on 34_11 vs 40_08...

One could take a diff patch of 34_11 to 40_08,
  call this 34-11To40-08.patch
And a 34_11 to Accelerated,
  call this 34_11ToAccelerated.patch

Then, using the tool combinediff (I guess any conflicts would have to be addressed on the patch file level)

something like combinediff 34-11To40-08.patch 34_11ToAccelerated.patch > 34_11To40_08_Accelerated.patch

and apply this patch to a 34_11 version of the game to achieve a v40_08 mod of the game, then do a diff comparison from 40_08 to this mod of v40_08

In light of what that script is doing, I recommend this one now

Code: [Select]
$files = @(get-childitem -include *.txt -recurse -path $path -filter $filter)
Write-Host "files loaded";
foreach ($file in $files) {
        $out1Pass = "$file" + ".1pass"
$outFile = "$file" + "2"

        Get-Content $file | Foreach-object {
            $_ -replace "`t","" `
-replace '\]\[',"]`r`n["
        } | Set-Content $outFile
    }

I was reading on what it does, and it rewrites the replacement operations.  I'm gonna push a better script in the next day or two if I don't learn a better way to write a script.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 17, 2014, 08:51:25 am
Touching on Fricy's comment from the link on the 1st page

Quote
The best'd be if there were total conversions to choose from, and also mini mods/addon packs that expand the base game, like streamlined leather and display case for eg.

Total conversion almost always entail a full rewrite.  A patch file CAN handle it, but it's merely going to copy/replace existing data.  Small mods will hardly be noticeable, but mods that standardize the raw file say... alphabetically, and rename reactions... I think are going to have huge diff differences.

Btw.  I just realized a minor issue with that powershell script again... I described it in the original stackexchange thread.
http://stackoverflow.com/questions/25345739/powershell-regexp-to-replace-any-character-left-bracket-and-replace-wit

Basically it's not fully putting comments onto their own lines.  But to be honest, anyone who knows python and regular expressions can probably handle it a little better than myself.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 17, 2014, 09:17:28 am
You can just use the 3 regular expressions I wrote. You do have to make sure for the first 2 that ^ and $ correctly match the beginning and end of a line, but they'll work on any regular expression parser, not just sed. Try them with -replace.

And I too encourage you to use Python. Getting windows scripts to run on my machine is a bit inconvenient. Last time I tried to boot into windows it was having graphics issues. Another scripting language is fine too. Even C/C++ is portable and C++11 has a graphics library.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 17, 2014, 09:24:00 am
how do I go about swapping your -e lines with the -replace function?

I don't see how the -e is specifying what to match and replace with.

As a workaround, I was going to install cygwin sed for gnuwin32

Update
okay,

sed -e "s/^[^[]*//" -e "s/][^\[]*$/]/" -e "s/][^[]*\[/]\n\[/g" *.txt does something.

I suppose one would have to do a

sed -e "s/^[^[]*//" -e "s/][^\[]*$/]/" -e "s/][^[]*\[/]\n\[/g" file.txt > file.out if wanted on a per file basis

Update
http://ask.metafilter.com/8335/Batchcorrect-files

A way to batch process and leave source files alone
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 17, 2014, 09:35:53 am
Ok, I now have code that will turn a folder full od modded DF installs into a much smaller folder full of mods conforming with the format at the top of the thread.  That is, 'LNP/Mods/$mod' contains only files with 'readme' or 'config' in the name, and a raw folder containing those files that are not identical to vanilla. 

For Accelerated Modest Mod (40.08), that shrinks it from 10MB to 2MB.  Not bad for such naive code!

Code: [Select]
import os
import shutil
import filecmp

# path from script to mods folder
mods_folder = 'LNP/Mods/'

mod_folders_list = []
for mod in os.listdir(mods_folder):
    if mod.startswith('df_'):
        # there must be only one folder beginning 'df_', the basis of comparison
        vanilla_folder = mods_folder + mod
        vanilla_raw_folder = vanilla_folder + '/raw/'
    else:
        mod_folders_list.append(mod)


def simplify_mod_and_df_folders():
    for mod in os.listdir(mods_folder):
        files_removed = 0
        folders_removed = 0
        for item in os.listdir(mods_folder + mod):
            # delete anything top-level not containing string 'raw', 'readme', or 'config'
            if item == 'raw':
                pass
            elif 'readme' in item.lower():
                pass
            elif 'config' in item.lower():
                pass
            elif os.path.isfile(mods_folder + mod + '/' + item):
                os.remove(mods_folder + mod + '/' + item)
                files_removed += 1
            else:
                shutil.rmtree(mods_folder + mod + '/' + item)
                folders_removed += 1
        if not files_removed + folders_removed == 0:
            print(mod, 'folder simplified!  (removed', files_removed, 'files and', folders_removed, 'folders)')

def remove_vanilla_files_from_mod_raws():
    for mod in mod_folders_list:
        mod_raw_folder = mods_folder + mod + '/raw/'
        files_removed = 0
        for file_tuple in os.walk(mod_raw_folder):
            for item in file_tuple[2]:
                item_path_str = os.path.join(file_tuple[0], item).replace('\\', '/').replace(mod_raw_folder, '')
                if os.path.isfile(vanilla_raw_folder + item_path_str): # if the file exists in the vanilla raws
                    if filecmp.cmp(mod_raw_folder + item_path_str, vanilla_raw_folder + item_path_str): # and it's the same file
                        os.remove(mod_raw_folder + item_path_str)
                        files_removed += 1
        if files_removed > 0:
            print('\n' + mod + ':\n    removed', files_removed, 'files identical to vanilla raws')

def make_raws_with_mods():
    print('What mods do you want to load?')
    for mod in mod_folders_list:
        print('    ', mod_folders_list.index(mod), mod)
    mods_to_load = input('Enter the indicies in load order seperated by spaces.\n    ')

    # gives ordered list with indicies as strings
    mods_to_load = mods_to_load.split(' ')

    print('Just kidding, this doesn\'t do anything yet!')


simple_folders_q = input('Do you want to simplify mod folders? (y/n)\n    ')
if simple_folders_q == 'y':
    simplify_mod_and_df_folders()
    print('  Done!\n')
else:
    print('Not simplifying mod folders\n')

remove_files_q = input('Do you want to remove files from mods which are identical to vanilla raws? (y/n)\n    ')
print(remove_files_q)
if remove_files_q == 'y':
    remove_vanilla_files_from_mod_raws()
    print('  Done!\n')
else:
    print('Not removing vanilla files\n')

make_raws_with_mods()
Looks good. One small fix:
xxx.replace(mod_raw_folder, '') -> xxx[len(mod_raw_folder):]
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 17, 2014, 09:38:23 am
how do I go about swapping your -e lines with the -replace function?

I don't see how the -e is specifying what to match and replace with.

As a workaround, I was going to install cygwin sed for gnuwin32
-e is not a function. It's just telling sed that each of those is a separate regular expression commands to apply. The "s" however, is.

I'd expect do to it with -replace i'd look something like this:
Code: [Select]
$_ -replace  "^[^[]*",""
$_ -replace  "][^\[]*$","]"
$_ -replace  "][^[]*\[","]\n\["

Installing cygwin is great, but a bit overkill for this. It's quite a large download under default settings. (and removing everything is a pain)
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 17, 2014, 09:48:51 am
I got your function to work, but by removing comments, it removes the 1st line of the file, the file_name.

I basically ran

sed -i -e "s/][^[]*\[/]\n\[/g" *.txt

I tried both scripts btw.

The original idea was to ensure that each *[ or ]*, * representing any character.  Which can be tricky due to whitespace, but generally tabs don't count, and inject a newline between the bracket and *.


To resolve it, the filename would have to be injected into the 1st line after the sed argument.

It also has the weird behaviour of putting in blank lines at the top of the outputted files.

Update
Okay... it seems to have worked with one file when it just outputs to the stream, but still has the issue of not properly putting comments on their own line

Code: [Select]
item_gloves

[OBJECT:ITEM]

###test###[ITEM_GLOVES:ITEM_GLOVES_GAUNTLETS]
[NAME:gauntlet:gauntlets]
[ARMORLEVEL:2]
[UPSTEP:1]
[SHAPED]
[LAYER:ARMOR]
[COVERAGE:100]
[LAYER_SIZE:20]
[LAYER_PERMIT:15]
[MATERIAL_SIZE:2]
[SCALED]
[BARRED]
[METAL]
[LEATHER]
[HARD]

I ran

sed -e "s/][^[]*\[/]\n\[/g" *.txt
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 17, 2014, 10:04:54 am
You can change "^[^[]*" to "\s*" so it just removes leading whitespace, but not leading comments. Otherwise you would have to treat the first line specially. I suspect the blank lines are just the lines where it removed comments; it doesn't strip empty lines, so the comments are removed, but the line stays.

And yeah, I forgot for windows you'll need \n to be \r\n or I guess `r`n (if ` is powershell's escape character).

Actually, if replace isn't a line editor like sed, then you may just omit the first expression. The second will remove trailing comments only, and the last one will handle everything else. Maybe I should look up the documentation of replace instead of guessing...
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 17, 2014, 10:14:41 am
"^[^[]*"

I searched for that string, closest I found is
"s/^[^[]*//"

in
sed -e "s/^[^[]*//" -e "s/][^\[]*$/]/" -e "s/][^[]*\[/]\n\[/g"

so

this?
sed -e "s/\s//" -e "s/][^\[]*$/]/" -e "s/][^[]*\[/]\n\[/g"
yep

Still leaves the comments adjacent to tokens

try this as input data
Code: [Select]
item_gloves

[OBJECT:ITEM]

###test###
[ITEM_GLOVES:ITEM_GLOVES_GAUNTLETS]###test###
[NAME:gauntlet:gauntlets]
###test###[ARMORLEVEL:2]
[UPSTEP:1]
###test###[SHAPED]
[LAYER:ARMOR]###test######test###
[COVERAGE:100]
[LAYER_SIZE:20]
[LAYER_PERMIT:15]
[MATERIAL_SIZE:2]
[SCALED]
[BARRED]
[METAL]
[LEATHER]
[HARD]

I'm looking at the regular expressions document on python's page
https://docs.python.org/2/howto/regex.html


Spoiler: nm (click to show/hide)

may have an answer here
http://stackoverflow.com/questions/6125098/how-to-match-any-non-white-space-character-except-a-particular-one
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 17, 2014, 10:32:02 am
So instead of mucking around with shell scripts, here's a python script that will remove all internal comments (but not leading or trailing comments), and put all tags on one line:
Code: [Select]
import re
import sys

print re.sub("][^[]*\[","]\n[",sys.stdin.read())


Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 17, 2014, 10:39:05 am
Code: [Select]
C:\Games\Dwarf Fortress\github comparisons\BasedOnVanillaRaws\BasedOnVanillaRaws
\test_SafetoDeleteMe>py
Python 3.4.1 (v3.4.1:c0e311e010fc, May 18 2014, 10:38:22) [MSC v.1600 32 bit (In
tel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import re
>>> import sys
>>>
>>> print re.sub("][^[]*\[","]\n[",sys.stdin.read())
  File "<stdin>", line 1
    print re.sub("][^[]*\[","]\n[",sys.stdin.read())
           ^
SyntaxError: invalid syntax
>>>

yeah, I'm just trying to nail down the "correct" parse method so I can do some diff comparisons and attempt some merges, but... comments are a big deal.  They add signature data to the file by being a contextual signature, and they are harmless/uniform if put on their own line like tokens are

preserving comments also preserves commented out tokens, as well as file_name information.

the problem I've noticed with most past solutions, is... everything is checking if two brackets are next to each other, but fail to separate non bracket comments apart from token lines.

This is important because if a comment is adjacent to a token.  a diff will read that as more than what it should, because diff reads the whole line.  For tokens, we only want a token read, not some comment[token] or [token]comment.  It should be

comment
[token]

or

[token]
comment
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 17, 2014, 10:43:07 am
"^[^[]*"

I searched for that string, closest I found is
"s/^[^[]*//"

in
sed -e "s/^[^[]*//" -e "s/][^\[]*$/]/" -e "s/][^[]*\[/]\n\[/g"

so

this?
sed -e "s/\s//" -e "s/][^\[]*$/]/" -e "s/][^[]*\[/]\n\[/g"
yep

Here's an explanation of what's going on with "^[^[]*"

The initial ^ matches the beginning. In sed this means the beginning of a line, because sed matches everything per line only. In Python ^ will defaultly just match the beginning of the whole string/file. I don't know about powershell.

The [^[]* matches all non [ characters. This is a character set in brackets. The ^ inverts the character set. It contains the one member [. So with the * it matches all characters up untill the first [.

So it removes leading comments.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 17, 2014, 10:50:08 am
would this match any character followed by ]?

And I DON'T want to remove comments (ATM)

/[\s\[]/


Spoiler: I'm trying some stuff (click to show/hide)
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 17, 2014, 11:01:07 am
Code: [Select]
C:\Games\Dwarf Fortress\github comparisons\BasedOnVanillaRaws\BasedOnVanillaRaws
\test_SafetoDeleteMe>py
Python 3.4.1 (v3.4.1:c0e311e010fc, May 18 2014, 10:38:22) [MSC v.1600 32 bit (In
tel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import re
>>> import sys
>>>
>>> print re.sub("][^[]*\[","]\n[",sys.stdin.read())
  File "<stdin>", line 1
    print re.sub("][^[]*\[","]\n[",sys.stdin.read())
           ^
SyntaxError: invalid syntax
>>>

yeah, I'm just trying to nail down the "correct" parse method so I can do some diff comparisons and attempt some merges, but... comments are a big deal.  They add signature data to the file by being a contextual signature, and they are harmless/uniform if put on their own line like tokens are

preserving comments also preserves commented out tokens, as well as file_name information.

the problem I've noticed with most past solutions, is... everything is checking if two brackets are next to each other, but fail to separate non bracket comments apart from token lines.

This is important because if a comment is adjacent to a token.  a diff will read that as more than what it should, because diff reads the whole line.  For tokens, we only want a token read, not some comment[token] or [token]comment.  It should be

comment
[token]

or

[token]
comment
Maybe your version of Python is being pedantic about () for print statements. Adding ( after print, and ) at the end should fix that.

So you want to preserve comments but put them on their own line. Ok. This unfortunately generates an extra newline in the simple case.
Code: [Select]
import re
import sys
print(re.sub("]\s*([^[]*?)\s*[","]\n\\1\n\[",sys.stdin.read()))

Or you can do it line by line, like sed.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 17, 2014, 11:05:14 am
I have no idea how to use this script in windows.

I tried creating a .py file with it and running it as

py test.py *.txt

or even py text.py

or just running the commands in a py console

but it just sits there blank looking at me as if I'm supposed to feed it data
I tried typing in *.txt...
still blank

Spoiler (click to show/hide)
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 17, 2014, 11:07:58 am
would this match any character followed by ]?

And I DON'T want to remove comments (ATM)

/[\s\[]/
That matches any whitespace or [, so no.

All characters followed by ] is /[^\]]*]/ (the final ] matches itself). But you don't want to touch the inside of the token. so you're more likely to match everything before [ which is /[^[]*/
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 17, 2014, 11:13:58 am

but it just sits there blank looking at me as if I'm supposed to feed it data
I tried typing in *.txt...
still blank
You need to feed it input with <raw_file.txt, and write the output to file with >outputfile.txt. so the command might look like "python pyscript.py <creature_birds.txt >creature_birds.test.txt". You can also paste the script directly into the terminal with python -c 'script_body', again specifying input and output files with < and >. 

This has the effect of feeding standard input from the input file, and standard output to the output file.

EDIT: unexpected end of regular expression is because I forgot to escape the last [. I fixed it, but not in the script I posted apparenly. look at it now.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 17, 2014, 11:19:22 am
i'm working on the sed thing tbh with all the issues I'm having with python

Code: [Select]
C:\Games\Dwarf Fortress\github comparisons\BasedOnVanillaRaws\BasedOnVanillaRaws
\test_SafetoDeleteMe>py test.py < item_gloves.txt
Traceback (most recent call last):
  File "test.py", line 3, in <module>
    print(re.sub("]\s*([^[]*?)\s*[","]\n\\1\n[",sys.stdin.read()))
  File "C:\Python34\lib\re.py", line 175, in sub
    return _compile(pattern, flags).sub(repl, string, count)
  File "C:\Python34\lib\re.py", line 288, in _compile
    p = sre_compile.compile(pattern, flags)
  File "C:\Python34\lib\sre_compile.py", line 465, in compile
    p = sre_parse.parse(p, flags)
  File "C:\Python34\lib\sre_parse.py", line 746, in parse
    p = _parse_sub(source, pattern, 0)
  File "C:\Python34\lib\sre_parse.py", line 358, in _parse_sub
    itemsappend(_parse(source, state))
  File "C:\Python34\lib\sre_parse.py", line 484, in _parse
    raise error("unexpected end of regular expression")
sre_constants.error: unexpected end of regular expression

C:\Games\Dwarf Fortress\github comparisons\BasedOnVanillaRaws\BasedOnVanillaRaws
\test_SafetoDeleteMe>

try2
Code: [Select]
C:\Games\Dwarf Fortress\github comparisons\BasedOnVanillaRaws\BasedOnVanillaRaws
\test_SafetoDeleteMe>python test.py <item_gloves.txt >test.txt
Traceback (most recent call last):
  File "test.py", line 3, in <module>
    print(re.sub("]\s*([^[]*?)\s*[","]\n\\1\n[",sys.stdin.read()))
  File "C:\Python34\lib\re.py", line 175, in sub
    return _compile(pattern, flags).sub(repl, string, count)
  File "C:\Python34\lib\re.py", line 288, in _compile
    p = sre_compile.compile(pattern, flags)
  File "C:\Python34\lib\sre_compile.py", line 465, in compile
    p = sre_parse.parse(p, flags)
  File "C:\Python34\lib\sre_parse.py", line 746, in parse
    p = _parse_sub(source, pattern, 0)
  File "C:\Python34\lib\sre_parse.py", line 358, in _parse_sub
    itemsappend(_parse(source, state))
  File "C:\Python34\lib\sre_parse.py", line 484, in _parse
    raise error("unexpected end of regular expression")
sre_constants.error: unexpected end of regular expression

I don't know.  i've spent too much time on this already.  I could have just finished up my own manual parsing of tokens in c by now.

I have no idea how regular expressions work.  There were some fancy suggestions to use readahead, and I think negative readahead (that were mentioned on stackexchange)

This matches any character followed by a [

but... I'm not sure how to use sed to replace *[ with *NewLine[

sed -e "s/[^[]*//"

Here's the sample file
Code: [Select]
item_gloves

[OBJECT:ITEM]

###test###[ITEM_GLOVES:ITEM_GLOVES_GAUNTLETS]###test###
[NAME:gauntlet:gauntlets]###test###
[ARMORLEVEL:2][UPSTEP:1]###test###
[SHAPED]
[LAYER:ARMOR]###test######test###
[COVERAGE:100]
###TEST
[LAYER_SIZE:20]
[LAYER_PERMIT:15]
[MATERIAL_SIZE:2]
[SCALED]
[BARRED]
[METAL]
[LEATHER]
[HARD]

desired output
Code: [Select]
item_gloves

[OBJECT:ITEM]

###test###
[ITEM_GLOVES:ITEM_GLOVES_GAUNTLETS]
###test###
[NAME:gauntlet:gauntlets]
###test###
[ARMORLEVEL:2]
[UPSTEP:1]
###test###
[SHAPED]
[LAYER:ARMOR]
###test######test###
[COVERAGE:100]
###TEST
[LAYER_SIZE:20]
[LAYER_PERMIT:15]
[MATERIAL_SIZE:2]
[SCALED]
[BARRED]
[METAL]
[LEATHER]
[HARD]


I suppose since I'm using a wildcard, I have to store a variable...

SED s/abc/xyz/g filename

That means substitute xyz with abc for the whole file.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 17, 2014, 11:30:23 am
The issue is not with python, I just had a bug in my regular expression.

sed just gets you succinctness at the cost of portability. This is is how to swap with sed:
Code: [Select]
sed s/replace_me/replace_with/g <from_file.txt >to_file.txtAnd this is the python script (which can be run from file)
Code: [Select]
import re
import sys

for line in sys.stdin.readlines():
  print(re.sub("replace_me","replace_with",line))
(If run from the terminal be sure to use single quotes around the code or escape the double quotes.)

Of course replace_me and replace_with as appropriate.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 17, 2014, 11:31:32 am
The issue is not with python, I just had a bug in my regular expression.

sed just gets you succinctness at the cost of portability. This is is how to swap with sed:
Code: [Select]
sed s/replace_me/replace_with/g <from_file.txt >to_file.txtAnd this is the python script (which can be run from file)
Code: [Select]
import re
import sys

for line in sys.stdin.readlines():
  print(re.sub("replace_me","replace_with",line))
(If run from the terminal be sure to use single quotes around the code or escape the double quotes.)

Of course replace_me and replace_with as appropriate.

sed is gnuwin32 though.  It was like a 256kb download

I need to stop spamming this thread and sort through the answers I have atm and see if I can come up with something using sed
http://word.mvps.org/faqs/general/usingwildcards.htm
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 17, 2014, 11:44:13 am

I don't know.  i've spent too much time on this already.  I could have just finished up my own manual parsing of tokens in c by now.

I have no idea how regular expressions work.  There were some fancy suggestions to use readahead, and I think negative readahead (that were mentioned on stackexchange)

Regular expressions are worth learning for their own sake, so don't think of it as wasted time.

Readahead and negative readahead basically constrain the match with a required prefix of suffix without that prefix or suffix being part of the matched group. You'd have too look up the syntax for it.

There are online tools for testing regular expression, like http://regexpal.com/. Might be easier than running a script. Note that sed matches line by line, like my for loop in python, whereas some tools will match on the whole file.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 17, 2014, 12:01:32 pm
well I may have something here

thanks to crowdsourcing & 0xdeadbeef

http://stackoverflow.com/questions/25350991/hold-variable-in-regular-expression-using-sed

sed -e "s/\]\(.\)/]\n\1/g;s/ *\[/[/g;s/\(.\)\[/\1\n[/g" *.txt

it does what I need for comments, but there is one additional newline (on windows, but on stack, the guy is on linux, and I don't think it made a newline for them)

sample output based on input:

Code: [Select]
item_gloves

[OBJECT:ITEM]

###test###
[ITEM_GLOVES:ITEM_GLOVES_GAUNTLETS]
###test###
[NAME:gauntlet:gauntlets]
###test###
[ARMORLEVEL:2]

[UPSTEP:1]
###test###
[SHAPED]
[LAYER:ARMOR]
###test######test###
[COVERAGE:100]
###TEST
[LAYER_SIZE:20]
[LAYER_PERMIT:15]
[MATERIAL_SIZE:2]
[SCALED]
[BARRED]
[METAL]
[LEATHER]
[HARD]
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 17, 2014, 12:33:06 pm
Here is the sed answer:

updated at 12:02PM PST
Code: [Select]
sed -e "s/\t//g" -e "s/(?m)^\s*//g" -e "s/\]\(.\)/]\r\1/g;s/ *\[/[/g;s/\(.\)\[/\1\r[/g" %%~na.txt > %%~na.out
requires gnuwin32 sed

http://http://sourceforge.net/projects/gnuwin32/files//sed/4.2.1/sed-4.2.1-setup.exe/download

Code: (ParseRaws.bat) [Select]
echo off
for /f %%a in ('dir /b *.txt') do sed -e "s/\t//g" -e "s/(?m)^\s*//g" -e "s/\]\(.\)/]\r\1/g;s/ *\[/[/g;s/\(.\)\[/\1\r[/g" %%~na.txt > %%~na.out
erase *.txt
ren *.out *.txt
REM remove all blanklines
REM -e  "s/^ *//; s/ *$//; /^$/d; s/\r//; /^\s*$/d"
echo on

doesn't parse subfolders due to the way it grabs a list of files...

before input:
Code: [Select]
item_gloves

[OBJECT:ITEM]

###test###[ITEM_GLOVES:ITEM_GLOVES_GAUNTLETS]###test###
[NAME:gauntlet:gauntlets]###test###
[ARMORLEVEL:2][UPSTEP:1]###test###
[SHAPED]
[LAYER:ARMOR]###test######test###
[COVERAGE:100]
###TEST
[LAYER_SIZE:20]
[LAYER_PERMIT:15]
[MATERIAL_SIZE:2]
[SCALED]
[BARRED]
[METAL]
[LEATHER]
[HARD]

after output:
Code: [Select]
item_gloves

[OBJECT:ITEM]

[ITEM_GLOVES:ITEM_GLOVES_GAUNTLETS]
###test###
[NAME:gauntlet:gauntlets]
###test###
[ARMORLEVEL:2]
[UPSTEP:1]
###test###
[SHAPED]
[LAYER:ARMOR]
###test######test###
[COVERAGE:100]
###TEST
[LAYER_SIZE:20]
[LAYER_PERMIT:15]
[MATERIAL_SIZE:2]
[SCALED]
[BARRED]
[METAL]
[LEATHER]
[HARD]

Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Dirst on August 17, 2014, 02:28:00 pm
Wow, I step away for an evening and come back to four pages of unread posts.  Impressive work so far, it looks like flattening has been solved.  We don't need an omnipotent solution that's going to mix Masterwork, OldGenesis and combat balancing from 23a into a single error-free 40.08 game... but a little simple merging would go a long way to making mini-mods play nice with each other.

I thought of something while I didn't have access to my machine.  There are two reasons why N-way merged files break.

Case 1: Mod A adds or deletes lines before mod B tries to change something.  Mod B's changes end up in the wrong place, which may or may not cause issues.

Case 2: Mod A and mod B try to change the same line.

Of course, both could happen in the same merge.  In this case mod B's changes happen at the wrong line and we could potentially end up with a duplicated raws problem.

I have a gut feeling that someone has already tackled case 1 in a standard library somewhere.  My initial guess of a solution would be to do everything in two passes.  Clone the original file to an intermediate file, tag the intermediate file with where each change is supposed to occur, then apply those changes whereever the tags ended up, stripping off the tags as we go.  If two mods try to change the same line, last in wins (and this can be logged for the player).

So, a snippet of the tagged intermediate file might look like

Code: [Select]
{RampageMod:13}[LARGE_ROAMING]
[BIOME:ANY_TROPICAL_FOREST]
{Zootastic:2}[BIOME:SHRUBLAND_TROPICAL]
{Zootastic:3}{RampageMod:14}[POPULATION_NUMBER:15:30]
{Zootastic:4}{RampageMod:15}[CLUSTER_NUMBER:3:7]
{RampageMod:16}[BENIGN]
[MEANDERER]
[NATURAL]
{Zootastic:5}[PREFSTRING:strength]

This should ensure that changes land where the modder thinks they will based on a vanilla diff.  Won't ensure that the changes are compatible, but at least they won't be off in some other object.

Does that make sense to anyone other than me?
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 17, 2014, 02:49:32 pm
I'm working on this issue right now.

I talked about if we merge patch files before applying them [for n-way merges], we address the conflicts at the patch level.

there's a linux tool called combinediff
http://cyberelk.net/tim/patchutils/man/rn01re02.html

I'm wondering if there's a way to do this with github...

anyways, my idea was:

34_11 to accelerated = diff1
34_11 to 40_08 = diff2

combinediff diff1 and diff2 to diff3

apply diff3 to 34_11
have 40_08 accelerated

why/who would do this?

Someone who wants a 40_08 accelerated to derive a patch based on 40_08.  The end user wouldn't need the 34_11 raws btw.  This would be done from the patch distributor's end
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 17, 2014, 03:11:35 pm
Case 1 is a special case of case 2.

The plan to handle case 2 is a run a 3 way diff and reject the file if there are any overlaps. In my psudocode I assumed that it's sufficient to compare the mod being added to the generated raws with vanilla as a baseline. So there's no intermediate file, except the output, but you do keep a copy of vanilla to compare with. A narrow test for overlapping would be to make sure the generated raws and mod raws of a new mod being added don't have consecutive changes, that is they don't both modify vanilla at the same place.

It occurs to me that a raw aware check for duplicate raws may be unavoidable for correctness. Otherwise you could have mods that try to add the same thing, in different files, and no differ is doing to realize it. Especially if they are not identical. Of course one option is to ignore this case.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 17, 2014, 05:35:55 pm
I do not think combinediff is exactly what I thought it was I think instead it just merges two sequential merges into one merge.

So no silver bullet on that.

So... no "ala cart merge of mods, if those mods have conflicts" and being able to add on any combination of things you want.

Instead, I would recommend the mod author or tool author, take a poll and ask the community what base of mods should be offered to the community, because most likely, it will be a set # of options & combinations.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Dirst on August 17, 2014, 06:16:40 pm
I do not think combinediff is exactly what I thought it was I think instead it just merges two sequential merges into one merge.

So no silver bullet on that.

So... no "ala cart merge of mods, if those mods have conflicts" and being able to add on any combination of things you want.

Instead, I would recommend the mod author or tool author, take a poll and ask the community what base of mods should be offered to the community, because most likely, it will be a set # of options & combinations.
Well, an a la cart menu that resolves serious conflicts is outside the scope of this project.  It should be sufficient to use a method that handles having the insertion point move unexpectedly (which is what I tried to describe above).  In that example, both mods tried to adjust the elephant's POPULATION_NUMBER and CLUSTER_NUMBER.  The tool would note the multiple changes for the user, with last-in-winning if the merge is allowed to proceed (in this case, Zootastic values would end up in the generated raws).  Note that this is pulling back from my earlier insistence on regular expressions.  We get at least 80% of the usefulness here without turning modders into regex-warriors.

There are a couple other mod manager projects rattling around the forums.  So long as they all share at least one format (which appears to be some kind of diff-from-vanilla), and at least one can grind mod raws into that format, it will all work out in the end.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 17, 2014, 06:29:53 pm
keeping track of insertion points and a cumulative score of line numbers that are removed/added while patching might alleviate the problem with multiple patches.

I was hoping to try to address this very issue with my mod merge tool thread.  Patches by themselves do not report if an object's token's have changed, but merely lines.  I think keeping track of some of the contextual info in the <patch> files would allow one to keep track of object's token's being removed/added for a particular object, maybe even find out if the same area being modified is in conflict due to the same addition/subtraction from another patch having been applied?

So I flattened some of the raws and put them up on github, if anyone wants to work with them.

https://github.com/thistleknot/BasedOnVanillaRaws/blob/40-07-phoebus-flattened/raw/objects/plant_standard.txt

It was recommended I check out octopus merges, and rebasing.

However, this looks promising, "An interdiff is a text file in patch format that describes the changes between two versions of a patch" 

https://www.drupal.org/documentation/git/interdiff

If it does what it sounds like, it would help me see the difference between two diff files that are to be applied?
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 17, 2014, 06:52:19 pm
The approach I recommend is looking at python's difflib to do a 3 way merge. Then examine each of the merge points for conflicts. Maybe I'll look at this myself sometime.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: PeridexisErrant on August 17, 2014, 08:04:36 pm
Wow, that's a lot of posts.  Experiments with regexp to flatten files, talk about diff mechanisms... and I haven't even finished a merge-by-overwrite yet!

I'm unlikely to do much for the next few days, busy updating the Starter Pack. 
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: hermes on August 17, 2014, 08:32:21 pm
I thought of something while I didn't have access to my machine.  There are two reasons why N-way merged files break.

Case 1: Mod A adds or deletes lines before mod B tries to change something.  Mod B's changes end up in the wrong place, which may or may not cause issues.

Case 2: Mod A and mod B try to change the same line.

....

This should ensure that changes land where the modder thinks they will based on a vanilla diff.  Won't ensure that the changes are compatible, but at least they won't be off in some other object.

Does that make sense to anyone other than me?

... Which is exactly why I'd go the route of Valdemar's stuff.  First flatten everything, then read/index all the objects/entities into structures, then do the diff patch generation on an object by object basis.  It's basically the same thing but with with the added bonus of the program being able to say explicitly what objects were deleted, what's been added and so on.  You have two levels of differences - the objects present difference, and the difference between equatable objects themselves.  The process to resolve conflicts becomes much more mechanical then, rather than holding hands up and saying, "well, it doesn't work and we don't know why".

In case 1 you'd know that the added lines of B would at least be in the correct object, and if the program was smart it could drop those lines in right after the correct preceding line.

In case 2 you'd know in what object the conflict arose and could prompt the user with relevant information.

This way would make this basic stuff easier, and open the door for more advanced stuff like making connections throughout the raws to entirely remove problem objects.  The two pass method Dirst suggests is essentially the same thing but done in a roundabout way.

The diff patch stuff is a good start and totally necessary, but IMO it'd be pointless to do this without going the route of interrogating the raws in a raw-language manner.  PE said he was adverse to an "advanced mod loader", which I understand... But I feel this way is the minimum, not an advanced feature.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Dirst on August 17, 2014, 09:05:34 pm
.The two pass method Dirst suggests is essentially the same thing but done in a roundabout way.

I had a feeling that about five minutes thought by a dabbling coder wasn't going to overtake an entire open source community.  Glad that someone has already optimized that idea.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: PeridexisErrant on August 17, 2014, 09:28:37 pm
I'd go the route of Valdemar's stuff.  First flatten everything, then read/index all the objects/entities into structures, then do the diff patch generation on an object by object basis.  It's basically the same thing but with with the added bonus of the program being able to say explicitly what objects were deleted, what's been added and so on.  You have two levels of differences - the objects present difference, and the difference between equatable objects themselves.  The process to resolve conflicts becomes much more mechanical then, rather than holding hands up and saying, "well, it doesn't work and we don't know why".
<snip>
The diff patch stuff is a good start and totally necessary, but IMO it'd be pointless to do this without going the route of interrogating the raws in a raw-language manner.  PE said he was adverse to an "advanced mod loader", which I understand... But I feel this way is the minimum, not an advanced feature.

I should clarify my position a little:
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: hermes on August 17, 2014, 09:50:26 pm
Merge logic should not present the user with choices beyond the list of mods and load order - if the logic cannot produce a known correct result, the merge should be refused.  This protects users from broken raws; put another way it's the difference between a mod loader which does some merging and a mod merge tool.  Details on what happened could be written to a log file, but probably shouldn't be shown to all users - remember that the target audience is people who are still new to DF and would otherwise avoid mods.

But that's the point!  If any mod removes something from vanilla raws, any other mod, whether applied before or after, could clash with that removal and with a blind diff patch there is no way to guarantee no clash... so you have to refuse the merge.  The diff patch should spot where a mod wanted to edit that removed entry, but it would never detect where that removed entry was referenced somewhere else by another mod.

I went though the exact same thought processes before making my mod loader, and that's why I just went with file overwrite detection.  You either do that, or go all the way and build the data structures.  Half way just lands you in all kinds of problems.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 17, 2014, 09:53:35 pm
Quote
could clash with that removal

not if you re-extract base raws with an overwrite, and that's always your input.

Then you work with a base patch system.

I thought about this as well (what if a user modifies his raws?)

One could allow a user to export their current raw set if they so chose, but anytime a mod is applied, it's rewritten over by a base zip
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: PeridexisErrant on August 17, 2014, 11:36:38 pm
No, it could still be a problem.  Imagine vanilla has raws for A, B, and C.  Mod1 modifies A and removes C.  Mod2 doesn't modify A or C - if it did we would refuse the merge - but instead modifies B, which has C as a dependency.  The raws are now broken, unless we have some *really* impressive parsing to catch this kind of thing. 

I don't know how common this might be, but for minor mods it doesn't seem too likely.  I think the best way to deal with this in the short to medium term is just to live with it and reduce the chances by way of major mods, which are more likely to do this, generally being incompatible. 

The format also assumes that any absent file is not deleted but rather identical to vanilla, which might help in this case but should probably be run past some modders to see if it would break things.  Maybe it just needs to be the full raw folder instead of files changed from vanilla...
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: hermes on August 18, 2014, 12:09:54 am
No, it could still be a problem.  Imagine vanilla has raws for A, B, and C.  Mod1 modifies A and removes C.  Mod2 doesn't modify A or C - if it did we would refuse the merge - but instead modifies B, which has C as a dependency.  The raws are now broken, unless we have some *really* impressive parsing to catch this kind of thing. 

I don't know how common this might be, but for minor mods it doesn't seem too likely.  I think the best way to deal with this in the short to medium term is just to live with it and reduce the chances by way of major mods, which are more likely to do this, generally being incompatible.

Agreed.  I was being rather overly negative, the blind patch diff will work great on merging mods that don't remove anything, and will catch more problems than simply checking for file overwrites.

Quote
The format also assumes that any absent file is not deleted but rather identical to vanilla, which might help in this case but should probably be run past some modders to see if it would break things.  Maybe it just needs to be the full raw folder instead of files changed from vanilla...

That shouldn't be such a big problem as the modder/an overseer could just put a blank file in instead..?
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: PeridexisErrant on August 18, 2014, 12:35:58 am
OK, here's where I leave this for while:  input format, folder structure, and a code skeleton that works... for everything except the actual merge bit.  You may want to expand the information passed through the functions to the merge logic, I kept it minimal so it doesn't seem to even hold folder variables.

Download link. (http://dffd.wimbli.com/file.php?id=9428)


Code: (just the mod merging functions) [Select]
def make_raws_with_mods():
    print('What mods do you want to load?')
    for mod in mod_folders_list:
        print('   ', mod_folders_list.index(mod), mod)
    mods_to_load = input('Enter the indicies of the mods to load, in order, seperated by spaces:\n    ')
    mods_to_load = mods_to_load.split(' ')
    mod_load_order = []
    for index in mods_to_load:
        mod_load_order.append(mod_folders_list[int(index)])
    mixed_raws_folder = mods_folder + 'temp/raw/'
    # remove an old folder if exists.  Name reserved for this reason!
    if os.path.exists(mixed_raws_folder):
        if os.path.isfile(mixed_raws_folder):
            os.remove(mixed_raws_folder)
        else:
            shutil.rmtree(mixed_raws_folder)
    # create folder of vanilla raws to operate on
    shutil.copytree(vanilla_raw_folder, mixed_raws_folder)
    print('\nFolder for merging created - "'+mixed_raws_folder+'" - with vanilla raws.')
    # start merging mods!
    merge_next_mod('', mod_load_order, -1)
    # get back after looping through
    print('\nMod merging complete!  The merged mods can be found in the mod folder as "temp".')

def merge_next_mod(next_mod, mod_load_order, mods_merged):
    if not next_mod == '':
        mod_merge_logic(next_mod, mods_merged)
    if not mod_load_order == []:
        next_mod = mod_load_order.pop(0)
        mods_merged += 1
        merge_next_mod(next_mod, mod_load_order, mods_merged)

def mod_merge_logic(mod, mods_merged):
    print('\n(placeholder merge logic for', mod + '; no action taken;', mods_merged, 'mods merged already)')
    # see eg vanilla file removal function to do per-file comparisons

Spoiler: all code (click to show/hide)

The format also assumes that any absent file is not deleted but rather identical to vanilla, which might help in this case but should probably be run past some modders to see if it would break things.  Maybe it just needs to be the full raw folder instead of files changed from vanilla...

That shouldn't be such a big problem as the modder/an overseer could just put a blank file in instead..?

That is the obvious workaround, and if we do it in the code modders don't have to comply with the standard at all.  Just reverse the current comparison tool to compare "for each file in vanilla, if no such file in mod, create blank file by that name".  Identical files are still removed which saves a lot of space for micro mods, and the diff from vanilla to an empty file is tiny so no worries there either.  Included above.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 18, 2014, 06:48:03 am

The format also assumes that any absent file is not deleted but rather identical to vanilla, which might help in this case but should probably be run past some modders to see if it would break things.  Maybe it just needs to be the full raw folder instead of files changed from vanilla...

That shouldn't be such a big problem as the modder/an overseer could just put a blank file in instead..?

That is the obvious workaround, and if we do it in the code modders don't have to comply with the standard at all.  Just reverse the current comparison tool to compare "for each file in vanilla, if no such file in mod, create blank file by that name".  Identical files are still removed which saves a lot of space for micro mods, and the diff from vanilla to an empty file is tiny so no worries there either.  Included above.
You could do this as a setting, but for minor mods, I think they're likely to just be the modified file in the proper raw folder and nothing else. So the default should be no file implies no changes.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 18, 2014, 07:08:51 am
No, it could still be a problem.  Imagine vanilla has raws for A, B, and C.  Mod1 modifies A and removes C.  Mod2 doesn't modify A or C - if it did we would refuse the merge - but instead modifies B, which has C as a dependency.  The raws are now broken, unless we have some *really* impressive parsing to catch this kind of thing. 

I don't know how common this might be, but for minor mods it doesn't seem too likely.  I think the best way to deal with this in the short to medium term is just to live with it and reduce the chances by way of major mods, which are more likely to do this, generally being incompatible. 

The format also assumes that any absent file is not deleted but rather identical to vanilla, which might help in this case but should probably be run past some modders to see if it would break things.  Maybe it just needs to be the full raw folder instead of files changed from vanilla...

so you guys are going to try "ala carte adding".  I thought you would have static 1 time patches only.  Not multiple.  If your doing multiple, your going to need some sort of object tracking as stated beforehand.  However... that could be something done at a later time.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: PeridexisErrant on August 18, 2014, 09:08:44 am
For clarity:

 - in the format, a missing file means "this file is identical to vanilla". Which means that minor mods can consist of a single file in the proper folder.

 - some mods actually do delete vanilla files though - so to cover that use case, files that are present in vanilla but need to be deleted in the mod are represented by an empty text file (equivalent to no file to DF).

 -  the code I've posted above will correctly convert full mod installs in the folder structure given in the download into our specified format, including blank placeholder file creation.

It also includes a first stab at the structure of a mod merge tool, albeit one utterly without logic (it prints the name of each mid and nothing else). I'm hoping some of you can code something in the function I defined for merge logic.

I need to read up again on how passing arguments to functions works, in order to keep all the useful variables around. $var not defined errors I didn't have time to deal with put an end to my merge logic creation for the day, so I just posted the code I had.

A la carte mod merging is obviously risky in some cases, but in others - like merging minimal minor mods - it would be an awesome trick. Starting simple and then expanding, with our easy to make and use format as a base, also leaves us the potential to keep improving the logic. At this point we've seen enough interest that I think we might see a snowball of usage and improvements, which would be awesome. Loading just one mod without graphics isn't a big enough trick to really count as moving forward though, and that's all I've got so far. Soon...
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Button on August 18, 2014, 12:30:16 pm
... Which is exactly why I'd go the route of Valdemar's stuff.  First flatten everything, then read/index all the objects/entities into structures, then do the diff patch generation on an object by object basis.  It's basically the same thing but with with the added bonus of the program being able to say explicitly what objects were deleted, what's been added and so on.  You have two levels of differences - the objects present difference, and the difference between equatable objects themselves.  The process to resolve conflicts becomes much more mechanical then, rather than holding hands up and saying, "well, it doesn't work and we don't know why".

In case 1 you'd know that the added lines of B would at least be in the correct object, and if the program was smart it could drop those lines in right after the correct preceding line.

In case 2 you'd know in what object the conflict arose and could prompt the user with relevant information.

This way would make this basic stuff easier, and open the door for more advanced stuff like making connections throughout the raws to entirely remove problem objects.  The two pass method Dirst suggests is essentially the same thing but done in a roundabout way.

The diff patch stuff is a good start and totally necessary, but IMO it'd be pointless to do this without going the route of interrogating the raws in a raw-language manner.  PE said he was adverse to an "advanced mod loader", which I understand... But I feel this way is the minimum, not an advanced feature.

I'm with Hermes.

IMO a tool that can only load a single minor mod is trivial, because most minor mods are set up to just overwrite the vanilla folder. If you want a mod loader that can load one mod at a time, it doesn't need all this logic - just a vanilla raw directory, a bunch of minor mod raw directories, and the ability to select which minor mod you want to load on top of the vanilla raw directory in the application raw folder. To load one mod, there's no need to diff. So I assume you want something that can load more than one set of raws... but merging multiple mods through diff would be a nightmare. The dupes alone... Not to mention the really problematic part of merging, creature variations.

For example, the vampires-drink-booze-fix I have uses creature variations to replace humanoids' blood with BLOOD2 or ICHOR2, which is default blood with a syndrome. Diff wouldn't find any problems loading the boozefix alongside another mod that did something to blood, but trying to load a creature with modifications to BLOOD after BLOOD has already been replaced with BLOOD2 (and so no longer exists in the creature definition)? There's no way to find those kinds of conflicts without loading raws.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Dirst on August 18, 2014, 12:45:44 pm
... Which is exactly why I'd go the route of Valdemar's stuff.  First flatten everything, then read/index all the objects/entities into structures, then do the diff patch generation on an object by object basis.  It's basically the same thing but with with the added bonus of the program being able to say explicitly what objects were deleted, what's been added and so on.  You have two levels of differences - the objects present difference, and the difference between equatable objects themselves.  The process to resolve conflicts becomes much more mechanical then, rather than holding hands up and saying, "well, it doesn't work and we don't know why".

In case 1 you'd know that the added lines of B would at least be in the correct object, and if the program was smart it could drop those lines in right after the correct preceding line.

In case 2 you'd know in what object the conflict arose and could prompt the user with relevant information.

This way would make this basic stuff easier, and open the door for more advanced stuff like making connections throughout the raws to entirely remove problem objects.  The two pass method Dirst suggests is essentially the same thing but done in a roundabout way.

The diff patch stuff is a good start and totally necessary, but IMO it'd be pointless to do this without going the route of interrogating the raws in a raw-language manner.  PE said he was adverse to an "advanced mod loader", which I understand... But I feel this way is the minimum, not an advanced feature.

I'm with Hermes.

IMO a tool that can only load a single minor mod is trivial, because most minor mods are set up to just overwrite the vanilla folder. If you want a mod loader that can load one mod at a time, it doesn't need all this logic - just a vanilla raw directory, a bunch of minor mod raw directories, and the ability to select which minor mod you want to load on top of the vanilla raw directory in the application raw folder. To load one mod, there's no need to diff. So I assume you want something that can load more than one set of raws... but merging multiple mods through diff would be a nightmare. The dupes alone... Not to mention the really problematic part of merging, creature variations.

For example, the vampires-drink-booze-fix I have uses creature variations to replace humanoids' blood with BLOOD2 or ICHOR2, which is default blood with a syndrome. Diff wouldn't find any problems loading the boozefix alongside another mod that did something to blood, but trying to load a creature with modifications to BLOOD after BLOOD has already been replaced with BLOOD2 (and so no longer exists in the creature definition)? There's no way to find those kinds of conflicts without loading raws.
Thistleknot is working on a more robust merge tool (http://www.bay12forums.com/smf/index.php?topic=142188), and there's also Rubble (http://www.bay12forums.com/smf/index.php?topic=140853.0).

PeridexisErrant is aiming for a simple way to sample mods, and for a pre-1.0 it's perfectly fine for it to be one mod at a time.  I believe that a useable tool should be able to handle some simple merging, like putting two mod's worth of PERMITTED_REACTIONs into the same entity_default.txt file, or modifying one creature when a creature before it in the same file had a line added/removed.  In this regime, two mods are in conflict only in the narrow sense that they attempt to modify the same vanilla tag.  That could be an insta-fail or a last-in-wins depending on settings, especially since load order is user-definable.

A manifest file (which is planned) can list dependencies, and it could also list known semantic conflicts like the one you mentioned.  Such a manifest entry could specify whether the mod is completely incompatible, or there is a required load order.  That presents comprehensible guidance to the player while minimizing the complexity (i.e., no recursive searches through the raws to locate unused tissues).
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Button on August 18, 2014, 01:19:57 pm
PeridexisErrant is aiming for a simple way to sample mods, and for a pre-1.0 it's perfectly fine for it to be one mod at a time.  I believe that a useable tool should be able to handle some simple merging, like putting two mod's worth of PERMITTED_REACTIONs into the same entity_default.txt file, or modifying one creature when a creature before it in the same file had a line added/removed.  In this regime, two mods are in conflict only in the narrow sense that they attempt to modify the same vanilla tag.  That could be an insta-fail or a last-in-wins depending on settings, especially since load order is user-definable.

But the idea is to make it newb-friendly, which means that it needs to be able to identify potential problems and handle them gracefully:

if the logic cannot produce a known correct result, the merge should be refused.  This protects users from broken raws; put another way it's the difference between a mod loader which does some merging and a mod merge tool.

False negatives are OK from this perspective - just limiting what mods you can throw together - but false positives are a huge potential user experience problem, especially for a newbie. My position is that there are just too many conflicts that are invisible to a file-by-file diff to rule out false positives unless you first manually screen out any mods which are potentially problematic.

Quote
A manifest file (which is planned) can list dependencies, and it could also list known semantic conflicts like the one you mentioned.  Such a manifest entry could specify whether the mod is completely incompatible, or there is a required load order.

It's planned? Haven't seen it on this thread. As far as manifest files, PE said just the other day that:

I am strongly opposed to anything which would require a more complicated input than folders of changed raws, as outlined in OP.  This is because I feel having a simple, usable, standard format is very important for adoption from both users and modders.  If it becomes common to have a readme.txt and configuration.xml - to hold info such as author, name, homepage, base DF version, etc in the latter case - it would be good to use that information but their absence must be handled gracefully.

Which would seem to imply no manifest file required. If planning is going on/that requirement has been reversed elsewhere, I'll shut up and let you work, since I don't want to butt in when I don't have all the information.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 18, 2014, 01:37:24 pm
Updated:

Code: [Select]
sed -r "s/\t//g;s/([^]].+)\[/\1\n[/g;s/\]([^[].+)$/]\n\1/g" testfile.txt
with batch

Still needs a little work:
broken token's have odd behaviour

Code: [Select]
echo off
for /f %%a in ('dir /b *.txt') do sed -r "s/\t//g;s/([^]].+)\[/\1\n[/g;s/\]([^[].+)$/]\n\1/g" %%~na.txt > %%~na.out
REM erase *.txt
REM ren *.out *.txt
REM remove all blanklines
for /f %%a in ('dir /b *.out') do sed -e "s/^ *//; s/ *$//; /^$/d; s/\r//; /^\s*$/d" %%~na.out > %%~na.out2
echo on

input:

Code: [Select]
item_gloves

[OBJECT:ITEM]

###test###
[ITEM_GLOVES:ITEM_GLOVES_GAUNTLETS]###test###
[NAME:gauntlet:gauntlets]
###test###[ARMORLEVEL:2]
[UPSTEP:1]
###test###[SHAPED]
[LAYER:ARMOR]###test######test###
[COVERAGE:100]
[LAYER_SIZE:20]
[LAYER_PERMIT:15]
[MATERIAL_SIZE:2]
[SCALED]
[BARRED]
[METAL]
[LEATHER]
[HARD]


output:
Code: [Select]
item_gloves
[OBJECT:ITEM]
###test###
[ITEM_GLOVES:ITEM_GLOVES_GAUNTLETS]
###test###
[NAME:gauntlet:gauntlets]
###test###
[ARMORLEVEL:2]
[UPSTEP:1]
###test###
[SHAPED]
[LAYER:ARMOR]
###test######test###
[COVERAGE:100]
[LAYER_SIZE:20]
[LAYER_PERMIT:15]
[MATERIAL_SIZE:2]
[SCALED]
[BARRED]
[METAL]
[LEATHER]
[HARD]

Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Dirst on August 18, 2014, 01:47:48 pm
Quote
A manifest file (which is planned) can list dependencies, and it could also list known semantic conflicts like the one you mentioned.  Such a manifest entry could specify whether the mod is completely incompatible, or there is a required load order.

It's planned? Haven't seen it on this thread. As far as manifest files, PE said just the other day that:

I am strongly opposed to anything which would require a more complicated input than folders of changed raws, as outlined in OP.  This is because I feel having a simple, usable, standard format is very important for adoption from both users and modders.  If it becomes common to have a readme.txt and configuration.xml - to hold info such as author, name, homepage, base DF version, etc in the latter case - it would be good to use that information but their absence must be handled gracefully.

Which would seem to imply no manifest file required. If planning is going on/that requirement has been reversed elsewhere, I'll shut up and let you work, since I don't want to butt in when I don't have all the information.

A manifest is the third point on the OP, though I did overstate it a bit by implying it was a definite feature when it's just being looked at.

I do differ with PE a bit because I think asking the modder to document known incompatibilities isn't much of a burden.  If we get around to a stretch goal of using that manifest file to check for mod updates, it will ripple out to the users with minimal effort.  The thorny issue is what to do if the tool discovers an active game has a newly discovered conflict.

Warning:
You are using RampageMod v1.08 and Zootastic v2.0, and an incompatibility between these mods has been identified.

Severity:
Minor

The problem:
Both of these mods affect several mundane animals in different ways.  The combination of mods does not cause an error, but it does affect the game balance that each mod is attempting to achieve.

Recommended actions:
It is recommended that you take one of the following actions.
(1) Ensure that Zootastic is loaded before RampageMod.
(2) Update to RampageMod v1.09.
Unfortunately, none of the above changes can be applied to active game.  We apologize for the inconvenience.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 18, 2014, 06:45:59 pm
finally, here it is.

I found an easier way to extract tokens, went with it.  I didn't have to solve a lot of the other work (you can see it in the REM lines)

Code: [Select]
echo off
REM s/\][^]]*/&\n/g doesn't work right
REM to remove [[ and ]] s/\][^]]*/&\n/g; s/\[[^[]*/&\n/g
REM to split up ][
REM for /f %%a in ('dir /b *.txt') do sed -e "s/\]\[/\]\n\[/g" %%~na.txt > %%~na.out
REM put tokens on their own line
for /f %%a in ('dir /b *.txt') do sed -e "s/\[[^][]*\]/\n&\n/g" %%~na.txt > %%~na.out
REM remove tabs
for /f %%a in ('dir /b *.out') do sed -r "s/\t//g" %%~na.out > %%~na.out2
REM for /f %%a in ('dir /b *.out') do sed -r "s/\t//g;s/([^]].+)\[/\1\n[/g;s/\]([^[].+)$/]\n\1/g" %%~na.out > %%~na.out2
REM remove all blanklines
for /f %%a in ('dir /b *.out2') do sed -e "s/^ *//; s/ *$//; /^$/d; s/\r//; /^\s*$/d" %%~na.out2 > %%~na.out3
REM cleanup
REM erase *.txt
REM ren *.out3 *.txt
REM erase *.out
echo on

basically what it does is no matter how bad the token's are broken, it puts them on their own line as a [token].

The rest of the broken comments are left alone an their own lines.

input
Spoiler (click to show/hide)

output
Spoiler (click to show/hide)
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: PeridexisErrant on August 18, 2014, 07:36:18 pm
Very nice!  There are a couple of cosmetic issues though; the output contains the input twice over (I suspect a bug), and a final pass to remove any line consisting entirely of square brackets would be nice.  One those are dealt with, I think that looks more human-readable as well as better for diffs. 

The problem with relying on a manifest is that it massively cuts down on your potential input.  At the moment, that's every mod out there - if it can work installed on vanilla DF (without graphics or dfhack), that install can be fed into the tool / converted to our format / stupidly merged.  If we require a manifest, that shrinks the pool to those mods made with this tool in mind or manually updated or compatibility by someone, which is to say no mods at all.  So requiring a manifest is fine, so long as we can derive a sensible one from the mod alone, and the defaults aren't too restrictive.  Good ways to use it would be to show extra information to the user (author, update link, etc), or helpful but non-critical info for the program - eg known non-conflicting mods (useful for mods that were split up, program then ignores detected conflicts between them as false), base raws version, etc.  Again though, the program should work well enough without a manifest file. 

Complicated problems in the raws following multiple merges:  there's probably something possible with comparing the before and after line identifiers in diffs and working out if two are intending to apply to the same area, even if the content at that line has changed.  This would deal with a couple of the problems noted so far.  The order might be:
 - transform all mod folders into our format, permanently (once ever per mod)
 - create (temporary) flattened raws for vanilla and all mods (once per run)
 - create a diff between flat vanilla and each flat mod (once per run)
 - select mods to load and load order, and each time this changes:
 * analyse diffs in order; if changed areas overlap reject merge and return first overlapping mod
 - attempt merges in order; if any fail reject the merge and return first non-merging mod
 - if merge was rejected inform the user which mod caused failure, otherwise offer new raws
I think at this point we're mostly talking over how to do the starred point to avoid false-positives in merge-ability.  While I think this would work enough of the time to ignore the edge cases, I don't have enough modding experience to really tell.  Tying for zero false positives (ie merges that should have been refused) if probably futile, because we'd be back to one mod at a time.  I'm still focussing on normal diffs rather than something raw-aware because it's so much faster to build; we can improve the system after it exists. 

And in the end we can add a label to use at own risk, explain that modding can lead to a broken game and while we try it's not perfect, and if the new world they generate is broken - that's what modding is like sometimes!
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 18, 2014, 07:59:51 pm
the double type was an accident

as to:
"remove any line consisting entirely of square brackets would be nice"

that was done intentionally.  To keep the contextual information in place (such as commented out options like mw has).  The mess of brackets was due to me purposely putting a bunch in.

This format should work.  It merely extracts [tokens] and put's them on their own line.  Everything else is left to whatever line it was on.

There is a much simpler version of the script that just grabs the [tokens], but you'd have to reinject the filename at the top
Code: [Select]
sed -e "s/\[[^][]*\]/\n&\n/g"
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: hermes on August 18, 2014, 08:09:22 pm
The problem with relying on a manifest is that it massively cuts down on your potential input.  At the moment, that's every mod out there - if it can work installed on vanilla DF (without graphics or dfhack), that install can be fed into the tool / converted to our format / stupidly merged.  If we require a manifest, that shrinks the pool to those mods made with this tool in mind or manually updated or compatibility by someone, which is to say no mods at all.  So requiring a manifest is fine, so long as we can derive a sensible one from the mod alone, and the defaults aren't too restrictive.  Good ways to use it would be to show extra information to the user (author, update link, etc), or helpful but non-critical info for the program - eg known non-conflicting mods (useful for mods that were split up, program then ignores detected conflicts between them as false), base raws version, etc.  Again though, the program should work well enough without a manifest file. 

I understand the sentiment, but not the logic.  Namely, it doesn't matter what system you implement, modders still have to pack responsibly or you must include a manifest.  Users would have to download mods from DFFD into a Mods directory, or the launcher has a browser that automatically lists, downloads and unpacks mods - either way there is an onus on the modder to conform to the raw folder structure, which they currently do not do at all. 

Creating a catch-all formatter that would magically pull out all the relevant files is impossible.  The Wanderer's Friend mod dumps everything within the raw structure, but the user has to select which features they want to install by selectively copying files across.  Only way you can do this properly is to have a sentient/English speaking AI that doesn't mind spending its life reading mod readme.txt files.

Point is, somewhere in the chain a human has to check it over and understand if the mod is compatible with the loader.  This line...

Quote
The problem with relying on a manifest is that it massively cuts down on your potential input.  At the moment, that's every mod out there - if it can work installed on vanilla DF (without graphics or dfhack), that install can be fed into the tool / converted to our format / stupidly merged.

... is false.  Some mods, by chance, will work.  Many/most of them won't or will be incorrectly installed and produce unwanted/unintended behaviour.

If I was going to future proof this, I'd bite the bullet and put a manifest in from the start.  Make another program or a webform or something that can generate the manifest easily, or just some guidelines.  Set up some mods yourself this way and then encourage others to use it.  Make sure the manifest has version control, then later you can make it work with DFHack scripts and other things.

Spoiler: Simple Manifest (click to show/hide)
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Dirst on August 18, 2014, 08:19:25 pm
the double type was an accident

as to:
"remove any line consisting entirely of square brackets would be nice"

that was done intentionally.  To keep the contextual information in place (such as commented out options like mw has).  The mess of brackets was due to me purposely putting a bunch in.

This format should work.  It merely extracts [tokens] and put's them on their own line.  Everything else is left to whatever line it was on.

There is a much simpler version of the script that just grabs the [tokens], but you'd have to reinject the filename at the top
Code: [Select]
sed -e "s/\[[^][]*\]/\n&\n/g"

Try this as a sed command

s/(.*?)(\[[^][]+\])/\1\n\2\n

and see if that works.  It looks for "something that's not a valid tag" followed by "something that is a valid tag" and inserts newlines after each.  Consecutive valid tags probably put a blank line in between, but a wrapper (or second pass) can weed all of those out.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 18, 2014, 08:31:28 pm
I tried

sed -r "s/(.*?)(\[[^][]+\])/\1\r\n\2\r/g" testfile.txt > testfile.out

and it didn't work on sample input data.

The solution posted 2-3 posts back above actually works better because it doesn't try to split the [[ and ]]'s into new lines, therefore reserving lines for just tokens and saying screw the rest if they aren't token's and leave them on their own lines and not bother inserting carriage returns in them.

I could make a cleaner version that just deals with the actual tokens and reinject the filename on top... however... there are files in the ..\text folder that seem to have non token data that may actually be used by the program.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: PeridexisErrant on August 18, 2014, 08:41:12 pm
Modders still have to pack responsibly or you must include a manifest.  ... either way there is an onus on the modder to conform to the raw folder structure, which they currently do not do at all. 

Creating a catch-all formatter that would magically pull out all the relevant files is impossible.  The Wanderer's Friend mod dumps everything within the raw structure, but the user has to select which features they want to install by selectively copying files across.  Only way you can do this properly is to have a sentient/English speaking AI that doesn't mind spending its life reading mod readme.txt files.

Point is, somewhere in the chain a human has to check it over and understand if the mod is compatible with the loader.  This line...
Quote
The problem with relying on a manifest is that it massively cuts down on your potential input.  At the moment, that's every mod out there - if it can work installed on vanilla DF (without graphics or dfhack), that install can be fed into the tool / converted to our format / stupidly merged.
... is false.  Some mods, by chance, will work.  Many/most of them won't or will be incorrectly installed and produce unwanted/unintended behaviour.

If I was going to future proof this, I'd bite the bullet and put a manifest in from the start.  Make another program or a webform or something that can generate the manifest easily, or just some guidelines.  Set up some mods yourself this way and then encourage others to use it.  Make sure the manifest has version control, then later you can make it work with DFHack scripts and other things.

You're right, most mods as distributed will be a steaming mess and completely broken.  That's not what I was talking about though - I was referring to mods after they're installed with DF (by either the sensible modder or the end user).  At this stage, they have to include a valid raw structure, and that we can process and use. 

It's the responsibility of whoever is adding the mod to the launcher to ensure that it's either:
 a) a full, working installation of raw-only mods; or
 b) a raw folder that can be overwritten over a vanilla install and have that become a working install of the mod

If someone wants a loader-compatible Wanderer's Friend option, they have to set up a working install of it themselves.  Or <checks DFFD> if they want something distributed as a "replace your options folder with this" mod like Eevee Fortress, they'd better do that.  I imagine some modders will conform to the format, some will be set up by pack distributors, and some will have to be done by the user.

I do plan to include manifest use from an early version, both reading if present and writing a default if not, but still not requiring one to be included for basic use.  Good idea on the version number.  We can certainly encourage modders to include a manifest and users to add to the generated default, but we have to be functional without it. 
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 18, 2014, 09:11:26 pm
this is a "cleaner" version that removes all comments and injects the filename at the top of the file.

interestingly enough, this guy was trying to solve exactly the issue I wanted with filename injections
http://www.computing.net/answers/programming/how-do-i-insert-a-filename-into-a-text-file/24247.htm

Either this method, or the one described in this post should do the trick:
http://www.bay12forums.com/smf/index.php?topic=142295.msg5584437#msg5584437

Code: [Select]
@ECHO OFF
SETLOCAL EnableDelayedExpansion

FOR /f "tokens=*" %%a IN ('DIR /b /a-d "*.txt"') DO (
SET Var=%%a
ECHO !Var:~0,-4!>>TempFile.txt
ECHO.>>TempFile.txt
grep -oE "\[[^][]+\]" %%a >>"TempFile.txt"
sed -e "s/\[[^][]*\]/\n&\n/g" "TempFile.txt" >> "TempFile2.txt"
sed -e "s/^ *//; s/ *$//; /^$/d; s/\r//; /^\s*$/d" "TempFile2.txt" > "TempFile3.txt"
DEL "%%a"
REN "TempFile3.txt" "%%a"
DEL "TempFile3.txt"
DEL "TempFile2.txt"
DEL "TempFile1.txt"
)
PAUSE

input:

Spoiler (click to show/hide)

output (it's called test because that was the file name... it's not an error, it's just because I set the filename is all)
Code: [Select]
test
[OBJECT:ITEM]
[ITEM_GLOVES:ITEM_GLOVES_GAUNTLETS]
[NAME:gauntlet:gauntlets]
[ARMORLEVEL:2]
[UPSTEP:1]
[SHAPED]
[LAYER:ARMOR]
[COVERAGE:100]
[LAYER_SIZE:20]
[LAYER_PERMIT:15]
[MATERIAL_SIZE:2]
[SCALED]
[BARRED]
[METAL]
[LEATHER]
[HARD]
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 18, 2014, 09:39:16 pm
- transform all mod folders into our format, permanently (once ever per mod)
 - create (temporary) flattened raws for vanilla and all mods (once per run)
 - create a diff between flat vanilla and each flat mod (once per run)
 - select mods to load and load order, and each time this changes:
 * analyse diffs in order; if changed areas overlap reject merge and return first overlapping mod
 - attempt merges in order; if any fail reject the merge and return first non-merging mod
 - if merge was rejected inform the user which mod caused failure, otherwise offer new raws
Regarding steps one and two, I still don't get why you want to transform mods into our format. Mods should work with the tool out of the box. You can do some caching as an optimization, so that you don't have to do the same work per mod, each time you merge mods, but this is a performance optimization, and it does not make sense to try to make the process faster at the cost of ease of use. If you do want caching like that, it would be better to use a system that looks at modification time instead of manually asking the user to pre-process the Mod.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 18, 2014, 09:58:53 pm
Regarding a manifest file, I agree with PeridexisErrant that we don't want to have any special requirements on the format of a Mod. A mod, capable of being installed by copying over vanilla should be all that is needed for a Mod to work.

Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: salithus on August 18, 2014, 10:04:56 pm
Do you guys have any specific mods you're using as a test bed? I've got some (crazy) ideas, but not sure what to best test with.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 18, 2014, 10:17:31 pm
Regarding a manifest file, I agree with PeridexisErrant that we don't want to have any special requirements on the format of a Mod. A mod, capable of being installed by copying over vanilla should be all that is needed for a Mod to work.

all this stuff about a "manifest"... I've been saying can be updated from the patch files/raws themselves.  There is absolutely NO NEED to make anything OTHER than a patch file.  If a v2 or v3 or whatever comes along.  The objects can be parsed in place from raws and patches.  Since patches are ALWAYS editing a base vanilla branch.  This branch can be defaulted back to to figure out the exact line #'s that were to be edited, and what object it was to be applied to.  Then... return back to that location using the token-id (of course in a hypothetical future version).
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 18, 2014, 10:31:51 pm
I started working on a script to
A)Identify conflicting mod files based on a diff
B)Merge non-conflicting files.

The intent is to be two function that work on multiple versions of the same file.

A simple implementation of this would be to implement A as assuming any two non identical files are conflicting ant B as moving the mod file to the generated mod. My script would be a drop in replacement for that.
Do you guys have any specific mods you're using as a test bed? I've got some (crazy) ideas, but not sure what to best test with.
No, but creating a testbed would be useful. Thistleknot suggested testing merging DF0.40.x raws and accelerated mod raws over DF0.34.11, but that's an ambitious goal. Another obvious goal is any particular mini-Mod on top of any graphics pad.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 18, 2014, 10:59:24 pm
I'll get you 3 diff mods by nights end that is based on diff's from 40_09.

Maybe you'll have some luck with merging them :)

I'll host the files on github, the commits will show I flattened them.  I'll probably use the script that keeps comments in.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: PeridexisErrant on August 18, 2014, 11:05:48 pm
OK, here's where I leave this for while:  input format, folder structure, and a code skeleton that works... for everything except the actual merge bit.  You may want to expand the information passed through the functions to the merge logic, I kept it minimal so it doesn't seem to even hold folder variables.

Download link. (http://dffd.wimbli.com/file.php?id=9428)

Spoiler: code (click to show/hide)

I started working on a script to
A)Identify conflicting mod files based on a diff
B)Merge non-conflicting files.

The intent is to be two function that work on multiple versions of the same file.

A simple implementation of this would be to implement A as assuming any two non identical files are conflicting and B as moving the mod file to the generated mod. My script would be a drop in replacement for that.

If you start with the code above, you can just fill out the function "def mod_merge_logic(mod, mods_merged):" - which was the idea of posting it :)  However I think that passing arguments to these functions may actually be counterproductive, as you lose all the other variables that have been set up (like, eg, paths and number of mods already merged). 

I plan on (sometime soonish I hope) fixing that, adding useful comments before that function explaining the variables you can use, and adding a flatten_raws function somewhere. 

Quote
testbed mods
I just visited DFFD for a couple of mods, I think Rise of the Mushroom Kingdom as a major mod, Accelerated Modest Mod in the middle, and Eevee Fortress which just adds them as a playable civ.  As noted above some mods require installation over vanilla before they're usable, but that's not much of a challenge. 

Quote
manifest
Reading everything relevant to merge logic from the content is the plan, but an optional manifest is still nice for stuff like displaying the author and an update link, maybe a one-sentence summary of the mod, that kind of thing.  Some fields can be autofilled from the raws / folder names / etc, but others need a human to write them. 
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 18, 2014, 11:12:12 pm
Code: (just the mod merging functions) [Select]
def merge_next_mod(next_mod, mod_load_order, mods_merged):
    if not next_mod == '':
        mod_merge_logic(next_mod, mods_merged)
    if not mod_load_order == []:
        next_mod = mod_load_order.pop(0)
        mods_merged += 1
        merge_next_mod(next_mod, mod_load_order, mods_merged)
I should have caught this earlier. This is a horribly wasteful way to loop, and may cause a stack overflow. Just write a for loop that loops over each mod. No need for recursion.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 18, 2014, 11:17:47 pm
I think there are some files that may "break" if flattened the way we do.

this is taken from vanilla.
seek out [CONTEXT:HIST_FIG:TRANS_NAME] at [CONTEXT:ABSTRACT_BUILDING:TRANS_NAME] over in [CONTEXT:SITE:TRANS_NAME]

if dwarf fortress reads this as oneline... then my parse breaks it

that's in data\speech
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 18, 2014, 11:18:11 pm
OK, here's where I leave this for while:  input format, folder structure, and a code skeleton that works... for everything except the actual merge bit.  You may want to expand the information passed through the functions to the merge logic, I kept it minimal so it doesn't seem to even hold folder variables.

Download link. (http://dffd.wimbli.com/file.php?id=9428)

Spoiler: code (click to show/hide)

I started working on a script to
A)Identify conflicting mod files based on a diff
B)Merge non-conflicting files.

The intent is to be two function that work on multiple versions of the same file.

A simple implementation of this would be to implement A as assuming any two non identical files are conflicting and B as moving the mod file to the generated mod. My script would be a drop in replacement for that.

If you start with the code above, you can just fill out the function "def mod_merge_logic(mod, mods_merged):" - which was the idea of posting it :)  However I think that passing arguments to these functions may actually be counterproductive, as you lose all the other variables that have been set up (like, eg, paths and number of mods already merged). 

I plan on (sometime soonish I hope) fixing that, adding useful comments before that function explaining the variables you can use, and adding a flatten_raws function somewhere. 
Yeah that's one of the functions I'm basically working on. The other is to see if merging is safe in the first place.

EDIT: except mod_merge_logic takes a mod, I'm working on single files.

Quote
Quote
manifest
Reading everything relevant to merge logic from the content is the plan, but an optional manifest is still nice for stuff like displaying the author and an update link, maybe a one-sentence summary of the mod, that kind of thing.  Some fields can be autofilled from the raws / folder names / etc, but others need a human to write them.
As long as it's optional.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 18, 2014, 11:20:41 pm
I think there are some files that may "break" if flattened the way we do.

this is taken from vanilla.
seek out [CONTEXT:HIST_FIG:TRANS_NAME] at [CONTEXT:ABSTRACT_BUILDING:TRANS_NAME] over in [CONTEXT:SITE:TRANS_NAME]

if dwarf fortress reads this as oneline... then my parse breaks it

that's in data\speech
Yeah, it seems some files can be flattened, others cannot. You'd have to just keep track of which files cannot, based on their name and directory.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 18, 2014, 11:46:53 pm
Okay.

Here's the testbed:

Based on: 40_09 flattened
zip can be dl'd from here (look for zip link in bottom right):
https://github.com/thistleknot/df_40_09_Flattened/tree/f18f2555afb309ced4fdeac5f2644f028939a2e6

I picked the first 3 mods that were compatible with 40_09 that had active development

Advanced Civilizations
Dire Forged (most changes)
Plant Fixes

https://github.com/thistleknot/df_40_09_Flattened/commits/master
I only flattened raw\objects\*.* (no subfolders)

Patch files
http://dffd.wimbli.com/file.php?id=9439

Spoiler: SCRIPTUSEDTOPARSE.BAT (click to show/hide)

UPDATE
I was able to merge the plant and advciv one's fairly easily.  However, the direforge and plantfix modified the same area of permitted reactions of some types of drinks.

I used tortoisegitmerge.

oh yeah, I had to delete the diff part for the additive pdf file, it was causing tortoisegitmerge headaches

well gnight guys

Update 2
Apparently, if one creates the patch file using diff -r -u dir1 dir2, one gets a patch:)

in cygwin, one can take two patches based on the same sourced if patches are UNIFIED.  And create an interdiff.  Which is basically a merge of the two!

conflicts will be flagged upon merge... but... this gives one the opportunity to address n way merges on the patch level!
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Button on August 19, 2014, 12:45:06 am
I was able to merge the plant and advciv one's fairly easily.  However, the direforge and plantfix modified the same area of permitted reactions of some types of drinks.

Did you happen to see if they were the same/similar modifications? I know at least one mod, which I think had Forge in the name, was going to include/expand upon the plant fixes.



By the way, what do you guys think about mods that are essentially bundles of small tweaks? Would it be helpful to users to separate them a tweak at a time, or do you think that would overwhelm with too much choice & all the tweaks should be bundled together?

Without a mod loading tool bundling seems to be the clear winner, but with a whole list to select from it might be better to include more modularity?



I'll break apart/format some of my personal mods into small feature-set mods for trying out the tools on, particularly with regards to creature variations so we can see how much of a problem that's going to be.



What are y'all's thoughts on applying graphics packs to the resulting raw monstrosity? (And I use the term affectionately). Are mods going to have to be included in graphics'd versions in order to be used with a tileset; will graphics packs be treated like just another mod, merged in at the end; or will there be special handling for graphics packs?
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 19, 2014, 01:17:42 am
By the way, what do you guys think about mods that are essentially bundles of small tweaks? Would it be helpful to users to separate them a tweak at a time, or do you think that would overwhelm with too much choice & all the tweaks should be bundled together?

Without a mod loading tool bundling seems to be the clear winner, but with a whole list to select from it might be better to include more modularity?
I think this would depend on the mod. Tweaks that have the same objective probably should be bundled. But unrelated tweaks, may be better off separated.

Quote
What are y'all's thoughts on applying graphics packs to the resulting raw monstrosity? (And I use the term affectionately). Are mods going to have to be included in graphics'd versions in order to be used with a tileset; will graphics packs be treated like just another mod, merged in at the end; or will there be special handling for graphics packs?
Ideally graphics will be treated like any other mod, meaning mods should be applied to vanilla, and the Mod Starter Pack would merge the mod with the graphics. It remains to be seen how practical that is.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: PeridexisErrant on August 19, 2014, 01:31:33 am
Awesome!   It looks like we're onto something  :D

@Thistleknot: 
Is that a working multimerge?  Wow. 
Now there's the just the comparatively simple tasks of implementing the flatten and merge functions in Python, rejecting failed merges, and then opening it up!  (Seriously, if the logic is confirmed to work that's a massive step forward)

@Button:
I think that given the novelty of easy patch based loading we might as well err on the side of many tiny patches for once - just to see what that's like!  Medium term, I can imagine that some things might amalgamate a little; to take the example of Accelerated DF it might want to divide materials into 'Generic creatures mod' and 'Generic stone and wood mod', but splitting blood from leather from shell etc would be too much. 

Quote
What are y'all's thoughts on applying graphics packs to the resulting raw monstrosity? (And I use the term affectionately). Are mods going to have to be included in graphics'd versions in order to be used with a tileset; will graphics packs be treated like just another mod, merged in at the end; or will there be special handling for graphics packs?
Ideally graphics will be treated like any other mod, meaning mods should be applied to vanilla, and the Mod Starter Pack would merge the mod with the graphics. It remains to be seen how practical that is.
Due to practical concerns, I'd lean the other way - treat graphics as a special case and assume that we're using ASCII with [graphics:no]; note that tilesets are still compatible, just not graphics. 

Some graphics packs are based on standard raws and others aren't.  This is historically to free up the tiles for accented letters for other things, but with the Text will be Text plugin that could be reversed - which would mean that the raw folder could be left entirely alone by graphics packs.  They also have to mess around with the /data folder a lot, and if we can avoid needing to touch that we probably should - separation of concerns to avoid causing yet more conflicts.  I'm aware that this limits the range of compatible mods somewhat, but I don't see encouraging a return to the baseline standard as a terrible position to take; and we can always extend later. 
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 19, 2014, 01:52:57 am
Quote
What are y'all's thoughts on applying graphics packs to the resulting raw monstrosity? (And I use the term affectionately). Are mods going to have to be included in graphics'd versions in order to be used with a tileset; will graphics packs be treated like just another mod, merged in at the end; or will there be special handling for graphics packs?
Ideally graphics will be treated like any other mod, meaning mods should be applied to vanilla, and the Mod Starter Pack would merge the mod with the graphics. It remains to be seen how practical that is.
Due to practical concerns, I'd lean the other way - treat graphics as a special case and assume that we're using ASCII with [graphics:no]; note that tilesets are still compatible, just not graphics. 

Some graphics packs are based on standard raws and others aren't.  This is historically to free up the tiles for accented letters for other things, but with the Text will be Text plugin that could be reversed - which would mean that the raw folder could be left entirely alone by graphics packs.  They also have to mess around with the /data folder a lot, and if we can avoid needing to touch that we probably should - separation of concerns to avoid causing yet more conflicts.  I'm aware that this limits the range of compatible mods somewhat, but I don't see encouraging a return to the baseline standard as a terrible position to take; and we can always extend later.
Graphics packs can change raws quite a bit. A quick look at Phoebus's shows that creature tiles, language files, inorganic stones, and plant files are all changed by the pack.

Graphics packs may need to be special cases, but ignoring them completely would not allow mods to work with graphics, which is a big problem.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: PeridexisErrant on August 19, 2014, 02:43:52 am
Graphics packs can change raws quite a bit. A quick look at Phoebus's shows that creature tiles, language files, inorganic stones, and plant files are all changed by the pack.

Graphics packs may need to be special cases, but ignoring them completely would not allow mods to work with graphics, which is a big problem.

As far as I'm aware this is mostly changes related to replacing tiles for accented letters with walls, floors, trees, and so on.  The CLA graphics pack I believe has a single change from vanilla raws, and still plenty of creatures. 

Basic graphics processing can probably be added within the existing framework, though I assume we'll need a non-diff compare logic for image files (messy but not hard).  I'd do that at the same general stage we apply the first round of upgrades, like avoiding flattening / destroying the book title files. 

More advanced logic might be required later, but I favour just asking graphics people to go old school + TwbT instead of modifying the raws.  There are hundreds of tile sets (which don't require graphics), thousands of mods, and maybe ten major graphics packs. 

As usual, it's not that I don't like the idea - here I even think we really need to be able to handle graphics before 2.0 - I just think we don't have the rest nailed down enough yet for it to be useful. 
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 19, 2014, 06:12:02 am
Gfx should be easy. Just do a diff between ASCII n a gfx mod. And apply it to a 3rd party mod.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: PeridexisErrant on August 19, 2014, 06:57:50 am
Gfx should be easy. Just do a diff between ASCII n a gfx mod. And apply it to a 3rd party mod. :-*
 :'( :'( :-*

> non-text files
> important stuff outside the raw folder

But yeah, as long as it's last in the load order that should work. 
A manifest should probably include whether it's graphics or not, so we can enforce last-on graphics merging in known cases. 
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Pidgeot on August 19, 2014, 07:12:48 am
Basic graphics processing can probably be added within the existing framework, though I assume we'll need a non-diff compare logic for image files (messy but not hard).  I'd do that at the same general stage we apply the first round of upgrades, like avoiding flattening / destroying the book title files. 

Don't graphics packs normally only *add* image files, not change them? Unless they change existing file, you really just need to copy over those extra PNGs (and if we want to store them in a single patch file, we could always do something like using Base64 to store the entire file in a plain-text format).

There's certainly a use-case for providing individual tiles to be replaced, but simple image copying should get us pretty far, I'd think, with or without TwbT.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: PeridexisErrant on August 19, 2014, 07:43:53 am
Yes, we shouldn't ever have to modify image files, just add and remove them. Given that we're comparing to vanilla DF anyway the only graphics files are the examples, and we could probably justify a special case to discard those.

Adding the image to a text diff hadn't occurred to me, but could be useful. Maybe we do want to cover the data folder as well as the raw folder!
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 19, 2014, 08:21:33 am
Guys guys guys. I'll update w a Phoebus patch. It ... Should.... Do the binary files as well. Idk. That may require manual copying... But the creature tiles hopefully will be taken care of via patch file.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Button on August 19, 2014, 08:31:16 am

Don't graphics packs normally only *add* image files, not change them? Unless they change existing file, you really just need to copy over those extra PNGs (and if we want to store them in a single patch file, we could always do something like using Base64 to store the entire file in a plain-text format).

There's certainly a use-case for providing individual tiles to be replaced, but simple image copying should get us pretty far, I'd think, with or without TwbT.

Phoebus, which is what I use, changes most of the vermin & plant tiles as well in the raws. I'm.... not sure why? But yeah.

Separating my stuff out into diffable modular mods is ongoing, and also a pain in the butt, but I should have a couple potential-conflict mods that I know for a fact work together later today.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 19, 2014, 08:58:25 am
Phoebus-Direforge.

I don't know why, but the phoebus fonts have screwy ascii fonts EVERYTIME I try to merge phoebus with another mod.  The names of dwarf's come up with weird symbols.

It merged anyways though!

https://github.com/thistleknot/df_40_09_Flattened/tree/40-09_Phoebus_Direforged

If anything, due to github NOT COPYING binary files.  I'd start with phoebus 40_09 as a base, and extract these files over.

Here's some of the merge conflicts I get with plant and direforge
http://imgur.com/Rjvo7RL
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Dirst on August 19, 2014, 09:55:09 am
In a graphics pack, the majority of raw changes are altering display tiles and replacing accented characters in the language files.  The latter frees up a lot of space in the tileset to allow more variety in the former.

Two quick notes:

1. We can't assume someone is running TWBT, especially if we aren't going to assume DFHack is installed.

2. Graphics files in /data tend not to collide... though personally I find the Starter Pack's habit of wiping out my font.tff file a little annoying.  A diff-from-vanilla based off of a graphics pack should exclude things like the font and cursor because those weren't changed.  The ticklish bit is applying a diff to the init file since it's not in the raw or data folder.

That said, the same logic should be able to apply graphics packs like any other mod, though I'd recommend applying them last.  That could be suggested/recommended/coerced through the optional manifest file.  I'd go one further and bring along a Stonesense folder since it would be ignored if that tool isn't installed.  I'm not aware of any mods that play with Soundsense, but if there are then it ought to get similar treatment.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Button on August 19, 2014, 10:10:54 am
Phoebus-Direforge.

I don't know why, but the phoebus fonts have screwy ascii fonts EVERYTIME I try to merge phoebus with another mod.  The names of dwarf's come up with weird symbols.

That's odd, it really should be the other way around. Unless the original language files are being prioritized?

Also, are you sure the normal and phoebus raws are using the same encoding scheme? Could be something to do with that.

Here's some of the merge conflicts I get with plant and direforge
http://imgur.com/Rjvo7RL

Oh, those should be able to merge well enough together one after the other. I mean, from a logic perspective. Obviously merge tools tend to be... a little simplistic.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 19, 2014, 10:44:02 am
Changes to d_init.txt and init.txt may need to be treated specially. Maybe just let graphics modify them, then post-process the files with user settings in a script.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 19, 2014, 10:47:31 am
so...

I'm thinking on this idea.

There is a way to COMBINE patch files.  If we combined them, we could apply them once vs retrying every iteration with a new patch.  Maybe even allow a player what patches/mods they want to test together, find the ones that are incompatible with each other, and show the player how they could use a 3rd party tool like diffmerge or tortoisegitmerge and show them the two patch files and the base folder if they so choose to do the merge themselves.

HEAD Patch file info:

tortoisegitmerge will throw a error about the two [patches] not being able to be applied neatly WITHOUT PROPER HEAD INFO?  So yeah.  I'm not 100% sure, but the Head info is in the top of each file, but I believe the paths in my output are relative to MY GIT INSTALLATION.  So... if we derived patches off of working actual folders... the head might actually exist on the disk, the head being the base files I believe (ex 40_09 ascii).  So the conflicts may/may not be resolved automatically with a merge?

I'm going to try some tests.

Anyways. 

As to applying grx patches last, 100% agree on that.

Also...

Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 19, 2014, 11:04:13 am
I wrote up a simple python function that tests if merging is possible. I test it by passing strings, making it diff on individual letters, but if you pass it a list of strings, it will diff that as lines.
Code: [Select]
import difflib

def can_merge_seq (mod_text, vanilla_text, into_text):
    van_mod_match = difflib.SequenceMatcher(None, vanilla_text, mod_text)   
    van_gen_match = difflib.SequenceMatcher(None, vanilla_text, into_text)   
   
    van_mod_seqs = van_mod_match.get_matching_blocks()
    van_gen_seqs = van_gen_match.get_matching_blocks()
   
    cur_v = 0
    while cur_v < len(vanilla_text) :
        (i1,j1,n1) = van_mod_seqs[0]
        (i2,j2,n2) = van_gen_seqs[0]
        if i1 > cur_v and i2 > cur_v:
            return False
        if i1 + n1 - cur_v < i2 + n2  - cur_v:
            cur_v = i1+n1
            van_mod_seqs.pop(0)
        else:
            cur_v = i2 + n2
            van_gen_seqs.pop(0)       
    return True

print can_merge_seq ('anything at all','vanilla','vanilla')
print can_merge_seq ('oooh','vanilla','vanilla')
print can_merge_seq ('vanil','vanilla','nilla')
print can_merge_seq ('van','vanilla','la')
print can_merge_seq ('van','vanilla','la')
print can_merge_seq ('vani','vanilla','lla')
print can_merge_seq ('vonilla','vanilla','banana')
print can_merge_seq ('vannilla','vanilla','banana')
print can_merge_seq ('vonilla','vanilla','venilla')
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Pidgeot on August 19, 2014, 11:07:48 am
Changes to d_init.txt and init.txt may need to be treated specially. Maybe just let graphics modify them, then post-process the files with user settings in a script.
Ideally the graphics packs will only change the fields they actually need to change, making this a non-issue.

The way I'm handling it in my launcher (http://www.bay12forums.com/smf/index.php?topic=140808.0) is to just read specific fields from the d_init.txt and init.txt files, and only overwrite those. For a full-blown patch system, the easiest way is probably to do just make a full patch and then filter out any extraneous changes to those files (or filter them out during patch creation).
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 19, 2014, 11:24:23 am
As to what Button was saying about the merge conflict:

http://www.bay12forums.com/smf/index.php?topic=142295.msg5586461#msg5586461
Quote
"Oh, those should be able to merge well enough together one after the other. I mean, from a logic perspective. Obviously merge tools tend to be... a little simplistic."

The fact that we could have just "added those lines on top of each other" means that could be an option to present the user.

I think the reason that happened is because one mod added 1 line, and the other mod added say 5 lines to the same spot.  Both mods/patches read the original location of that FILE AS BLANK.  Since we are only ever merging two mods at a time, since VANILLA is always our base.  If both lines expected a blank spot (can be read from patch file), and they found something there (in the case of merge conflicts).  It's safe to assume that we CAN ADD BOTH PATCHES, ONE ON TOP OF THE OTHER.

Actually, you know... now that I think about it.  Our base files don't have blank spots.  So what's happening is... their is a token mismatch.  At that point we can introduce logic on how to deal with token mismatches.  It's probably one mod realizing another mod wrote to that spot.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Dirst on August 19, 2014, 11:37:14 am
As to what Button was saying about the merge conflict:

http://www.bay12forums.com/smf/index.php?topic=142295.msg5586461#msg5586461
Quote
"Oh, those should be able to merge well enough together one after the other. I mean, from a logic perspective. Obviously merge tools tend to be... a little simplistic."

The fact that we could have just "added those lines on top of each other" means that could be an option to present the user.

I think the reason that happened is because one mod added 1 line, and the other mod added say 5 lines to the same spot.  Both mods/patches read the original location of that FILE AS BLANK.  Since we are only ever merging two mods at a time, since VANILLA is always our base.  If both lines expected a blank spot (can be read from patch file), and they found something there (in the case of merge conflicts).  It's safe to assume that we CAN ADD BOTH PATCHES, ONE ON TOP OF THE OTHER.

Actually, you know... now that I think about it.  Our base files don't have blank spots.  So what's happening is... their is a token mismatch.  At that point we can introduce logic on how to deal with token mismatches.  It's probably one mod realizing another mod wrote to that spot.

This is the problem I was trying to fix a bunch of posts ago by doing two passes.  Slip in bookmarks where each change is supposed to occur secure in the knowledge that no lines will actually move, then perform the actual changes at those bookmarks which could move things around quite a bit.

This still doesn't do a good job with re-ordering things.  To get that, you'd probably have to nuke the vanilla file completely and replace it using a different filename.  The launcher would detect that one mod's changes got deleted by another, but absent some advanced raw-aware logic it just isn't possible to merge the two.  That advanced logic comes... later.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 19, 2014, 11:43:55 am
As to what Button was saying about the merge conflict:

http://www.bay12forums.com/smf/index.php?topic=142295.msg5586461#msg5586461
Quote
"Oh, those should be able to merge well enough together one after the other. I mean, from a logic perspective. Obviously merge tools tend to be... a little simplistic."

The fact that we could have just "added those lines on top of each other" means that could be an option to present the user.

I think the reason that happened is because one mod added 1 line, and the other mod added say 5 lines to the same spot.  Both mods/patches read the original location of that FILE AS BLANK.  Since we are only ever merging two mods at a time, since VANILLA is always our base.  If both lines expected a blank spot (can be read from patch file), and they found something there (in the case of merge conflicts).  It's safe to assume that we CAN ADD BOTH PATCHES, ONE ON TOP OF THE OTHER.

Actually, you know... now that I think about it.  Our base files don't have blank spots.  So what's happening is... their is a token mismatch.  At that point we can introduce logic on how to deal with token mismatches.  It's probably one mod realizing another mod wrote to that spot.
The problem is, sometimes additions in the same spot don't conflict, sometimes they do. Two mods adding a creature in the same spot are probably ok. Two mods adding a the same tag to the same creature might cause problems. A Simple solution is to not allow adding to the same spot.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 19, 2014, 11:48:33 am

This still doesn't do a good job with re-ordering things.  To get that, you'd probably have to nuke the vanilla file completely and replace it using a different filename.  The launcher would detect that one mod's changes got deleted by another, but absent some advanced raw-aware logic it just isn't possible to merge the two.  That advanced logic comes... later.
We've been talking about this in the abstract, but is there a test case of two mods that we'd like to be able to merge with one doing complex reordering, and the other adding or removing stuff? At the very least we need to make sure that we properly detect that the mods can't be naively merged.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Button on August 19, 2014, 11:59:26 am
All right, I've got a little pack of minor mods with potentially interesting interactions uploaded now. http://dffd.wimbli.com/file.php?id=9443 .

These features coexist happily in my raws, but may be a little challenging to merge together automatically, so I hope you find them useful :).
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Dirst on August 19, 2014, 12:05:54 pm

This still doesn't do a good job with re-ordering things.  To get that, you'd probably have to nuke the vanilla file completely and replace it using a different filename.  The launcher would detect that one mod's changes got deleted by another, but absent some advanced raw-aware logic it just isn't possible to merge the two.  That advanced logic comes... later.
We've been talking about this in the abstract, but is there a test case of two mods that we'd like to be able to merge with one doing complex reordering, and the other adding or removing stuff? At the very least we need to make sure that we properly detect that the mods can't be naively merged.

Spacefox did a lot of tidying up inside the gem raws in DF2012, but since the vanilla raws are now tidier I'm not seeing the same kind of re-organizing.  Still could use the 2012 versions a test case.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 19, 2014, 12:49:21 pm
"Two mods adding a the same tag to the same creature might cause problems"
Couldn't this type of conflict be caught by reading the two patch files and being like, "hey, BOTH OF THESE are adding the same token in the same contextual match".

Of course it might be a little simplistic.

Two mods might be incorporating modest mod fixes but reordered the tokens!  Aghast!

That would still result in duplicates.

To address that situation, advanced mod merging would have to be implemented and detect object individual token +/- changes

update on interdiff combining patches
btw, I'm not having much luck merging patch files.  I think I have them "merged" but can never get them to apply correctly.  I know it can be done with github somehow and auto merge conflicts.  However, trying to get something like that done outside of github in diff patch gnu tools, idk.

I'm going to write up a stackexchange problem specifically targetting github users and seeing if there's an equivalent way of doing the same thing at the command line.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 19, 2014, 12:54:20 pm
All right, I've got a little pack of minor mods with potentially interesting interactions uploaded now. http://dffd.wimbli.com/file.php?id=9443 .

These features coexist happily in my raws, but may be a little challenging to merge together automatically, so I hope you find them useful :).
Thanks, looking at it now. Already clear to me that two mods that add the same file cannot be trivially merged.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 19, 2014, 12:56:13 pm

This still doesn't do a good job with re-ordering things.  To get that, you'd probably have to nuke the vanilla file completely and replace it using a different filename.  The launcher would detect that one mod's changes got deleted by another, but absent some advanced raw-aware logic it just isn't possible to merge the two.  That advanced logic comes... later.
We've been talking about this in the abstract, but is there a test case of two mods that we'd like to be able to merge with one doing complex reordering, and the other adding or removing stuff? At the very least we need to make sure that we properly detect that the mods can't be naively merged.

One solution to this problem (and is what I asked RawExplorer mod author a while back if he could incorporate).

Was object tag-id alphabetizing.

However, comments kind of create havoc with that.  However, if WE DID ALL THIS ON OUR END.  We could alphabetize our flattened raws based on some parse token-id trick.

The great thing about it is, the FIRST <token> is our object:type, so immediately, we can determine what we should be alphabetizing on.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 19, 2014, 01:03:35 pm
"Two mods adding a the same tag to the same creature might cause problems"
Couldn't this type of conflict be caught by reading the two patch files and being like, "hey, BOTH OF THESE are adding the same token in the same contextual match".

Of course it might be a little simplistic.

Two mods might be incorporating modest mod fixes but reordered the tokens!  Aghast!

That would still result in duplicates.

To address that situation, advanced mod merging would have to be implemented and detect object individual token +/- changes
If two mods add the same content in the same spot, we may be able to detect that and allow it. In the general case there's the possibility that they are changing the same token in incompatible ways.

EDIT: Button's mods modify c_variation_default.txt by adding the same content in two mods. So that's a test case for that.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 19, 2014, 01:12:59 pm

This still doesn't do a good job with re-ordering things.  To get that, you'd probably have to nuke the vanilla file completely and replace it using a different filename.  The launcher would detect that one mod's changes got deleted by another, but absent some advanced raw-aware logic it just isn't possible to merge the two.  That advanced logic comes... later.
We've been talking about this in the abstract, but is there a test case of two mods that we'd like to be able to merge with one doing complex reordering, and the other adding or removing stuff? At the very least we need to make sure that we properly detect that the mods can't be naively merged.

One solution to this problem (and is what I asked RawExplorer mod author a while back if he could incorporate).

Was object tag-id alphabetizing.

However, comments kind of create havoc with that.  However, if WE DID ALL THIS ON OUR END.  We could alphabetize our flattened raws based on some parse token-id trick.

The great thing about it is, the FIRST <token> is our object:type, so immediately, we can determine what we should be alphabetizing on.
You'd have to figure out top level tokens for that. That's not in PeridexisErrant's version 1 objectives. It also means you have to track two kinds of changes separately: those that add/remove/reorder top level tokens, and those that modify existing entities.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 19, 2014, 01:43:37 pm
All right, I've got a little pack of minor mods with potentially interesting interactions uploaded now. http://dffd.wimbli.com/file.php?id=9443 .

These features coexist happily in my raws, but may be a little challenging to merge together automatically, so I hope you find them useful :).
So as an update on this here's what I found:
1)the two larger mods modify creature_bug_slug_new.txt in incompatible ways.
2)the two larger mods modify c_variation_default.txt with the same changes, which should be ok.
3)two mods add body_snail.txt, which cannot be allowed
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 19, 2014, 01:53:52 pm
So... If I can get an answer to this... maybe... just maybe... I can do a CL equivalent.

http://stackoverflow.com/questions/25388806/github-3-way-merge-patch

If the only solution is a github solution and manual editing of raws... then nm.  But hopefully, if this works by merging non conflicting merge conflicts, then maybe we can merge say plantfixes and direforge.

non conflicting meaning, things like different tokens being written to the same line should equal write both to their own lines.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 19, 2014, 02:02:32 pm
EDIT: Didn't understand the problem.

The short answer is no, because we want to be conservative about allowing conflicting mods. If you don't care about failing on true conflicts, then you may have some luck making a smarter merge tool that sometimes does the wrong thing.

I'm working on a 3 way merge script atm, (testing it at this point) but I don't plan to make it too fancy, because I don't want to merge mods that really do conflict.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Button on August 19, 2014, 02:09:56 pm
So as an update on this here's what I found:
2)the two larger mods modify c_variation_default.txt with the same changes, which should be ok.

Rreally? They should have compatible but different changes, in different locations.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 19, 2014, 02:18:52 pm
So as an update on this here's what I found:
2)the two larger mods modify c_variation_default.txt with the same changes, which should be ok.

Rreally? They should have compatible but different changes, in different locations.
Woops. Good catch. You're right. Changes are in different locations, so there is no problem.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 19, 2014, 02:40:25 pm
EDIT: Didn't understand the problem.

The short answer is no, because we want to be conservative about allowing conflicting mods. If you don't care about failing on true conflicts, then you may have some luck making a smarter merge tool that sometimes does the wrong thing.

I'm working on a 3 way merge script atm, (testing it at this point) but I don't plan to make it too fancy, because I don't want to merge mods that really do conflict.

I'm sorry, i've been keeping up like 80% of the posts since about page 5 or 6, I parsed through page 5 and didn't see where you reference n way merges.  The thought is familiar, I'd love to see your solution.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 19, 2014, 02:51:24 pm
EDIT: Didn't understand the problem.

The short answer is no, because we want to be conservative about allowing conflicting mods. If you don't care about failing on true conflicts, then you may have some luck making a smarter merge tool that sometimes does the wrong thing.

I'm working on a 3 way merge script atm, (testing it at this point) but I don't plan to make it too fancy, because I don't want to merge mods that really do conflict.

I'm sorry, i've been keeping up like 80% of the posts since about page 5 or 6, I parsed through page 5 and didn't see where you reference n way merges.  The thought is familiar, I'd love to see your solution.
N-way merge algorithm here:
http://www.bay12forums.com/smf/index.php?topic=142295.msg5579007#msg5579007

Basically just keep doing a 3 way merge between the mod being added, vanilla, and the merge of all previous mods.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 19, 2014, 03:26:38 pm
I looked at the psuedo.

My hope and goal was not to spend too much effort on testing merge conflicts.

But rather, to do a simple if check, if the point of conflict is due to the patch file assuming the source was unchanged

If the patch file can somehow just find the correct INSERTION point.  It could insert the data as long as the data it was inserting was not specifically REMOVED.  I think that can be adopted somehow, somewhere, not sure exactly though.

The case with direforge and the fixplants is the case I brought up.  Button said those two could have been merged automatically.  Github has a feature like that, so does tortoisegitmerge.  To lay one set of changes on top of the other.

This is what I was thinking.  If there was a simple if check at that point of insertion to verify if the reason for conflict was due to both patch files assuming the point of insertion was this, but it something else (due to the way of sequential patches, so I would think checking the original vanilla raws, and if the vanilla raws match the patch file, the insert at correct insertion point), but the prior point of insertion matches then maybe just insert anyways?
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 19, 2014, 03:36:11 pm
If two mods add content to the same spot, like to the same creature, there is a high chance that the merges conflict, especially if they also remove a tag in that spot.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 19, 2014, 03:37:34 pm
so you want to avoid those merges?

if the contents of the patch don't match and/or conflict with the contents of the other patch that affect the same place, I think it could safely be applied.  The data would be in within the two patch files.

Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Button on August 19, 2014, 03:38:19 pm
If two mods add content to the same spot, like to the same creature, there is a high chance that the merges conflict, especially if they also remove a tag in that spot.

The question is, though, what about two mods that add content to the same spot because that's the natural place to add things? Like, at the end of a creature variation definition for instance. It's usual to add new tags at the end of the variation, not in the middle somewhere.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 19, 2014, 03:40:00 pm
If two mods add content to the same spot, like to the same creature, there is a high chance that the merges conflict, especially if they also remove a tag in that spot.

The question is, though, what about two mods that add content to the same spot because that's the natural place to add things? Like, at the end of a creature variation definition for instance. It's usual to add new tags at the end of the variation, not in the middle somewhere.

that's my point.  If the two patch files based on vanilla are MERELY ADDITIVE (for this Use Case), then I think both can be additive.

If one is SUBTRACTIVE, and the other doesn't subtract it.  Then... that could be cause for alarm, but I would think to subtract it if one mod did and the other didn't.

But...

if one added a line, and the other removed a line FROM VANILLA.  that would be a huge red flag, this mod can't be applied without manual intervention.

I think if one could keep track of the changes between two different modded versions, and figure if they added or deleted current vanilla tokens... (when the patch is being loaded).

Then that logic could be carried forward into insertion/deletions.

It would/could cause duplicates though.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 19, 2014, 04:01:28 pm
Think about this simple case: 1 mod adds a creature specific token at the end of a creature definition. another adds a creature. If you add them in the wrong order, you could get the creature specific token to the wrong creature if you don't order the merge right.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 19, 2014, 04:06:13 pm
If two mods add content to the same spot, like to the same creature, there is a high chance that the merges conflict, especially if they also remove a tag in that spot.

The question is, though, what about two mods that add content to the same spot because that's the natural place to add things? Like, at the end of a creature variation definition for instance. It's usual to add new tags at the end of the variation, not in the middle somewhere.
Then they might add conflicting content. I'd err on the side of safety and not allow the modification. Unless they are adding the same content, in which case you don't want to add it twice.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Meph on August 19, 2014, 04:17:01 pm
How automated do you want to have it? I have merged dozens of mods by hand, its not that much work...
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 19, 2014, 04:25:07 pm
How automated do you want to have it? I have merged dozens of mods by hand, its not that much work...
The idea is to have it completely automated and able to merge and remove mods.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Meph on August 19, 2014, 04:28:22 pm
What happens to mutually exclusive mods?

One removes kimberlite, the other adds new uses for diamonds (glass cutting for example). No kimberlite = No diamonds. While the Raws would fit and the program would merge them, you would have an unuseable mod in the end.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 19, 2014, 05:01:17 pm
Think about this simple case: 1 mod adds a creature specific token at the end of a creature definition. another adds a creature. If you add them in the wrong order, you could get the creature specific token to the wrong creature if you don't order the merge right.

I think some simple way to either derive a precombined patch

or to keep track of line changes as you add/remove lines would b in order to address this concern

Basically, you would check against vanilla, and determine if the changes are merely additive or subtractive.

Then if they are additive, it's all green light.  Merge with rest of patches (barring any duplicate line additions?)

Yeah, it get's complicated apparently...

I was just assuming [by] always checking a patch against vanilla [we] would [edit:] be basing [the] "Additive" part of the mod [against Vanilla]; and if you could somehow map the BEFORE and AFTER line states of working mod vs vanilla, you could somehow accommodate future line injections based on tracking insertion points or something.

but apparently, it gets complicated, and one might as well do some object tracking at that point.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 19, 2014, 06:10:04 pm
Think about this simple case: 1 mod adds a creature specific token at the end of a creature definition. another adds a creature. If you add them in the wrong order, you could get the creature specific token to the wrong creature if you don't order the merge right.

I think some simple way to either derive a precombined patch

or to keep track of line changes as you add/remove lines would b in order to address this concern

Basically, you would check against vanilla, and determine if the changes are merely additive or subtractive.

Then if they are additive, it's all green light.  Merge with rest of patches (barring any duplicate line additions?)

Yeah, it get's complicated apparently...

I was just assuming always checking a patch against vanilla would the base "Additive" part of the mod.  And if you could somehow map the BEFORE and AFTER line states of working mod vs vanilla, you could somehow accommodate future line injections based on tracking insertion points or something.

but apparently, it gets complicated, and one might as well do some object tracking at that point.
One mod adds [PET] at the end of a vanilla creature. Another mod adds a new creature at the same spot. Both are merely additive. But if merged in the wrong way, the result would add [PET] to the new creature instead of the vanilla creature.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 19, 2014, 06:20:42 pm
What happens to mutually exclusive mods?

One removes kimberlite, the other adds new uses for diamonds (glass cutting for example). No kimberlite = No diamonds. While the Raws would fit and the program would merge them, you would have an unuseable mod in the end.
That's a rather benign error. The result is still a valid set of raws. So it's not a big problem.

Also, since the primary use of this tool is for including a lot of minor mods, hopefully 2 minor mods that only make those changes would obviously conflict from their description, so nobody would actually try to do that.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 19, 2014, 07:05:21 pm
Here's the code to merge mod files in Python. It merges non-conflicting sequences as proven by the previously published can_merge_seq function.

Neither of the functions can handle two mods adding the same content yet. Also, this still needs to be hooked up to PeridexisErrant's code.
Code: [Select]
import difflib

def do_merge_seq (mod_text, vanilla_text, gen_text):
    if vanilla_text == gen_text: #this should happen often
        return mod_text
    van_mod_match = difflib.SequenceMatcher(None, vanilla_text, mod_text)   
    van_gen_match = difflib.SequenceMatcher(None, vanilla_text, gen_text)   
   
    van_mod_ops = van_mod_match.get_opcodes()
    van_gen_ops = van_gen_match.get_opcodes()
   
    output_file_temp = []   
    cur_v = 0
    while cur_v < len(vanilla_text) :
        (mod_tag, mod_i1, mod_i2, mod_j1, mod_j2) = van_mod_ops[0]
        (gen_tag, gen_i1, gen_i2, gen_j1, gen_j2) = van_gen_ops[0]
        #print van_mod_ops[0]
        #print van_gen_ops[0]
        #print cur_v
        if mod_tag == 'equal' and gen_tag == 'equal' :
            if mod_i2 < gen_i2:
                output_file_temp += vanilla_text[cur_v:mod_i2]
                cur_v = mod_i2
                van_mod_ops.pop(0)
            else:
                output_file_temp += vanilla_text[cur_v:gen_i2]
                cur_v = gen_i2
                van_gen_ops.pop(0)
                if mod_i2 == gen_i2 :
                    van_mod_ops.pop(0)           
        else:                                                 
            if mod_tag != 'equal' :
                output_file_temp += mod_text[mod_j1:mod_j2]
                cur_v = mod_i2
                van_mod_ops.pop(0)
                if mod_i2 == gen_i2 :
                    van_gen_ops.pop(0)   
            elif gen_tag!='equal':
                output_file_temp += gen_text[gen_j1:gen_j2]
                cur_v = gen_i2
                van_gen_ops.pop(0)
                if mod_i2 == gen_i2 :
                    van_mod_ops.pop(0)   
            #if neither gen_tag nor mod_tag is 'equal', this mod can't be merged
        #print (output_file_temp)
    if van_mod_ops:
        (mod_tag, mod_i1, mod_i2, mod_j1, mod_j2) = van_mod_ops[0]
        output_file_temp += mod_text[mod_j1:mod_j2]
    if van_gen_ops:
        (gen_tag, gen_i1, gen_i2, gen_j1, gen_j2) = van_gen_ops[0]
        output_file_temp += gen_text[gen_j1:gen_j2]
    return output_file_temp

vanilla_file = "vanilla"
output_file = "vanilla"

print ("----")
print output_file
output_file = do_merge_seq ('anything at all', vanilla_file, output_file)
print output_file
print ("----")
output_file = vanilla_file
print output_file
output_file = do_merge_seq ('nilla', vanilla_file, output_file)
print output_file
output_file = do_merge_seq ('vanill', vanilla_file, output_file)
print ''.join(output_file)
print ("----")
output_file = vanilla_file
print output_file
output_file = do_merge_seq ('vonilla', vanilla_file, output_file)
print output_file
output_file = do_merge_seq ('banana', vanilla_file, output_file)
print ''.join(output_file)
print ("----")
output_file = vanilla_file
print output_file
output_file = do_merge_seq ('banana', vanilla_file, output_file)
print output_file
output_file = do_merge_seq ('vonilla', vanilla_file, output_file)
print ''.join(output_file)
print ("----")
output_file = vanilla_file
print output_file
output_file = do_merge_seq ('banana', vanilla_file, output_file)
print output_file
output_file = do_merge_seq ('vanilla', vanilla_file, output_file)
print ''.join(output_file)
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: hermes on August 19, 2014, 07:13:47 pm
I don't understand why you guys are beating around the bush with this.  The way PE wants the version 1 to work you should be able to detect all these conflicts but you have to refuse the merge each timeThere is no way a blind diff patch can merge any mod that takes something away, guaranteed error free.

Appealing to PE again... you're in a much better position than I ever have been to impose a standard for mods, because the starter pack you curate is so popular.  I don't see what the rush is, if you spend some time trying to set up a single, decent system that is flexible enough to handle the diverse range of DF mods, that would be great.

However, I do agree that there should be a non-manifest way of merging, so I'd go the route of making a two tier mod-integration system where mods without a manifest are given lower priority and are more readily excluded if they remove objects from the raws.  As everyone seems to be slowly realizing, there are so many potential conflicts you have to go all the way to actually make the thing work properly.

That said I feel like I'm trolling a bit, so I'll bow out here for now, I guess you guys are going in a different direction than I'd anticipated.  Really hope it works out, good luck!   :)
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: PeridexisErrant on August 19, 2014, 07:58:41 pm
https://github.com/PeridexisErrant/Py-Mod-Loader

Now on GitHub, for easier collaboration and code compilation.  Note that the readme etc is not at all finished, but eventually I hope that'll have the definitive standard descriptions.

I don't understand why you guys are beating around the bush with this.  The way PE wants the version 1 to work you should be able to detect all these conflicts but you have to refuse the merge each timeThere is no way a blind diff patch can merge any mod that takes something away, guaranteed error free.

Appealing to PE again... you're in a much better position than I ever have been to impose a standard for mods, because the starter pack you curate is so popular.  I don't see what the rush is, if you spend some time trying to set up a single, decent system that is flexible enough to handle the diverse range of DF mods, that would be great.

However, I do agree that there should be a non-manifest way of merging, so I'd go the route of making a two tier mod-integration system where mods without a manifest are given lower priority and are more readily excluded if they remove objects from the raws.  As everyone seems to be slowly realizing, there are so many potential conflicts you have to go all the way to actually make the thing work properly.

That said I feel like I'm trolling a bit, so I'll bow out here for now, I guess you guys are going in a different direction than I'd anticipated.  Really hope it works out, good luck!   :)

It's a problem all right.  The solution seems to be either living with some working merges that produce non-functional raws, disallowing removals, only merging one mod, or full raw comprehension.  I think occasional errors is probably the way to go, and just warn people to be careful with mods.  I like the idea of trusting mods with a (non-tool-written) manifest more, we could do good stuff with that. 

There's no way I'm including a non-final system in the starter pack; that way lies madness through support enquiries.  You're probably right about imposing a standard - I don't want to end up making the IE6 of mods though, so there's a lot of care required.  No rush at all. 

I see no trolling, and it's good to hear an alternative view - especially from someone who can see us falling into traps.  Please stick around! 
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 19, 2014, 08:56:01 pm
Well, the beauty of a mod system is this.

If you keep it to simple things like base ascii game changes, and offer as many mods as you can.  You keep the mods down to their ascii elements and allow players to choose from a wide range of mods that are applied on a batch level.

I mean, players can simply just check "one" mod and play that without ever wanting to merge them.

Barring most likely dfhack, unless of course you have dfhack and the flattening of raw files doesn't break dfhack.

and maybe last you offer a tileset change on top.

The "mix" of mods can be left up to the community.  Supposedly the system can work as long as one derives a diff from base to their mod, then the mod can be applied using this patch system.

So in the end, regardless of whether the "merging of mods" works right or not, your allowing modders to port/be included in a mod system by merely supplying patches, what's great about that, is it's possible to extract patches ourselves and submit them using a git system.

If players so choose, the "merge" process can come with a disclaimer and "no silver bullet" kind of situation.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Button on August 19, 2014, 09:00:51 pm
I'm still completely in agreement with Hermes by the way, I just think that it's a conclusion y'all won't come around to until you've grappled with the problem a bit more :P. Until then, I'll help the diff-based approach as best I can.

I was thinking I might write a last-pass errorchecker which could do some basic raw comprehension and catch errors which the diff might miss - finding dupes across files, reactions with no possible reagents, things like that. Could be expanded into a more fully-functional syntactic parser later. It'll give me an excuse to pick up Python again - I've been stuck doing Java at work.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: PeridexisErrant on August 19, 2014, 09:39:43 pm
I was thinking I might write a last-pass errorchecker which could do some basic raw comprehension and catch errors which the diff might miss - finding dupes across files, reactions with no possible reagents, things like that. Could be expanded into a more fully-functional syntactic parser later. It'll give me an excuse to pick up Python again - I've been stuck doing Java at work.

You know, I was thinking that a post-processing pass might be the best way to catch some of this stuff, but I'm nowhere near making it happen.  That would be awesome!

I make no comment on the accuracy of observations that we may have underestimated the problem just a little  :o
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Merkator on August 20, 2014, 07:41:44 am
This short script should extract every single token in file. Maybe someone find it useful.
Code: [Select]
import re
import sys

if len(sys.argv) < 2:
    filepath = str(input("FILEPATH> "))
else:
    filepath = sys.argv[1]


def reader(filepath):
    tokens = []
    name = ""
    with open("test.txt", 'r') as file:
        for line in file:
            if name == "":
                name = line
            for i in re.findall("\[[A-Za-z0-9:]*?\]", line):
                tokens.append(i)
    return (name, tokens)


def main():
    name, tokens = reader(filepath)
    for i in tokens:
        print(i)

if __name__ == "__main__":
    main()


BTW why not use something like robocopy \mir or cp -rvu for major mods.
Or just add small manifest file with information about compatibility. Number of major mods is not that big after all.
They are maybe 10 really big mods and no one will try to merge Masterwork with MLP  :P

Code: [Select]
  {
     "name":<name of the mod>,
     "version":<version of the mod>,
     "compatible-major":{
         "name-of-major-mod":"version-of-major-mod"
  }
}

Something like this. Where * is wildcard.

{
  "name":"Mycrazymod",
  "version":"0.0.1",
  "compatible-major": {
     "MasterworkDF":"5.*"
   }
}


For smaller mods you insist on merging raws itself. Why not merge diffs first?
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 20, 2014, 08:15:36 am
Update5
Asking if it's possible to apply nway merging on a patch level
http://stackoverflow.com/questions/25408998/octopus-merge-using-patch-files

Update4
https://github.com/thistleknot/df_40_09_Flattened/commits/3-wayResults of merge of advciv, plantfixes, and direforge (ascii) & flattened here:

I can do 3 way merges of conflicting mods if they share a common ancestor.
Code: [Select]
C:\Games\Dwarf Fortress\mod testbed\df_40_09_win [3-way]> git branch 40_09_DireForge-Flattened f18f255
C:\Games\Dwarf Fortress\mod testbed\df_40_09_win [3-way]> git merge 3459e01
Auto-merging raw/objects/plant_standard.txt
CONFLICT (content): Merge conflict in raw/objects/plant_standard.txt
Auto-merging raw/objects/entity_default.txt
CONFLICT (content): Merge conflict in raw/objects/entity_default.txt
Automatic merge failed; fix conflicts and then commit the result.
C:\Games\Dwarf Fortress\mod testbed\df_40_09_win [3-way +48 ~23 -0 !2 | +0 ~0 -0 !2]>
It's a git branching issue.  I DO NOT KNOW if it can be done offline.  What this really is, is a way for me to merge a ton of mods now.

Update3
Figured out how to do an octopus Merge, it was about ensuring I had a common ancestor to base it on (didn't realize anything about a common ancestor until I looked at the rejected merge* files that served as the 3 way comparison), and the common ancestor HAD TO BE flattened vanilla.  Which was a bit of work.  because the mods are applied, then flattened.  So my prior common ancestor was non flattened ascii vanilla...  But I rebranched based on flattened ascii, replaced with already flattened advciv and plantfixes, and I was able to merge two mods together at the same time
Code: [Select]
C:\Games\Dwarf Fortress\mod testbed\df_40_09_win [3-way +2 ~0 -0 !]> git merge 514bde0 13a563f
Trying simple merge with 514bde0
Trying simple merge with 13a563f
Merge made by the 'octopus' strategy.
 raw/objects/creature_darkdwarf.txt        | 363 +++++++++++++
 raw/objects/creature_firedwarf.txt        | 363 +++++++++++++
 raw/objects/creature_frostdwarf.txt       | 362 +++++++++++++
 raw/objects/creature_stonedwarf.txt       | 362 +++++++++++++
 raw/objects/creature_stormdwarf.txt       | 362 +++++++++++++
 raw/objects/creature_wilddwarf.txt        | 362 +++++++++++++
 raw/objects/descriptor_shape_standard.txt |  25 +
 raw/objects/entity_darkdwarf.txt          | 821 +++++++++++++++++++++++++++++
 raw/objects/entity_default.txt            |   9 +
 raw/objects/entity_firedwarf.txt          | 815 +++++++++++++++++++++++++++++
 raw/objects/entity_frostdwarf.txt         | 817 +++++++++++++++++++++++++++++
 raw/objects/entity_stonedwarf.txt         | 822 ++++++++++++++++++++++++++++++
 raw/objects/entity_stormdwarf.txt         | 815 +++++++++++++++++++++++++++++
 raw/objects/entity_wilddwarf.txt          | 816 +++++++++++++++++++++++++++++
 raw/objects/plant_crops.txt               | 218 +++++---
 raw/objects/plant_garden.txt              | 169 +++++-
 raw/objects/plant_standard.txt            | 128 ++++-
 raw/objects/reaction_plantfix.txt         |  92 ++++
 18 files changed, 7618 insertions(+), 103 deletions(-)
 create mode 100644 raw/objects/creature_darkdwarf.txt
 create mode 100644 raw/objects/creature_firedwarf.txt
 create mode 100644 raw/objects/creature_frostdwarf.txt
 create mode 100644 raw/objects/creature_stonedwarf.txt
 create mode 100644 raw/objects/creature_stormdwarf.txt
 create mode 100644 raw/objects/creature_wilddwarf.txt
 create mode 100644 raw/objects/entity_darkdwarf.txt
 create mode 100644 raw/objects/entity_firedwarf.txt
 create mode 100644 raw/objects/entity_frostdwarf.txt
 create mode 100644 raw/objects/entity_stonedwarf.txt
 create mode 100644 raw/objects/entity_stormdwarf.txt
 create mode 100644 raw/objects/entity_wilddwarf.txt
 create mode 100644 raw/objects/reaction_plantfix.txt
C:\Games\Dwarf Fortress\mod testbed\df_40_09_win [3-way]>

Update2
http://stackoverflow.com/questions/5292184/merging-multiple-branches-with-git
Apparently the first octopus merge didn't work very well, but a more indepth explanation is here:
Code: [Select]

    git checkout master
    git pull origin feature1 feature2
    git checkout develop
    git pull . master (or maybe git rebase ./master)

Quote
The first command changes your current branch to master.

The second command pulls in changes from the remote feature1 and feature2 branches. This is an "octopus" merge because it merges more than 2 branches. You could also do two normal merges if you prefer.

The third command switches you back to your develop branch.

The fourth command pulls the changes from local master to develop.

Hope that helps.

EDIT: Note that git pull will automatically do a fetch so you don't need to do it manually. It's pretty much equivalent to git fetch followed by git merge.

Update:
http://stackoverflow.com/questions/25388806/github-3-way-merge-patch

Someone may have given me an answer which echo's another person's answer on the same issue, Octopus Merge:
Code: [Select]
git merge B C
Quote
This short script should extract every single token in file. Maybe someone find it useful.
We already hit on the token extraction two ver... Both sed regex commands; , one that derives tokens using regex like ur ex then injects filename on top, and another that puts tokens on their own lines effectively keeping comments. Both ver remove all whitespace. :)

Please do tell. How does one combine diff's? I would love to know

Combeindiff combines two sequential commits of the same branch, however these patches are all individual branches off the same base, vanilla. So far the best solution I've had is interdiff and I have not figured out how to apply an interdiff merged patch file.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Meph on August 20, 2014, 11:47:58 am
Silly question among all this talk of different technical problems and solutions: Has anyone asked around, written some PMs to mod authors about writing and packaging their mods in a standardized system?

Because I havent seen a single modder here, besides Putnam and me, and both our usual projects are a bit large to be included as 'minor mods' that can be easily packaged.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Button on August 20, 2014, 11:53:26 am
Silly question among all this talk of different technical problems and solutions: Has anyone asked around, written some PMs to mod authors about writing and packaging their mods in a standardized system?

Because I havent seen a single modder here, besides Putnam and me, and both our usual projects are a bit large to be included as 'minor mods' that can be easily packaged.

What am I, chopped liver? :P
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Meph on August 20, 2014, 12:01:32 pm
A button? :)

No, sorry, I dont connect your name with any mods I recall... and there are no threads in the mod release board by your name. So as unfortunate as it is, I have to declare you a button. ;)
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Button on August 20, 2014, 12:05:49 pm
A button? :)

No, sorry, I dont connect your name with any mods I recall... and there are no threads in the mod release board by your name. So as unfortunate as it is, I have to declare you a button. ;)

Just cause I don't plaster my work all over my signature... :P

Plant Bugfix/Minor Mod (http://www.bay12forums.com/smf/index.php?topic=141440.0) is mine. You'll notice it has a thread on the mod release board. ;)
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Merkator on August 20, 2014, 12:07:32 pm
If I am wrong somewhere in the the middle please correct me.
The biggest diff issue is that, when you apply one mod as diff file to the vanilla, every single mod or at least most of them invalidates.
That's because line numbers change.
3-way diff sound pretty nice, but still the problem with conflicting mods stay.

Of course the best thing is to have just context dependent diff format, and not format that depend on line numbering.
Or format that use relative line numbering. For example apply some patch after n lines from point x.

And I think using git is not the best option here. Git is great, but IDK if this is best use case for it, but if you insist. ;P


Why not just squash all patches in to single diff like that http://stackoverflow.com/questions/616556/how-do-you-squash-commits-into-one-patch-with-git-format-patch?rq=1 (http://stackoverflow.com/questions/616556/how-do-you-squash-commits-into-one-patch-with-git-format-patch?rq=1)
 
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 20, 2014, 01:20:32 pm
Squash across branches?

squashing is for commits of the same branch.

we are aware that the lines and what not change with each successive patch btw.  The only way to address that is to do some type of octopus merge...  That's why I was hitting on this octopus merge solution.  If there was a way to ship git and derive patches from a base + a bunch of mods... that would be neat.  but I think it's a little much...
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 20, 2014, 02:30:46 pm
Octopus Merges vs Sequential Patching

Contextual differences between mods will break when:

After flattening/parsing:

creatureA and creatureB look the same around say 10-20 lines in their token make-ups.

So a mod comes in and INSERTS a new creature before creature's A and B.

So now the line references are messed up, diff will most likely fall back on a contextual match at this point for any creature mods that affect creature a or b in the lines I mentioned above.

So a mod, say modCreatureB, comes in, (which is a diff from vanilla:modCreatureB), mods creatureB at some odd lines.

Now, [the patch program when patching,] modCreatureB is confused since line #'s have changed, so it does a contextual match.

This is where the issue arises: If creatureA and creatureB had token makeup's similar in the same area, the contextual match will most likely fall on the first match or something.

Solution
Keep track of lines changes (Which I believe King Mir said he was doing)

OR...
Better solution IMO (which requires some working version of offline git command line batch scripting, we would ship a repository with the front end tool basically).

Octopus Merges
The mod applied will always be compared with the original base vs the last mod applied.  When conflicts occur with other mods... the injection point will always be derived from looking at the base, and seeing what other changes are being made (insertion/deletion points), and MATCHING those [insertion/deletion] points to the before/after contextual locations of the original patch file makeup.  The reason this works is the way CONTEXTUAL DIFFS ARE OUTPUTTED.  They always show BEFORE/AFTER CHANGES.  As long as we can MATCH 1 of those sides.  We can insert based on one of those side's matches + also being able to reference the line and be able to keep track of any changes based on the way the other mods are applied and line changes are moved around.

I would think...

It's entirely possible that other mods change multiple tokens within an object at different spots, therefore destroying contextual matchings...?  Idk

However, ultimately.  If one wanted to, object tracking is what would resolve that.

Anyways, my point being, is if we can just match ONESIDE of the contextual match while ensuring we're at the right line #, we should be able to do a clean insert or delete option.

And on a final note:
Alphabetizing
I've been pondering this idea on how to alphabetize raws to address the issue of TC mods or mods that completely reorder the contents of the files.

It may be possible with regular expressions (sed, or python, or grep), but unfortunately, the solution I'm thinking of involves breaking comments that are listed right before an object would be broken apart from the next object... so, I'd rather just remove comments than try to parse them using some batch script.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Dirst on August 20, 2014, 02:57:09 pm
Highly detailed explanation

This is why I mentioned the idea of bookmarking an intermediate file, which fixes all issues with line numbers.  It does not perform well if two mods want to insert different stuff at the same point (it'd likely throw a false-positive conflict) and it completely craps its pants if the file is re-ordered, but it looks like it should be relatively simple and get us a fair percentage of what we need.

We can include a simple special case that two mods adding at the same point are not considered a conflict (important for adding custom buildings and reactions to entity_default.txt), but I don't think PE is looking to make the pinnacle of file merger technology especially if manifest files are adopted.  A manifest can declare dependencies and known conflicts and "recommended" load orders relative to other specific mods.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Dirst on August 20, 2014, 03:02:20 pm
Alphabetizing
I've been pondering this idea on how to alphabetize raws to address the issue of TC mods or mods that completely reorder the contents of the files.

It may be possible with regular expressions (sed, or python, or grep), but unfortunately, the solution I'm thinking of involves breaking comments that are listed right before an object would be broken apart from the next object... so, I'd rather just remove comments than try to parse them using some batch script.
A multi-line match for regular expressions should be able to scoop up everything between tags and associate it with the next valid tag.  The special cases are (1) the header up to the [OBJECT:FOO] token, (2) header comments after [OBJECT:FOO] but before the first [FOO:BAR] and (3) comments at the end of the file.  (1) and (3) should be easy.  We just have to pick a standard for (2), is it associated with the header or the first object?
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 20, 2014, 03:03:59 pm
well, I demonstrated earlier how the file_name can be injected back to the top of the .txt file... so that isn't a concern :)

If you say 1 and 3 are easy.

I think the inbetween ones can be matched to the last token they were adjacent to...

but... that only works BEFORE whitespace is removed.

If whitespace is removed... then there won't be blank lines inbetween tokens...

And... it's hard to say how to split comments.

I mean..

[token]

comment

comment

[object:bar]

can be tricky as well
Ultimately though, comments aren't important [for playing] other than for exporting merges.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 20, 2014, 07:57:17 pm
So someone got back to me on how a 3 way merge can be done using patch files, they said it was kind of hard to do without Subversion Control. :)  But said would be possible with diff3 and the source folders of each version to be changed (which can be achieved by extracting and applying patch files).

http://stackoverflow.com/questions/25408998/octopus-merge-using-patch-filesg

Code: [Select]
Otherwise, a 3-way merge is probably going to require the source directories of each version and then in a shell script:

mkdir VersionABC
for i in (cd VersionA && ls -R); do
  if [ -f VersionA/$i ]; then
    diff3 --merge VersionB/$i VersionA/$i VersionC/$i > VersionABC/$i
  else
    mkdir -p VersionABC/$i
  fi
done


I'm not 100% if they meant VersionB/$i VersionA/$i VersionC/$i vs ABC but it could be tested out.

Here's some documentation on diff3.

Looks like it does exactly what we have been concerned with, merge 2 descendents of a common ancestor.
http://www.chemie.fu-berlin.de/chemnet/use/info/diff/diff_8.html

Update
apparently diff3 suck at merging.

It had LARGE swathes of sections that were repeated, so nm on that idea.

Actually, using the -m option, such as diff3 -m a b c > a-bc.patch seems to work pretty well.

Great tutorial/explanation of how diff3 works

http://www.thegeekstuff.com/2012/08/diff3-examples/
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Putnam on August 20, 2014, 09:11:25 pm
both our usual projects are a bit large to be included as 'minor mods' that can be easily packaged.

Pfah. My mod collection includes Sparking already. Including Masterwork stuff would simply be an ordeal rather than impossible at this stage.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: PeridexisErrant on August 21, 2014, 12:00:03 am
Silly question among all this talk of different technical problems and solutions: Has anyone asked around, written some PMs to mod authors about writing and packaging their mods in a standardized system?

Because I havent seen a single modder here, besides Putnam and me, and both our usual projects are a bit large to be included as 'minor mods' that can be easily packaged.

I was basically thinking that there was enough feedback to ensure the basic concept worked, and then it would be better to get a working implementation before soliciting content.

The format is designed so that any simple / raw only mod - yeah, MW is unlikely to work - can just be dropped in.  Mods which go deliberately minimalist will obviously be more widely compatible for the merge stage, but there's plenty that works well enough for testing. 
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: PeridexisErrant on August 21, 2014, 02:39:26 am
I've substantially refactored the code.  We now have one monster procedure that simplifies folders (deletes all files that aren't needed per our mod format), some placeholder input stuff to get an ordered list of mods to load, and then for each mod pass each filename to a merge function for one-file-at-a-time merging. 

So basically that function needs a lot more work, but the rest is at least good enough to work as a placeholder. 

Code: [Select]
def per_file_mod_merge_logic(vanilla_raw_folder, mod_raw_folder, mixed_raw_folder, file):
    if os.path.isfile(mixed_raw_folder + file):
        # preprocess files here
        pass
        # merge logic goes here
        # see https://docs.python.org/2/library/difflib.html
    else:
        pass
        # nothing for now, but later just copy file over:
        #shutil.copy(mod_raw_folder + file, mixed_raw_folder)

https://github.com/PeridexisErrant/Py-Mod-Loader

I had a look at the functions King Mir wrote earlier in the thread, but just got errors.  Do you mind having another look, and maybe trying to make them usable in the function above?  Once we get that working, we have a working prototype!
Spoiler: King Mir's Code (click to show/hide)

I'm also not quite sure what the final position on preprocessing of the files was - we have so many test snippets around I can't tell which would be good to use.  Thistleknot - could you post a description of how you think files should be processed before doing the diffs?
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Merkator on August 21, 2014, 05:32:44 am
We have still problems with diff. Maybe use something like little extension to diff format, in that way the script can find correct line numbers and fill them. That way we can easily generate correct diff file without much additional work.
I think using git for mod management is overkill. KISS rule win ;P.

My idea is use something like that:
Code: [Select]

Standart diff format look like this

@@ -4,6 +4,6 @@
- something
+ something else
...

We can use something like that

@@ =[CREATURE:WOLVERINE]>[CASTE:MALE], 6 =[CREATURE:WOLVERINE]>[CASTE:MALE],6
-something
+something else
...


The stuff after equality sign describe where the mod will be placed.
The whole expression can be read as:
First [CASTE:MALE] token after [CREATURE:WOLVERINE] token.
And what is important modder don't need to even see them.
The script can find those marks and fill them with correct line numbers.

PROS:

CONS:
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 21, 2014, 07:43:14 am
I've substantially refactored the code.  We now have one monster procedure that simplifies folders (deletes all files that aren't needed per our mod format), some placeholder input stuff to get an ordered list of mods to load, and then for each mod pass each filename to a merge function for one-file-at-a-time merging. 

So basically that function needs a lot more work, but the rest is at least good enough to work as a placeholder. 

Code: [Select]
def per_file_mod_merge_logic(vanilla_raw_folder, mod_raw_folder, mixed_raw_folder, file):
    if os.path.isfile(mixed_raw_folder + file):
        # preprocess files here
        pass
        # merge logic goes here
        # see https://docs.python.org/2/library/difflib.html
    else:
        pass
        # nothing for now, but later just copy file over:
        #shutil.copy(mod_raw_folder + file, mixed_raw_folder)

https://github.com/PeridexisErrant/Py-Mod-Loader

I had a look at the functions King Mir wrote earlier in the thread, but just got errors.  Do you mind having another look, and maybe trying to make them usable in the function above?  Once we get that working, we have a working prototype!
Spoiler: King Mir's Code (click to show/hide)
Yep, I was planning to do that already.

Code: [Select]
I'm also not quite sure what the final position on preprocessing of the files was - we have so many test snippets around I can't tell which would be good to use.  Thistleknot - could you post a description of how you think files should be processed before doing the diffs?
[/quote]My position is avoid preprocessing that changes the mod itself, but do do preprocessing during integration.

As for what's implemented, I think thistleknot figured out how to flatten raws, but it's in sh shell scripts/stray shell commands. But apparently is was helpful in updating his mods to 0.40 so we have a useful proof of concept. And you can lift the regular expressions from it.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 21, 2014, 07:57:17 am
I would look into diff3 vs diff.  It might be a better way to merge.  If deprecated items are modded in 1st using diff3, and diff3 is always referring to a common ancestor, I think one might get better workable mods.  I tried it out with various text files as input and it does a pretty good job of cascading deletions.  It's just subtractive mods [I think] have to be applied first.

Code: [Select]
dir /b %2\raw\objects
mkdir VersionABC
mkdir VersionABC\raw
mkdir VersionABC\raw\objects
for /f %%f in ('dir /b %2\raw\objects') do diff3 --merge %1\raw\objects\%%f %2\raw\objects\%%f %3\raw\objects\%%f > VersionABC\raw\objects\%%f

Here's a working windows script
Code: [Select]
diff3Batch.bat accmod 34_11 civforge

linux psuedo that I derived it from
Code: [Select]
mkdir VersionABC
for i in (cd VersionA && ls -R); do
  if [ -f VersionA/$i ]; then
    diff3 --merge VersionB/$i VersionA/$i VersionC/$i > VersionABC/$i
  else
    mkdir -p VersionABC/$i
  fi
done

makes a new versionABC.

The format of diff3 is MyChanges Base TheirChanges

So...

MyChanges would be the parameter we keep swapping with each new rebuild

Base would be vanilla

TheirChanges would be the incoming mod.

I tested it out.

Automerges the text files in object

Update:
Apparently that won't work... Duplicates
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 21, 2014, 06:13:02 pm
okay, I think i figured out what's up with diff3

I needed the -e or -a option.

Then I didn't get any output with <<<<< or ====

Code: [Select]
C:\temp>DIFF3 -e -m civforge\raw\objectS\entity_default.txt 34_11\raw\objects\en
tity_default.txt accmod\raw\objects\entity_default.txt >temp.txt

updated batch script
Code: [Select]
dir /b %2\raw\objects
mkdir VersionABC
mkdir VersionABC\raw
mkdir VersionABC\raw\objects
for /f %%f in ('dir /b %2\raw\objects') do diff3 -e --merge %1\raw\objects\%%f %2\raw\objects\%%f %3\raw\objects\%%f > VersionABC\raw\objects\%%f


I think correct order is
Code: [Select]
diff3batch modb moda modc
moda:modc changes will be merged into modb, moda serves as base.

Here's a relevant section and how it was handled by the merge, I think it merge wonderfully.

Left = 34_11; Middle = AccMod; Right = CivForge

You can see it respected the 34_11 - AccMod entity changes when it looked at Civilization Forge, and it added in an item from Civilization Forge, the jar.

So... diff3 diff3 diff3!
http://imgur.com/SOitufk

Note: It looks like the behavior dropped toy_doll from being added by CivForge due to the fact that the entire section was deleted by Accelerated mod.

The notepad++ window is the output of the diff3 -e -m merge operation.  Behind it, the center column in kdiff3 looks just like the column on the left.  The only difference is the right column (civforge) has a _Jar entry that is added in the final output.

Update:
I was able to merge 3 34_11 mods together and then update them to 40_09 using diff3.  I went through and randomly verified some of the additions that you can see were brought in from my github version tracking.  For example, the changes from 34_11:Fortress Defense entity_default can be seen here (although similar to accelerated mod, Fortress Defense entity_default optional changes also got rid of wheelbarrow's for plainsmen, which accelerated mod did not, and those changes were merged in)  :)

https://github.com/thistleknot/df_34_11_win-Flattened-Bases/commit/8d71025dfc2b8996245f2b0ff4ee469b747233a8

I merged
34_11:Accelerated changes into CivForge = AccCivForge
34_11:Fortress Defense changes into AccCivForge = AccCivForgeFortressDefense
34_11:40_09 changes into AccCivForgeFortressDefense = AccCivForgeFortressDefense-40_09

The only caveat, is this batch file only works on the files in raw\objects.  Diff3 can be modified to work with other files.

Also, the batch file only looks at the [directory listing of] base folder's raw\objects\*.* for diff3 comparisons.  New files that are not in this folder due to other mods having them, won't have a diff3 merge ran on them because their absent from this initial directory listing (as a workaround, an empty file could be created for those missing files, diff3 should create a new outputted file for each missing file if they are supplied in this manner).  So if two mods mod the same file but the base doesn't have it.  Then a 2 way merge would have to be done between the two outside of this diff3 batch file and then that file would have to be brought into the final output.

On a final note, alphabetizing:
The only thing stopping us from doing something like this EFFICIENTLY on a total conversion mod... is a [command line] raw/text object alphabetizer.  Maybe with extended functionality to alphabetize castes within object:id's.

However.  If one wanted to incorporate a total conversion mod.  One could... alphabetize all the raws by hand using rawExplorer.  It can alphabetize a whole set of raws apparently and output it in a single go, then we parse them.  Then do the same with vanilla base.  And you could probably do better additive/subtractive diff3 comparisons.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: PeridexisErrant on August 21, 2014, 07:49:08 pm
<use a nonstandard diff format>
If you want to write one, that would be awesome.  It should take two files as input, derive the diff, and be able to apply that to another file. 
Operating on the whole raw folder instead of per-file would be a nice bonus, but the former is enough to use it. 
The only catch is that so far no one who could wants to dedicate the time to do that.

<diff3 is awesome>
Amen.  Unfortunately I want this to work on random Windows computers too  :'(

However, I've since found... this 3-Way Text Merging Algorithm (http://www.stephanboyer.com/post/26/3-way-text-merging-algorithm).  It looks perfect for us!
Quote
For a class called 6.033 (Computer Systems Engineering) at MIT, I was required to design a collaborative, distributed text editor. It’s supposed to work like Git—if two users make concurrent changes to the document, the editor should try to automatically merge the two changes (or report a merge conflict).

However, the merge algorithm had to be more intelligent than Git’s line-by-line diff merger—if one user moved a paragraph of text (for example) to a different location in the document, and another user concurrently edited the wording of that paragraph, then the merger should be able to detect that the paragraph was both edited and moved, automatically.

I wanted the merge algorithm to be even more general. It shouldn’t have any notion of “paragraph” or “sentence.” Rather, if any piece of text is moved and edited concurrently, this should be resolved automatically. Furthermore, it should work even if two (or more) users edit the text and another user user moves it. Moreover, I wanted it to even work recursively—for example, if one user moves a chapter in a book, while another user moves a paragraph within that chapter, and yet another user moves a sentence within that paragraph, and still another user changes the wording of that sentence, the merge algorithm should be able to handle all of that without any user intervention, and without any notion of “chapter,” “paragraph,” “sentence,” etc. It should just “do the right thing,” whether you’re working on code, a novel, a scientific paper, or any other text-based document.

While I wasn’t required to actually implement the algorithm, I ended up doing it in Python anyway (https://gist.github.com/boyers/4713315). I later discovered that I had independently invented the operational transformation.
So far as I can tell, this solves basically all of our problems if we can get permission to use it.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 21, 2014, 08:10:23 pm
Yeah I saw that project. I don't know how good it is. It might be too smart, or it might be too dumb. Thing is, just because two mods can by combined in some predictable way, doesn't mean that it's safe to do so. I didn't really look at what that code does in such cases.

I'll reiterate this test case:
Code: (vanilla) [Select]
[CREATURE:GIANT_LEOPARD_GECKO]
[COPY_TAGS_FROM:GECKO_LEOPARD]
[APPLY_CREATURE_VARIATION:GIANT]
[CV_REMOVE_TAG:CHANGE_BODY_SIZE_PERC]
[APPLY_CURRENT_CREATURE_VARIATION]
[GO_TO_END]
[SELECT_CASTE:ALL]
[CHANGE_BODY_SIZE_PERC:400700]
[GO_TO_START]
[NAME:giant leopard gecko:giant leopard geckos:giant leopard gecko]
[CASTE_NAME:giant leopard gecko:giant leopard geckos:giant leopard gecko]
[DESCRIPTION:A large monster in the shape of a gecko.]
[POPULATION_NUMBER:10:20]
[CLUSTER_NUMBER:1:1]
[CREATURE_TILE:'G']
[COLOR:6:0:1]
[PETVALUE:500]
[MOUNT_EXOTIC]
[GO_TO_END]
[PREFSTRING:amazing sticky feet]
[PREFSTRING:coloration]
[APPLY_CREATURE_VARIATION:STANDARD_QUADRUPED_GAITS:900:657:438:219:1900:2900] 40 kph
[APPLY_CREATURE_VARIATION:STANDARD_SWIMMING_GAITS:2990:2257:1525:731:4300:6100] 12 kph
[APPLY_CREATURE_VARIATION:STANDARD_CRAWLING_GAITS:2990:2257:1525:731:4300:6100] 12 kph
[APPLY_CREATURE_VARIATION:STANDARD_CLIMBING_GAITS:2990:2257:1525:731:4300:6100] 12 kph
Code: (mod adding pet tag) [Select]
[CREATURE:GIANT_LEOPARD_GECKO]
[COPY_TAGS_FROM:GECKO_LEOPARD]
[APPLY_CREATURE_VARIATION:GIANT]
[CV_REMOVE_TAG:CHANGE_BODY_SIZE_PERC]
[APPLY_CURRENT_CREATURE_VARIATION]
[GO_TO_END]
[SELECT_CASTE:ALL]
[CHANGE_BODY_SIZE_PERC:400700]
[GO_TO_START]
[NAME:giant leopard gecko:giant leopard geckos:giant leopard gecko]
[CASTE_NAME:giant leopard gecko:giant leopard geckos:giant leopard gecko]
[DESCRIPTION:A large monster in the shape of a gecko.]
[POPULATION_NUMBER:10:20]
[CLUSTER_NUMBER:1:1]
[CREATURE_TILE:'G']
[COLOR:6:0:1]
[PETVALUE:500]
[MOUNT_EXOTIC]
[GO_TO_END]
[PREFSTRING:amazing sticky feet]
[PREFSTRING:coloration]
[APPLY_CREATURE_VARIATION:STANDARD_QUADRUPED_GAITS:900:657:438:219:1900:2900] 40 kph
[APPLY_CREATURE_VARIATION:STANDARD_SWIMMING_GAITS:2990:2257:1525:731:4300:6100] 12 kph
[APPLY_CREATURE_VARIATION:STANDARD_CRAWLING_GAITS:2990:2257:1525:731:4300:6100] 12 kph
[APPLY_CREATURE_VARIATION:STANDARD_CLIMBING_GAITS:2990:2257:1525:731:4300:6100] 12 kph
        [PET]
Code: (mod adding creature) [Select]
[CREATURE:GIANT_LEOPARD_GECKO]
[COPY_TAGS_FROM:GECKO_LEOPARD]
[APPLY_CREATURE_VARIATION:GIANT]
[CV_REMOVE_TAG:CHANGE_BODY_SIZE_PERC]
[APPLY_CURRENT_CREATURE_VARIATION]
[GO_TO_END]
[SELECT_CASTE:ALL]
[CHANGE_BODY_SIZE_PERC:400700]
[GO_TO_START]
[NAME:giant leopard gecko:giant leopard geckos:giant leopard gecko]
[CASTE_NAME:giant leopard gecko:giant leopard geckos:giant leopard gecko]
[DESCRIPTION:A large monster in the shape of a gecko.]
[POPULATION_NUMBER:10:20]
[CLUSTER_NUMBER:1:1]
[CREATURE_TILE:'G']
[COLOR:6:0:1]
[PET_EXOTIC]
[PETVALUE:500]
[MOUNT_EXOTIC]
[GO_TO_END]
[PREFSTRING:amazing sticky feet]
[PREFSTRING:coloration]
[APPLY_CREATURE_VARIATION:STANDARD_QUADRUPED_GAITS:900:657:438:219:1900:2900] 40 kph
[APPLY_CREATURE_VARIATION:STANDARD_SWIMMING_GAITS:2990:2257:1525:731:4300:6100] 12 kph
[APPLY_CREATURE_VARIATION:STANDARD_CRAWLING_GAITS:2990:2257:1525:731:4300:6100] 12 kph
[APPLY_CREATURE_VARIATION:STANDARD_CLIMBING_GAITS:2990:2257:1525:731:4300:6100] 12 kph
[CREATURE:DESERT TORTOISE]
[DESCRIPTION:A tiny shelled reptile that lives in the desert.]
[NAME:desert tortoise:desert tortoises:desert tortoise]
[CASTE_NAME:desert tortoise:desert tortoises:desert tortoise]
[CHILD:1][GENERAL_CHILD_NAME:desert tortoise hatchling:desert tortoise hatchlings]
[CREATURE_TILE:'t'][COLOR:6:0:0]
[PETVALUE:50]
[BENIGN][NATURAL][PET_EXOTIC]
[BIOME:ANY_DESERT]
[LARGE_ROAMING]
[POPULATION_NUMBER:10:30]
[CLUSTER_NUMBER:1:1]
[PREFSTRING:shells]
[PREFSTRING:longevity]
[CANNOT_JUMP]

Can these be merged? does the order or merging effect the result? I posit that it must not be allowed to be merged. Does diff3 merging allow it? Does stephan boyer's code?
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 21, 2014, 08:16:23 pm
Diff3 works on windows. I'm on windows. Just include the executable.

When I get home I'll do your test your merge test case but tell me which is the common ancestor?
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 21, 2014, 08:28:58 pm
Diff3 works on windows. I'm on windows. Just include the executable.

When I get home I'll do your test your merge test case but tell me which is the common ancestor?
The first one is the ancestor. If you do it, be sure to try the other two in both orders.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: PeridexisErrant on August 21, 2014, 08:59:56 pm
Diff3 works on windows. I'm on with does. Just include the executable
If it can be included in the compiled version of the PyLNP, that would be great.  See below though, I'm not sure we need it. 

Yeah I saw that project. I don't know how good it is. It might be too smart, or it might be too dumb. Thing is, just because two mods can by combined in some predictable way, doesn't mean that it's safe to do so. I didn't really look at what that code does in such cases.

I'll reiterate this test case:<snip>

Can these be merged? does the order or merging effect the result? I posit that it must not be allowed to be merged. Does diff3 merging allow it? Does stephan boyer's code?
Yes, they can be merged by either clever use of a two-way diff, diff3 or Boyer's tool.  (though I haven't tested this specifically)

According to https://en.wikipedia.org/wiki/Diff3 we could get a similar effect to diff3 by simply applying a diff from vanilla to the mod to the already-merged-mods wrapped in something to catch merge conflicts, and then we've got a basic mod merger.  It probably won't handle merging many big changes, but it would give a working script that can give feedback to the user about the validity of their load order.  eg:

Code: [Select]
>>> (input load order)
  $mod0 <GREEN> (because first mod is always OK)
  $mod1 <ORANGE> (because validity unknown)
  $mod2 cannot be merged, try a different load order. <RED> (merge failed)
>>> (input load order)

Raws post-merge can be:
1 - Error, merge is impossible (RED)
2 - Invalid raws; DF crashes because of improper formatting or the deletion of something crucial (ORANGE)
3 - Nonfunctional raws; DF can read them but some things have no effect (likely due to missing dependencies, which sometimes is the above) (YELLOW)
4 - Functional raws; everything works as it should.  (GREEN)
Then color pessimistically - anything of unknown validity is orange.  Anything of unknown functionality is at least yellow.

Right now, I just want to get *some* mods past stage 1.  More advanced standard diff/merge implementations will be able to squeeze more changes through stage 1 without problems (due to recognising movement etc).  Distinguishing two from three or four is harder, and I think it shoulkd be a post-1.0 goal.  My proposal further up the thread was to just guess based on how many mods were selected for now and fill this in later.  I'd probably try for a simple script where you feed it a raw folder and it just returns whether the raws are valid or not; extensible to also check functionality later, and call that on the mixed folder after each mod is added.  It can also be used to check that the mods themselves are valid, which would be nice!
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 21, 2014, 09:11:40 pm
Update
I see the issue your test case raises.

It replaces the pet token with the new creature vs injecting the newcreature either before/after pet token.

Spoiler (click to show/hide)

Sample 1 = base (common ancestor)

compare sample1:sample3 apply to sample2

diff3 -e -m sample2.txt sample1.txt sample3.txt > ...
Spoiler (click to show/hide)

compare sample1:sample2 apply to sample3

diff3 -e -m sample3.txt sample1.txt sample2.txt > diffs1s2applytos3.txt
Spoiler (click to show/hide)

by the way, that 3 way python diff tool looks handy.  I'm all up for better solutions.  I'm just glad we're thinking n way [common ancestor] merging at this point.  We obviously see it can be done without a git system.

Quote
we could get a similar effect to diff3 by simply applying a diff from vanilla to the mod to the already-merged-mods wrapped in something to catch merge conflicts

you mean revert a diff from currentVersion to commonAncestor?  Apply changes to commonAncestor, un-reverse reversed patch?

Quote
I'd probably try for a simple script where you feed it a raw folder and it just returns whether the raws are valid or not; extensible to also check functionality later, and call that on the mixed folder after each mod is added.  It can also be used to check that the mods themselves are valid, which would be nice!

what your asking for is some post raw merge processing beyond simple patch filing.

As to merging using common ancestor and diff3, I was able to merge Fortress Defense, Accelerated Modest Mod, Civilization Forge 2.8, and 40_09 together with hardly any fuss.  I used the same method using git, but this time I used diff3.  When I tried it with git I checked the error log and the only errors I had were related to the new creatures that were brought in that didn't have their tokens updated to reflect 40_09.  Those kinds of things are the things I think you speak of when referring to is it compatible or not, otherwise I would think you would need to do some regexp raw object processing.  Things like new creatures from older mods that were never updated to 40_09 standards, so some template would have to be applied to ensure body detail plans were updated to include the new _neck, things of that nature.

But speaking from a tool that accomplishes merging raws as is with little to no error.  I would say diff3 can do that.

Mixing mods from 34_11 to 40_09 is where my errors came in.  However, I think one could use diff3 with even less drama if mixing mods of the same version (40_09) vs across versions.

Update
I think a lot of the concern with diff3 and it's "block matching" can be alleviated if we parse the raws before we diff3 them.  I recommend (if possible) to alphabetize each .txt file's objects, and then parse them to remove whitespace and put tokens on their own individual lines.  That way when mod folders are processed in the same manner, the diff between base and the mod will be on an individual token level.  I think diff3 is smart enough to catch inserted blocks
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 21, 2014, 09:39:42 pm
Here's an updated merge script. I haven't tested my algorithm thoroughly yet, so it probably has bugs.

It can be run as a script that takes 3 files passed to as arguments: the mod file, the vanilla raw file, and the generated raw file. If the generated raw file does not exist, it will create it. It will do a 3 way merge of the mod into the generated_raw file. The script will return 1 if it fails to merge, 0 if it succeeds.

It can also be run with 0 arguements (or any number other than 3), in which case it tests the merger algorithm.

You can also import it as a module and call do_merge_files directly, with the same 3 arguments. do_merge_seq  is for comparing lists, which may be useful if you already have the contents of the file as a list of lines.

Spoiler (click to show/hide)
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 21, 2014, 09:51:46 pm
I tried the py file you guys are talking about and this latest, I just have no luck with python

Spoiler (click to show/hide)
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 21, 2014, 09:59:17 pm
Update
I see the issue your test case raises.

It replaces the pet token with the new creature vs injecting the newcreature either before/after pet token.
Ok, I think that's a problem. putting the pet token on the wrong creature is an incorrect merge.

Quote
Update
I think a lot of the concern with diff3 and it's "block matching" can be alleviated if we parse the raws before we diff3 them.  I recommend (if possible) to alphabetize each .txt file's objects, and then parse them to remove whitespace and put tokens on their own individual lines.  That way when mod folders are processed in the same manner, the diff between base and the mod will be on an individual token level.  I think diff3 is smart enough to catch inserted blocks
There are two problems with that:
1) The order of objects in the raws has in-game effects.
2) Parsing the raws and comparing them is more complicated. It'd be nice to have a dumber merge tool that doesn't mis-merge compatible mods first.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 21, 2014, 10:01:19 pm
I tried the py file you guys are talking about and this latest, I just have no luck with python

Spoiler (click to show/hide)
I think the problem is I'm using Python 2.7 (because that's what's installed), and you probably have Python 3.x. But I'll fix those errors for you.

Here we are:
Spoiler (click to show/hide)

Sorry for the trouble. I should have made sure since it's the same problem as before.

EDIT: run as
python mergemod.py mod_file.txt vanilla_file.txt target_file.txt
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 21, 2014, 10:12:59 pm
I'm kinda bummed that your [pet] newcreature situation kind of breaks merging.  I would think if the diff program was smart enough to figure out what token it was next to, either before or after token's of the diff.  That it could still be added.  The fact that the other mod didn't add/remove that specific token bothers me.  The diff app should have seen that two mods were bringing different tokens to the same line.  So both should have been accomodated by checking the before/after tokens on adjacent lines...

Oh well.  1st world problems right.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 21, 2014, 10:26:29 pm
I'm kinda bummed that your [pet] newcreature situation kind of breaks merging.  I would think if the diff program was smart enough to figure out what token it was next to, either before or after token's of the diff.  That it could still be added.  The fact that the other mod didn't add/remove that specific token bothers me.  The diff app should have seen that two mods were bringing different tokens to the same line.  So both should have been accomodated by checking the before/after tokens on adjacent lines...

Oh well.  1st world problems right.
I can't think of a way diff3 could be smarter at merging.

PS
Just realized diff3 will write into the first argument. my script writes into the 3rd. I should change that.

PPS try diff -m without the -e. That seems to merge correctly but show conflicts as needed.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 21, 2014, 11:09:46 pm
well.  it seems merger fails with your sample test as well.

Seems like you could run a test such, as you've derived, something like if a conflict occurs for writing to the same line (as merger will throw), throw an error.

else do a diff3 -m -e merge?

...

I just realized that with the way we remove ALL WHITESPACE, could this type of issue occur outside of this context?  I'm thinking since we remove all whitespace, that when we change a token, we are always overwriting the old token.  However, since all our patches are based on before and after snapshots of whole files...  It's easier to get the exact diff from the document and merge them.  So the only time an issue may occur is when two files try to alter the exact same line.

Btw,
I tried diff -m and nogo with 3 files?

I also tried merger.py on entity_default in a 3 way and it crapped itself all over my console.  Something about maximum recursion depth.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: PeridexisErrant on August 21, 2014, 11:21:47 pm
Code: [Select]
import os
import difflib

context_lines = 2
if os.path.isfile(mixed_raw_folder+file+'.patch'):
    os.remove(mixed_raw_folder+file+'.patch')
for line in difflib.unified_diff(open(vanilla_raw_folder + file).readlines(),
                                 open(mod_raw_folder + file).readlines(), n=context_lines):
    with open(mixed_raw_folder+file+'.patch', 'a') as item:
        item.write(line)

Creating a unified patch file with a few lines of context (two lines matches within but not between objects) fixes the [pet] issue, but I can't work out how to apply a unified patch with python.  Argh.

<Python 3.x compatible version>
Well, it no longer freaks out about the print statement  :)  Unfortunately it also outputs the contents of the vanilla file  :(

My testing, though I don't follow the various opcodes, shows that the output_file_temp returned by do_merge_seq() is the same as the contents of the vanilla file. 
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 22, 2014, 05:15:49 am
I was thinking and thinking about it.

Your script to join adjacent tokens to nearby tokens only works if their is whitespace.  I have an idea to modify the regexp sed script to include default whitespace but remove the whitespace that is created when splitting tokens onto their own lines.  I can do so by replacing the initial whitespace with a special token marker that at the very end will be replaced with whitespace again?
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 22, 2014, 06:45:37 am
Code: [Select]
import os
import difflib

context_lines = 2
if os.path.isfile(mixed_raw_folder+file+'.patch'):
    os.remove(mixed_raw_folder+file+'.patch')
for line in difflib.unified_diff(open(vanilla_raw_folder + file).readlines(),
                                 open(mod_raw_folder + file).readlines(), n=context_lines):
    with open(mixed_raw_folder+file+'.patch', 'a') as item:
        item.write(line)

Creating a unified patch file with a few lines of context (two lines matches within but not between objects) fixes the [pet] issue, but I can't work out how to apply a unified patch with python.  Argh.
You can get difflib.Differ.unified_diff() to print out a unified diff, but merging isn't provided in the library.

Quote
<Python 3.x compatible version>
Well, it no longer freaks out about the print statement  :)  Unfortunately it also outputs the contents of the vanilla file  :(

My testing, though I don't follow the various opcodes, shows that the output_file_temp returned by do_merge_seq() is the same as the contents of the vanilla file.
It should return 1 though. If it returns 1, then the output is garbage; it detected a conflict and gave up. It prints out vanilla because it got to the end of vanilla before finding a conflict. To see it returned 1, "echo $?" imediately after running it. You can run it like this:
python mergemod.py mod_file.txt vanilla_file.txt target_file.txt ; echo $?

But I need to test it more. See why it's complaining about maximum recursion depth to thistleknot, and test it more for correctness.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: MagiX on August 22, 2014, 06:56:57 am
Spoiler (click to show/hide)
I haven't read the entire thread, just the last few pages... quite a discussion going on here :)

What about writing a custom json/xml/whatever style parser that puts these things into (multi-level) dict structures and then comparing the dict structures? This should look like that:
Code: (vanilla) [Select]
vanilla_dict={"Creature":{"Giant_leopard_gekko":{all the key/value pairs from vanilla here}}}
Code: (Pet) [Select]
Mod_1={"Creature":{"Giant_leopard_gekko":{all the key/value pairs from vanilla here,"PET":''}}}
Code: (Add animal) [Select]
Mod_2={"Creature":{"Giant_leopard_gekko":{all the key/value pairs from vanilla here}},
"Desert_tortoise":{all the key/value pairs from new animal here}}
So for every key, one could check if the value (i.e. a new dict) is the same or not and if it is not the same, one can do this recursively. A simple 2 dict comparison can be found here (https://stackoverflow.com/questions/1165352/fast-comparison-between-two-python-dictionary)
We thus have some options:
and as a final step, one should parse the mixed mod dict into a file.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 22, 2014, 07:42:27 am
Spoiler (click to show/hide)
I haven't read the entire thread, just the last few pages... quite a discussion going on here :)

What about writing a custom json/xml/whatever style parser that puts these things into (multi-level) dict structures and then comparing the dict structures? This should look like that:
Code: (vanilla) [Select]
vanilla_dict={"Creature":{"Giant_leopard_gekko":{all the key/value pairs from vanilla here}}}
Code: (Pet) [Select]
Mod_1={"Creature":{"Giant_leopard_gekko":{all the key/value pairs from vanilla here,"PET":''}}}
Code: (Add animal) [Select]
Mod_2={"Creature":{"Giant_leopard_gekko":{all the key/value pairs from vanilla here}},
"Desert_tortoise":{all the key/value pairs from new animal here}}
So for every key, one could check if the value (i.e. a new dict) is the same or not and if it is not the same, one can do this recursively. A simple 2 dict comparison can be found here (https://stackoverflow.com/questions/1165352/fast-comparison-between-two-python-dictionary)
We thus have some options:
  • Stuff that is unchanged is copied to the mixed mod dict
  • Stuff that is simply added (as the [PET] tag or the new creature) will be added to the mixed mod dict
  • Stuff that is changed --> check if the same key/value pair is changed in both mods --> yes: problem; no: copy the change to the mixed mod dict
  • Stuff that is removed in the mod --> remove from mixed mod dict
and as a final step, one should parse the mixed mod dict into a file.
Separating it into two levels like that does solve that particular problem. But you can't just use a dict, because you need to preserve the order of some tags.

But using json or xml may mean we can find a good diff/merge tool, that is aware of how the order of some things doesn't matter and some things do. XML is more powerful than JSON here, because the same xml tag can have both attributes which are unordered, and nested tags that are ordered.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: PeridexisErrant on August 22, 2014, 07:49:51 am
Either sounds good, both are beyond my current skills. 

Go for it, and I'll keep writing documentation and design ideas for stuff I can't code yet :P
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: MagiX on August 22, 2014, 08:34:48 am
you need to preserve the order of some tags.
Is there some kind of rule for that? After just briefly scanning some of the files, I haven't seen a "clear" pattern, besides indentation and even that does not seem to be consistent in all cases.

beyond my current skills. 
Time to learn sth new :)
I have no clue how to approach this either, but just thought about it and why not share my idea
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 22, 2014, 08:47:30 am
Well the first step would be to find an xml merge tool. You might also try to write an XSLT script that does such a merge and properly identifies conflicting mergers; writing an XSLT script may be easier than writing a merge algorithm from scratch in python. Without such a tool, taking a round-trip through xml is pointless.

IMO, a from scratch python script that does merging on a 2+ level structure is probably the best way to go eventually.

Anyway, I'm going to keep working on my merge algorithm for now. And maybe add more boilerplate.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 22, 2014, 08:52:25 am
you need to preserve the order of some tags.
Is there some kind of rule for that? After just briefly scanning some of the files, I haven't seen a "clear" pattern, besides indentation and even that does not seem to be consistent in all cases.
I'm not a modder, so I don't know the details, but some tags clearly suggest that order matters for them, like [GO_TO_END]. Other tags, like [PET] can be put anywhere after the creature token.

Go for it, and I'll keep writing documentation and design ideas for stuff I can't code yet :P
Design is good. There's a lot of fairly strait forward stuff that needs to be done to manage everything. You need to be able to specify the list of mods. You need to be able to delete the output when merging fails. You probably want to figure out which two mods conflict when merging, which requires extra analysis. And of course the GUI -- designing and stubbing out the GUI can help plan what features you want even if they aren't immediately implemented.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Putnam on August 22, 2014, 09:30:04 am
you need to preserve the order of some tags.
Is there some kind of rule for that? After just briefly scanning some of the files, I haven't seen a "clear" pattern, besides indentation and even that does not seem to be consistent in all cases.
I'm not a modder, so I don't know the details, but some tags clearly suggest that order matters for them, like [GO_TO_END]. Other tags, like [PET] can be put anywhere after the creature token.

[GO_TO_END] is pretty much only there because castes are not declared at the start. Castes and not-creature tokens imbedded in creatures (I.E tissues and materials) are the only thing where positioning matterse.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 22, 2014, 11:19:18 am
So I've been thinking. The reason we have these a (a being our base/common ancestor) vs b vs c token replacement between mods is what I would call "collisions" when trying to add to the same line (not replace but both are additive).

I was thinking if we somehow marked them in patch files in some post process. Maybe we can modify the changed token to be ]###newobject or ]#oldobjectToken


Since.I figured out how diff3 works I've been reusing kdiff3 in a whole new way.

Most of these collisions are resolved w inserting b before c or c before b. If we could somehow incorporate that logic as a post processor command in the patch files. Maybe an ]###add or ]#replace or ###Del... On each add of replace line (in a patch file).
we might be able to resolve this issue
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Merkator on August 22, 2014, 11:29:10 am
I think we end up with full featured parser. Anyone here with some knowledge about Haskell and Parsec. ;)

BTW I wrote my small diff parser and end up with 100 LOC.

I post it when I do some bugtesting and clean up this piece of... I mean beauty. :P
 
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Dirst on August 22, 2014, 11:40:28 am
you need to preserve the order of some tags.
Is there some kind of rule for that? After just briefly scanning some of the files, I haven't seen a "clear" pattern, besides indentation and even that does not seem to be consistent in all cases.
I'm not a modder, so I don't know the details, but some tags clearly suggest that order matters for them, like [GO_TO_END]. Other tags, like [PET] can be put anywhere after the creature token.

[GO_TO_END] is pretty much only there because castes are not declared at the start. Castes and not-creature tokens imbedded in creatures (I.E tissues and materials) are the only thing where positioning matterse.
There are four kinds of order dependence in the raws, with an example for each at the end.

1. The header (filename and OBJECT: declaration) need to come first in a file.
2. Variations need to be defined after the base creature.
3. Several tokens accept a list of subtokens to build a structure.  The structure closes when the parser hits the first token that isn't a valid subtoken in that context.
4. Castes are a special case of 3.  First, everything that appears before the first caste declaration is applied to ALL castes, nothing closes a caste structure except another caste declaration, and a caste declaration can be re-opened later in the same creature.

Example of 1: the creature_standard and [OBJECT:CREATURE] at the top of a file.
Examples of 2: a giant kea can't be defined until a kea is already defined, an olm man can't be defined before an olm (tiger men are an exception... they are made from scratch rather than being a tiger variant).
Examples of 3: each CREATURE sucks up all tags until it hits another CREATURE tag, and a SYNDROME sucks up all tags until it runs out of syndrome-defining or creature-effect-defining tags.
Example of 4: the intelligent creatures tend to have a lot of definition up front, briefly split into MALE and FEMALE castes, then select all castes again to finish up.

The easiest way to handle this is to hardcode in 1, and treat order within a top-level object as if it is critical to handle 3 and 4. Case 2 is the one that prevents us from alphabetizing things.

One way to handle that is to do a two-level sort.  Base creatures are listed alphabetically, then all variations are listed alphabetically.  The logic could be re-used later if we want to alphabetize gems within categories or something weird like that.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 22, 2014, 12:18:19 pm
That is exactly the info we need to build a raw structure.

So...

I didn't have much luck tweaking my script to replace blank lines with [token] and then back-update with blank lines...

but... I did get the command down to 1 line in a for loop

ParseRawsv4a.bat
Code: [Select]
echo off
REM put tokens on their own line | REM remove tabs | remove all blanklines
for /f %%a in ('dir /b *.txt') do sed -e "s/\[[^][]*\]/\n&\n/g" %%~na.txt | sed -r "s/\t//g" | sed -e "s/^ *//; s/ *$//; /^$/d; s/\r//; /^\s*$/d" > %%~na.out |type %%~na.out
REM cleanup
ren *.out3 *.txt
erase *.out
echo on

I think this flatten should only be applied to the items within the [objects] folder.  Things that affect speech and text seem to be read per line vs per token.

I was hoping to address the whitespace removal possibly affecting when two mods add tailing tokens at the end of objects.  If the dictionary/match PE was trying to attempt, I assumed the relevant whitespace that trails any token additions at the end of objects would be relevant and NOT wish to be deleted, but idk.  Either way, I had a bit of trouble

There is one way to do it, but then I'd have to use cygwin...
http://stackoverflow.com/questions/11393616/replace-string-that-contains-crlf
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Button on August 22, 2014, 12:44:44 pm
What about writing a custom json/xml/whatever style parser that puts these things into (multi-level) dict structures and then comparing the dict structures? This should look like that:

Hey guys, sorry for not keeping you up to date on my postprocessor, blah blah work blah blah food poisoning. What I have so far works a lot like this but without needing XML (yet).

So far what I have is code to read a raw file and parse it into raw objects. The code is on my home computer, but here's the pseudocode as I remember it:

Spoiler (click to show/hide)

The idea is, that we parse each mod into a collection of raw objects, indexed by object type and object name. This catches duplicate raws during loading.

It can easily be expanded into comparing each collection of raw objects to each other. Mods which add new, raw objects would be trivial with this setup. Mods which remove or make changes to existing objects would require additional handling, but there's plenty of room for it.

I was messing around with formats for defining legal raw objects of various types. Mainly what I found is that XML isn't great for it, because it doesn't deal gracefully for tags which are allowed in any order. Might be best to define a custom format if we want to go into it that far.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Merkator on August 22, 2014, 12:59:52 pm
Button: sound great. I thought myself about something like that.
It may be really much better way.
Storing object is not much problem.

But you remember about tag order and the whole [CASTE] thing.
What kind of data structure you use? List with tuple for each token or ordereddict with tuples or ordereddicts as values?

Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 22, 2014, 01:05:35 pm
I think we end up with full featured parser. Anyone here with some knowledge about Haskell and Parsec. ;)

Good point.

I have some experience with parsers and parser generators, but DF raws are so primitive that a parser generator seems overkill. There's very little grammar to DF raws, and checking the grammar is not important at all. On the other hand, maybe a lexer tool would be worthwile. Thistleknot, you might want to look into this: if there's a "lex" or lexer generator tool that compiles into python or a portable language. Maybe ANTLR, if it has a sufficiently documented Python generator. It might make reading and de-serializing raws easier.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 22, 2014, 01:06:20 pm
What about writing a custom json/xml/whatever style parser that puts these things into (multi-level) dict structures and then comparing the dict structures? This should look like that:
...

great work man!

Yeah, if it could do caste tokens

and maybe even alphabetize?

and basically follow the important stuff in Dirst's post
http://www.bay12forums.com/smf/index.php?topic=142295.msg5595599#msg5595599

then I think you would have a great way to do some diff comparisons inbetween two objects.  One doesn't even have to use diff at this point. 

One could just note the sequential order of the tokens between
CommonAncestor
ModB
ModC

Note the sequential & instance additive/subtractive/replacing difference of tokens between two objects and track those changes.

Something like

Diff ancestor:mod c applied to modB.

King Mir
Yeah, I'll look into lexer for python sure.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 22, 2014, 01:09:43 pm

I was messing around with formats for defining legal raw objects of various types. Mainly what I found is that XML isn't great for it, because it doesn't deal gracefully for tags which are allowed in any order. Might be best to define a custom format if we want to go into it that far.
What about using attributes for tags like that?
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Button on August 22, 2014, 01:13:41 pm
Button: sound great. I thought myself about something like that.
It may be really much better way.
Storing object is not much problem.

But you remember about tag order and the whole [CASTE] thing.
What kind of data structure you use? List with tuple for each token or ordereddict with tuples or ordereddicts as values?

all_objects is a dict(string object_type,dict(string object_name,RawObject)).

RawObject is a custom class. It stores among other things the filename, the object name & type, and the complete text of that raw object as a (ordered) list of strings (including comments).

I figure as processing gets more in-depth we'll be storing more and more data about each raw object;  and it sucks to increase the complexity of a data structure that's being repeatedly searched by other sections of the code, when you could hide it inside an object instead. :)
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 22, 2014, 01:25:20 pm
I was looking at lexar's...

but... I was thinking we need a parser no?

http://pyparsing.wikispaces.com/

found pyparsing

http://stackoverflow.com/questions/1651487/python-parsing-bracketed-blocks
Code: [Select]
>>> from pyparsing import nestedExpr
>>> txt = "{ { a } { b } { { { c } } } }"
>>>
>>> nestedExpr('{','}').parseString(txt).asList()
[[['a'], ['b'], [[['c']]]]]
>>>

lexer research:

If not

lexar wise I've found:
Pygments
http://pygments.org/docs/lexers/
Ply
http://www.dabeaz.com/ply/
ex: http://www.dabeaz.com/ply/example.html

and a big list
https://wiki.python.org/moin/LanguageParsing
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 22, 2014, 01:28:31 pm
You can't store objects in a map/dict when order matters. Use a list or array. That means for creature tokens, because of variations like animal people. Alphabetizing is a problem for the same reason.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Button on August 22, 2014, 01:39:23 pm
What about writing a custom json/xml/whatever style parser that puts these things into (multi-level) dict structures and then comparing the dict structures? This should look like that:
...

Yeah, if it could do caste tokens

Caste tokens are going to be a ways off no matter how we slice it. Since only creature objects use them, and we'll need a mapping file full of which tokens are caste-level and which are creature-level... .

Quote
and maybe even alphabetize?

The work required to alphabetize objects against each other is essentially already done. Alphabetizing tags within an object would be significantly harder, since we'd need to know which-all tags could be reordered safely.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Dirst on August 22, 2014, 01:43:04 pm
You can't store objects in a map/dict when order matters. Use a list or array. That means for creature tokens, because of variations like animal people. Alphabetizing is a problem for the same reason.
This generalizes for most raw files... there is a top-level object (like CREATURE or REACTION) that is usually, but not always, self-contained.  The biggest exception is that base creatures must exist "earlier in the raws" than any variants of that creature.  Raw files of a certain type are parsed in alphabetical order based on the internal name at the first line of the file, and tags within a file are parsed in order of appearance.

Most other dependencies are handled by using separate object types.  Materials are their own thing, body plans are their own thing, etc. and order makes no difference if you're "calling" an object from a different object type.  For example, creatures can "call" any tissue template and then modify it locally without any concern for what order the tissue templates are in.

The one object that hasn't been pulled out into its own files is the SYNDROME type.  I expect that to happen, eventually.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 22, 2014, 01:54:33 pm
I was looking at lexar's...

but... I was thinking we need a parser no?

http://pyparsing.wikispaces.com/

found pyparsing

http://stackoverflow.com/questions/1651487/python-parsing-bracketed-blocks
Code: [Select]
>>> from pyparsing import nestedExpr
>>> txt = "{ { a } { b } { { { c } } } }"
>>>
>>> nestedExpr('{','}').parseString(txt).asList()
[[['a'], ['b'], [[['c']]]]]
>>>

lexer research:

If not

lexar wise I've found:
Pygments
http://pygments.org/docs/lexers/
Ply
http://www.dabeaz.com/ply/

and a big list
https://wiki.python.org/moin/LanguageParsing
So for a bit of background:
A lexer converts text into a sequence of tokens or glyphs. Using a series of regular expressions specified for each token, It would create a function what takes a raw file, and puts out a token structure every time it's called. Effectively it is a stream of tokens. This is something that I think a tool may help with for DF raws.

A parser converts the output of a lexer into rules based on the language grammar. So it has a grammar file that maps the structure of a DF raw to snippets of code that are run whenever a particular token sequence is encountered. It could read in a token and figure out what level token (Cast, creature, objext, etc), and have seperate code for each, including cases like when a cast level token is encountered creature level.

Of course to parse the raws you need both. However, grammar wise, the raws of DF are very simple. So what I'm saying is, you don't necessarily need a parser generator, and it may be simpler not to use one. Unless that is it comes bundled with the lexer anyway, and is easy to use because of that. So definitely look at lexers. Maybe look at parsers too, while you're at it. Often they are bundled together.

EDIT:
On second thought look at parsers too. You're if you're thinking of writing a parser, you should know what tools exist.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Button on August 22, 2014, 01:55:16 pm
You can't store objects in a map/dict when order matters. Use a list or array. That means for creature tokens, because of variations like animal people. Alphabetizing is a problem for the same reason.
This generalizes for most raw files... there is a top-level object (like CREATURE or REACTION) that is usually, but not always, self-contained.  The biggest exception is that base creatures must exist "earlier in the raws" than any variants of that creature.  Raw files of a certain type are parsed in alphabetical order based on the internal name at the first line of the file, and tags within a file are parsed in order of appearance.

Don't worry, King Mir, the dict is just for object lookup. Order is preserved within the object.

I didn't realize that base creatures had to be earlier in the raws than their variations; I'll make a note of that for write-out logic.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 22, 2014, 02:03:04 pm
You can't store objects in a map/dict when order matters. Use a list or array. That means for creature tokens, because of variations like animal people. Alphabetizing is a problem for the same reason.
This generalizes for most raw files... there is a top-level object (like CREATURE or REACTION) that is usually, but not always, self-contained.  The biggest exception is that base creatures must exist "earlier in the raws" than any variants of that creature.  Raw files of a certain type are parsed in alphabetical order based on the internal name at the first line of the file, and tags within a file are parsed in order of appearance.

Don't worry, King Mir, the dict is just for object lookup. Order is preserved within the object.

I didn't realize that base creatures had to be earlier in the raws than their variations; I'll make a note of that for write-out logic.

I believe that's what RawExplorer author was saying.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 22, 2014, 02:11:59 pm
Do we really want to auto-alphabetize everything generally? Some things like gems are kinda convenient if sorted by worth. On the other hand, for stockpiles are clearer when things are in alphabetical order.

Agreeably, second level lags who's order doesn't effect the game at all should be put in a dict/map to make it easy to check for changes to the same token.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Dirst on August 22, 2014, 02:21:01 pm
Do we really want to auto-alphabetize everything generally? Some things like gems are kinda convenient if sorted by worth. On the other hand, for stockpiles are clearer when things are in alphabetical order.

Agreeably, second level lags who's order doesn't effect the game at all should be put in a dict/map to make it easy to check for changes to the same token.
Everything?  No.

For the dict/map idea, is there a reason to stop at two levels?  For most structures, it'd be sufficient to soak up all subtokens and just keep them with the parent.  Castes, however, are a major pain in the ass.  Or, they would be if Dwarves had asses (http://www.bay12forums.com/smf/index.php?topic=142739).

The only pseudo-simple solution I can think of is to treat caste declarations and caste selections as milestones within the CREATURE object.  A similar tag on the other side of such a milestone is considered a different tag rather than a duplicate one.  This will keep the caste-level tags from overwriting each other without requiring an exhaustive list of caste-level tags.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 22, 2014, 02:32:00 pm
Actually, if if objects are matched there's no need for alphabetizing. That was more a concern using a diff approach on an entire file when comparing to mods that significantly reordered their files.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Merkator on August 22, 2014, 02:47:07 pm
For parsing there are literally tons of tools and libs.
Even re may be all we need. Only problem I found was stuff like
[TILE:1:3:':':4]
It is just sample. In C I would just parse char by char. In python to.
If not this little uggly things whole parser could be almost in one line like
Code: [Select]
l = l.strip('[').strip(']').split(':')
token = (l[0], l[1:])

Damn, i would love to know how you solve this...


Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 22, 2014, 04:41:43 pm
For parsing there are literally tons of tools and libs.
Even re may be all we need. Only problem I found was stuff like
[TILE:1:3:':':4]
It is just sample. In C I would just parse char by char. In python to.
If not this little uggly things whole parser could be almost in one line like
Code: [Select]
l = l.strip('[').strip(']').split(':')
token = (l[0], l[1:])

Damn, i would love to know how you solve this...
You solve this by writing a proper lexer, that converts a string into a list of tokens. Then write a state machine for each kind of token. So in your example the list of tokens starts like this: ['[',KEYWORD="TILE",':','1',':' ....
(Keywords actually can just be enum values, but strings in general can't, so you need the extra data for them.)
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 22, 2014, 05:25:28 pm
So... yeah.  I'm not actually contributing to this project btw guys.  I've got my own reasons for chiming in, even if just as input from a mod merger.  I start school Monday and I don't have time to start on this.  I'm just glad to enjoy the conversational ride.  I hope you continue in setting a good standard for people to build on.

I wanted to call the type of "merging" you guys are proposing as a "blind merge"

I think a term coined term is important, whatever you guys want to call it

[such as... ***cough cough***]
"blind merge"


Something that tells a user/mod merger that the merging process is best done manually [requires updating] to implement features that have been removed/added that need to be expanded to the rest of the entities/creatures/objects in the game in raw [token] format additions/subtractions.

If this wishes to provide that.  Then I would think some regexp fancy magic would need to be implemented.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Button on August 22, 2014, 05:29:45 pm
Do we really want to auto-alphabetize everything generally? Some things like gems are kinda convenient if sorted by worth. On the other hand, for stockpiles are clearer when things are in alphabetical order.

Agreeably, second level lags who's order doesn't effect the game at all should be put in a dict/map to make it easy to check for changes to the same token.
Everything?  No.

For the dict/map idea, is there a reason to stop at two levels?  For most structures, it'd be sufficient to soak up all subtokens and just keep them with the parent.  Castes, however, are a major pain in the ass.  Or, they would be if Dwarves had asses (http://www.bay12forums.com/smf/index.php?topic=142739).

The only pseudo-simple solution I can think of is to treat caste declarations and caste selections as milestones within the CREATURE object.  A similar tag on the other side of such a milestone is considered a different tag rather than a duplicate one.  This will keep the caste-level tags from overwriting each other without requiring an exhaustive list of caste-level tags.

Castes are going to be somewhat exhausting since they don't need to be declared before they happen. I'm thinking the best way to do them will probably be with two passes: first, find the castes; second, go through the raw in order and for all caste-level tokens not assigned to a caste explicitly, assign it to all the castes you found earlier. (This would of course require a file defining all caste-level tokens, so we know what to do that with.) The output raws would be significantly larger than the input raws, but separating out the all-caste tokens onto each caste individually seems to me to be the  way to prevent specific-caste mods from cocking everything up.

However, I don't think we'd need to put it into the lookup. Since all the different types of RawObjects have different requirements/formats for their individual needs, I figured we'd use good ol' polymorphism to handle the differences between object formats.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: PeridexisErrant on August 22, 2014, 07:07:49 pm
I look forward to learning enough to contribute to these discussions - my first programming course just covered lists, and while I've been reading ahead I'm not quite up to this yet :P
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Roses on August 22, 2014, 09:31:04 pm
Honestly if I were working on this I would start with a program that just divided the raws into dictionaries that would be easy to check against eachother. For example:

Viola, now you have a dictionary that you can just compare to another dictionary and if it has extra entries or less entries you can adjust your output accordingly. An advantage to this is you can also specify exactly how you want the output to look, so you could, for instance, output into a broken up structure that is more easy to read like
Code: [Select]
[CREATURE:FANTASY_DRAGON_ANCIENT]
[NAME:ancient dragon:ancient dragons:ancient dragon]
[CREATURE_TILE:255][COLOR:7:0:1]

===== BIOME AND NUMBERS
[BIOME:ANY_LAND]
[APPLY_CREATURE_VARIATION:POP_NUMBERS:1:3:1:1:5]

===== MEGABEAST INFO
[MEGABEAST]
[DIFFICULTY:11]
[ATTACK_TRIGGER:120:50000:1000000]
[SPHERE:FIRE]
[SPHERE:EARTH]
[SPHERE:SKY]
[SPHERE:MOUNTAINS]
[SPHERE:LIGHTNING]
[SPHERE:STORMS]
[SPHERE:RULERSHIP]

===== IMMUNITIES
[APPLY_CREATURE_VARIATION:IMMUNITIES_MEGABEAST]

===== TRAITS
[APPLY_CREATURE_VARIATION:TRAITS_MEGABEAST]
[APPLY_CREATURE_VARIATION:TRAITS_COLOSSAL]

[PETVALUE:15000]
[PET_EXOTIC]
[INTELLIGENT]
[TRAINABLE]
[MOUNT]
[FLIER]
[LARGE_PREDATOR]
[BONECARN]
[ALL_ACTIVE]
[SWIMS_INNATE]
[CHILD:0]

[PREFSTRING:unfathomable power]

===== SIZE, AGE, AND SPEED
[BODY_SIZE:0:0:2000000]
[BODY_SIZE:1000:0:2000000000]
[BODY_SIZE:10000:0:2100000000]
[APPLY_CREATURE_VARIATION:STANDARD_FLYING_GAITS:900:528:352:176:1900:2900] 50 kph
[APPLY_CREATURE_VARIATION:STANDARD_QUADRUPED_GAITS:900:711:521:293:1900:2900] 30 kph
[APPLY_CREATURE_VARIATION:STANDARD_SWIMMING_GAITS:2206:1692:1178:585:3400:4900] 15 kph
[APPLY_CREATURE_VARIATION:STANDARD_CLIMBING_GAITS:2206:1692:1178:585:3400:4900] 15 kph
[APPLY_CREATURE_VARIATION:STANDARD_CRAWLING_GAITS:2206:1692:1178:585:3400:4900] 15 kph

===== SENSES
[VIEWRANGE:#] -- default 20
[LOW_LIGHT_VISION:#] -- highest 10000, default 0
[VISION_ARC:#:#]

[ODOR_LEVEL:#] -- default 50
[SMELL_TRIGGER:#] -- no smell 10000, default 50

===== CREATURE CLASSES
[CREATURE_CLASS:ELEMENTAL_WARD]
[CREATURE_CLASS:MAGIC_RESIST]
[CREATURE_CLASS:MEGABEAST]
[CREATURE_CLASS:DRAGON]
[CREATURE_CLASS:COLOSSAL]

===== LAIR AND HABIT
[LAIR:SHRINE:100]
[LAIR_HUNTER]
[HABIT_NUM:TEST_ALL]
[HABIT:GIANT_NEST:100]

===== BODY
[BODY:DRAGON:ATTACHMENT_HEAD_4HORNS:ATTACHMENT_LIMBS_SPIKESLEG:ATTACHMENT_MISC_SPIKESTAIL]
[BODYGLOSS:CLAW_FOOT]
[BODY_DETAIL_PLAN:MATERIALS_LEVEL_9]
[REMOVE_MATERIAL:HAIR]
[USE_MATERIAL_TEMPLATE:SCALE:SCALE_TEMPLATE_LEVEL_9]
[USE_MATERIAL_TEMPLATE:TALON:NAIL_TEMPLATE_LEVEL_5]
[USE_MATERIAL_TEMPLATE:HORN:HORN_TEMPLATE_LEVEL_5]
[USE_MATERIAL_TEMPLATE:SPIKE:SPIKE_TEMPLATE_LEVEL_5]
[BODY_DETAIL_PLAN:TISSUES_LEVEL_9]
[REMOVE_TISSUE:HAIR]
[USE_TISSUE_TEMPLATE:SCALE:SCALE_TEMPLATE_LEVEL_9]
[USE_TISSUE_TEMPLATE:TALON:CLAW_TEMPLATE_LEVEL_5]
[USE_TISSUE_TEMPLATE:HORN:HORN_TEMPLATE_LEVEL_5]
[USE_TISSUE_TEMPLATE:SPIKE:SPIKE_TEMPLATE_LEVEL_5]
[BODY_DETAIL_PLAN:VERTEBRATE_TISSUE_LAYERS:SCALE:FAT:MUSCLE:BONE:CARTILAGE]
[TISSUE_LAYER:BY_CATEGORY:TOE:TALON:FRONT]
[SELECT_TISSUE_LAYER:HEART:BY_CATEGORY:HEART]
[PLUS_TISSUE_LAYER:SCALE:BY_CATEGORY:THROAT]
[TL_MAJOR_ARTERIES]
[BODY_DETAIL_PLAN:STANDARD_HEAD_POSITIONS]
[BODY_DETAIL_PLAN:HUMANOID_RIBCAGE_POSITIONS]
[USE_MATERIAL_TEMPLATE:SINEW:SINEW_TEMPLATE]
[TENDONS:LOCAL_CREATURE_MAT:SINEW:200]
[LIGAMENTS:LOCAL_CREATURE_MAT:SINEW:200]
[USE_MATERIAL_TEMPLATE:BLOOD:BLOOD_TEMPLATE]
[BLOOD:LOCAL_CREATURE_MAT:BLOOD:LIQUID]
[USE_MATERIAL_TEMPLATE:PUS:PUS_TEMPLATE]
[PUS:LOCAL_CREATURE_MAT:PUS:LIQUID]
[HAS_NERVES]
[GETS_WOUND_INFECTIONS]
[GETS_INFECTIONS_FROM_ROT]
[SELECT_TISSUE:SCALE]
[RELATIVE_THICKNESS:8]
[BODY_APPEARANCE_MODIFIER:LENGTH:90:95:98:100:102:105:110]
[BODY_APPEARANCE_MODIFIER:HEIGHT:90:95:98:100:102:105:110]
[BODY_APPEARANCE_MODIFIER:BROADNESS:90:95:98:100:102:105:110]

===== ATTRIBUTES AND SKILLS
[PHYS_ATT_RANGE:STRENGTH:4000:4100:4200:4500:4800:4900:5000]
[PHYS_ATT_RANGE:AGILITY:3000:3100:3200:3500:3800:3900:4000]
[PHYS_ATT_RANGE:TOUGHNESS:3500:3600:3700:4000:4300:4400:4500]
[PHYS_ATT_RANGE:ENDURANCE:3500:3600:3700:4000:4300:4400:4500]
[PHYS_ATT_RANGE:DISEASE_RESISTANCE:4000:4100:4200:4500:4800:4900:5000]
[PHYS_ATT_RANGE:RECUPERATION:3000:3100:3200:3500:3800:3900:4000]

[MENT_ATT_RANGE:WILLPOWER:5000:5000:5000:5000:5000:5000:5000]
[MENT_ATT_RANGE:SPATIAL_SENSE:3500:3600:3700:4000:4300:4400:4500]
[MENT_ATT_RANGE:KINESTHETIC_SENSE:5000:5000:5000:5000:5000:5000:5000]
[MENT_ATT_RANGE:FOCUS:5000:5000:5000:5000:5000:5000:5000]

[NATURAL_SKILL:BITE:12]
[NATURAL_SKILL:GRASP_STRIKE:12]
[NATURAL_SKILL:STANCE_STRIKE:12]
[NATURAL_SKILL:MELEE_COMBAT:12]
[NATURAL_SKILL:SITUATIONAL_AWARENESS:12]
[NATURAL_SKILL:DISCIPLINE:12]

===== ATTACKS
[APPLY_CREATURE_VARIATION:ATTACK_BITE]
[APPLY_CREATURE_VARIATION:ATTACK_TAIL_SWIPE]
[APPLY_CREATURE_VARIATION:ATTACK_CLAW]
[APPLY_CREATURE_VARIATION:ATTACK_HORN]
[APPLY_CREATURE_VARIATION:ATTACK_SPIKE]
[APPLY_CREATURE_VARIATION:SPECIAL_ATTACK_WING_BUFFET_COLOSSAL]
[APPLY_CREATURE_VARIATION:SPECIAL_ATTACK_STOMP_COLOSSAL]
[APPLY_CREATURE_VARIATION:SPECIAL_ATTACK_TAIL_SWEEP_COLOSSAL]
[APPLY_CREATURE_VARIATION:SPECIAL_ATTACK_CLEAVE_CLAW_COLOSSAL]
[APPLY_CREATURE_VARTIAION:SPECIAL_ATTACK_IMPALE_HORN_COLOSSAL]

===== CASTES
[CASTE:FEMALE_RED]
[FEMALE]
[LAYS_EGGS]
[EGG_MATERIAL:LOCAL_CREATURE_MAT:EGGSHELL:SOLID]
[EGG_MATERIAL:LOCAL_CREATURE_MAT:EGG_WHITE:LIQUID]
[EGG_MATERIAL:LOCAL_CREATURE_MAT:EGG_YOLK:LIQUID]
[EGG_SIZE:400000]
[CLUTCH_SIZE:1:2]
[CASTE:MALE_RED]
[MALE]
[CASTE:FEMALE_BLUE]
[FEMALE]
[LAYS_EGGS]
[EGG_MATERIAL:LOCAL_CREATURE_MAT:EGGSHELL:SOLID]
[EGG_MATERIAL:LOCAL_CREATURE_MAT:EGG_WHITE:LIQUID]
[EGG_MATERIAL:LOCAL_CREATURE_MAT:EGG_YOLK:LIQUID]
[EGG_SIZE:400000]
[CLUTCH_SIZE:1:2]
[CASTE:MALE_BLUE]
[MALE]
[CASTE:FEMALE_WHITE]
[FEMALE]
[LAYS_EGGS]
[EGG_MATERIAL:LOCAL_CREATURE_MAT:EGGSHELL:SOLID]
[EGG_MATERIAL:LOCAL_CREATURE_MAT:EGG_WHITE:LIQUID]
[EGG_MATERIAL:LOCAL_CREATURE_MAT:EGG_YOLK:LIQUID]
[EGG_SIZE:400000]
[CLUTCH_SIZE:1:2]
[CASTE:MALE_WHITE]
[MALE]
[CASTE:FEMALE_BROWN]
[FEMALE]
[LAYS_EGGS]
[EGG_MATERIAL:LOCAL_CREATURE_MAT:EGGSHELL:SOLID]
[EGG_MATERIAL:LOCAL_CREATURE_MAT:EGG_WHITE:LIQUID]
[EGG_MATERIAL:LOCAL_CREATURE_MAT:EGG_YOLK:LIQUID]
[EGG_SIZE:400000]
[CLUTCH_SIZE:1:2]
[CASTE:MALE_BROWN]
[MALE]

===== CASTE DETAILS
[SELECT_CASTE:FEMALE_RED]
[SELECT_ADDITIONAL_CASTE:MALE_RED]
[DESCRIPTION:Very few ancient dragons exist today. These dragons are the original members of the elder race that has gone all but extinct. Those that do still exist are the ones that were in the middle of the breach when it closed. It is said that half of the mind of these once noble creatures was left in the other realm and now they are little more than intelligent beasts who rule the skies. Red - The red dragons are masters of fire.]
[CASTE_NAME:ancient red dragon:ancient red dragons:ancient red dragon]
[FIREIMMUNE_SUPER]

===== CASTE CLASSES
[CREATURE_CLASS:FIRE4]

===== CASTE INTERACTIONS
[APPLY_CREATURE_VARIATION:SPELL_ELEMENTAL_FIRE_FIRE_JET]
[APPLY_CREATURE_VARIATION:SPELL_ELEMENTAL_FIRE_FIREBALL]
[APPLY_CREATURE_VARIATION:SPELL_ELEMENTAL_FIRE_FLAME_BLAST]
[APPLY_CREATURE_VARIATION:SPELL_ELEMENTAL_FIRE_INCINERATE]
[APPLY_CREATURE_VARIATION:SPELL_ELEMENTAL_FIRE_DRAGON_FIRE]

===== CASTE TISSUE LAYER GROUPS
[SET_TL_GROUP:BY_CATEGORY:ALL:SCALE]
[TL_COLOR_MODIFIER:RED:1:CRIMSON:1:CARMINE:1]
[TLCM_NOUN:scales:PLURAL]
[SET_TL_GROUP:BY_CATEGORY:EYE:EYE]
[TL_COLOR_MODIFIER:BLACK:1]
[TLCM_NOUN:eyes:PLURAL]

===== CASTE DETAILS
[SELECT_CASTE:FEMALE_BLUE]
[SELECT_ADDITIONAL_CASTE:MALE_BLUE]
[DESCRIPTION:Very few ancient dragons exist today. These dragons are the original members of the elder race that has gone all but extinct. Those that do still exist are the ones that were in the middle of the breach when it closed. It is said that half of the mind of these once noble creatures was left in the other realm and now they are little more than intelligent beasts who rule the skies. Blue - The blue dragons are masters of ice.]
[CASTE_NAME:ancient blue dragon:ancient blue dragons:ancient blue dragon]

===== CASTE CLASSES
[CREATURE_CLASS:ICE4]

===== CASTE INTERACTIONS
[APPLY_CREATURE_VARIATION:SPELL_ELEMENTAL_ICE_ICICLE]
[APPLY_CREATURE_VARIATION:SPELL_ELEMENTAL_ICE_COLD_SNAP]
[APPLY_CREATURE_VARIATION:SPELL_ELEMENTAL_ICE_RAY_OF_FROST]
[APPLY_CREATURE_VARIATION:SPELL_ELEMENTAL_ICE_FREEZE]
[APPLY_CREATURE_VARIATION:SPELL_ELEMENTAL_ICE_FROST_NOVA]

===== CASTE TISSUE LAYER GROUPS
[SET_TL_GROUP:BY_CATEGORY:ALL:SCALE]
[TL_COLOR_MODIFIER:AZURE:1:AQUA:1:COBALT:1:CERULEAN:1:MIDNIGHT_BLUE:1]
[TLCM_NOUN:scales:PLURAL]
[SET_TL_GROUP:BY_CATEGORY:EYE:EYE]
[TL_COLOR_MODIFIER:BLACK:1]
[TLCM_NOUN:eyes:PLURAL]

===== CASTE DETAILS
[SELECT_CASTE:FEMALE_WHITE]
[SELECT_ADDITIONAL_CASTE:MALE_WHITE]
[DESCRIPTION:Very few ancient dragons exist today. These dragons are the original members of the elder race that has gone all but extinct. Those that do still exist are the ones that were in the middle of the breach when it closed. It is said that half of the mind of these once noble creatures was left in the other realm and now they are little more than intelligent beasts who rule the skies. White - The white dragons are masters of air]
[CASTE_NAME:ancient white dragon:ancient white dragons:ancient white dragon]

===== CASTE CLASSES
[CREATURE_CLASS:AIR4]

===== CASTE INTERACTIONS
[APPLY_CREATURE_VARIATION:SPELL_ELEMENTAL_AIR_LIGHTNING]
[APPLY_CREATURE_VARIATION:SPELL_ELEMENTAL_AIR_CHAIN_LIGHTNING]
[APPLY_CREATURE_VARIATION:SPELL_ELEMENTAL_AIR_SHOCK]
[APPLY_CREATURE_VARIATION:SPELL_ELEMENTAL_AIR_GUST]
[APPLY_CREATURE_VARIATION:SPELL_ELEMENTAL_AIR_CYCLONE]

===== CASTE TISSUE LAYER GROUPS
[SET_TL_GROUP:BY_CATEGORY:ALL:SCALE]
[TL_COLOR_MODIFIER:WHITE:1:IVORY:1]
[TLCM_NOUN:scales:PLURAL]
[SET_TL_GROUP:BY_CATEGORY:EYE:EYE]
[TL_COLOR_MODIFIER:BLACK:1]
[TLCM_NOUN:eyes:PLURAL]

===== CASTE DETAILS
[SELECT_CASTE:FEMALE_BROWN]
[SELECT_ADDITIONAL_CASTE:MALE_BROWN]
[DESCRIPTION:Very few ancient dragons exist today. These dragons are the original members of the elder race that has gone all but extinct. Those that do still exist are the ones that were in the middle of the breach when it closed. It is said that half of the mind of these once noble creatures was left in the other realm and now they are little more than intelligent beasts who rule the skies. Brown - The brown dragons are masters of earth.]
[CASTE_NAME:ancient brown dragon:ancient brown dragons:ancient brown dragon]

===== CASTE CLASSES
[CREATURE_CLASS:EARTH4]

===== CASTE INTERACTIONS
[APPLY_CREATURE_VARIATION:SPELL_ELEMENTAL_EARTH_BOULDER]
[APPLY_CREATURE_VARIATION:SPELL_ELEMENTAL_EARTH_STALAGMITE]
[APPLY_CREATURE_VARIATION:SPELL_ELEMENTAL_EARTH_SAND_STORM]
[APPLY_CREATURE_VARIATION:SPELL_ELEMENTAL_EARTH_PETRIFY]
[APPLY_CREATURE_VARIATION:SPELL_ELEMENTAL_EARTH_EARTHQUAKE]

===== CASTE TISSUE LAYER GROUPS
[SET_TL_GROUP:BY_CATEGORY:ALL:SCALE]
[TL_COLOR_MODIFIER:BROWN:1:DARK_BROWN:1:LIGHT_BROWN:1]
[TLCM_NOUN:scales:PLURAL]
[SET_TL_GROUP:BY_CATEGORY:EYE:EYE]
[TL_COLOR_MODIFIER:BLACK:1]
[TLCM_NOUN:eyes:PLURAL]

===== MISC
[SELECT_CASTE:ALL]
[SELECT_MATERIAL:ALL]
[MULTIPLY_VALUE:25]
[COLDDAM_POINT:NONE]
[HEATDAM_POINT:NONE]
[IGNITE_POINT:NONE]
[IF_EXISTS_SET_MELTING_POINT:55000]
[IF_EXISTS_SET_BOILING_POINT:57000]
[SPEC_HEAT:30000]
[SELECT_MATERIAL:BLOOD]
[PLUS_MATERIAL:PUS]
[MELTING_POINT:10000]
[EXTRA_BUTCHER_OBJECT:BY_TOKEN:HEART]
[EBO_ITEM:TOOL:ITEM_TOOL_CREATURE_HEART_DRAGON_ANCIENT:NONE:NONE]
[EXTRA_BUTCHER_OBJECT:BY_TOKEN:BRAIN]
[EBO_ITEM:SMALLGEM:NONE:INORGANIC:SOUL_GEM_DRAGON_ANCIENT]
That would be my suggestion.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Dirst on August 22, 2014, 10:16:42 pm
Honestly if I were working on this I would start with a program that just divided the raws into dictionaries that would be easy to check against eachother. For example:

  • Create object type dictionary (e.g. CREATURE)
  • Create sub dictionary for each creature (e.g. TOAD)
  • Read raws from top to bottom, placing each entry into the dictionary for specific creature (e.g. MAXAGE:70:100)
  • If you hit a certain token (like USE_MATERIAL_TEMPLATE, SELECT_TISSUE_LAYER, or, the big one, CASTE) create a sub-sub dictionary for that tag
  • Continue adding to the sub-sub dictionary until you hit a token that exits out of it (e.g. another CASTE or USE_MATERIAL_TEMPLATE etc...)
Viola, now you have a dictionary that you can just compare to another dictionary and if it has extra entries or less entries you can adjust your output accordingly. An advantage to this is you can also specify exactly how you want the output to look, so you could, for instance, output into a broken up structure that is more easy to read like
Code: [Select]
[CREATURE:FANTASY_DRAGON_ANCIENT]
[NAME:ancient dragon:ancient dragons:ancient dragon]
[CREATURE_TILE:255][COLOR:7:0:1]

===== BIOME AND NUMBERS
[BIOME:ANY_LAND]
[APPLY_CREATURE_VARIATION:POP_NUMBERS:1:3:1:1:5]

===== MEGABEAST INFO
[MEGABEAST]
[DIFFICULTY:11]
[ATTACK_TRIGGER:120:50000:1000000]
[SPHERE:FIRE]
[SPHERE:EARTH]
[SPHERE:SKY]
[SPHERE:MOUNTAINS]
[SPHERE:LIGHTNING]
[SPHERE:STORMS]
[SPHERE:RULERSHIP]

===== IMMUNITIES
[APPLY_CREATURE_VARIATION:IMMUNITIES_MEGABEAST]

===== TRAITS
[APPLY_CREATURE_VARIATION:TRAITS_MEGABEAST]
[APPLY_CREATURE_VARIATION:TRAITS_COLOSSAL]

[PETVALUE:15000]
[PET_EXOTIC]
[INTELLIGENT]
[TRAINABLE]
[MOUNT]
[FLIER]
[LARGE_PREDATOR]
[BONECARN]
[ALL_ACTIVE]
[SWIMS_INNATE]
[CHILD:0]

[PREFSTRING:unfathomable power]

===== SIZE, AGE, AND SPEED
[BODY_SIZE:0:0:2000000]
[BODY_SIZE:1000:0:2000000000]
[BODY_SIZE:10000:0:2100000000]
[APPLY_CREATURE_VARIATION:STANDARD_FLYING_GAITS:900:528:352:176:1900:2900] 50 kph
[APPLY_CREATURE_VARIATION:STANDARD_QUADRUPED_GAITS:900:711:521:293:1900:2900] 30 kph
[APPLY_CREATURE_VARIATION:STANDARD_SWIMMING_GAITS:2206:1692:1178:585:3400:4900] 15 kph
[APPLY_CREATURE_VARIATION:STANDARD_CLIMBING_GAITS:2206:1692:1178:585:3400:4900] 15 kph
[APPLY_CREATURE_VARIATION:STANDARD_CRAWLING_GAITS:2206:1692:1178:585:3400:4900] 15 kph

===== SENSES
[VIEWRANGE:#] -- default 20
[LOW_LIGHT_VISION:#] -- highest 10000, default 0
[VISION_ARC:#:#]

[ODOR_LEVEL:#] -- default 50
[SMELL_TRIGGER:#] -- no smell 10000, default 50

===== CREATURE CLASSES
[CREATURE_CLASS:ELEMENTAL_WARD]
[CREATURE_CLASS:MAGIC_RESIST]
[CREATURE_CLASS:MEGABEAST]
[CREATURE_CLASS:DRAGON]
[CREATURE_CLASS:COLOSSAL]

===== LAIR AND HABIT
[LAIR:SHRINE:100]
[LAIR_HUNTER]
[HABIT_NUM:TEST_ALL]
[HABIT:GIANT_NEST:100]

===== BODY
[BODY:DRAGON:ATTACHMENT_HEAD_4HORNS:ATTACHMENT_LIMBS_SPIKESLEG:ATTACHMENT_MISC_SPIKESTAIL]
[BODYGLOSS:CLAW_FOOT]
[BODY_DETAIL_PLAN:MATERIALS_LEVEL_9]
[REMOVE_MATERIAL:HAIR]
[USE_MATERIAL_TEMPLATE:SCALE:SCALE_TEMPLATE_LEVEL_9]
[USE_MATERIAL_TEMPLATE:TALON:NAIL_TEMPLATE_LEVEL_5]
[USE_MATERIAL_TEMPLATE:HORN:HORN_TEMPLATE_LEVEL_5]
[USE_MATERIAL_TEMPLATE:SPIKE:SPIKE_TEMPLATE_LEVEL_5]
[BODY_DETAIL_PLAN:TISSUES_LEVEL_9]
[REMOVE_TISSUE:HAIR]
[USE_TISSUE_TEMPLATE:SCALE:SCALE_TEMPLATE_LEVEL_9]
[USE_TISSUE_TEMPLATE:TALON:CLAW_TEMPLATE_LEVEL_5]
[USE_TISSUE_TEMPLATE:HORN:HORN_TEMPLATE_LEVEL_5]
[USE_TISSUE_TEMPLATE:SPIKE:SPIKE_TEMPLATE_LEVEL_5]
[BODY_DETAIL_PLAN:VERTEBRATE_TISSUE_LAYERS:SCALE:FAT:MUSCLE:BONE:CARTILAGE]
[TISSUE_LAYER:BY_CATEGORY:TOE:TALON:FRONT]
[SELECT_TISSUE_LAYER:HEART:BY_CATEGORY:HEART]
[PLUS_TISSUE_LAYER:SCALE:BY_CATEGORY:THROAT]
[TL_MAJOR_ARTERIES]
[BODY_DETAIL_PLAN:STANDARD_HEAD_POSITIONS]
[BODY_DETAIL_PLAN:HUMANOID_RIBCAGE_POSITIONS]
[USE_MATERIAL_TEMPLATE:SINEW:SINEW_TEMPLATE]
[TENDONS:LOCAL_CREATURE_MAT:SINEW:200]
[LIGAMENTS:LOCAL_CREATURE_MAT:SINEW:200]
[USE_MATERIAL_TEMPLATE:BLOOD:BLOOD_TEMPLATE]
[BLOOD:LOCAL_CREATURE_MAT:BLOOD:LIQUID]
[USE_MATERIAL_TEMPLATE:PUS:PUS_TEMPLATE]
[PUS:LOCAL_CREATURE_MAT:PUS:LIQUID]
[HAS_NERVES]
[GETS_WOUND_INFECTIONS]
[GETS_INFECTIONS_FROM_ROT]
[SELECT_TISSUE:SCALE]
[RELATIVE_THICKNESS:8]
[BODY_APPEARANCE_MODIFIER:LENGTH:90:95:98:100:102:105:110]
[BODY_APPEARANCE_MODIFIER:HEIGHT:90:95:98:100:102:105:110]
[BODY_APPEARANCE_MODIFIER:BROADNESS:90:95:98:100:102:105:110]

===== ATTRIBUTES AND SKILLS
[PHYS_ATT_RANGE:STRENGTH:4000:4100:4200:4500:4800:4900:5000]
[PHYS_ATT_RANGE:AGILITY:3000:3100:3200:3500:3800:3900:4000]
[PHYS_ATT_RANGE:TOUGHNESS:3500:3600:3700:4000:4300:4400:4500]
[PHYS_ATT_RANGE:ENDURANCE:3500:3600:3700:4000:4300:4400:4500]
[PHYS_ATT_RANGE:DISEASE_RESISTANCE:4000:4100:4200:4500:4800:4900:5000]
[PHYS_ATT_RANGE:RECUPERATION:3000:3100:3200:3500:3800:3900:4000]

[MENT_ATT_RANGE:WILLPOWER:5000:5000:5000:5000:5000:5000:5000]
[MENT_ATT_RANGE:SPATIAL_SENSE:3500:3600:3700:4000:4300:4400:4500]
[MENT_ATT_RANGE:KINESTHETIC_SENSE:5000:5000:5000:5000:5000:5000:5000]
[MENT_ATT_RANGE:FOCUS:5000:5000:5000:5000:5000:5000:5000]

[NATURAL_SKILL:BITE:12]
[NATURAL_SKILL:GRASP_STRIKE:12]
[NATURAL_SKILL:STANCE_STRIKE:12]
[NATURAL_SKILL:MELEE_COMBAT:12]
[NATURAL_SKILL:SITUATIONAL_AWARENESS:12]
[NATURAL_SKILL:DISCIPLINE:12]

===== ATTACKS
[APPLY_CREATURE_VARIATION:ATTACK_BITE]
[APPLY_CREATURE_VARIATION:ATTACK_TAIL_SWIPE]
[APPLY_CREATURE_VARIATION:ATTACK_CLAW]
[APPLY_CREATURE_VARIATION:ATTACK_HORN]
[APPLY_CREATURE_VARIATION:ATTACK_SPIKE]
[APPLY_CREATURE_VARIATION:SPECIAL_ATTACK_WING_BUFFET_COLOSSAL]
[APPLY_CREATURE_VARIATION:SPECIAL_ATTACK_STOMP_COLOSSAL]
[APPLY_CREATURE_VARIATION:SPECIAL_ATTACK_TAIL_SWEEP_COLOSSAL]
[APPLY_CREATURE_VARIATION:SPECIAL_ATTACK_CLEAVE_CLAW_COLOSSAL]
[APPLY_CREATURE_VARTIAION:SPECIAL_ATTACK_IMPALE_HORN_COLOSSAL]

===== CASTES
[CASTE:FEMALE_RED]
[FEMALE]
[LAYS_EGGS]
[EGG_MATERIAL:LOCAL_CREATURE_MAT:EGGSHELL:SOLID]
[EGG_MATERIAL:LOCAL_CREATURE_MAT:EGG_WHITE:LIQUID]
[EGG_MATERIAL:LOCAL_CREATURE_MAT:EGG_YOLK:LIQUID]
[EGG_SIZE:400000]
[CLUTCH_SIZE:1:2]
[CASTE:MALE_RED]
[MALE]
[CASTE:FEMALE_BLUE]
[FEMALE]
[LAYS_EGGS]
[EGG_MATERIAL:LOCAL_CREATURE_MAT:EGGSHELL:SOLID]
[EGG_MATERIAL:LOCAL_CREATURE_MAT:EGG_WHITE:LIQUID]
[EGG_MATERIAL:LOCAL_CREATURE_MAT:EGG_YOLK:LIQUID]
[EGG_SIZE:400000]
[CLUTCH_SIZE:1:2]
[CASTE:MALE_BLUE]
[MALE]
[CASTE:FEMALE_WHITE]
[FEMALE]
[LAYS_EGGS]
[EGG_MATERIAL:LOCAL_CREATURE_MAT:EGGSHELL:SOLID]
[EGG_MATERIAL:LOCAL_CREATURE_MAT:EGG_WHITE:LIQUID]
[EGG_MATERIAL:LOCAL_CREATURE_MAT:EGG_YOLK:LIQUID]
[EGG_SIZE:400000]
[CLUTCH_SIZE:1:2]
[CASTE:MALE_WHITE]
[MALE]
[CASTE:FEMALE_BROWN]
[FEMALE]
[LAYS_EGGS]
[EGG_MATERIAL:LOCAL_CREATURE_MAT:EGGSHELL:SOLID]
[EGG_MATERIAL:LOCAL_CREATURE_MAT:EGG_WHITE:LIQUID]
[EGG_MATERIAL:LOCAL_CREATURE_MAT:EGG_YOLK:LIQUID]
[EGG_SIZE:400000]
[CLUTCH_SIZE:1:2]
[CASTE:MALE_BROWN]
[MALE]

===== CASTE DETAILS
[SELECT_CASTE:FEMALE_RED]
[SELECT_ADDITIONAL_CASTE:MALE_RED]
[DESCRIPTION:Very few ancient dragons exist today. These dragons are the original members of the elder race that has gone all but extinct. Those that do still exist are the ones that were in the middle of the breach when it closed. It is said that half of the mind of these once noble creatures was left in the other realm and now they are little more than intelligent beasts who rule the skies. Red - The red dragons are masters of fire.]
[CASTE_NAME:ancient red dragon:ancient red dragons:ancient red dragon]
[FIREIMMUNE_SUPER]

===== CASTE CLASSES
[CREATURE_CLASS:FIRE4]

===== CASTE INTERACTIONS
[APPLY_CREATURE_VARIATION:SPELL_ELEMENTAL_FIRE_FIRE_JET]
[APPLY_CREATURE_VARIATION:SPELL_ELEMENTAL_FIRE_FIREBALL]
[APPLY_CREATURE_VARIATION:SPELL_ELEMENTAL_FIRE_FLAME_BLAST]
[APPLY_CREATURE_VARIATION:SPELL_ELEMENTAL_FIRE_INCINERATE]
[APPLY_CREATURE_VARIATION:SPELL_ELEMENTAL_FIRE_DRAGON_FIRE]

===== CASTE TISSUE LAYER GROUPS
[SET_TL_GROUP:BY_CATEGORY:ALL:SCALE]
[TL_COLOR_MODIFIER:RED:1:CRIMSON:1:CARMINE:1]
[TLCM_NOUN:scales:PLURAL]
[SET_TL_GROUP:BY_CATEGORY:EYE:EYE]
[TL_COLOR_MODIFIER:BLACK:1]
[TLCM_NOUN:eyes:PLURAL]

===== CASTE DETAILS
[SELECT_CASTE:FEMALE_BLUE]
[SELECT_ADDITIONAL_CASTE:MALE_BLUE]
[DESCRIPTION:Very few ancient dragons exist today. These dragons are the original members of the elder race that has gone all but extinct. Those that do still exist are the ones that were in the middle of the breach when it closed. It is said that half of the mind of these once noble creatures was left in the other realm and now they are little more than intelligent beasts who rule the skies. Blue - The blue dragons are masters of ice.]
[CASTE_NAME:ancient blue dragon:ancient blue dragons:ancient blue dragon]

===== CASTE CLASSES
[CREATURE_CLASS:ICE4]

===== CASTE INTERACTIONS
[APPLY_CREATURE_VARIATION:SPELL_ELEMENTAL_ICE_ICICLE]
[APPLY_CREATURE_VARIATION:SPELL_ELEMENTAL_ICE_COLD_SNAP]
[APPLY_CREATURE_VARIATION:SPELL_ELEMENTAL_ICE_RAY_OF_FROST]
[APPLY_CREATURE_VARIATION:SPELL_ELEMENTAL_ICE_FREEZE]
[APPLY_CREATURE_VARIATION:SPELL_ELEMENTAL_ICE_FROST_NOVA]

===== CASTE TISSUE LAYER GROUPS
[SET_TL_GROUP:BY_CATEGORY:ALL:SCALE]
[TL_COLOR_MODIFIER:AZURE:1:AQUA:1:COBALT:1:CERULEAN:1:MIDNIGHT_BLUE:1]
[TLCM_NOUN:scales:PLURAL]
[SET_TL_GROUP:BY_CATEGORY:EYE:EYE]
[TL_COLOR_MODIFIER:BLACK:1]
[TLCM_NOUN:eyes:PLURAL]

===== CASTE DETAILS
[SELECT_CASTE:FEMALE_WHITE]
[SELECT_ADDITIONAL_CASTE:MALE_WHITE]
[DESCRIPTION:Very few ancient dragons exist today. These dragons are the original members of the elder race that has gone all but extinct. Those that do still exist are the ones that were in the middle of the breach when it closed. It is said that half of the mind of these once noble creatures was left in the other realm and now they are little more than intelligent beasts who rule the skies. White - The white dragons are masters of air]
[CASTE_NAME:ancient white dragon:ancient white dragons:ancient white dragon]

===== CASTE CLASSES
[CREATURE_CLASS:AIR4]

===== CASTE INTERACTIONS
[APPLY_CREATURE_VARIATION:SPELL_ELEMENTAL_AIR_LIGHTNING]
[APPLY_CREATURE_VARIATION:SPELL_ELEMENTAL_AIR_CHAIN_LIGHTNING]
[APPLY_CREATURE_VARIATION:SPELL_ELEMENTAL_AIR_SHOCK]
[APPLY_CREATURE_VARIATION:SPELL_ELEMENTAL_AIR_GUST]
[APPLY_CREATURE_VARIATION:SPELL_ELEMENTAL_AIR_CYCLONE]

===== CASTE TISSUE LAYER GROUPS
[SET_TL_GROUP:BY_CATEGORY:ALL:SCALE]
[TL_COLOR_MODIFIER:WHITE:1:IVORY:1]
[TLCM_NOUN:scales:PLURAL]
[SET_TL_GROUP:BY_CATEGORY:EYE:EYE]
[TL_COLOR_MODIFIER:BLACK:1]
[TLCM_NOUN:eyes:PLURAL]

===== CASTE DETAILS
[SELECT_CASTE:FEMALE_BROWN]
[SELECT_ADDITIONAL_CASTE:MALE_BROWN]
[DESCRIPTION:Very few ancient dragons exist today. These dragons are the original members of the elder race that has gone all but extinct. Those that do still exist are the ones that were in the middle of the breach when it closed. It is said that half of the mind of these once noble creatures was left in the other realm and now they are little more than intelligent beasts who rule the skies. Brown - The brown dragons are masters of earth.]
[CASTE_NAME:ancient brown dragon:ancient brown dragons:ancient brown dragon]

===== CASTE CLASSES
[CREATURE_CLASS:EARTH4]

===== CASTE INTERACTIONS
[APPLY_CREATURE_VARIATION:SPELL_ELEMENTAL_EARTH_BOULDER]
[APPLY_CREATURE_VARIATION:SPELL_ELEMENTAL_EARTH_STALAGMITE]
[APPLY_CREATURE_VARIATION:SPELL_ELEMENTAL_EARTH_SAND_STORM]
[APPLY_CREATURE_VARIATION:SPELL_ELEMENTAL_EARTH_PETRIFY]
[APPLY_CREATURE_VARIATION:SPELL_ELEMENTAL_EARTH_EARTHQUAKE]

===== CASTE TISSUE LAYER GROUPS
[SET_TL_GROUP:BY_CATEGORY:ALL:SCALE]
[TL_COLOR_MODIFIER:BROWN:1:DARK_BROWN:1:LIGHT_BROWN:1]
[TLCM_NOUN:scales:PLURAL]
[SET_TL_GROUP:BY_CATEGORY:EYE:EYE]
[TL_COLOR_MODIFIER:BLACK:1]
[TLCM_NOUN:eyes:PLURAL]

===== MISC
[SELECT_CASTE:ALL]
[SELECT_MATERIAL:ALL]
[MULTIPLY_VALUE:25]
[COLDDAM_POINT:NONE]
[HEATDAM_POINT:NONE]
[IGNITE_POINT:NONE]
[IF_EXISTS_SET_MELTING_POINT:55000]
[IF_EXISTS_SET_BOILING_POINT:57000]
[SPEC_HEAT:30000]
[SELECT_MATERIAL:BLOOD]
[PLUS_MATERIAL:PUS]
[MELTING_POINT:10000]
[EXTRA_BUTCHER_OBJECT:BY_TOKEN:HEART]
[EBO_ITEM:TOOL:ITEM_TOOL_CREATURE_HEART_DRAGON_ANCIENT:NONE:NONE]
[EXTRA_BUTCHER_OBJECT:BY_TOKEN:BRAIN]
[EBO_ITEM:SMALLGEM:NONE:INORGANIC:SOUL_GEM_DRAGON_ANCIENT]
That would be my suggestion.

I think the technical name for that approach is "scope creep."  This sounds like a decent avenue to pursue, but not for a first version.  False-positive rejects are fine since the intended audience is newbs who will probably only load mods pre-loaded with the Starter Pack.  That handful can have their compatibility worked out in manifest files if absolutely necessary.

Once the idea of merge-in mods takes off, more intelligent parsing will be far more valuable.  The important thing for now is to make sure we don't make design decisions that tie our hands down the road.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: PeridexisErrant on August 22, 2014, 10:50:10 pm
Honestly if I were working on this I would start with a program that just divided the raws into dictionaries that would be easy to check against each other. For example:
  • Create object type dictionary (e.g. CREATURE)
  • Create sub dictionary for each creature (e.g. TOAD)
  • Read raws from top to bottom, placing each entry into the dictionary for specific creature (e.g. MAXAGE:70:100)
  • If you hit a certain token (like USE_MATERIAL_TEMPLATE, SELECT_TISSUE_LAYER, or, the big one, CASTE) create a sub-sub dictionary for that tag
  • Continue adding to the sub-sub dictionary until you hit a token that exits out of it (e.g. another CASTE or USE_MATERIAL_TEMPLATE etc...)
Viola, now you have a dictionary that you can just compare to another dictionary and if it has extra entries or less entries you can adjust your output accordingly. An advantage to this is you can also specify exactly how you want the output to look, so you could, for instance, output into a broken up structure that is more easy to read like
I think the technical name for that approach is "scope creep."  This sounds like a decent avenue to pursue, but not for a first version.  False-positive rejects are fine since the intended audience is newbs who will probably only load mods pre-loaded with the Starter Pack.  That handful can have their compatibility worked out in manifest files if absolutely necessary.

Once the idea of merge-in mods takes off, more intelligent parsing will be far more valuable.  The important thing for now is to make sure we don't make design decisions that tie our hands down the road.
I have nothing at all against the idea of scope creep, more advanced parsing would be great. 

That said, we don't even have a working basic merge yet, so I think it's a little premature to go quite this complex.  Current priorities:
  1. Get a basic diff-based system working.  We can live with a high rejection rate for now.
  2. Finish the structure of the code, so it can be extended like this later (mostly done).
  3. GUI-ify as tab in a fork of the PyLNP.
  4. Write a function that checks raws for validity and functionality. 

I agree that we should keep thing generally future-proof, but so long as all the options people are proposing take inputs in the same format I don't think its too much of a problem - maybe more work on our end to refactor, but no change for modders or users.  Note that we're not claiming there's a 'correct' or canonical way to merge a given set of raws; given that we're not modifying pre-existing worlds this shouldn't be an issue.  So long as the input is in the format this thread specifies and output is valid to DF, anything goes. 

Just please, let's get a basic model working soon. 
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 22, 2014, 11:19:28 pm
Yeah that sounds like a good list. Here's my comments on it
2) This is really #1. We need the boilerplate to take mod folder structure and generate the merge df raws.
1) This is what I'm working on -- testing my merge algorithm for correctness and expanding as needed.
3) Definitely up there. If there's somebody reading this thread looking to contribute, this would be a good way to help.
4) This will take more then a function. We need sample mods are attempted to be merged that demonstrate the working program. We don't need an exhaustive unit test for the differ algorithm here, but we need enough tests to show the usefulness and limitations of merging.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: PeridexisErrant on August 23, 2014, 12:44:05 am
3. GUI-ify as tab in a fork of the PyLNP.
3) Definitely up there. If there's somebody reading this thread looking to contribute, this would be a good way to help.
I should do some sketches soon to show what I have in mind.  If anyone wnats to get started let me know and I'll prioritise them. 
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Merkator on August 23, 2014, 04:48:32 am
I am dumb...

Yep, this was easy.
For anyone interested

Code: [Select]
BSTART = '['
BEND = ']'
STR = ASCII+
TYPE = STR
TILE = "'" STR "'"
ARG = SEP STR|TILE
TOKEN = BSTART TYPE ARG* BEND

Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 23, 2014, 10:27:50 am
You guys will have to address merge dependencies when addressing conflicts.

The issue happened with me when I merged civforge with 40_09.  Both CHANGED the same areas.  The program has to be smart enough to know a "change" occured, and decide which change to go with.

I would think merely referencing the order of the mods when merging them would be enough to show precedence.  For ex... if I put 40_09 in front of CivForge, I would think 40_09 changes would have precedence over CivForge if both "mods" are affecting the same thing (this is from a base of 34_11 btw).

Update
that kind of swapping of "changed" values vs removed/added values.

May be tricky, some sort of hieuristic comparison between the two candidates to be swapped in has to be done.

Could get very very very tricky with things like merging civforge and accelerated mod.  I've had my kdiff3 thing that two lines were the same because most of the strings were the same and it thought it was the same line.

Anyways, google's diffmatchpatch might come in handy for that. 

https://code.google.com/p/google-diff-match-patch/downloads/list

they may have a python version.  I'd be surprised if they didn't.  But... it might help identify those one liner's.

I figure it could be done w raw structure mapping or... When a line is deleted and replaced, a Google diff match patch is ran on the replacement lines to see if there a match.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: King Mir on August 23, 2014, 02:34:35 pm
So one of the goals of this is that mods that conflict are marked as conflicting, not silently resolved to favor one. Favoring one may be suitable for some merging applications, like making your own mod bundle, but for someone trying out a batch of mods, having the order the mods are loaded in not matter is a good thing.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Dirst on August 23, 2014, 09:16:20 pm
So one of the goals of this is that mods that conflict are marked as conflicting, not silently resolved to favor one. Favoring one may be suitable for some merging applications, like making your own mod bundle, but for someone trying out a batch of mods, having the order the mods are loaded in not matter is a good thing.
I think the merger should raise a warning if there is any overlap, but allow the user to proceed with some simple tiebreaker rules (either a priority list, or a load order with last-in-wins).

The (optional) manifest file can silently OK some of these collisions.

The load order can also be important if we end up with modular mods that require a core be in place before the module is applied.  I have a very specific example in mind, a mod I'm working on that is supposed to be adjusted based on the graphic pack loaded.  Right now my plan is to upload it configured for Phoebus (the Starter Pack's default) and instruct the user thru simple search-replaces for popular graphics.  I'd rather have a core mod tuned for vanilla with modmods for CLA, Ironhand, Mayday, Phoebus and Spacefox.

All of the modmods would only change a single file, one that is added by the core mod.  This sounds like something that should be achievable via a manifest file (and even without one if I publish them as distinct mods).
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: PeridexisErrant on August 23, 2014, 09:34:21 pm
A merge conflict - ie something couldn't be loaded - is marked as a conflict as are all later mods.  This is a red/abort scenario. 

However if a merge over the top of a previous merge is possible, we do it - that's the point of the merge instructions being ordered.  Because this is in general likely to be broken, it's an orange/warning.  However the manifest should eventually have a list on known-compatible parent mods, to help in this exact scenario.  Until we have a working merger this is all pretty academic though!
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Button on August 25, 2014, 05:09:14 am
So I haven't read all the comments since Friday because it is so late it's early but I just remembered I had yet to upload this so here's the mod loader guys http://dffd.wimbli.com/file.php?id=9509

If you just run it off the bat it'll give errors on the plant files, the raw files need to be flattened out because it gets confused by comments being on the same line as a tag. It's just the load-in not the write-out yet but the write-out should be fairly obvious once you take a look at it.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: PeridexisErrant on August 25, 2014, 05:47:56 am
So I haven't read all the comments since Friday because it is so late it's early but I just remembered I had yet to upload this so here's the mod loader guys http://dffd.wimbli.com/file.php?id=9509

If you just run it off the bat it'll give errors on the plant files, the raw files need to be flattened out because it gets confused by comments being on the same line as a tag. It's just the load-in not the write-out yet but the write-out should be fairly obvious once you take a look at it.

Nice!  Unfortunately it doesn't seem to like the plant_crops, plant_garden, or plant_new_trees files - I got eleven thousand lines of complaining about them! :o  Should be easy enough to fix, I imagine, and I look forward to seeing a full merge based on this (or any other) concept. 
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Button on August 25, 2014, 10:54:59 am
So I haven't read all the comments since Friday because it is so late it's early but I just remembered I had yet to upload this so here's the mod loader guys http://dffd.wimbli.com/file.php?id=9509

If you just run it off the bat it'll give errors on the plant files, the raw files need to be flattened out because it gets confused by comments being on the same line as a tag. It's just the load-in not the write-out yet but the write-out should be fairly obvious once you take a look at it.

Nice!  Unfortunately it doesn't seem to like the plant_crops, plant_garden, or plant_new_trees files - I got eleven thousand lines of complaining about them! :o  Should be easy enough to fix, I imagine, and I look forward to seeing a full merge based on this (or any other) concept.

It doesn't like them even after flattening? I thought the problem with them was - and I tested this by fixing it manually in plant_crops but it was late and I didn't feel like figuring out the flattener >> - is that they have comments on the same line as their [PLANT:whatever] tags and the parser isn't expecting that.

But if it throws all those errors even with flattening I'll have to take another look at that.



I'm planning to integrate the loader with the raw flattener, and then get it writing output, later this week. Then I'll set it up to pull in from other mods.

The first pass will probably just pull in new objects, and do a string-based diff/merge on the whole object body of each changed object. The framework is there for better tag handling but I get the feeling we want to have a working alpha asap.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 25, 2014, 11:56:06 am
 :I don't know howyour flattening them but the last batch file I posted does it quite nicely using sed and leaves no comments on the same line as tokens...

http://www.bay12forums.com/smf/index.php?topic=142295.msg5595688#msg5595688

Alternatively I have some c logic I could offer to parse tokens... But I don't think that's in need.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Merkator on August 26, 2014, 04:28:34 am
I already wrote basic raw parser with pyparsing.
I will post it here later.
For now I don't have access to my linux machine.

UPDATE:

Ok. Here it goes :)
Link to gist (https://gist.github.com/Demagogue/af2c31d75a2bb390bf71)

The grammar look like that:
Code: [Select]
from pyparsing import Word, ZeroOrMore, Suppress, Optional, Regex
from pyparsing import alphas, alphanums
from pyparsing import ParseException

lbracket = Suppress('[')
rbracket = Suppress(']')
tile = Regex("\'.\'")
arg = Suppress(':') + (Word(alphanums) ^ tile)
self.token = lbracket + Word(alphas) + Optional(ZeroOrMore(arg)) + rbracket


UPDATE2:

I am moron.
Script has some problems with handling comments and names with _ inside.
Now everything is working correctly.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 26, 2014, 09:02:21 am
I see your python script... And fold.

:)

Maybe button is having an issue w parsing files where tokens and comments NEED to be on the same line. Such as the text subfolder and some items in the data subfolder?

Idk, but only the objects\*.text should be reformatted, reformatting other files by moving/deleting their comments may break the way the game reads the line.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Dirst on August 26, 2014, 09:08:25 am
I see your python script... And fold.

:)

Maybe button is having an issue w parsing files where tokens and comments NEED to be in the same line. Such as the text subfolder and some items in the data subfolder?
Yeah, it looks like we should skip any flattening of the files in the text folder, since they are all one-entry-per-line already and the "comments" are actually literal text.  But standard diffs should be able to handle those files once they survive the pre-processing stage.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Button on August 26, 2014, 09:37:53 am
Maybe button is having an issue w parsing files where tokens and comments NEED to be on the same line. Such as the text subfolder and some items in the data subfolder?

No, I'm having an issue where work is preventing me from spending as much time on this as I'd like :P. That's why I didn't flatten the raws myself - had to get to sleep, had to get to work, and then yesterday work became the devourer of souls.

I'm just parsing .txt files from objects, and PeredexisErrant said he ran into problems in plants, which was something I believed would only happen if the raw files weren't flattened before running. So I asked him if he flattened them/if the plants problem happened even after flattening, but he hasn't replied about that yet.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Merkator on August 26, 2014, 11:50:22 am
Sorry I don't inform you guys, still my code need (probably need) flattened raws.
I will post my script with everything in.

BTW the ./text/*.txt files should be treated different. Because they are completely different than standard raws.
In those files the tokens work as variables.
In rest of the raws, they create visible structure.
My approach work only on the normal raws, and remove the comments :(
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 26, 2014, 12:44:17 pm
There's also text in data subfolders that have tokens ni believe that are read line by line vs by token.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: PeridexisErrant on August 26, 2014, 05:45:27 pm
No flattening, I just ran the code as it came.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Button on August 26, 2014, 06:02:31 pm
No flattening, I just ran the code as it came.

OK, then the plant errors were expected. :) I did say you'd get plant errors if you ran without flattening first :P
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Button on August 28, 2014, 09:10:25 am
I hope I haven't killed this thread/discouraged anybody from working on it. As you may have noticed I don't have a ton of time to work on this myself so please don't think that I've claimed the project or anything.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Meph on August 28, 2014, 06:26:35 pm
I know everyone is talking about automation and the best script methods to merge mods together, but why not have one person do it by hand? I have done it dozens of times, and its quite easy if you take minor mods and vanilla DF as base. Since you say that you dont even want to try MDF + Genesis + LfR or something similar, it might be that the entire process of writing the code to merge mods is much more work than merging them by hand.

I mean... this thread is over 2 weeks old. In that time I could have merged 100 minor mods into a modpack, using simple on/off toggles on the [OBJECT:X] line at the top, and using specific filenames, with only the current visual-basics based LNP/Starter Pack GUI as base. And I dont have coding knowledge, I would copy+paste the AQUIFER button you have for this.

I dont mean to intrude, but I have the feeling that you are overcomplicating stuff. Although I have to add that new additions would require recompiling the launcher. Splinterz would probably find a way around that, even using the system I described, by using a dropdown menu and an external txt that modders could add their mods to, he did the same for MDF.

Suggestion: Add a new tab to your normal DF Starter Pack, with 10 very popular minor mods, and wait for peoples feedback? Thats what I would do.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Putnam on August 28, 2014, 06:35:21 pm
Your way is ugly. I mean, that way lies madness, which is to say that way is why I was so confused with Masterwork in general.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Meph on August 28, 2014, 06:46:23 pm
Yes its ugly. And primitive. But you know that only because you have modding/programming background knowledge. Users dont care.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Putnam on August 28, 2014, 06:54:08 pm
Users don't care either way; modders/programmers seem to prefer something more compartmentalized. A system to merge mods only needs to be made once while manually merging needs to be done many times. I'd say it's worth it.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: PeridexisErrant on August 28, 2014, 08:07:23 pm
I hope I haven't killed this thread/discouraged anybody from working on it. As you may have noticed I don't have a ton of time to work on this myself so please don't think that I've claimed the project or anything.
"I don't have a ton of time to work on this" basically sums up my position, except I'm also bumping up against the limits of my programming ability.  My sneaky plan is to wait until my semester course covers something relevant, and then retry - but if someone else finishes first I'm not going to be sad!

I know everyone is talking about automation and the best script methods to merge mods together, but why not have one person do it by hand? I have done it dozens of times, and its quite easy if you take minor mods and vanilla DF as base. Since you say that you don't even want to try MDF + Genesis + LfR or something similar, it might be that the entire process of writing the code to merge mods is much more work than merging them by hand.

I mean... this thread is over 2 weeks old. In that time I could have merged 100 minor mods into a modpack, using simple on/off toggles on the [OBJECT:X] line at the top, and using specific filenames, with only the current visual-basics based LNP/Starter Pack GUI as base. And I dont have coding knowledge, I would copy+paste the AQUIFER button you have for this.

I dont mean to intrude, but I have the feeling that you are overcomplicating stuff. Although I have to add that new additions would require recompiling the launcher. Splinterz would probably find a way around that, even using the system I described, by using a dropdown menu and an external txt that modders could add their mods to, he did the same for MDF.

Suggestion: Add a new tab to your normal DF Starter Pack, with 10 very popular minor mods, and wait for peoples feedback? That's what I would do.

The medium-term goal is to add the mod loader as a new tab in the PyLNP, which I'm switching over to soon.  That will mean that the format is a standard available to new players on whatever OS, which should help adoption from users and modders.  Recompiling isn't much of an issue once the source code has been written.  For a while it might be a binary choice between mods and the ability to change graphics, but I'm hoping we can solve that eventually. 

You're right that a simple mod pack could have been assembled in less time, but the goal of this isn't just to make a mod pack available - we already have those.  I want to help new players get into mods, and that means they need to be able to add new mods to the pack as easily as anyone can add utilities or graphics to the LNP.  It's likely that we'll compile a basic content set for this - making some popular small mods as minimal as possible to reduce conflicts and so on - but that's not meant to be a core part of the program, which should be flexible enough to use whatever content is fed to it without having to be rebuilt each time. 

We may well be overcomplicating things, but I'm feeling ambitious about the long term. It's important to note that I've designed something that works at a basic level for the merging, which we can improve the logic of later.  Unfortunately I can't work out how to apply a unified diff with Python, or I'd have had a working implementation a week ago...
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Meph on August 28, 2014, 08:39:02 pm
Ah, so you want a random person who has this Mod Pack to download any mod he likes from DFFD, drag+drop it into some folder, and have the Mod Pack automatically add it to its list of available modules that it installs/deinstalls. Thats a pretty neat idea. :)

Only downside I see is that it requires more initiative than I attribute to the average currently-non-mod-using player. It would still require an active mindset of looking through the forum, seeing a mod they like by finding it in the modding board and reading the description, going to DFFD, download it, unpack it, drag+drop it into the Mod Loader. Reminds me of Nexus Mods, Oblivion or Skyrim. ^^

And cross compability for linux/mac is a big plus of course. :)
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 28, 2014, 09:24:46 pm
Yes its ugly. And primitive. But you know that only because you have modding/programming background knowledge. Users dont care.

I think Meph's way is kind of an intelligent post processing of raws.

I recommended a patch file that injected a post processor ###Flag### [based on how Meph has his various flagged raws load] w while back in this thread.

I also do think we are over-complicating what could be accomplished with just patching and scripting.  However, each mod would be static in such a scenario.

Trying to build a "merge" of mods... I'm not sure how Meph would go about resolving that with manual edits.  It would involve a method to track the merging of various mods and there affect on multiple tokens...  unless somehow that system already incorporated some sort of "classification" system for the ###Flags.

However, the only way to do it without "coding" is to do it by hand; which takes away have the fun imo.

Whatever you derive by hand, at least the batch file I derived for processing raw's, would keep your ###Token's intact.

On further thought.

Your mod loader has code that parses tokens based on these "###Token's###"... what language is it in?
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Meph on August 28, 2014, 09:27:23 pm
Its Visual Basics. It literally just goes through text files, looks for YES_SOMETHING[ and replaces it with NO_SOMETHING!, removing the [ , which makes DF not accept the tag as correct. It requires .NET-framework and only works on Windows, so its not useable for what Peridexis and co. are talking about here. ;)
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: PeridexisErrant on August 28, 2014, 09:30:07 pm
Ah, so you want a random person who has this Mod Pack to download any mod he likes from DFFD, drag+drop it into some folder, and have the Mod Pack automatically add it to its list of available modules that it installs/deinstalls. Thats a pretty neat idea. :)

Only downside I see is that it requires more initiative than I attribute to the average currently-non-mod-using player. It would still require an active mindset of looking through the forum, seeing a mod they like by finding it in the modding board and reading the description, going to DFFD, download it, unpack it, drag+drop it into the Mod Loader. Reminds me of Nexus Mods, Oblivion or Skyrim. ^^

And cross compability for linux/mac is a big plus of course. :)
Yep, that's it exactly   :D

We can't make it entirely painless, but any improvement is good and this could be pretty big.  Many players will never go past the defaults, but if ten percent decide to add something that's still a serious influx of mod-users for the currently-neglected small mods.  The easy-add feature also makes it easier for whoever maintains and distributes the pack. 

My concept for the way load order works is uh, very similar, to the elder scrolls model - user chooses order, gets feedback on compatibility, retry until it works or you realise it's not going to work at all.  Early implementations should err on the side of rejecting merges that might be OK, but it should get smarter over time.  One important longer-term goal - which should be useful more generally - is to eventually parse and assess the internal state of the raws; and notify the user of problems with broken tags, references to nothing, missing dependencies, broken templates, and so on.  Like the rest, this will probably start very primitive and get upgraded over time. 

The search-and-replace would be easy to implement on a pre-compiled set of raws for however many mods (though tedious to do very many), but I can't see a robust way to add new content in the drag-and-drop way we're targeting. 
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 28, 2014, 09:45:46 pm
the 2 weeks of conversation is due to

Quote
My concept for the way load order works is uh, very similar, to the elder scrolls model - user chooses order, gets feedback on compatibility, retry until it works or you realise it's not going to work at all

you're talking about a mod tool for another game in who knows what language that can't just be "ported" over to DF.

So... I would propose something simple first.

v1.0
Static Mods (scripts, parsed raws [either python parsed or batch/shell parsed], and patch files)

Then...

v2.0
Do a mod merger (maybe using a system similar to Meph's search/replace concept; mods that don't merge though, would have to be figured out somehow so the proper flags wouldn't conflict)

Then maybe an

v3.0
advanced object merger.

We're waiting for a mansion when all we need is a shack atm.

We literally have ALL the tools to create a batch based mod system.  User clicks on a batch file or linux shell script, then gets a bunch of text prompts asking what mod he wants to load.  Then the mod is applied, just before the game is started from the gui or batch file.

We could EVEN HOST A GITHUB repository of all the mods so a user can go and download them :)

Hell, we could use github to create our PATCH files for us.

BTW, this is just a proof of concept, but it could easily be done rather than having a user have to hunt mods down and parse and merge.

However, in the meantime, I'd propose just including a few mods, say 5 to 10 in an initial release, and work on a repo where users can upload their mods.  Hell, the theory is easy enough for anyone to implements.  The batch file could simply list a subfolder's dir contents where the mods are stored.... and parse from their.  The batch file would automagically parse and patch the mod compared to a base vanilla version, which is entirely possible using diff3.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Dirst on August 28, 2014, 10:54:02 pm
the 2 weeks of conversation is due to

Quote
My concept for the way load order works is uh, very similar, to the elder scrolls model - user chooses order, gets feedback on compatibility, retry until it works or you realise it's not going to work at all

you're talking about a mod tool for another game in who knows what language that can't just be "ported" over to DF.

So... I would propose something simple first.

v1.0
Static Mods (scripts, parsed raws [either python parsed or batch/shell parsed], and patch files)

Then...

v2.0
Do a mod merger (maybe using a system similar to Meph's search/replace concept; mods that don't merge though, would have to be figured out somehow so the proper flags wouldn't conflict)

Then maybe an

v3.0
advanced object merger.

We're waiting for a mansion when all we need is a shack atm.

We literally have ALL the tools to create a batch based mod system.  User clicks on a batch file or linux shell script, then gets a bunch of text prompts asking what mod he wants to load.  Then the mod is applied, just before the game is started from the gui or batch file.

We could EVEN HOST A GITHUB repository of all the mods so a user can go and download them :)

Hell, we could use github to create our PATCH files for us.

BTW, this is just a proof of concept, but it could easily be done rather than having a user have to hunt mods down and parse and merge.

However, in the meantime, I'd propose just including a few mods, say 5 to 10 in an initial release, and work on a repo where users can upload their mods.  Hell, the theory is easy enough for anyone to implements.  The batch file could simply list a subfolder's dir contents where the mods are stored.... and parse from their.  The batch file would automagically parse and patch the mod compared to a base vanilla version, which is entirely possible using diff3.

I was going to suggest something similar for a repository, with a directory of download locations hosted on DFFD or something, with a fallback to its last-downloaded copy on the local machine.  There is a DFHack scripts collection that uses some wiki tool to make a human-readable directory, but that might be overkill.  All we need is some XML that has the mod title, version number, author, two-sentence blurb, and URL.

The mod tool may end up using a non-standard diff format, but an important design goal is to let the launcher accept raws as input (either making its own diffs, or merging things on the fly) so that a mod doesn't need to be specifically prepared to work with the launcher.  If we want launcher-prepped mods in a special format, we might as well go with Rubble because it already exists and it's kinda awesome.

When the user pastes in a URL for something not in the directory, it should suck the mod in (whether in raw or diff format) and treat it like anything else.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: PeridexisErrant on August 28, 2014, 11:03:54 pm
I've already written a simple console-based version in Python (3.4).  It even goes through and creates a unified diff file for each file as it goes, and it would be easy to add a section that called the unix patch utility each time.  Unfortunately that doesn't come on windows, and I don't want to pursue something that might never work better.  Here it is:

https://github.com/PeridexisErrant/Py-Mod-Loader

(http://i.imgur.com/u70xWzo.png)

Rubble is indeed awesome, and does the whole "special input == awesome output" thing much better than we're able to.  A far-future stretch goal might be an option to switch over to rubble... after we put a roof on our modest house. 

Hosting is, by design, wherever mods are found.  I imagine that this will come with a decent set to start with though!
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 29, 2014, 08:49:18 am
Note:

The fear of a tool "never" getting better should never stop one from implementing it.  Things can be proof of concept until we get better features implemented.  A python based gui could be in order next.

I'm not 100% sure how your python script works.

I would imagine the concept would do 1 of two things.

A 'ModTool' (whatever name) which scan two folders

One for dropped in raw's, say

\ImportedMods

and one for already created patchffiles and maybe a matching zip of binary files that need to be included
\PatchFiles

Inside \ImportedMods\ would be:
  .\Base\ (Based on Vanilla)

Inside \PatchFiles\ would be
  at least 5 patch files for 5 different mods.

A User can drop a mod subfolder into \ImportedMods\ as
  .\<NameOfMod>\
subfolders that would [need to] be copied in would be dataand raw.

When the script starts, it lists the choices for what to load based on the contents of these two subfolders.  So... when running, a user would choose 1 of two options:

1. Apply patch (Load Mod)
2. Process mod[ s] (Create Patch[es])

Option1:

Fairly straightforward.  Lists contents of \Patchfiles\, lets a user pick a patchfile & applys it against \ImportedRaws\Base\ and outputs the results into a [different] folder which is basically the contents of Dwarf Fortress Vanilla but with empty subfolders for data and raw, which are to be populated by the patch output.

Option2:

a. Flattens all \ImportedMods\*\Raw\Objects\*.txt

b. and we do a diff comparison between (including all subfolders, missing files between base and <NameOfMod> have blank base placeholders created)

  \ImportedMods\Base\data\*.*
  \ImportedMods\Base\raw\*.*
and
  \ImportedMods\<NameOfMod>\data\*.*
  \ImportedMods\<NameOfMod>\raw\*.*

either as a single <NameOfMod>.patch patch file or <NameOfMod>data.patch and <NameOfMod>raw.patch set of patch files.  Also, the patch file(s) ignore binary files; instead, binary files that differ are saved into a matching zip file.

and dump the patch file (with the matching zip of binary files that were different) into

\PatchFiles\

The one exception to merges for atm should be "tileset" patching on top of mods.  Siimilar to how the linux java based lnp works.  However, we could simply do a diff3 comparison using a common ancestor.  There hopefully won't be any conflicts, and a matching diff3 output file can be created using the concept.

The beauty of this system, is it leaves the complicated [possibly broken] merging of mods up to mod mergers who know how to hand implement merged mods.  As we know, it's not a pretty process.  That way when a mod is intelligently merged, we have a format for them to create a patch file from it.  That patch file can BE PASSED AROUND
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Dirst on August 29, 2014, 09:57:35 am
So thistleknot, if I read that correctly, you'd like the launcher to apply one and only one patch?

I'd prefer PE's idea of specifying the order and throwing warnings about potential conflicts, probably using some version of diff3.  That way someone who wants to use Modest Accelerated Mod and Mushroom Kingdom doesn't need to hunt around for a patch that pre-mixes the two.

That said, I have three future-proofing suggestions:

1. Treat graphics packs like any other mod.  In fact maybe define arbitrary roots for the mod's folders so that things like Stonesense and Soundsense content can come along for the ride.

2. Allow the optional manifest file to silently okay some apparent conflicts and explicitly reject others.  These should be version-number specific.  This allows a mod to have sub-mods that depend on the core.

3. Try to encourage PE's idea of modular mods.  This will reduce (though not eliminate) collisions.  That entity_default.txt file is going to be prodded a lot.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 29, 2014, 10:06:22 am
I'm merely suggesting if we're having a hold up based on what is ready vs not ready.  Currently, no one is 100% onboard with the way diff3 does merging.  It can drop lines that are meant to be merged together.

So, since that requires more advanced merging techniques, I was proposing in lieu of a fully working solution, to merely keep diff3 to applying tilesets atm and we only load 1 base mod.

I do see one issue with tilesets, and that's when we have mods that bring in new creatures, plants, etc... we can't rely on a universal tileset applier for all mods... (for the most part it might work though).

As to modularity:

I think if modders keep their changes to new_files vs changing vanilla files.  It would help with merging mods...  However, a lot of mods mod objects in vanilla files... so unless the object was removed and put into an external file to be modded...
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Dirst on August 29, 2014, 10:21:21 am
I'm merely suggesting if we're having a hold up based on what is ready vs not ready.  Currently, no one is 100% onboard with the way diff3 does merging.  It can drop lines that are meant to be merged together.

So, since that requires more advanced merging techniques, I was proposing in lieu of a fully working solution, to merely keep diff3 to applying tilesets atm and we only load 1 base mod.

I do see one issue with tilesets, and that's when we have mods that bring in new creatures, plants, etc... we can't rely on a universal tileset applier for all mods... (for the most part it might work though).

As to modularity:

I think if modders keep their changes to new_files vs changing vanilla files.  It would help with merging mods...  However, a lot of mods mod objects in vanilla files... so unless the object was removed and put into an external file to be modded...

Argh, didn't mean to imply we shouldn't go forward with a one-mod-only version to start.  I just don't see that as a viable end-goal.

If two mods try to change a single line (such as the grazing stat for a specific creature), it ought to end up with a last-in-wins scenario or at least an error.  Dropping lines would be a bug in diff3 and they should probably hear about it.

Some mods add stuff, and those should be in creature_mymod.txt and inorganic_mymod.txt and so on.  Some mods explicitly fiddle with vanilla stuff (such as the thirty-seven or so variations on hard farming), and those should change the tags they need to change without trying to tidy up the vanilla stuff around them.  It's unlikely you'd get two hard farming mods to co-exist, but you should at least be able to get a hard farming mod to work with an arbitrary graphics pack.

If a modder believes that his or her change to a vanilla object needs to be mutually exclusive of any other modding, then the mod should replace the vanilla entry with a comment and recreate the object in a new file.  That will give the patch loader a tummy ache and prevent the incompatible mixing.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 29, 2014, 12:22:05 pm
I'm merely suggesting if we're having a hold up based on what is ready vs not ready.  Currently, no one is 100% onboard with the way diff3 does merging.  It can drop lines that are meant to be merged together.

So, since that requires more advanced merging techniques, I was proposing in lieu of a fully working solution, to merely keep diff3 to applying tilesets atm and we only load 1 base mod.

I do see one issue with tilesets, and that's when we have mods that bring in new creatures, plants, etc... we can't rely on a universal tileset applier for all mods... (for the most part it might work though).

As to modularity:

I think if modders keep their changes to new_files vs changing vanilla files.  It would help with merging mods...  However, a lot of mods mod objects in vanilla files... so unless the object was removed and put into an external file to be modded...

Argh, didn't mean to imply we shouldn't go forward with a one-mod-only version to start.  I just don't see that as a viable end-goal.


I didn't intend it to be an end-goal, but rather a start

I know I said,
Quote
"I was proposing in lieu of a fully working solution, to merely keep diff3 to applying tilesets atm and we only load 1 base mod."


but... I didn't mean that be our "final" product; but, rather, due to issues with diff3 (for example) and other parsing methods that are preventing us from building a merger from the start... just do a simple patching system at first, then build from that.

Btw your idea of using blank placeholders for heavy diff files is similar in concept to if a file does not exist in vanilla.  How would one code that, like, right now.  Is that possible without having to implement ###comments... how would that be handled/flagged appropriately?

I think another cause for issue is the fact that some of us have ideas in one language (primarily myself in batch scripting and command lines) and others using python.  I'm all for python; however, the py is weak in this one ;)  It does have the added affect that while we discuss the possible "end-case" scenario's on the thread... anyone one of us could have completed a simple batch solution in our own right that could have served as a base 1.  I do think if python is more versatile, to go that route vs shell scripting.  Just ensuring that the end client dependencies are met when running python (without conflict) are ensured, then go for it (DFHack does it somehow).
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Dirst on August 29, 2014, 12:50:24 pm
Argh, didn't mean to imply we shouldn't go forward with a one-mod-only version to start.  I just don't see that as a viable end-goal.

I didn't intend it to be an end-goal, but rather a start

Okay, we got that miscommunication out of the way.  On to business.

Btw your idea of using blank placeholders for heavy diff files is similar in concept to if a file does not exist in vanilla.  How would one code that, like, right now.  Is that possible without having to implement ###comments... how would that be handled/flagged appropriately?

Suppose someone heavily modifies the Elven entity in their mod, like adding a dozen nobles and stylings while tweaking just about everything else.  The odds of that successfully merging with even a modest change in another mod (such as progress triggers) is close to nil.  So the idea would be for the modded file to have a comment in its entity_default.txt along the lines of

Note: The ENTITY:FOREST object has been moved to entity_woodstock.txt by Woodstock Mod v1.08.

and delete every line of the original entity.  This will cause a conflict with any other attempt to modify the Elves' entity (except someone trying to delete it completely).  It doesn't need any special logic in the launcher, just some advice for modders who want their stuff to work better with the launcher.  And it doesn't prevent someone else from modifying a different vanilla entity, either.

I think another cause for issue is the fact that some of us have ideas in one language (primarily myself in batch scripting and command lines) and others using python.  I'm all for python; however, the py is weak in this one ;)  It does have the added affect that while we discuss the possible "end-case" scenario's on the thread... anyone one of us could have completed a simple batch solution in our own right that could have served as a base 1.  I do think if python is more versatile, to go that route vs shell scripting.  Just ensuring that the end client dependencies are met when running python (without conflict) are ensured, then go for it (DFHack does it somehow).

Shell scripting is going to be difficult or impossible to keep cross-platform, so I would lean toward Python.  Or at least I would if I knew how to script in Python.  As it is I just point over to where the Python lives and encourage others to go over there and build a shining city on the hill.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: PeridexisErrant on August 29, 2014, 07:19:29 pm
Note:

The fear of a tool "never" getting better should never stop one from implementing it.  Things can be proof of concept until we get better features implemented.  A python based gui could be in order next.

I'm not 100% sure how your python script works.

I would imagine the concept would do 1 of two things.

A 'ModTool' (whatever name) which scan two folders

One for dropped in raw's, say

\ImportedMods

and one for already created patchffiles and maybe a matching zip of binary files that need to be included
\PatchFiles

<snip>

The one exception to merges for atm should be "tileset" patching on top of mods.  Similar to how the linux java based lnp works.  However, we could simply do a diff3 comparison using a common ancestor.  There hopefully won't be any conflicts, and a matching diff3 output file can be created using the concept.

The beauty of this system, is it leaves the complicated [possibly broken] merging of mods up to mod mergers who know how to hand implement merged mods.  As we know, it's not a pretty process.  That way when a mod is intelligently merged, we have a format for them to create a patch file from it.  That patch file can BE PASSED AROUND

I object to the idea of distributing the mods as diffs, because the whole point of this thread is that we have a common input format that any tool or version of a tool can use, which won't change with improved logic.  This means that any patch files should be derived each time the tool is run from the basic format.  Graphics are currently beyond scope - we can handle those once the basic thing is working. 

How my script works:

Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 29, 2014, 10:19:38 pm
I thought we had agreed that diff's were the way to go because of the potential of 3 way merges using common ancestors, which address's the concern to:

Quote
because the whole point of this thread is that we have a common input format that any tool or version of a tool can use, which won't change with improved logic

However... it's this "merging" that is preventing a "base" version.  What your asking for at this point is something that is NOT a diff comparison and is some sort of object tracking state.

The concept of "diff"s merely means you can reconstruct any mod using patches from a base version.

THEN FROM THEIR you can do advanced merging by rebuilding mods from patches (all from a base version), and doing some fancy 3way merging (using vanilla as base) from these rebuilt mods.

However, my point was... that we don't have anything.

However, since you have said you object to the diff approach, then I will not push it any further.

However, don't be upset if someone comes along and posts a diff patch solution in the meantime (whether I decided to try or not).

Either way.  It seems someone somewhere might get impatient and throwing some stuff together; otherwise unless we solve "the diff patch problem of the century using quite extensive object state tracking", who knows when.\

However, this is your show Peredexis.  I'm not going to act like I plan on taking the helm.  However, I may fancy some proof of concept diff patch batch solution.  I was on a roll until my damn laptop ac jack went out again for the 3rd time in 6 mo's... However, most all of the base ideas I put up in this thread.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Dirst on August 29, 2014, 11:00:56 pm
I think PE's objection was just to pointing users at a URL with a diff file that wouldn't function without the launcher.  The mod would be raw files (and scripts and graphics) that the launcher processes using its latest and greatest logic.  Remember, the base on one player's machine might be different than someone else's.  For example, I always do that tweak to put beards on female dwarves.  Others might have other permanent tweaks they want like making microcline a less obnoxious color.

Do we want to encourage players to make their own personal mod of local tweaks so that everything can truly call vanilla it's common ancestor?
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: PeridexisErrant on August 29, 2014, 11:35:19 pm
Yep. Using a standard diff to do the merge logic makes sense to me, but for interoperability mods should be stored and distributed whole instead of as patches.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: thistleknot on August 30, 2014, 09:23:11 am
hrmmm...

Well.  I can see how basing it on a base would break if you added beards to females and every mod afterwards didn't (basically the mods would say, no we took off beards) when your intended behaviour is to draw what the mods changed from what they used as their base.  Which we all "assume" is vanilla, or at least [we assume we] can have have a common ancestor derived; which is not always true, such as with total conversion mods.

However.  I guess it could be "rebased" on a bearded version as a base... but the mod load concept I had would break?

I was under the "mis-assumption" that since LNP pretty much shipped as a static build, that we could start with a static build of mods.  Things would most likely break if players tried to mod the vanilla base after we checked the mods worked with a common ancestor system.  However, in theory we shouldn't have to 'check' them if we plan on allowing importing of them.

If I loaded mods into an imported folder (via the whole mod set), derived diff's from based off vanilla, and tried to apply it to a non vanilla base... then I think there would be an issue at that point.   Because then all diff's would be contextual vs unified and then you'd have those issues because you don't have a common ancestor anymore.

Well either way.  I'm glad I learned some stuff about 3 way merges and kdiff3 at the very least.  My brain has been hardwired lately with github thinking, so that's why I'm so pro patches.  However, merge issues such as I've had with github would arise if applying to different bases.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Dirst on August 30, 2014, 10:43:50 am
hrmmm...

Well.  I can see how basing it on a base would break if you added beards to females and every mod afterwards didn't (basically the mods would say, no we took off beards) when your intended behaviour is to draw what the mods changed from what they used as their base.  Which we all "assume" is vanilla, or at least [we assume we] can have have a common ancestor derived; which is not always true, such as with total conversion mods.

However.  I guess it could be "rebased" on a bearded version as a base... but the mod load concept I had would break?

I was under the "mis-assumption" that since LNP pretty much shipped as a static build, that we could start with a static build of mods.  Things would most likely break if players tried to mod the vanilla base after we checked the mods worked with a common ancestor system.  However, in theory we shouldn't have to 'check' them if we plan on allowing importing of them.

If I loaded mods into an imported folder (via the whole mod set), derived diff's from based off vanilla, and tried to apply it to a non vanilla base... then I think there would be an issue at that point.   Because then all diff's would be contextual vs unified and then you'd have those issues because you don't have a common ancestor anymore.

Well either way.  I'm glad I learned some stuff about 3 way merges and kdiff3 at the very least.  My brain has been hardwired lately with github thinking, so that's why I'm so pro patches.  However, merge issues such as I've had with github would arise if applying to different bases.

There are a couple different ways to handle this, but it only makes a difference in the outcome if we distribute patches.  If we distribute raws then all of these result in the same build.

1. Make a bunch of really simple mods that match what people tweak, and include them with the Starter Pack.  So there would be a Bearded Ladies mod and a Green Microcline mod and an English-only Names mod and so on.  This could cause... clutter.

2. Set up a "base" folder and a copy of that base called "My changes" that the user can tweak.  This one can be merged in like any other mod, but it should be checked for changes on every merge since it won't have the same kind of version discipline you'd get from a "formal" mod.  Obviously, no downloaded mod with have known-compatibility with this mod, so expect some false-positives that the user will need to figure out.

3. Include a tutorial for the adventurous to build their own mod.  Basically the same as 2 except the user makes the copy and applies a version number to it.

These aren't mutually exclusive, either.  If we want to combine 1 with another, the user should need to download the mini-mods from the repository (to encourage them to find everything else there).

Edit: And actually, in thistleknot's case the beards wouldn't be mentioned in the patches so everything should be fine and dandy.  Except now there is one more tag in creature_standard.txt which mis-aligns even flattened raws.  Contextual matching ought to fix it, but I wouldn't want to depend on it.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Button on September 02, 2014, 11:33:05 am
The beards-as-base problem isn't really a problem, tbh. You'd just have to change your installation strategy. Instead of tweaking beards and then installing, the installer should consider any deviations from vanilla in the user's raw files a mod, and install that mod last and/or with highest priority.

It does point to the biggest problem, though, which is multifeatured mods. Because mod installation is currently such a hassle, mods start packaging more and more features together, when an automatic mod installer would be significantly simpler if each feature were packaged as its own individual mod. Then we wouldn't have to worry about whether any given change is significant to the mod or not: if it deviates from the raws, it's significant.

I suppose we could just assume every change is significant to the mod, fail on any conflict, and hope that modders change their packaging to be more, well, modular. That would be a lot more helpful than a manifest file.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Merkator on September 02, 2014, 11:40:43 am
I do some research on parsing front. For now I can parse files and make nice group of tag. Caste for know not work correctly, sadly.
More interesting for me can be one template with every, single possible token in that can exist in some place.
For example most object files have some major token like ENTITY, ITEM_*, PLANT.
After that almost always exist `head` section of raw which have name of the token, and information about it.
Next go `body` which describe features that some token have. Like GRAZER of AMPHIBIOUS.
Lastly goes `caste` section and applied variation. And few tokens that can happen almost anywhere.
Having dictionary of every token and context wher some token go make so much easier parsing raws into workable form.
But as far as I know this is gaming community and we love to have Fun.
And writing schemes for in game text format isn't the fun that we want, sadly.

But if anyone have to much time. Can you post here the tokens that happen to exist in sections.
In this gist is the list of every token (only exception are language files).
Gist (https://gist.github.com/Demagogue/c3e79ba1dc887cf15930)
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Dirst on September 02, 2014, 11:55:46 am
The beards-as-base problem isn't really a problem, tbh. You'd just have to change your installation strategy. Instead of tweaking beards and then installing, the installer should consider any deviations from vanilla in the user's raw files a mod, and install that mod last and/or with highest priority.

That was Method 2 that I mentioned above.  The only cost is that you need some disk space for a virgin copy of the raws for comparison.

It does point to the biggest problem, though, which is multifeatured mods. Because mod installation is currently such a hassle, mods start packaging more and more features together, when an automatic mod installer would be significantly simpler if each feature were packaged as its own individual mod. Then we wouldn't have to worry about whether any given change is significant to the mod or not: if it deviates from the raws, it's significant.

I suppose we could just assume every change is significant to the mod, fail on any conflict, and hope that modders change their packaging to be more, well, modular. That would be a lot more helpful than a manifest file.
I am all for modular mods, but the manifest is useful to declare dependencies and known these-two-mods-do-not-play-well-together cases.

The hypothetical Hard Farming mod could have a core that changes the plant raws, then a module that adds a bunch of new plants, and another module that changes a bunch of reactions.  A tidy modder would build the reactions using reaction classes so that they would function with or without the new plants, but then it depends on the raw changes (adding reaction classes) in the core bit of the mod.

The core mod probably doesn't even need a manifest file (unless it wants to declare incompatibility with specific other mods), but the two modules could use manifests to declare dependency on the core mod.  In this case, load order of the mods wouldn't matter, but there are cases when in would.

To the end user, they would appear as three mods: HardFarming v1.02, HardFarming New Plants v1.02 and HardFarming Iron Chef v1.02.  If the second or third is added to the list without the first, the launcher can display a message to the user even before attempting the merge.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: PeridexisErrant on September 02, 2014, 07:24:31 pm
If a user has several standard small tweaks - beards or similar - they can make a mod that makes those changes, and apply it with the others.  The launcher is not going to automatically detect user mods, but they can just copy their whole install to the folder and it'll extract the mod/s. 

The whole system is based on having a full copy of the vanilla raws in the Mods folder, which is the common ancestor of all mods.  Trying to mod this copy would be a bad idea, and reversed by the first merge anyway.  The disk space this takes is more than saved by discarding identical files in mods. 

Having just finished an assignment on graph manipulation, I'm going to take another look at difflib soon.  It still makes no sense to me...  but maybe some focus can change that, after all my outstanding essays are done. 
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: PeridexisErrant on September 03, 2014, 05:44:34 am
Great news!  I've tracked down the issue I was having with King Mir's merge code, and it's now working!  Specifically, the code currently on GitHub (https://github.com/PeridexisErrant/Py-Mod-Loader) now works to merge any number of non-conflicting mods on a line-by-line basis. 

Caveats:  the error handling is somewhere between pathetic and totally missing, and using more than one mod that adds files is almost guaranteed to mess stuff up.  The diff works line by line, and there's no flattening yet.
Upsides:  any number of non-conflicting minor mods *will* work, but any that changes a line which has already been modified will have that file skipped (silently at the moment, I did mention that the error handling was bad).

Over the next week or so I'll try to improve the feedback passed to the user and the mod-level handling of merge failures, so that eg we don't say a mod was merged in OK when two of the files were skipped.  Accumulating and displaying / responding to problems should be a thing, since we get a per-file True/False for if the merge worked.  Once that's in, enough information can be displayed for the user to make this a potentially useful tool - at least as a CLI demonstration. 

If anyone has or can easily make a collection of minor mods - which make minimal changes for their purpose, and only modify files (no additions or removals) - it would be great to share them as a test bed. 

I'm very excited about this - the remaining logic is well within my reach, and then it's ready to be GUI-ified and beta tested.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: PeridexisErrant on September 03, 2014, 07:33:54 am
OK, I stayed up later than I should have and did some more work on the surrounding logic.  I'm happy to call it version 0.2 now, and I've rewritten the readme as well.  Anyone want to test it on some simple or not-so-simple mods?

https://github.com/PeridexisErrant/Py-Mod-Loader
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: expwnent on September 03, 2014, 09:12:40 am
I haven't read the whole thread, but is there a reason you don't just use git as a versioning system?
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Button on September 03, 2014, 09:30:44 am
I haven't read the whole thread, but is there a reason you don't just use git as a versioning system?

 ::)
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: PeridexisErrant on September 03, 2014, 09:36:53 am
I haven't read the whole thread, but is there a reason you don't just use git as a versioning system?

The idea is to make mods part of the starter pack, andas easy to use and add for your self as graphics packs or utilities (and working in the same way to the user).  Git is great on the mod creation side, as is find and replace, and scripting, but it's not going to be a "drag, drop, click" solution which is what we're after.

I can't blame anyone for skipping the thread though, it kinda exploded...
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: expwnent on September 03, 2014, 09:49:10 am
My main line of reasoning is that if you can learn to play DF you can learn to use git. If you just make one "base" repo of default DF and do branches accordingly and merge in the mods you want it should work pretty well.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Dirst on September 03, 2014, 09:55:58 am
OK, I stayed up later than I should have and did some more work on the surrounding logic.  I'm happy to call it version 0.2 now, and I've rewritten the readme as well.  Anyone want to test it on some simple or not-so-simple mods?

https://github.com/PeridexisErrant/Py-Mod-Loader

I have an issue with Real Life sucking up a lot of my time lately, but I'll give this a shot when I'm able.  One request though: can we allow the mod to include other folders like Stonesense content?  The only collision should be with the top-level index.txt file.  The logic should allow the eventual inclusion of graphics packs as mods.  I don't use Soundsense (or any sound at all), but if sound packs can be dropped in then those should be allowed as well.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Button on September 03, 2014, 10:50:44 am
OK, I stayed up later than I should have and did some more work on the surrounding logic.  I'm happy to call it version 0.2 now, and I've rewritten the readme as well.  Anyone want to test it on some simple or not-so-simple mods?

https://github.com/PeridexisErrant/Py-Mod-Loader

I don't use Soundsense (or any sound at all), but if sound packs can be dropped in then those should be allowed as well.

Soundsense manages its own sound packs; it doesn't need to be "installed" like graphics packs do.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: IronSI on September 04, 2014, 11:56:58 pm
PeridexisErrant, I played around a bit with you mod loader.

Notes (Using the following mod packs: All Races Playable, DF Fine Polish, Plantfixes load in that order):

I got a improper indentation error, which I fixed.
I had to figure out whether it was python2 or python3 (Python3). Might want to include a note in the readme to that effect.
I had to figure out where to place the mod files (in LNP/Mods/) Might want to include a note in the readme to that effect.
Upon merging it one set of complete set of raws in temp/raw/ and in temp/raw/objects/, which were different from each other.
Upon merging it initially erased all files in the Plantfixes mod due the folder being named objects in stead of the standard raw.

After deleting the excess raws, I successfully created a pocket embark on a pocket world (With a human civ, no less, thanks to a successfully merged all races playable mod)

I have checked to see if the merges were 'clean' (that is, none of the feature broke anything from the other mods that they shouldn't have) but it appears to works as advertised, but a bit buggy.

If you need any other help PeridexisErrant, let me know and I will be happy to help with my Rusty Competent Python Scripter and my Rusty Competent C Programmer skills.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: PeridexisErrant on September 05, 2014, 01:46:37 am
PeridexisErrant, I played around a bit with you mod loader.

Notes (Using the following mod packs: All Races Playable, DF Fine Polish, Plantfixes load in that order):

I got a improper indentation error, which I fixed.
I had to figure out whether it was python2 or python3 (Python3). Might want to include a note in the readme to that effect.
I had to figure out where to place the mod files (in LNP/Mods/) Might want to include a note in the readme to that effect.
Upon merging it one set of complete set of raws in temp/raw/ and in temp/raw/objects/, which were different from each other.
Upon merging it initially erased all files in the Plantfixes mod due the folder being named objects in stead of the standard raw.

After deleting the excess raws, I successfully created a pocket embark on a pocket world (With a human civ, no less, thanks to a successfully merged all races playable mod)

I have checked to see if the merges were 'clean' (that is, none of the feature broke anything from the other mods that they shouldn't have) but it appears to works as advertised, but a bit buggy.

If you need any other help PeridexisErrant, let me know and I will be happy to help with my Rusty Competent Python Scripter and my Rusty Competent C Programmer skills.

Awesome!  Thanks for the feedback.  I'll clarify the usage and version information in the readme. 

I think your problems with duplicate raws and the plantfixes mod have the same cause:  a mod used as input that wasn't in the expected format.  It sounds like the plantfixes mod is distributed as an 'objects' folder, which you renamed 'raw' - this would cause the duplication, as the objects files don't exist in the top-level folder in vanilla (and deleting them would also effectively remove that mod).  Try putting them in "LNP/Mods/Plantfixes/raw" (/objects/files) instead of renaming.  Remember that the input must be structured so you could drop the contents of the mods folder over the DF folder to install it.  This does make me think that checking the validity of the inputs should take a higher priority though - ideally it could detect and fix this kind of issue, and warn you in more complicated cases. 

Without any information, I'll assume that the improper indentation error was caused by indentation in one of the mods you merged.  If the mod changed indentation, it could be valid on it's own but not once merged. 
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: PeridexisErrant on September 05, 2014, 06:00:06 am
Hi all - I've posted a full folder structure with nine mods included and formatted as an example. 

Here it is:  http://dffd.wimbli.com/file.php?id=9428
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: IronSI on September 05, 2014, 10:18:23 am
The indentation error was in your python code, one of the triple quoted blocks was indented too much (Lines 212 and 213).

Upon review of the raw folders (putting everything in into raw/objects folders), it doesn't appear to be cause by the Plantfixes mod, but the Df-Fine-Polish has subfolders within the objects folder, would that affect anything?
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: PeridexisErrant on September 05, 2014, 06:22:05 pm
Oops, I fixed that in the download above. 

Yes, any variation from the vanilla file structure will mean files have to be copied instead of merged - which massively increases the chance of conflicts.  You can't really avoid new files with some mods, but trying to 'fix' the folder structure should make them easier to spot. 
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Dirst on September 06, 2014, 01:49:57 pm
Yes, any variation from the vanilla file structure will mean files have to be copied instead of merged - which massively increases the chance of conflicts.  You can't really avoid new files with some mods, but trying to 'fix' the folder structure should make them easier to spot.
This looks like a very good start, though I'm not sure why it couldn't handle stuff outside the raw folder (like Stonesense XML or a new embark profile) so long as it doesn't attempt any flattening there.

I tried to look through Python's documentation to see if I could be of any more help on the project, but it looks like it will be a while before I do anything other than tweak other people's code.  I did, however, see that Python has a nice ElementTree structure (https://docs.python.org/3/library/xml.etree.elementtree.html) that can be used to hold (among other things) file directory information or XML data.  Not sure if an ElementTree would be easier than shutil, but the XML parser massively simplifies the problem of using a manifest file.

Code: [Select]
<?xml version="1.0" ?>

<!-- This XML contains enough information to make an educated guess about how two mods would merge *without processing any files.*
     If it sees multiple mods affecting the same file, it notifies the user that it has to try a merge to figure out compatibility.
     This means that the tool can download a bunch of these manifests in bulk to preview mods.  -->

<modpack token="Example"> <!-- The token is the name of the folder holding the mod, and how mods refer to one another. -->
<title>The Example Mod</title>
<version>0</version><subversion>90</subversion> <!-- Would display as "v0.90".  Can also add a <revision> tag for "v0.90r2" -->
<author>Dirst</author>
<thread>http://www.bay12forums.com/smf/index.php?topic=000000</thread>
<download>http://dffd.wimbli.com/file.php?id=0000</download>
<description>This mod is an example used to illustrate the optional manifest file.</description>
<dependencies>
<dfhack/>
<mod token="Basic" min_version="0" min_subversion="01" order="before"/>
</dependencies>
<compatibility>
<core max_version="40" max_subversion="07">  <!-- The lack of min_ will match anything v40.07 or earlier. -->
<color>red</color>
<message level="3">The Example Mod is not compatible with Dwarf Fortress before v40.08.</message>
</core>
<core min_version="40" min_subversion="08" max_version="40" max_subversion="11">
<color>green</color>
</core>
<core min_version"40" min_subversion="12">
<color>orange</color>
<message level="3">The Example Mod has not been tested with this version of Dwarf Fortress.</message>
</core>
<dfhack min_version="40" min_subversion="08" min_revision="2" max_version="40" max_subversion="10" max_revision="1">
<color>green</color>
<message level="1">"The Example Mod" employs DFHack's interactionTrigger and lua scripts.</message>
</dfhack>
<dfhack min_version="40" min_subversion="10" min_revision="2">
<color>orange</color>
<message level="3">The Example Mod has not been tested with this version of DFHack.</message>
<message level="1">"The Example Mod" employs DFHack's interactionTrigger and lua scripts.</message>
</dfhack>
<mod token="CLA" min_version="40" max_version="40">
<color>green</color>
</mod>
<mod token="Ironhand" min_version="40" max_version="40">
<color>green</color>
</mod>
<mod token="MasterworkDF"> <!-- Lack of min_ and max_ will match any version -->
<color>red</color>
<message level="3">The Example Mod is known to be incompatible with Masterwork DF.</message>
</mod>
<mod token="Mayday" min_version="40" max_version="40">
<color>green</color>
</mod>
<mod token="OldGenesis">
<color>red</color>
<message level="3">The Example Mod is known to be incompatible with Old Genesis.</message>
</mod>
<mod token="Phoebus" min_version="40" min_subversion="07" max_version="40">
<color>green</color>
</mod>
<mod token="Spacefox" min_version="40" max_version="40">
<color>green</color>
</mod>
<mod token="Other" min_version="1" min_subversion="0" max_version="1" max_subversion="35">
<color>yellow</color>
<message level="2">There have been no reports of conflicts with this version of That Other Mod.</message>
<message level="1">"The Example Mod" & "That Other Mod" both add to entity_default.txt, but do not appear to affect one another's edits.</message>
</mod>
<mod token="Basic" min_version="0" min_subversion="01" max_version="0" max_subversion="20">
<color>green</color>
<message level="1">"The Example Mod" is an extension of "The Basic Mod" and will not function without it.</message>
</mod>
<mod token="Basic" min_version="0" min_subversion="21">
<color>orange</color>
<message level="3">"The Example Mod" has not been tested with this version of "The Basic Mod".</message>
<message level="1">"The Example Mod" is an extension of "The Basic Mod" and will not function without it.</message>
</mod>
</compatibility>
<manifest>
<!-- Upon merge, any files with no path info are copied into an LNP documentation subfolder named after the token. -->
<!-- The parse attribute defaults to "flatten" which also strips out anything outside of []s. -->
<file parse="text">manifest.xml</file> <!-- Encode linefeeds if needed, but do not flatten this file. -->
<file parse="text">readme.txt</file>
<file parse="none">The Example Mod v0.90.pdf</file> <!-- Pass this file exactly as-is. -->
<file parse="none">data/art/font.TTF</file>
<file>data/init/embark_profiles.txt</file> <!-- Should be able to add a profile to the end of the user's set. -->
<file parse="text">data/init/overrides.txt</file> <!-- TWBT uses # to comment out inactive lines, so can't trust flattening. -->
<file parse="text">raw/onLoad.init</file> <!-- Mods that use DFHack will probably want to add lines to this file. -->
<file>raw/graphics/graphics_example.txt</file>
<file parse="none">raw/graphics/example/example.png</file>
<file>raw/objects/creature_example.txt</file>
<file>raw/objects/entity_default.txt</file> <!-- Many mods will need to add lines into this file. -->
<file>raw/objects/interaction_example.txt</file>
<file>raw/objects/reaction_example.txt</file>
<file parse="text">raw/objects/text/secret_wisdom.txt</file>
<file parse="text">raw/scripts/example-script.lua</file>
<file parse="text">stonesense/index.txt</file> <!-- Any Stonesense content will need to add a line to this file. -->
<file parse="text">stonesense/example/index.txt</file>
<file parse="none">stonesense/example/example.png</file>
<file parse="text">stonesense/example/example.xml</file>
</manifest>
</modpack>

The first part contains some identifying information about the mod as it might appear in a GUI loader.  A set of <dependencies> lets the tool know if DFHack or any other mod is required for this mod to function.  A structure of <compatibility> data allows for hand-crafted overrides for specific combinations of mods.  This can be displayed to the user even before a merge is attempted.  The actual <manifest> at the end identifies all of the files.  We could hardcode how certain folders and certain extensions are handled, but this seems more future-proof.

One issue with <compatibility> is that there could be zero, one or two bits of XML that are relevant:
0: No pre-merge feedback to the user because these two mods have never heard of each other before.
1: Trust the override that is present and display the message in the GUI before the merge is attempted.  The minimum message level is user configurable.  All messages might get sent to a log.
2: Color the mods according to the "redder" data, and present both sets messages to the user for him/her to decide.  Maybe cull duplicate identical messages.

The manifest is the only thing that a modder would be asked to do specifically for the loader, and even that is optional.

Edit: Fixed missing word and added link to Python docs on ElementTree.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Button on September 08, 2014, 03:11:32 pm
Hrm. I was sort of hoping for a way to put more... granular merging instructions into a manifest file. For example, take my vampire fodder minor mod (one of the modular modpack I uploaded for testing), which replaces a number of creatures' BLOOD or ICHOR with a custom BLOOD2 or ICHOR2. It would be super sweet if I could tell the merger that, if another mod messes with BLOOD for [these creature objects], please change it so it messes with BLOOD2 instead.

Now that I write that out, though, it sounds unrealistically ambitious, so never mind.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Dirst on September 08, 2014, 04:20:13 pm
Hrm. I was sort of hoping for a way to put more... granular merging instructions into a manifest file. For example, take my vampire fodder minor mod (one of the modular modpack I uploaded for testing), which replaces a number of creatures' BLOOD or ICHOR with a custom BLOOD2 or ICHOR2. It would be super sweet if I could tell the merger that, if another mod messes with BLOOD for [these creature objects], please change it so it messes with BLOOD2 instead.

Now that I write that out, though, it sounds unrealistically ambitious, so never mind.
The goal is to parse things that could individually function if just dropped onto the DF folder.  Sometime after v1.0, it might make sense to fold in something more elaborate such as scripting.  But I think it'd be easier to user Rubble as a back end for that rather than re-inventing the wheel.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: Button on September 09, 2014, 02:12:13 pm
Skimmed over your script and the logic looks OK, but it's not really runnable right now >>. Herp derp it's supposed to be Python 3.



I think your observation about mods which put new entities in new files being more error-prone than those which put new entities in existing files is flawed. AFAICT the reason you think this is that you don't have any dupe-checking logic between raw files - but dupes could just as easily be placed in separate vanilla raw files as they could in separate modded raw files.
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: PeridexisErrant on September 09, 2014, 06:58:44 pm
Skimmed over your script and the logic looks OK, but it's not really runnable right now >>. Herp derp it's supposed to be Python 3.



I think your observation about mods which put new entities in new files being more error-prone than those which put new entities in existing files is flawed. AFAICT the reason you think this is that you don't have any dupe-checking logic between raw files - but dupes could just as easily be placed in separate vanilla raw files as they could in separate modded raw files.

I do intend to make it usable in 2.7, but haven't gotten around to it yet. 

Duplication checking would certainly help, but it's not the main issue.  Basically it treats adding files as a proxy for "this is a major mod, and therefore more likely to cause problems".  There's also a few edge cases like "Mod A adds creatures (in a new file), Mod B removes all creatures (but silently misses those in the new file)" - adding creatures in a block removed by mod B would be caught. 
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: PeridexisErrant on September 11, 2014, 07:46:52 am
Concept image: 
(https://i.imgur.com/3ZULgyC.png)

Click mod in bottom list to append to top list, click mod in top list to remove from top list.  Once you're happy with some combination of green and yellow, click install! 

The launcher rebuilds the merge every time the top list changes and provides live feedback via the colours (yes, it's that fast).  The logic is described higher in the thread, basically line-by-line changes where a second change from the common ancestor (vanilla) is rejected. 

Until some more of the kinks get worked out, this will be hidden unless you enable it in the LNP options file (hopefully). 



Next steps, in no particular order: 
Title: Re: Proposal: a standard format for mods in a diff/patch Mod Starter Pack
Post by: PeridexisErrant on September 13, 2014, 07:46:28 am
Exciting news all - with help from Pidgeot, I got the mod merger working as part of a PyLNP fork! (https://github.com/PeridexisErrant/PyLNP-plus-mods)
With a working GUI, functional merge logic, and the input format all sorted out I think it's time to close this thread. 

Please join me in the new thread! (http://www.bay12forums.com/smf/index.php?topic=143662) 

It's going to be a party over there, with refinements to the logic and interface - and a call going out for suitable mods to be included  :D
Time to strike the mods!

Spoiler: Pics of it working! (click to show/hide)