Bay 12 Games Forum

Please login or register.

Login with username, password and session length
Advanced search  

Author Topic: Dwarf Fortress Wiki Offline Dump [UNOFFICIAL]  (Read 11398 times)

daD

  • Escaped Lunatic
    • View Profile
Dwarf Fortress Wiki Offline Dump [UNOFFICIAL]
« on: April 26, 2013, 12:49:51 am »

Hey guys, i noticed that DFWiki has been lagging heavily recently, so i decided to make an offline usable HTML dump(aaand it seemed like a nice idea overall :P).

It took me some time (it was NOT fun :(), but in the end i had something that i could call plausible.

This i want to share with you guys:



Key points of this package:
  • It looks like dwarffortresswiki.org
  • You can use it offline.
  • It is done using a ton of very dirty hacks (conclusion is - its king of ugly, but has it ever stopped your dwarfy nature?), but at least you got something when you are offline/DFWiki is lagging/you want small part of info *fast* (yes, its VERY fast to use).
  • It has mostly all information from DFW about current version of DF (v0.34.11), but some (not many) links are broken/there are no pages/some other weird stuff.
  • There are a lot of images absent (nothing can be done here, sorry :()
  • Its just bare HTML with some JS, so you dont have to install ANYTHING, just open index.html in your browser and you are done.
  • Search is working! (Almost, its just very strict.)
  • It is *NOT* official DFW release.
  • Mediawiki navigation bars and panels are stripped out (this is very dirty hack, you just wont see them).
  • All RAWs are in place.

Thats all guys, i hope you like it and use it.
Thanks for all the maintainers and writers of the original wiki, without you the game would be impossible :)

p.s. Briess, sorry for messing with your wiki server and "copying" your hard work.

p.p.s. About the hotfix: it seems that one rather important page has been lost at the last moment, and i've reuploaded it. Just replace it in the proper articles folder and you'll be fine.
« Last Edit: April 26, 2013, 06:37:19 pm by daD »
Logged

Locriani

  • Bay Watcher
  • Locriani == Briess
    • View Profile
    • dwarf fortress wiki
Re: Dwarf Fortress Wiki Offline Dump [UNOFFICIAL]
« Reply #1 on: April 26, 2013, 01:49:24 am »

Or you could just use this with wikitaxi, which is generated on a different server and doesn't lag the wiki for everyone else: :P

http://dwarffortresswiki.org/images/dump.xml.bz2
Autogenerated daily.

The speed issues for the wiki have been fully resolved now, as well.
Logged
I am one of many administrators of the wiki.  Please use my user page (http://dwarffortresswiki.org/index.php/User_talk:Briess) on the wiki to contact me, as I check that more often than these forums.

Locriani

  • Bay Watcher
  • Locriani == Briess
    • View Profile
    • dwarf fortress wiki
Re: Dwarf Fortress Wiki Offline Dump [UNOFFICIAL]
« Reply #2 on: April 26, 2013, 02:13:07 am »

Let me clarify: If you want to continue to do these crawls of the wiki, that's fine - just please try to restrict them to off-peak (2AM-6AM CST) so I don't have to spend more money on additional nodes.

The xml dump is only the active revision of articles because a full dump of the wiki history and contents takes almost 40 hours to process and takes up around 18GB of space.
Logged
I am one of many administrators of the wiki.  Please use my user page (http://dwarffortresswiki.org/index.php/User_talk:Briess) on the wiki to contact me, as I check that more often than these forums.

daD

  • Escaped Lunatic
    • View Profile
Re: Dwarf Fortress Wiki Offline Dump [UNOFFICIAL]
« Reply #3 on: April 26, 2013, 02:55:17 am »

The problem with wikitaxi is that:
1) Its Windows only.
2) I doesn't follow redirects at all. It means ALL the wiki is unusable.

I xml dumped only certain namespaces and only once, so load is not a big deal.

You can try to use dumpHTML extension to generate neat and usable dumps, but its really slow, and i managed to launch it only from root.

OR if you can upload all images archive for me, i could possibly make v2 using dump.xml and images, and this version will be nice and full and stuff.

p.s. I actually have a full "copy" of your wiki with all extensions and settings, and i've got a question - how did you solve auto double redirects like Stuffs --> Stuff(#redirect cv:Stuff) --> DF2012:Stuff? My wiki only stops at Stuff(#redirect df2012:Stuff) page =/
Logged

Locriani

  • Bay Watcher
  • Locriani == Briess
    • View Profile
    • dwarf fortress wiki
Logged
I am one of many administrators of the wiki.  Please use my user page (http://dwarffortresswiki.org/index.php/User_talk:Briess) on the wiki to contact me, as I check that more often than these forums.

lethosor

  • Bay Watcher
    • View Profile
Re: Dwarf Fortress Wiki Offline Dump [UNOFFICIAL]
« Reply #5 on: April 26, 2013, 04:50:10 pm »

If you run an actual Mediawiki installation, this extension might help (I haven't tried it yet). Unfortunately, the majority of users don't have an individual MediaWiki installation, and I haven't found a cross-platform offline wiki reader yet (there's Kiwix, but it's Wikipedia-only).
« Last Edit: May 05, 2013, 10:37:06 am by lethosor »
Logged
DFHack - Dwarf Manipulator (Lua) - DF Wiki talk

There was a typo in the siegers' campfire code. When the fires went out, so did the game.

mareck

  • Bay Watcher
  • Ahhhh, Dwarf meat!
    • View Profile
Re: Dwarf Fortress Wiki Offline Dump [UNOFFICIAL]
« Reply #6 on: May 16, 2013, 02:26:03 pm »

@Locriani (Lord of the Wiki)

Would it be possible to do a extract of All the 40D articles. 
Reason i ask,  all i really want is that data. No other versions data.   
And I dont want to spider ur site at the moment as it seems to be under very heavy load.   (2013-05-16 19:24:20: Fatal exception of type MWException)
Logged
Art is a Blast

Locriani

  • Bay Watcher
  • Locriani == Briess
    • View Profile
    • dwarf fortress wiki
Re: Dwarf Fortress Wiki Offline Dump [UNOFFICIAL]
« Reply #7 on: May 17, 2013, 11:07:59 pm »

The site's not under heavy load, that error is due to a reverse proxy being in place between your computer and the edge node you communicate with on the wiki.  I haven't figured out a solution yet for that, aside from instructing people to use a webserve node (dfweb1 - 3) in the interim.

Also, I'm hardly the lord of the wiki by any means, I just make sure it doesn't break >.>

http://dwarffortresswiki.org/images/dump.xml.bz2 is a dump of all the wiki pages (current revision of the page only) that is autogenerated nightly.  You'll need wikitaxi or another similar program to use this dump.

If that doesn't work for you, spidering is okay as long as:

  • You spider in off peak hours: 2AM-6AM CST
  • You spider as a logged out user (logged out pages are heavily cached and cause less load)
  • You spider against an webserve node (dfweb1 - 3)
Logged
I am one of many administrators of the wiki.  Please use my user page (http://dwarffortresswiki.org/index.php/User_talk:Briess) on the wiki to contact me, as I check that more often than these forums.