Migrating from Mediawiki to Bookstack

Posted by Brad on Mon 15 January 2018

I generally like mediawiki, but I also like simple. So when I found BookStack and the fact that it supported markdown I jumped on it. Sure, it doesn't have all the features that mediawiki does, but I do think it has a much cleaner interface that puts the focus on the documentation instead of the application. My one complaint with bookstack originally was that it didn't have a true public mode but then I found it in the global settings. The other con with bookstack is that its a bit weird to set up since it uses composer, so the install is a bit weird to automate, but we'll let it slide since its unlikely I'll be installing this all over the place.

So with that one complaint dealt with I decided it was time to fully migrate from the wiki to bookstack. I was originally just going to move things around manually but then I realized this would be the perfect project to do some real python scripting. There's a few different ways this could be accomplished but the 2 that stood out was either through the backend databases or through the web interfaces. I'm not much of a database guy and I wasn't sure what all would need to be populated in the db so I decided to go through the frontend web interfaces.

I won't go into every little bit of the script as its mostly just a loop and replacing some bits with other bits, but you can find the complete script (and others) here. I ultimately ended up taking advantage of a few different python modules: beautiful soup, selenium, and requests. And while its probably not the cleanest way of doing it and its not perfect, I think it turned out pretty good. I ran into a few problems with the selenium send_keys function with the long blocks of text I was dealing with. Most of the internet seemed to recommend javascript as an alternative, but I simply looped through the text so send_keys was only sending a character at a time. While not really practical it was interesting to see the python's typing speed. I think next time I would try doing a copy/paste operation instead.

Another fun peculiarity I hit with the send_keys method is that the bookstack markdown editor will auto-indent lines based on the previous lines indentation. This caused a few formatting issues as the indents made normal text appear to be part of a code block. The final fun bit was timing things, so the script didn't start looking for parts of the page until it was fully loaded. I took the lazy way out and just put in a pause. I suppose I should have done a test for the element and then waited if it wasn't there but for a script like this I think the easy way is good enough.

So there you have it, my first real python script that wasn't a practice problem from a book.