I have many text files that I want to upload to a wiki running MediaWiki. I don't even know if this is really possible, but I want to give it a shot.
Each text file's name will be the title of the wiki page.
One wiki page for one file.
I want to upload all text files from the same folder as the program is in.
Perhaps asking you to code it all is asking too much, so could you tell me at least which language I should look for to give it a shot?
What you probably want is a bot to create the articles for you using the MediaWiki API. Probably the best known bot framework is pywikipedia for Python, but there are API libraries and bot frameworks for many other languages too.
In fact, pywikipedia comes with a script called pagefromfile.py that does something pretty close to what you want. By default, it creates multiple pages from a single file, but if you know some Python, it shouldn't be too hard to change that.
Actually, if the files are on the same server your wiki runs on (or you can upload them there), then you don't even need a bot at all: there's a MediaWiki maintenance script called importTextFile.php that can do it for you. You can run it in for all files in a given directory with a simple shell script, e.g.:
for file in directory/*.txt; do
php /path/to/your/mediawiki/maintenance/importTextFile.php "$file";
done
(Obviously, replace directory
with the directory containing the text files and /path/to/your/mediawiki
with the actual path of your MediaWiki installation.)
By default, importTextFile.php will base the name of the created page on the filename, stripping any directory prefixes and extensions. Also, per standard MediaWiki page naming rules, underscores will be replaced by spaces and the first letter will be capitalized (unless you've turned that off in your LocalSettings.php); thus, for example, the file directory/foo_bar.txt
would be imported as the page "Foo bar". If you want finer control over the page naming, importTextFile.php also supports an explicit --title
parameter. Or you could always copy the script and modify it yourself to change the page naming rules.
Ps. There's also another MediaWiki maintenance script called edit.php that does pretty much the same thing as importTextFile.php, except that it reads the page text from standard input and doesn't have the convenient default page naming rules of importTextFile.php. It can be quite handy for automated edits using Unix pipelines, though.
Addendum: The importTextFile.php script expects the file names and contents to be in the UTF-8 encoding. If your files are in some other encoding, you'll have to either fix them first or modify the script to do the conversion, e.g. using mb_convert_encoding().
In particular, the following modifications to the script ought to do it:
To convert the file names to UTF-8, edit the titleFromFilename() function, near the bottom of the script, and replace its last line:
return $parts[0];
with:
return mb_convert_encoding( $parts[0], "UTF-8", "your-encoding" );
where your-encoding
should be the character encoding used for your file names (or auto
to attempt auto-detection).
To also convert the contents of the files, make a similar change higher up, inside the main code of the script, replacing the line:
$text = file_get_contents( $filename );
with:
$text = file_get_contents( $filename );
$text = mb_convert_encoding( $text, "UTF-8", "your-encoding" );
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With