Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Invoking the MediaWiki Page Parser to get HTML?

Tags:

php

mediawiki

I'd like to get the HTML for a MediaWiki Page, that is I want to run the MediaWiki Markup through the parser. Now, I know I could just use some external Parser, but most of them do not support Transclusion and (naturally) Extensions, so my output will be different.

As I have access to the MediaWiki installation, I wonder if I can just use the built-in parser to render me the page. I don't want to do screen scraping because of all the other stuff on the page (navigation, sidebar, javascript and css includes etc.), I literally just want the body.

If it matters, it is running MediaWiki 1.12 on PHP 5.2.

like image 480
Michael Stum Avatar asked Dec 17 '22 04:12

Michael Stum


2 Answers

Use action=render; eg index.php?title=Article_title&action=render

like image 193
Bryan Avatar answered Jan 03 '23 15:01

Bryan


Yes you can do that, as a matter of fact, I remember doing this very thing in many of my extensions available here.

Found one of my extension that does this: SecureTransclusion.

snippet follows:

public function mg_strans( &$parser, $page, $errorMessage = null, $timeout = 5 ) {

    if (!self::checkExecuteRight( $parser->mTitle ))
        return 'SecureTransclusion: '.wfMsg('badaccess');

    $title = Title::newFromText( $page );
    if (!is_object( $title ))
        return 'SecureTransclusion: '.wfMsg('badtitle')." ($page)";

    if ( $title->isTrans() )
        $content = $this->getRemotePage( $parser, $title, $errorMessage, $timeout );
    else
        $content = $this->getLocalPage( $title, $errorMessage );

    $po = $parser->parse( $content, $parser->mTitle, new ParserOptions() );
    $html = $po->getText();

    return array( $html, 'noparse' => true, 'isHTML' => true );
}
like image 45
jldupont Avatar answered Jan 03 '23 15:01

jldupont