Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does the "edit section" feature on Wikipedia work?

How does Wikipedia implement the "edit this section" feature for its articles, wherein a user can edit just a section of an article, rather than the whole article? I've tried crawling through mediawiki's code by myself, but it's a bit dense for me to understand. Here's my guess (but only a guess):

User click's on [edit] in an article. This links to the regular edit page, but with an additional parameter passed via GET in the URL specifying which section to edit. Without this extra parameter, Mediawiki would normally simply present the user with a form editing the entire page. But specifying a section to edit causes Mediawiki to extract ONLY that section and present it for editing.

What stumps me is how does Mediawiki parse out individual sections? From what I understand, MW doesn't store the sections individually - it stores each ENTIRE VERSION of each version of the page as one big block of text (in addition to boatloads of metadata). Does MW simply look for H1, H2, H3, ... tags, and use those to split up the text into regions when it renders the page? And when a user saves a revised section, does it look at the current version, re-parse the text into sections, and just "inject" the new version into a copy of the current version, which it then saves as a NEW version?

I assume my understanding of MediaWiki is grossly simplified, but I'm just trying to get a rough idea.

Thanks!

like image 594
loneboat Avatar asked Sep 11 '10 02:09

loneboat


People also ask

How does Wikipedia edit work?

Wikipedia uses two interface methods: classic editing with the Source Editor through wikitext (wiki markup), and a new VisualEditor (VE). Wikitext editing using the Source Editor is chosen by clicking the Edit source tab at the top of a Wikipedia page (or on a section-edit link).

How can you tell who made an edit on Wikipedia?

Start with the article history tab The first and most important tool in finding out who's been involved with editing a Wikipedia page is the article history. This history is found above the text of the article and shows exactly which changes were made when and by which user.

How do you edit something on Wikipedia?

How do I edit a page? To edit the whole page at once, click the "edit this page" tab at the top. To edit just one section, click the "edit" link to the right of the section heading. To edit on Wikipedia, you type in a special markup language called wikitext.

What happens when you edit a wiki page?

When you edit a Wikipedia page, you can either log in or complete this task anonymously. If you are logged in, it will display your username and the edit you have made. If you have done so anonymously, it will just record your IP address . Wikipedia does not object to anonymous entries.


1 Answers

This might be a clue - from http://en.wikipedia.org/wiki/Help:Section. The sections use a specific markup as such:

==Section==

===Subsection===

====Sub-subsection====
  • Using the same heading more than once on a page causes problems.
  • When a section with a duplicate name is edited, the edit history and summary will be ambiguous as to which section was edited.
  • When saving the page after a section edit, the editor's browser may navigate to the wrong section.

Sections can be separately edited by clicking special edit links labeled "[edit]" by the heading, or by right clicking on the section heading, depending on the preferences set. This is called "section editing feature" (Preferences -> Editing -> "Enable section editing via [edit] links"). Section editing feature will take you to an edit page by a URL such as

http://en.wikipedia.org/w/index.php?title=Help:Section&action=edit&section=2

Note that here section numbers are used, not section titles; subsections have a single number, e.g. section 2.1 may be numbered 3, section 3 is then numbered 4, etc. You can also directly type in such URLs in the address bar of your browser.

So it looks like the parser has a count of sections in the TOC and then uses the = tag to place the specific text into the editor.

Here are some of the tables used:

Page Table - (http://www.mediawiki.org/wiki/Manual:Page_table) - Each page in a MediaWiki installation has an entry here which identifies it by title

Revision Table holds metadata for every edit done to a page within the wiki. Every edit of a page creates a revision row, which holds information such as the user who made the edit, the time at which the edit was made, and a reference to the new wikitext in the text table

Text Table - holds the wikitext of individual page revisions.

The contents of pages are stored as BLOBs. So it must parse in binary.

Hope this helps.

like image 136
Todd Moses Avatar answered Oct 14 '22 03:10

Todd Moses