Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Wikipedia API: search for famous people

I have the following Wikipedia API search query:

http://en.wikipedia.org/w/api.php?&action=query&generator=search&gsrnamespace=0&gsrlimit=20&prop=pageimages|extracts&pilimit=max&exintro&exsentences=1&exlimit=max&continue&pithumbsize=100&gsrsearch=Albert%20Einstein

I just want to list famous people - is there a way to do that?

like image 665
rybo111 Avatar asked May 23 '15 22:05

rybo111


2 Answers

There isn't an exact way to limit your search results to only famous people. However, you can use a few different filters in with Wikipedia's CirrusSearch to roughly narrow your results to people:

  • incategory: Can you find a category that includes the people you want? Categories may not be a great solution, since they may be inconveniently specific.
  • linksto: Do articles about people link to a common article?
  • hastemplate: Can you find a template that is used on biographies of famous people? The template {{birth date}} may be a good solution (if it's fine to limit your search to mostly non-fictional people with non-disputed known birthdates).

For example, see your same search result with hastemplate:Birth_date to see people:

https://en.wikipedia.org/w/api.php?&action=query&generator=search&gsrnamespace=0&gsrlimit=20&prop=pageimages|extracts&pilimit=max&exintro&exsentences=1&exlimit=max&continue&pithumbsize=100&gsrsearch=hastemplate%3ABirth_date+Albert%20Einstein

{
"batchcomplete": "",
"continue": {
    "gsroffset": 20,
    "continue": "gsroffset||"
},
"query": {
    "pages": {
        "92733": {
            "pageid": 92733,
            "ns": 0,
            "title": "Albert A. Michelson",
            "index": 14,
            "thumbnail": {
                "source": "https://upload.wikimedia.org/wikipedia/commons/thumb/9/9e/Albert_Abraham_Michelson2.jpg/71px-Albert_Abraham_Michelson2.jpg",
                "width": 71,
                "height": 100
            },
            "pageimage": "Albert_Abraham_Michelson2.jpg",
            "extract": "<p><b>Albert Abraham Michelson</b> (surname pronunciation anglicized as \"Michael-son\", December 19, 1852 \u2013 May 9, 1931) was an American physicist known for his work on the measurement of the speed of light and especially for the Michelson\u2013Morley experiment.</p>"
        },
        "736": {
            "pageid": 736,
            "ns": 0,
            "title": "Albert Einstein",
            "index": 1,
            "thumbnail": {
                "source": "https://upload.wikimedia.org/wikipedia/commons/thumb/3/3e/Einstein_1921_by_F_Schmutzer_-_restoration.jpg/76px-Einstein_1921_by_F_Schmutzer_-_restoration.jpg",
                "width": 76,
                "height": 100
            },
            "pageimage": "Einstein_1921_by_F_Schmutzer_-_restoration.jpg",
            "extract": "<p><b>Albert Einstein</b> (<span><span>/<span><span title=\"/\u02c8/ primary stress follows\">\u02c8</span><span title=\"/a\u026a/ long 'i' in 'tide'\">a\u026a</span><span title=\"'n' in 'no'\">n</span><span title=\"'s' in 'sigh'\">s</span><span title=\"'t' in 'tie'\">t</span><span title=\"/a\u026a/ long 'i' in 'tide'\">a\u026a</span><span title=\"'n' in 'no'\">n</span></span>/</span></span>; <small>German:</small> <span title=\"Representation in the International Phonetic Alphabet (IPA)\">[\u02c8alb\u025b\u0250\u032ft \u02c8a\u026an\u0283ta\u026an]</span>; 14 March 1879&#160;\u2013 18 April 1955) was a German-born theoretical physicist.</p>"
        },
        "1139788": {
            "pageid": 1139788,
            "ns": 0,
            "title": "Alfred Einstein",
            "index": 6,
            "thumbnail": {
                "source": "https://upload.wikimedia.org/wikipedia/en/thumb/1/12/Alfred_Einstein.jpg/70px-Alfred_Einstein.jpg",
                "width": 70,
                "height": 100
            },
            "pageimage": "Alfred_Einstein.jpg",
            "extract": "<p><b>Alfred Einstein</b> (December 30, 1880&#160;\u2013 February 13, 1952) was a German-American musicologist and music editor.</p>"
        },

        ...

Someday, you should be able to use Wikidata to search for entities on Wikipedia that are an instance of human. For now, we'll have to work with search filters.

like image 126
slaporte Avatar answered Sep 25 '22 01:09

slaporte


My workaround for now is to filter search results server-side, by only showing articles that have birth_date in their revision content.

The bounty is still available if someone finds a way around this.

like image 35
rybo111 Avatar answered Sep 25 '22 01:09

rybo111