Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I access the StackOverflow API from Mathematica

I was wondering the other day if StackOverflow had an API I could access from Mathematica, and apparently it does: "Saving plot annotations"

What's the best way to get data from StackOverflow into Mathematica? Sjoerd used the information to make a plot. I'm interested in adding SO-related notifications into a docked cell I keep in my notebooks, so I can tell when there are updates or responses without leaving Mathematica.

like image 759
Brett Champion Avatar asked Apr 21 '11 14:04

Brett Champion


People also ask

Does stack overflow have an API?

The Stack Overflow API allows users to interface with the site through commands such as Getting comments by ids, Getting question summary information and more. The Stack Exchange API is based on standard HTTP and URLs and responses are in JSON. Users are allowed 10,000 requests/day with API key.

What is Stack Exchange API?

For authentication purposes, the Stack Exchange API implements OAuth 2.0 (templated on Facebook's implementation in pursuit of developer familiarity). A number of methods in the Stack Exchange API accept dates as parameters and return dates as properties, the format of these dates is consistent and documented.

Does Stack Exchange have an API?

The Stack Exchange API permits access to data within private Teams. Only Stack Overflow Business accounts have read and write access to the API.


2 Answers

By popular demand, the code to generate the top-10 SO answerers plot (except annotations) using the SO API (it's a pretty neat and complete API; lots of goodies there. Easy too - see my code).

Update: added App-key to ensure the code co-operates better with the SO-API (higher daily call cap). Please use it only for this app.

April 2011 enter image description here

August 2011 enter image description here

MMA 8 version! MMA7 version further down

getRepChanges[userID_Integer] :=
 Module[{totalChanges},
  totalChanges = 
   "total" /. 
    Import["http://api.stackoverflow.com/1.1/users/" <> 
      ToString[userID] <> "/reputation?key=NgVJ4Y6vFkuF-oqI-eOvOw&fromdate=0&pagesize=1&page=1",
      "JSON"
    ];
    Join @@ 
    Table[
      "rep_changes" /. 
         Import["http://api.stackoverflow.com/1.1/users/" <> 
                ToString[userID] <> 
                "/reputation?key=NgVJ4Y6vFkuF-oqI-eOvOw&fromdate=0&pagesize=100&page=" 
                <> ToString[page], 
                "JSON"
         ],
         {page, 1, Ceiling[totalChanges/100]}
    ]
  ]

topAnswerers = 
  ({"display_name","user_id", "email_hash"} /. #) & /@ 
     ("user" /. 
      ("top_users" /. 
        Import[
          "http://api.stackoverflow.com/1.1/tags/mathematica/top-answerers/all-time",    
          "JSON"
        ]
       )
      )

topAnswerers = {#, #2, 
    Import["http://www.gravatar.com/avatar/" <> #3 <> ".jpg?s=36&d=identicon&d=identicon"]
    } & @@@ topAnswerers

repChangesTopUsers =
  Table[
    repChange = 
     ReleaseHold[
        (
         Hold[
           {
              DateList["on_date" + AbsoluteTime["January 1, 1970"]], 
             "positive_rep" - "negative_rep"
           }
         ] /. #
        ) & /@ getRepChanges[userID]
      ] // Sort;
      accRepChange = {repChange[[All, 1]],Accumulate[repChange[[All, 2]]]}\[Transpose],
      {userID, topAnswerers[[All, 2]]}
    ];

pl = DateListLogPlot[
  Tooltip @@@ 
   Take[({repChangesTopUsers, Row /@ topAnswerers[[All, {3, 1}]]}\[Transpose]), 
    10], Joined -> True, Mesh -> None, ImageSize -> 1000, 
  PlotRange -> {All, {10, All}}, 
  BaseStyle -> {FontFamily -> "Arial-Bold", FontSize -> 16}, 
  DateTicksFormat -> {"MonthNameShort", " ", "Year"}, 
  GridLines -> {True, None}, 
  FrameLabel -> (Style[#, FontSize -> 18] & /@ {"Date", "Reputation", 
      "Top-10 answerers", ""})]

EDIT
Note that you can plot up to and including a top-20 by changing the value in the Take function. It gets busy pretty soon.

Tried to improve the readability of Markup code somewhat. I'm afraid this will yield some spurious spaces when copied.

EDIT
Page size back to 100 elements/page ==> fewer API calls Please note that the first call to the API is to determine the amount of posts the user has. This data is present no matter the page size, so this is preferably chosen small (10 or so, possibly 1, didn't check). Then the data is fetched in successive pages until the last page is reached. You can use the maximum page size (100) for that. Just take care that the maximum number of pages in the loop count is adjusted accordingly.

EDIT: better MMA 7 code (Fri Apr 22)

MMA 7 doesn't do JSON imports, so I do a text import instead followed by a bare-bones JSON translation. I've tested this version several times now (in MMA 8) and it seems to work without the errors I got yesterday.

getRepChanges[userID_Integer] :=
 Module[{totalChanges},
  totalChanges = 
   "total" /. 
    ImportString[
     StringReplace[(Import[
        "http://api.stackoverflow.com/1.1/users/" <> 
         ToString[userID] <> 
         "/reputation?key=NgVJ4Y6vFkuF-oqI-eOvOw&fromdate=0&pagesize=1&page=1", "Text"]), {":" ->
         "->", "[" -> "{", "]" -> "}"}], "NB"];
  Join @@ 
   Table["rep_changes" /. 
     ImportString[
      StringReplace[
       Import["http://api.stackoverflow.com/1.1/users/" <> 
         ToString[userID] <> 
         "/reputation?key=NgVJ4Y6vFkuF-oqI-eOvOw&fromdate=0&pagesize=100&page=" <> ToString[page],
         "Text"], {":" -> "->", "[" -> "{", "]" -> "}"}], 
      "NB"], {page, 1, Ceiling[totalChanges/100]}]]
topAnswerers = ({"display_name", "user_id", 
      "email_hash"} /. #) & /@ ("user" /. ("top_users" /. 
      ImportString[
       StringReplace[
        " " <> Import[
          "http://api.stackoverflow.com/1.1/tags/mathematica/top-answerers/all-time", "Text"], {":" -> "->", "[" -> "{", "]" -> "}"}], 
       "NB"]))
topAnswerers = {#, #2, 
    Import["http://www.gravatar.com/avatar/" <> #3 <> 
      ".jpg?s=36&d=identicon&d=identicon"]} & @@@ topAnswerers
repChangesTopUsers = 
  Table[repChange = 
    ReleaseHold[(Hold[{DateList[
             "on_date" + AbsoluteTime["January 1, 1970"]], 
            "positive_rep" - "negative_rep"}] /. #) & /@ 
       getRepChanges[userID]] // Sort;
   accRepChange = {repChange[[All, 1]], 
      Accumulate[repChange[[All, 2]]]}\[Transpose], {userID, 
    topAnswerers[[All, 2]]}];

DateListLogPlot[
 Tooltip @@@ 
  Take[({repChangesTopUsers, 
      Row /@ topAnswerers[[All, {3, 1}]]}\[Transpose]), 10], 
 Joined -> True, Mesh -> None, ImageSize -> 1000, 
 PlotRange -> {All, {10, All}}, 
 BaseStyle -> {FontFamily -> "Arial-Bold", FontSize -> 16}, 
 DateTicksFormat -> {"MonthNameShort", " ", "Year"}, 
 GridLines -> {True, None}, 
 FrameLabel -> (Style[#, FontSize -> 18] & /@ {"Date", "Reputation", 
     "Top-10 answerers", ""})] 

EDIT: auxiliary functions to filter on post tags These functions can be used to filter reputation gains, in order to find gains for certain tags only. tagLookup gets a post_ID integer as input and yields the specific post's tags. getQuestionIDs and getAnswerIDsFrom... go the other way. Given a tag they find all the question and answer IDs so that one can test with MemberQ whether a given post_ID belongs to this tag. Both tagLookup and getAnswerIDs are slow since many API calls are necessary. I couldn't test the last two function as either API access is down or my IP has been capped.

tagLookup[postID_Integer] :=
 Module[{im},
  im = Import["http://api.stackoverflow.com/1.1/questions/" <> ToString[postID],"JSON"];
  If[("questions" /. im) != {},
   First[("tags" /. ("questions" /. im))],
   im = Import["http://api.stackoverflow.com/1.1/answers/" <> ToString[postID],"JSON"];
   First[("tags" /. ("questions" /. Import["http://api.stackoverflow.com/1.1/questions/" <> 
          ToString[First["question_id" /. ("answers" /. im)]], "JSON"]))]
   ]
  ]

getQuestionIDs[tagName_String] := Module[{total},
  total = 
   "total" /. 
    Import["http://api.stackoverflow.com/1.1/questions?tagged=" <> 
      tagName <> "&pagesize=1", "JSON"];
  Join @@ 
   Table[("question_id" /. ("questions" /. 
        Import["http://api.stackoverflow.com/1.1/questions?key=NgVJ4Y6vFkuF-oqI-eOvOw&tagged=" <>
           tagName <> "&pagesize=100&page=" <> ToString[i], 
         "JSON"])), {i, 1, Ceiling[total/100]}]
  ]

getAnswerIDsFromQuestionID[questionID_Integer] :=
 Module[{total},
  total = 
   Import["http://api.stackoverflow.com/1.1/questions/" <> 
     ToString[questionID] <> "/answers?key=NgVJ4Y6vFkuF-oqI-eOvOw&pagesize=1", "JSON"];
  If[total === $Failed, Return[$Failed], total = "total" /. total]; 
  Join @@ Table[
    "answer_id" /. ("answers" /. 
       Import["http://api.stackoverflow.com/1.1/questions/" <> 
         ToString[questionID] <> "/answers?key=NgVJ4Y6vFkuF-oqI-eOvOw&pagesize=100&page=" <> 
         ToString[i], "JSON"]), {i, 1, Ceiling[total/100]}]
  ]

getAnswerIDsFromTag[tagName_String] :=
 Module[{},
  Join @@ (getAnswerIDsFromQuestionID /@ 
     Cases[getQuestionIDs[tagName], Except[$Failed]])
  ]
like image 52
Sjoerd C. de Vries Avatar answered Oct 12 '22 00:10

Sjoerd C. de Vries


Brett, unrelated to SO API, but you could use RSS feed for the newest Mathematica-tagged questions. Here is my naive implementation:

QuestionHyperlink[data_] := 
 Function[{name, title, link}, 
   Hyperlink[Tooltip[title, name], link]] @@ Join[
   Cases[data, 
    XMLElement[
      "author", _, {___, XMLElement["name", {}, {name_}], ___}] :> 
     name],
   Cases[data, XMLElement["title", _, {title_}] :> title],
   Cases[data, XMLElement["link", rules_, {}] :> ("href" /. rules)]]

Cases[Import[
  "http://stackoverflow.com/feeds/tag?tagnames=mathematica&sort=\
newest", "XML"], 
 XMLElement["entry", attrs_, data_] :> 
  QuestionHyperlink[data], Infinity]

enter image description here

like image 27
Sasha Avatar answered Oct 12 '22 01:10

Sasha