I was wondering the other day if StackOverflow had an API I could access from Mathematica, and apparently it does: "Saving plot annotations"
What's the best way to get data from StackOverflow into Mathematica? Sjoerd used the information to make a plot. I'm interested in adding SO-related notifications into a docked cell I keep in my notebooks, so I can tell when there are updates or responses without leaving Mathematica.
The Stack Overflow API allows users to interface with the site through commands such as Getting comments by ids, Getting question summary information and more. The Stack Exchange API is based on standard HTTP and URLs and responses are in JSON. Users are allowed 10,000 requests/day with API key.
For authentication purposes, the Stack Exchange API implements OAuth 2.0 (templated on Facebook's implementation in pursuit of developer familiarity). A number of methods in the Stack Exchange API accept dates as parameters and return dates as properties, the format of these dates is consistent and documented.
The Stack Exchange API permits access to data within private Teams. Only Stack Overflow Business accounts have read and write access to the API.
By popular demand, the code to generate the top-10 SO answerers plot (except annotations) using the SO API (it's a pretty neat and complete API; lots of goodies there. Easy too - see my code).
Update: added App-key to ensure the code co-operates better with the SO-API (higher daily call cap). Please use it only for this app.
April 2011
August 2011
MMA 8 version! MMA7 version further down
getRepChanges[userID_Integer] :=
Module[{totalChanges},
totalChanges =
"total" /.
Import["http://api.stackoverflow.com/1.1/users/" <>
ToString[userID] <> "/reputation?key=NgVJ4Y6vFkuF-oqI-eOvOw&fromdate=0&pagesize=1&page=1",
"JSON"
];
Join @@
Table[
"rep_changes" /.
Import["http://api.stackoverflow.com/1.1/users/" <>
ToString[userID] <>
"/reputation?key=NgVJ4Y6vFkuF-oqI-eOvOw&fromdate=0&pagesize=100&page="
<> ToString[page],
"JSON"
],
{page, 1, Ceiling[totalChanges/100]}
]
]
topAnswerers =
({"display_name","user_id", "email_hash"} /. #) & /@
("user" /.
("top_users" /.
Import[
"http://api.stackoverflow.com/1.1/tags/mathematica/top-answerers/all-time",
"JSON"
]
)
)
topAnswerers = {#, #2,
Import["http://www.gravatar.com/avatar/" <> #3 <> ".jpg?s=36&d=identicon&d=identicon"]
} & @@@ topAnswerers
repChangesTopUsers =
Table[
repChange =
ReleaseHold[
(
Hold[
{
DateList["on_date" + AbsoluteTime["January 1, 1970"]],
"positive_rep" - "negative_rep"
}
] /. #
) & /@ getRepChanges[userID]
] // Sort;
accRepChange = {repChange[[All, 1]],Accumulate[repChange[[All, 2]]]}\[Transpose],
{userID, topAnswerers[[All, 2]]}
];
pl = DateListLogPlot[
Tooltip @@@
Take[({repChangesTopUsers, Row /@ topAnswerers[[All, {3, 1}]]}\[Transpose]),
10], Joined -> True, Mesh -> None, ImageSize -> 1000,
PlotRange -> {All, {10, All}},
BaseStyle -> {FontFamily -> "Arial-Bold", FontSize -> 16},
DateTicksFormat -> {"MonthNameShort", " ", "Year"},
GridLines -> {True, None},
FrameLabel -> (Style[#, FontSize -> 18] & /@ {"Date", "Reputation",
"Top-10 answerers", ""})]
EDIT
Note that you can plot up to and including a top-20 by changing the value in the Take function. It gets busy pretty soon.
Tried to improve the readability of Markup code somewhat. I'm afraid this will yield some spurious spaces when copied.
EDIT
Page size back to 100 elements/page ==> fewer API calls
Please note that the first call to the API is to determine the amount of posts the user has. This data is present no matter the page size, so this is preferably chosen small (10 or so, possibly 1, didn't check). Then the data is fetched in successive pages until the last page is reached. You can use the maximum page size (100) for that. Just take care that the maximum number of pages in the loop count is adjusted accordingly.
EDIT: better MMA 7 code (Fri Apr 22)
MMA 7 doesn't do JSON imports, so I do a text import instead followed by a bare-bones JSON translation. I've tested this version several times now (in MMA 8) and it seems to work without the errors I got yesterday.
getRepChanges[userID_Integer] :=
Module[{totalChanges},
totalChanges =
"total" /.
ImportString[
StringReplace[(Import[
"http://api.stackoverflow.com/1.1/users/" <>
ToString[userID] <>
"/reputation?key=NgVJ4Y6vFkuF-oqI-eOvOw&fromdate=0&pagesize=1&page=1", "Text"]), {":" ->
"->", "[" -> "{", "]" -> "}"}], "NB"];
Join @@
Table["rep_changes" /.
ImportString[
StringReplace[
Import["http://api.stackoverflow.com/1.1/users/" <>
ToString[userID] <>
"/reputation?key=NgVJ4Y6vFkuF-oqI-eOvOw&fromdate=0&pagesize=100&page=" <> ToString[page],
"Text"], {":" -> "->", "[" -> "{", "]" -> "}"}],
"NB"], {page, 1, Ceiling[totalChanges/100]}]]
topAnswerers = ({"display_name", "user_id",
"email_hash"} /. #) & /@ ("user" /. ("top_users" /.
ImportString[
StringReplace[
" " <> Import[
"http://api.stackoverflow.com/1.1/tags/mathematica/top-answerers/all-time", "Text"], {":" -> "->", "[" -> "{", "]" -> "}"}],
"NB"]))
topAnswerers = {#, #2,
Import["http://www.gravatar.com/avatar/" <> #3 <>
".jpg?s=36&d=identicon&d=identicon"]} & @@@ topAnswerers
repChangesTopUsers =
Table[repChange =
ReleaseHold[(Hold[{DateList[
"on_date" + AbsoluteTime["January 1, 1970"]],
"positive_rep" - "negative_rep"}] /. #) & /@
getRepChanges[userID]] // Sort;
accRepChange = {repChange[[All, 1]],
Accumulate[repChange[[All, 2]]]}\[Transpose], {userID,
topAnswerers[[All, 2]]}];
DateListLogPlot[
Tooltip @@@
Take[({repChangesTopUsers,
Row /@ topAnswerers[[All, {3, 1}]]}\[Transpose]), 10],
Joined -> True, Mesh -> None, ImageSize -> 1000,
PlotRange -> {All, {10, All}},
BaseStyle -> {FontFamily -> "Arial-Bold", FontSize -> 16},
DateTicksFormat -> {"MonthNameShort", " ", "Year"},
GridLines -> {True, None},
FrameLabel -> (Style[#, FontSize -> 18] & /@ {"Date", "Reputation",
"Top-10 answerers", ""})]
EDIT: auxiliary functions to filter on post tags
These functions can be used to filter reputation gains, in order to find gains for certain tags only.
tagLookup
gets a post_ID integer as input and yields the specific post's tags. getQuestionIDs
and getAnswerIDsFrom...
go the other way. Given a tag they find all the question and answer IDs so that one can test with MemberQ
whether a given post_ID belongs to this tag. Both tagLookup and getAnswerIDs are slow since many API calls are necessary. I couldn't test the last two function as either API access is down or my IP has been capped.
tagLookup[postID_Integer] :=
Module[{im},
im = Import["http://api.stackoverflow.com/1.1/questions/" <> ToString[postID],"JSON"];
If[("questions" /. im) != {},
First[("tags" /. ("questions" /. im))],
im = Import["http://api.stackoverflow.com/1.1/answers/" <> ToString[postID],"JSON"];
First[("tags" /. ("questions" /. Import["http://api.stackoverflow.com/1.1/questions/" <>
ToString[First["question_id" /. ("answers" /. im)]], "JSON"]))]
]
]
getQuestionIDs[tagName_String] := Module[{total},
total =
"total" /.
Import["http://api.stackoverflow.com/1.1/questions?tagged=" <>
tagName <> "&pagesize=1", "JSON"];
Join @@
Table[("question_id" /. ("questions" /.
Import["http://api.stackoverflow.com/1.1/questions?key=NgVJ4Y6vFkuF-oqI-eOvOw&tagged=" <>
tagName <> "&pagesize=100&page=" <> ToString[i],
"JSON"])), {i, 1, Ceiling[total/100]}]
]
getAnswerIDsFromQuestionID[questionID_Integer] :=
Module[{total},
total =
Import["http://api.stackoverflow.com/1.1/questions/" <>
ToString[questionID] <> "/answers?key=NgVJ4Y6vFkuF-oqI-eOvOw&pagesize=1", "JSON"];
If[total === $Failed, Return[$Failed], total = "total" /. total];
Join @@ Table[
"answer_id" /. ("answers" /.
Import["http://api.stackoverflow.com/1.1/questions/" <>
ToString[questionID] <> "/answers?key=NgVJ4Y6vFkuF-oqI-eOvOw&pagesize=100&page=" <>
ToString[i], "JSON"]), {i, 1, Ceiling[total/100]}]
]
getAnswerIDsFromTag[tagName_String] :=
Module[{},
Join @@ (getAnswerIDsFromQuestionID /@
Cases[getQuestionIDs[tagName], Except[$Failed]])
]
Brett, unrelated to SO API, but you could use RSS feed for the newest Mathematica-tagged questions. Here is my naive implementation:
QuestionHyperlink[data_] :=
Function[{name, title, link},
Hyperlink[Tooltip[title, name], link]] @@ Join[
Cases[data,
XMLElement[
"author", _, {___, XMLElement["name", {}, {name_}], ___}] :>
name],
Cases[data, XMLElement["title", _, {title_}] :> title],
Cases[data, XMLElement["link", rules_, {}] :> ("href" /. rules)]]
Cases[Import[
"http://stackoverflow.com/feeds/tag?tagnames=mathematica&sort=\
newest", "XML"],
XMLElement["entry", attrs_, data_] :>
QuestionHyperlink[data], Infinity]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With