I (and co-hackers) are building a sort of trivia game inspired by this blog post: http://messymatters.com/calibration. The idea is to give confidence intervals and learn how to be calibrated (when you're "90% sure" you should be right 90% of the time).
We're thus looking for, ideally, thousands of questions with unambiguous numerical answers. Also, they shouldn't be too boring. There are a lot of random statistics out there -- eg, enclosed water area in different countries -- that would make the game mind-numbing. Things like release dates of classic movies are more interesting (to most people).
Other interesting ones we've found include Olympic records, median incomes for different professions, dates of famous inventions, and celebrity ages. Scraping things like above, by the way, was my reason for asking this question: Scrape HTML tables from a given URL into CSV
So, if you know of other sources of interesting numerical facts (in a parsable form) I'm eager for pointers to them. Thanks!
vgchartz.com have various charts for video game titles and hardware performance.
Sample queries:
There's enough data for questions like:
billboard.com is all you need.
In addition to sales figures, you can also ask queries about chart positions, e.g.:
You can make unambiguous numeric Q/A out of most lists. Take for example, a list like TIME.com All Time 100 Novels
Some generic questions that can be asked are:
You can do this with any given Top 100 lists:
historyorb.com is just one example. The URLs and HTMLs are very scrape-friendly.
There are many similar sites, e.g. brainyhistory.com.
You can also use these dates to "cross" with the other data (e.g. the Top 100 Novels example above).
The Internet Movie Database is of course... the internet movie database!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With