The ghtorrent-bq
data is great to have snapshot of GitHub, however, it is not clear when it is updated and how I could get more up to date data
Query data at a point in time You can query a table's historical data from any point in time within the time travel window by using a FOR SYSTEM_TIME AS OF clause. This clause takes a constant timestamp expression and references the version of the table that was current at that timestamp.
Theoretically, it is updated every time a new GHTorrent MySQL dump has been released. Practically, there are still manual adjustments that need to be done to the generated CSVs as there is lots of weird text in fields such as user locations that CSV parsers fail to handle.
http://ghtorrent.org/gcloud.html
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With