Recently, I attended an onsite interview for a company and I was asked design questions related to big data like e.g: get me the list of users accessed a website (say google) between time t1 and t2. What data structures to use, how to handle concurrency, stale data, how many servers are needed to store the data, and requirements(software, hardware) of each server etc.....
Please point me some books/web references to increase my knowledge in this new area.Also provide me insights on how to answer such type of design questions
this book (free download) (amazon: mining of massive datasets) was just posted to HN (that thread also has some useful comments) - from a first skim it looks really good. you could read that.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With