Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

am I exposing sensitive data if I put a bson ID in a url?

Tags:

url

mongodb

bson

Say I have a Products array in my Mongodb. I'd like users to be able to see each product on their own page: http://www.mysite.com/product/12345/Widget-Wodget. Since each Product doesn't have an incremental integer ID (12345) but instead it has a BSON ID (5063a36bdeb13f7505000630), I'd need to either add the integer ID or use the BSON ID.

Since BSON ID's include the PID:

  • 4-byte timestamp,
  • 3-byte machine identifier,
  • 2-byte process id,
  • 3-byte counter.

Am I exposing secure information to the outside world if I use the BSON ID in my url?

like image 481
jcollum Avatar asked Dec 05 '12 17:12

jcollum


1 Answers

I can't think of any use to gain privileges on your machines, however using ObjectIds everywhere discloses a lot of information nonetheless.

By crawling your website, one could:

  • find about some hidden objects: for instance, if the counter part goes from 0x....b1 to 0x....b9 between times t1 and t2, one can guess ObjectIds within these invervals. However, guessing ids is most likely useless if you enforce access permissions
  • know the signup date of each user (not very sensitive info but better than nothing)
  • deduce actual (as opposed to publicly available) business hours from the timestamps of objects created by the staff
  • deduce in which timezones your audience lives from the timestamps of user-generated objects: if your website is one which people use mostly at lunchtime, then one could measure peaks of ObjectIds and deduce that a peak at 8 PM UTC means the audience was on the US West coast
  • and more generally, by crawling most of your website, one can build a timeline of the success of your service, having for any given time knowledge of: your user count, levels of user engagement, how many servers you've got, how often your servers are restarted. PID changes occurring on weekends are more likely crashes, whereas those on business days are more likely crashes + software revisions
  • and probably find other info specific to your business processes and domain

To be fair, even with random ids one can infer a lot. The main issue is that you need to prevent anyone from scraping a statistically significant part of your site. But if someone is determined, they'll succeed eventually, which is why providing them with all of this extra, timestamped info seems wrong.

like image 199
guillaume Avatar answered Oct 19 '22 12:10

guillaume