Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Obtain list of My Places from Google Maps

Tags:

google-maps

I am trying to obtain the list of places the user has saved on Google Maps. Now I know there isnt an API for this (for whatever reason), but I saw here:

"My Places" Google Maps API

That apparently there used to be a way to obtain the URL, but it does not seem to work with my list of places.

E.g.

https://www.google.com/maps/@46.889424,0.1194148,6z/data=!4m3!11m2!2s1KbZtik1IdXyNhwfXEb3P9vaZvzU!3e3

Does not seem to work if I append &output=kml or &output=json

I created this list on Google Maps, then hit share and obtained that link.

I even tried parsing the resulting HTML but it seems everything is handled by some Javascript Engine and I can't find any reference to Google Ids there --- I dont even know how they handle clicks!

Any help? There must be a way to retrieve this information programmatically!

EDIT:

I managed to get something working by visiting the shared link, then processing the html and storing the window.APP_INITIALIZATION_STATE variable. I then convert it to an javascript array and loop over it. Deep inside the array/map structure, I managed to get the google name and google place id out of that array. That seems to work a bit, but when trying with lists over 20 items long, google only gets the first 20 and is waiting for the user to 'scroll down' to get the next 20. That seems to trigger another call to get the next 20 results and looks a bit like:

https://www.google.com/search?tbm=map&fp=1&authuser=0&hl=en&gl=nl&pb=!4m8!1m3!1d54065472.4384380........

I can see the original feature id being included at the end of the url, but have no idea how to construct this url in full though to get the next 20 items.... Any ideas?

like image 263
Gabriel Sanmartin Avatar asked Jun 21 '18 09:06

Gabriel Sanmartin


3 Answers

Your saved places list actually has what you call a feature ID attribute, this isn't a common practice and Google frowns upon this technique but take a look at this URL:

https://www.google.com/maps/preview/entity?authuser=0&hl=en&gl=us&pb=!1m10!1s0x0%3A0x3743ae09a161976b!3m8!1m3!1d14318.72623152007!2d-98.2296425!3d26.2070353!3m2!1i1024!2i768!4f13.1!12m3!2m2!1i392!2i106!13m57!2m2!1i203!2i100!3m2!2i4!5b1!6m6!1m2!1i86!2i86!1m2!1i408!2i200!7m42!1m3!1e1!2b0!3e3!1m3!1e2!2b1!3e2!1m3!1e2!2b0!3e3!1m3!1e3!2b0!3e3!1m3!1e8!2b0!3e3!1m3!1e3!2b1!3e2!1m3!1e9!2b1!3e2!1m3!1e10!2b0!3e3!1m3!1e10!2b1!3e2!1m3!1e10!2b0!3e4!2b1!4b1!9b0!14m3!1snyc5W-WeHY3r5gLwkoRI!7e81!15i10112!15m19!2b1!5m4!2b1!3b1!5b1!6b1!10m1!8e3!14m1!3b1!17b1!24b1!25b1!26b1!30m1!2b1!36b1!52b1!53b1!21m28!1m6!1m2!1i0!2i0!2m2!1i458!2i768!1m6!1m2!1i974!2i0!2m2!1i1024!2i768!1m6!1m2!1i0!2i0!2m2!1i1024!2i20!1m6!1m2!1i0!2i748!2m2!1i1024!2i768!22m1!1e81!29m0!30m1!3b1

Highlighted is the feature ID from the link you posted: https://www.google.com/maps/@46.889424,0.1194148,6z/data=!4m3!11m2!2s1KbZtik1IdXyNhwfXEb3P9vaZvzU!3e3

Along with other maps parameters; when you hit that link you're actually manually triggering the same callback that Google's own scripts in maps use to parse the data to feed back to the maps UI; if you look at array item 2, or {c:..} you'll find a stringified array with the contents of your list, now depending on the program language you're using all it takes is a little tweaking (find/replace, loop through, lint and trim, etc.) to this array and you can pull your results; the cool thing is if you add or remove a place the next time you hit that end point it's updated in real-time.

Some people may call it a "hack"; but it gets the job done. :)

Hope I pointed you to a direction in the event you haven't found a solution; give this a shot.

Note the URL has to be pasted in its entirety, SO truncated the hyperlink; copy and paste the whole thing in one shot and a text file from Google with the arrays will be produced; in my case I curl the URLs I need and parse the returned strings as needed to pull data from Google where their API has limitations. Just a tip. :)

like image 86
Ilan P Avatar answered Nov 01 '22 19:11

Ilan P


This was the only API was able to find was this:

https://www.google.com/bookmarks/?output=xml

Used in a browser you would have to first log in through Google's OAuth. It would then return your saved places. Not sure at the moment how you would embedded the authentication to do this programmatically, but this might send you in the right direction.

like image 25
Niles Tanner Avatar answered Nov 01 '22 19:11

Niles Tanner


Pagination

You can use this tool to decrypt the pb-parameter. PB stands for protocol buffer (protobuf) and Google uses its own kind of it for maps. You can find different decoders for this by googling it.

In my case, the pagination was done via one parameter (8iX0). It seems, that it always comes with another similar parameter (7i20) but I don't know that it does. I can't yet confirm that this is always the case, but from my experience you're basically looking for two integers that are 20/40/60 etc. apart.

Here's what this looks like for me:

  • page 2 (7i20, 8i20)
  • page 3 (7i20, 8i40)
  • page 4 (7i20, 8i60)

From this information, I tried 7i20 8i00 for page 1, that seemed to work. For lists with >100 items, it just continues like that (8i120, 8i140 etc.)

Here's a code snippet in python (quick & dirty). Make sure to add (long) delays if your list has many pages as you will get rate-limited by captchas eventually if you don't. Notice the 8i%s0 in the url, make sure to put the %s back when you paste your pb-block.

url = "https://www.google.com:443/search?tbm=map&pb=!7i20!8i%s0!..."
headers = {"Referer": "https://www.google.com/"}

def fetch_stops_from_maps():
    new_results = -1
    page = 0
    results = []

    while new_results != 0:
        new_results = 0
        x = requests.get(url % page, headers=headers)
        txt = html.unescape(x.text)
        txt = txt.split("\n")[1]
        results = re.findall(r"\[null,null,[0-9]{1,2}\.[0-9]{4,15},[0-9]{1,2}\.[0-9]{4,15}]", txt)

        print(len(results))
        for cord in results:
            # curr = the description you can manually type in when saving
            curr = txt.split(cord)[1].split("\"]]")[0]
            curr = curr[curr.rindex(",\"") + 2:]

            cords = str(cord).split(",")
            lat = cords[2]
            lon = cords[3][:-1]

            results.append(s)
            new_results += 1
        page += 2

Actually getting the correct url

Getting the correct url currently seems to be the hardest part when doing this and I have not fully figured this out aswell. However, for my use-case this is not really important, so I extracted the correct pb-block once and called it a day.

As explained in the other answers, the id of the list is visible in the basic url (here, the 2sXX...) when you navigate to the list in your browser. It seems to usually be 24-32 (?) characters long.

.../maps/<coords>/data=!4m3!11m2!2sXXXX...XXXX!3e3 

If you have this id, you can put it into an existing protobuf-block and it may work (I only tested this with 3 different lists, which were all created by the same account, so this theory is far from proven).

Now, how do you get the block? I would just share the one I have, but because I only understand parts of what it does, I fear that it may contain some personal info. Instead, I will share my process of getting it. For this I use Burpsuite. It's a program mainly used for web-security testing and has a free community edition, however for our use-case it is the perfect tool, because with it you can easily tinker with requests, change small parts in the request, send it again and immediately see if your changes changed the response. However for extracting the pb-block, one should also be able to use any program that can intercept browser traffic.

Heres the basic rundown with burp:

  1. From GMaps, share a list that has >20 items (this is important) and copy the public link

  2. In Burp, go to the tab "Proxy", make sure "Intercept" is off and click "Open browser" to open the integrated chromium browser

  3. There, paste the link and wait until maps loaded completely

  4. In Burp, turn "Intercept" on, then in google maps, scroll down in the list, until it starts loading new results (always blocks of 20)

  5. Burp now intercepted all requests the browser made since you turned intercepting on. Click "Forward" and go through all requests, until you see a request in the format

    GET /search?tbm=map&authuser=0&hl=de&gl=de&pb=!7i20....

This is what you're looking for.

Optionally, you can now right click into the request-text and click "send to repeater", then switch to the repeater-tab. Here you can edit the request and then send it again, being able to see the response immediately. For example, removing the authuser, hl, gl, q, ech, psi url parameters, the request still works flawlessly. If you remove the tch=1 parameter, the response you get will be in a more human readable format.

In the request-text you should now be able to just search for the list-id you got from the link previously and replace it with the id of another list (search bar is at the bottom in burp). As I said, this worked for me, but it may be possible that the pb-block contains some additional metadata that makes lists from different google-accounts or different types of lists incompatible with specific pb-blocks. Just a theory though. Let me know how it goes!

Further automating

I have theorised that one could automate getting the pb-block using requests-html because it can load html-sites fully but it doesn't get updated anymore. Another option (probably the better one) is Selenium Wire, as you should be able to load the page and intercept the requests, like we did in burp. Seems like a whole lot of work tho :D

like image 40
bingo Avatar answered Nov 01 '22 19:11

bingo