I have an app running in Heroku. I am using sitemap_generator to generate sitemap and save it into s3. I have added the robots.txt to contain my sitemap location.
My question are.
How can I know my sitemap are successfully find by search engine like google?
How can I monitor my sitemap?
If my sitemap is located in my app server I can add the sitemap manually into google webmaster tools for monitoring. Because when I click on "Test/Add sitemap" in Google webmaster tools, it default to the same server.
Thanks for your help.
Do you need to submit a sitemap to Google Search Console? You should, but you don't have to submit a sitemap to Google. Google's bots will crawl your website eventually; submitting a sitemap just speeds the process along. Before you submit, ensure that your sitemap reflects what's currently on your website.
If you have a larger file or more URLs, you will have to break your list into multiple sitemaps. You can optionally create a sitemap index file (a file that points to a list of sitemaps) and submit that single index file to Google. You can submit multiple sitemaps and/or sitemap index files to Google.
I got it to work.
Google has something called cross submission: http://googlewebmastercentral.blogspot.com/2007/10/dealing-with-sitemap-cross-submissions.html
You might want to visit this blog as well: http://stanicblog.blogspot.sg/2012/02/how-to-add-your-sitemap-file-located-in.html
Thanks for your help, yacc.
Let me answer your two first questions, one at a time (I'm not sure what you mean by 'how can I monitor my sitemap' so I'll skip it):
If you can't use Google webmaster form to submit your sitemap, use an HTTP get request to notify Google of your new site map.
If your sitemap is located at https://s3.amazonaws.com/sitemapbucket/sitemap.gz
, first URL encode your sitemap URL (you can use this online URL encoder/decoder for that) then using curl or wget to submit your encoded URL to Google:
curl www.google.com/webmasters/tools/ping?sitemap=https%3A%2F%2Fs3.amazonaws.com%2Fsitemapbucket%2Fsitemap.gz
If your request is successful you'll get a 200 answer with a message like this:
... cut ...
<body><h2>Sitemap Notification Received</h2>
<br>
Your Sitemap has been successfully added to our list of Sitemaps to crawl.
... cut ...
Open Webmaster Tools, navigate to Site sonfiguration->Sitemaps, there you should see the sitemaps that you've submited. It might take sometime for a new sitemap to show up there, so check frequently.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With