I have read an article about ignoring the robots from some url in my ASP MVC.NET project. In his article author said that we should add some action in some off controllers like this. In this example he adds the action to the Home Controller:
#region -- Robots() Method --
public ActionResult Robots()
{
Response.ContentType = "text/plain";
return View();
}
#endregion
then we should add a Robots.cshtml file in our project with this body
@{
Layout = null;
}
# robots.txt for @this.Request.Url.Host
User-agent: *
Disallow: /Administration/
Disallow: /Account/
and finally we should add this line of code to the Gloabal.asax
routes.MapRoute("Robots.txt",
"robots.txt",
new { controller = "Home", action = "Robots" });
my question is that do robots crawl the controllers which has [Authorization] attribute like Administration
?
This simple piece of code worked for my asp net core 3.1 site:
[Route("/robots.txt")]
public ContentResult RobotsTxt()
{
var sb = new StringBuilder();
sb.AppendLine("User-agent: *")
.AppendLine("Disallow:")
.Append("sitemap: ")
.Append(this.Request.Scheme)
.Append("://")
.Append(this.Request.Host)
.AppendLine("/sitemap.xml");
return this.Content(sb.ToString(), "text/plain", Encoding.UTF8);
}
do robots crawl the controllers which has [Authorization] attribute like Administration
If they find a link to it, they are likely to try and crawl it, but they will fail just like anyone with a web browser that does not log in. Robots have no special ability to access your website differently than a standard browser.
Note that robots that conform to the Robots Exclusion Standard crawl the exact URL
http://mydomain/robots.txt
You can create a response for that URL however you like. One approach is certainly to have a controller that handles that request. You can also just add a text file with the same content you would have returned from the controller, e.g.
User-agent: *
Disallow: /Administration/
Disallow: /Account/
to the root folder of your project and make sure it is marked as content so that it is deployed to the website.
Adding this robots.txt entry will prevent conforming robots from attempting to browse controllers that require authentication (and lighten the load on your website slightly), but without the robots file they will just try the URL and fail.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With