asp.net mvc exclude an action from search engine crawling

Question

Is there a way to exclude a Controller Action from search engine crawling? Is there a MVC verb (Attribute), which can be added above the action name?

I want to exclude the following URL from search engine crawling

Home/Secret?type=1

But I want this to be available to search engine crawling

Home/Search

Valamas · Accepted Answer

I think you need to dynamically generate a robots.txt file.

You should create a RobotController to serve a robots.txt file.

Check Reference Here

Related to the above link was a question about allowing .txt extension to be served by an action: https://stackoverflow.com/a/14084127/511438

public ActionResult Robots()
{
    Response.ContentType = "text/plain";
    //-- Here you should write a response with the list of 
    //areas/controllers/action for search engines not to follow.
    return View();
}

Add a Robots.cshtml

Map a route so a call to the file will instead call the action above.

routes.MapRoute("Robots.txt",
                "robots.txt",
                new { controller = "Home", action = "Robots" });

Here is the NoRobots attribute with code to get a list of areas/controllers/actions that have the attribute. Sorry for interpreting the full namespace text. Would love for someone to look at the reflection to work things out better.

public sealed class NoRobotsAttribute : System.Attribute
{

    public static IEnumerable<MethodInfo> GetActions()
    {
        return Assembly.GetExecutingAssembly().GetTypes()
               .Where(t => (typeof(Controller).IsAssignableFrom(t)))
               .SelectMany(
                    type =>
                    type.GetMethods(BindingFlags.Public | BindingFlags.Instance)
                        .Where(a => a.ReturnType == typeof(ActionResult))
                 );

    }
    public static IEnumerable<Type> GetControllers()
    {
        return Assembly.GetExecutingAssembly().GetTypes()
               .Where(t => (typeof(Controller).IsAssignableFrom(t)));

    }


    public static List<string> GetNoRobots()
    {
        var robotList = new List<string>();

        foreach (var methodInfo in GetControllers().Where(w => w.DeclaringType != null))
        {
            var robotAttributes = methodInfo
                    .GetCustomAttributes(typeof(NoRobotsAttribute), false)
                    .Cast<NoRobotsAttribute>();

            foreach (var robotAttribute in robotAttributes)
            {
                 //-- run through any custom attributes on the norobots attribute. None currently specified.
            }
            List<string> namespaceSplit = methodInfo.DeclaringType.FullName.Split('.').ToList();

            var controllersIndex = namespaceSplit.IndexOf("Controllers");
            var controller = (controllersIndex > -1 ? "/" + namespaceSplit[controllersIndex + 1] : "");
            robotList.Add(controller);

        }

        foreach (var methodInfo in GetActions())
        {
            var robotAttributes = methodInfo
                    .GetCustomAttributes(typeof(NoRobotsAttribute), false)
                    .Cast<NoRobotsAttribute>();

            foreach (var robotAttribute in robotAttributes)
            {
                 //-- run through any custom attributes on the norobots attribute. None currently specified.
            }

            List<string> namespaceSplit = methodInfo.DeclaringType.FullName.Split('.').ToList();

            var areaIndex = namespaceSplit.IndexOf("Areas");
            var area = (areaIndex > -1 ? "/" + namespaceSplit[areaIndex + 1] : "");

            var controllersIndex = namespaceSplit.IndexOf("Controllers");
            var controller = (controllersIndex > -1 ? "/" + namespaceSplit[controllersIndex + 1] : "");

            var action = "/" + methodInfo.Name;

            robotList.Add(area + controller + action);

        }
        return robotList;
    }
}

Usage:

[NoRobots] //Can be applied at controller or action method level.
public class HomeController : Controller
{
    [NoRobots]
    public ActionResult Index()
    {
        ViewData["Message"] = "Welcome to ASP.NET MVC!";
        List<string> x = NoRobotsAttribute.GetNoRobots();
        //-- Just some test code that wrote the result to a webpage.
        return View(x);
    }
}

... and for Areas.

namespace MVC.Temp.Areas.MyArea.Controllers
{
    using MVC.Temp.Models.Home;

    [NoRobots]
    public class SubController : Controller
    {
        [NoRobots]
        public ActionResult SomeAction()
        {
            return View();
        }

    }
}

So keep in mind that the solution relies on namespaces and would welcome any improvements someone can offer.

Finally, you need to write the robot file correctly, including any header information and wildcard support.

Steven V · Answer

If it's publicly accessible, and especially linked on a page, a robot can/will find it. You can use rel="nofollow" on links, use the noindex meta tag on the page itself, or use a robots.txt file to Disallow indexing of the pages. This will prevent all the honest search engines (like Google, Bing, Yahoo) from indexing or following the links, but won't keep out the random bots from looking at the pages.

Never the less, the URL is accessible to the public. A human can visit it, then a computer can. If you would like to prevent it from being accessible to the general public you probably want to look into user authentication.

asp.net mvc exclude an action from search engine crawling

Tags:

asp.net-mvc

seo

asp.net-mvc-3

asp.net-mvc-4

pili

2 Answers

Valamas

Steven V

Recent Activity

Donate For Us

asp.net mvc exclude an action from search engine crawling

Tags:

asp.net-mvc

seo

asp.net-mvc-3

asp.net-mvc-4

pili

2 Answers

Valamas

Steven V

Related questions

Recent Activity

Donate For Us