Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Magento 1 - Removing numbers in url key/product url

As you may know, if you have products that share a url key, the url key will have a digit appended to it:

i.e

http://www.example.com/main-category/sub-category/product-name-**6260**.html

How do I find the source of that 6260 (which is one of the #'s appended to my urls)? I tried product id, sku, I cannot find the source of it. The reason I ask is because if I can find it, I can create a string replace function to flush it out of url's before I echo them on certain product listing pages.

Thanks.

like image 374
Joel Avatar asked Nov 28 '12 21:11

Joel


2 Answers

Before we get to the location in code where this happens, be advised you're entering a world of pain.

  1. There's no simple rule as to how those numbers are generated. There's cases where it's the store ID, there's cases where it's the simple product ID. There's cases where it's neither

  2. Even if there was, it's common for not-from-scratch Magento sites to contain custom functionality that changes this

  3. Ultimately, since Magento's human readable/SEO-friendly URLs are located in the core_url_rewrite table, it's possible for people to insert arbitrary text

Warnings of doom aside, the Model you're looking for is Mage::getSingleton('catalog/url'). This contains most of the logic for generating Magento Catalog and product rewrites. All of these methods end by passing the request path through the getUnusedPath method.

#File: app/code/core/Mage/Catalog/Model/Url.php
public function getUnusedPath($storeId, $requestPath, $idPath)
{
    //...
}

This method contains the logic for for creating a unique number on the end of the URL. Tracing this in its entirely is beyond the scope of a Stack Overflow post, but this line in particular is enlightening/disheartening.

$lastRequestPath = $this->getResource()
    ->getLastUsedRewriteRequestIncrement($match[1], $match[4], $storeId);
if ($lastRequestPath) {
    $match[3] = $lastRequestPath;
}
return $match[1]
    . (isset($match[3]) ? ($match[3]+1) : '1')
    . $match[4];

In particular, these two lines

$match[3] = $lastRequestPath;
//...
. (isset($match[3]) ? ($match[3]+1) : '1')
//...

In case it's not obvious, there are cases where Magento will automatically append a 1 to a URL, and then continue to increment it. This makes the generation of those URLs dependent on system state when they were generated — there's no simple rule.

Other lines of interest in this file are

if (strpos($idPath, 'product') !== false) {
    $suffix = $this->getProductUrlSuffix($storeId);
} else {
    $suffix = $this->getCategoryUrlSuffix($storeId);
}    

This $suffix will be used on the end of the URL as well, so those methods are worth investigating.

If all you're trying to do is remove numbers from the URL, you might be better off with a regular expression or some explode/implode string jiggering.

like image 91
Alan Storm Avatar answered Nov 19 '22 22:11

Alan Storm


I have little to no idea why this works but this worked for me. Most probably because it makes urls non-unique. Magento ver. 1.7.0.2 had suddenly started adding numbers as suffixes to my new products' names, even if their url keys and names were different from the old products. On a hunch, I went to System -> Configuration -> Catalog -> Search Engine Optimizations -> Product URL Suffix and changed the default .html to -prod.html. I guess you could change it to any suffix you wanted to. Then I re-indexed my website, refreshed cache, and presto! All the numbers were gone from the product urls. The product urls now all have the format custom-product-name-prod.html. The canonical tag also shows custom-product-name-prod.html so I'm double happy.

Don't know if it'll work for others, but I hope it does. Do note that I did have old and new products with duplicate URLs and that I had disabled old products before doing this procedure. So if you have 2 products with the same url key and both are enabled, then this may NOT work for you.

like image 23
Agrim Avatar answered Nov 19 '22 21:11

Agrim