As you may know, if you have products that share a url key, the url key will have a digit appended to it:
i.e
http://www.example.com/main-category/sub-category/product-name-**6260**.html
How do I find the source of that 6260 (which is one of the #'s appended to my urls)? I tried product id, sku, I cannot find the source of it. The reason I ask is because if I can find it, I can create a string replace function to flush it out of url's before I echo them on certain product listing pages.
Thanks.
Before we get to the location in code where this happens, be advised you're entering a world of pain.
There's no simple rule as to how those numbers are generated. There's cases where it's the store ID, there's cases where it's the simple product ID. There's cases where it's neither
Even if there was, it's common for not-from-scratch Magento sites to contain custom functionality that changes this
Ultimately, since Magento's human readable/SEO-friendly URLs are located in the core_url_rewrite
table, it's possible for people to insert arbitrary text
Warnings of doom aside, the Model you're looking for is Mage::getSingleton('catalog/url')
. This contains most of the logic for generating Magento Catalog and product rewrites. All of these methods end by passing the request path through the getUnusedPath
method.
#File: app/code/core/Mage/Catalog/Model/Url.php
public function getUnusedPath($storeId, $requestPath, $idPath)
{
//...
}
This method contains the logic for for creating a unique number on the end of the URL. Tracing this in its entirely is beyond the scope of a Stack Overflow post, but this line in particular is enlightening/disheartening.
$lastRequestPath = $this->getResource()
->getLastUsedRewriteRequestIncrement($match[1], $match[4], $storeId);
if ($lastRequestPath) {
$match[3] = $lastRequestPath;
}
return $match[1]
. (isset($match[3]) ? ($match[3]+1) : '1')
. $match[4];
In particular, these two lines
$match[3] = $lastRequestPath;
//...
. (isset($match[3]) ? ($match[3]+1) : '1')
//...
In case it's not obvious, there are cases where Magento will automatically append a 1
to a URL, and then continue to increment it. This makes the generation of those URLs dependent on system state when they were generated — there's no simple rule.
Other lines of interest in this file are
if (strpos($idPath, 'product') !== false) {
$suffix = $this->getProductUrlSuffix($storeId);
} else {
$suffix = $this->getCategoryUrlSuffix($storeId);
}
This $suffix
will be used on the end of the URL as well, so those methods are worth investigating.
If all you're trying to do is remove numbers from the URL, you might be better off with a regular expression or some explode
/implode
string jiggering.
I have little to no idea why this works but this worked for me. Most probably because it makes urls non-unique. Magento ver. 1.7.0.2 had suddenly started adding numbers as suffixes to my new products' names, even if their url keys and names were different from the old products. On a hunch, I went to System -> Configuration -> Catalog -> Search Engine Optimizations -> Product URL Suffix
and changed the default .html
to -prod.html
. I guess you could change it to any suffix you wanted to. Then I re-indexed my website, refreshed cache, and presto! All the numbers were gone from the product urls. The product urls now all have the format custom-product-name-prod.html
. The canonical tag also shows custom-product-name-prod.html
so I'm double happy.
Don't know if it'll work for others, but I hope it does. Do note that I did have old and new products with duplicate URLs and that I had disabled old products before doing this procedure. So if you have 2 products with the same url key and both are enabled, then this may NOT work for you.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With