Question

UPDATE
Below is the original question, and while it's related to what the issue ends up being, it's tangential. Please see the edits starting with number 2 for more useful background information

On our site, we have some CMS pages that explain correlation between two different categories. As such, the URLs tend to be similar to those catalog page URLs.

  • An example CMS URL:
    • "brand/category.html"
  • The category which matches:
    • "category"

Is there a setting in Magento to force category route matching to be more strict?

EDIT: I should note, though it feels obvious: These are just example names

EDIT 2: If it's helpful, all catalog pages have URLs relative to root (website.com/subcat) where subcat is a child of another category. This behavior is different from the default in other Magento installs. (Note: this isn't preferred, and it's unclear as to why it's happening).

EDIT 3: After more digging, I found a quote from an article by Fabrizio Branca on URL keys in 1.13:

Before 1.13/1.8 any CMS page with a url-key that was also used as a category or product url-key would be evaluated first. This way you could easily replace the main categories by cms landing pages. This has changed now. Even though the CMS controller is processed first, the product and category urls will be evaluated before the routing process starts, making it much harder to display cms content in a clean way instead.

EDIT 4: Result of more research:

  • "legitimate category" exists, and by default is accessible at /a
  • "legitimate other category" exists as well, and is at b
  • regardless of the relationship between these two categories, either can be accessed using the other as its parent (a/b works fine, as does b/a).
    • note that a/b shows products of b and b/a products of a
  • However, b/b does not work, nor does non-existant-category/a

What I'm looking for is a URL structure similar to the previous Magento versions (IE category/subcategory), without losing the benefits of background indexing that 1.13 gives.

Was it helpful?

Solution

(Thought I'd posted an answer similar to Alan's, but I hadn't. Sitting here in LocalStorage. But, I can tag onto his answer with an interesting solution theory.)

The CMS router adds itself to the Front Controller instance by observing the controller_front_init_routers event after the Admin and Standard routers are added. With a little config XML, it would be possible to switch this to the controller_front_init_before event, thereby adding the CMS router first, meaning its match()ing logic will run before the others.

To test this theory, drop the following into app/etc/local.xml:

<frontend>
    <events>
        <!-- fire observer for different event -->
        <controller_front_init_before>
            <observers>
                <cms>
                    <class>Mage_Cms_Controller_Router</class>
                    <method>initControllerRouters</method>
                </cms>
            </observers>
        </controller_front_init_before>
        <!-- disable the original observer -->
        <controller_front_init_routers>
            <observers>
                <cms>
                    <type>disabled</type>
                </cms>
            </observers>
        </controller_front_init_routers>
    </events>
</frontend>

See if this solves the problem.

Incidentally, the CMS router will adjust the request path in the same way as the URL rewrite model.

OTHER TIPS

I've seen a lot of interesting implementations (some good, some bad) by SEO specialists who didn't have a Magento background throughout the years. It sounds like you may be running into problems with some custom code you don't understand. The high level answer to your question may be "Contact the person who wrote the SEO code and/or installed the extension you don't understand", or find a Magento consultant to take a look and quickly dissect it for you.

Your question, even with its clarification, is still too confusing. Literally speaking, no, there's no setting in Magento to "force category routing to be more strict". I'm going to explain, in broad terms, how category routing works vs. CMS routing in a standard Magento system. This will (hopefully) give you enough information to ask a new question in terms we'll be able to understand it by. Also, I've written extensively on Magento's request dispatch before, so if you're interested on the nitty gritty details I'd start there.

Category Routing

There is, strictly speaking, no "category" routing in Magento. On a site with SEO Friendly URLs turned off, a category listing page looks like this.

http://magento.example.com/catalog/category/view/id/8

When SEO friendly URLs are on, Magento (in an indexing process) creates between one and several entries in the

core_url_rewrite

table for that category. The request_path column is the import one here. When Magento is deciding how to handle a particular URL, it will first look in this table. If the current URL matches the request_path, Magento will change its internal representation of the URL so it look like the target_path column.

So, in the sample data, there's a row that looks like this

*************************** 1. row ***************************
url_rewrite_id: 17
      store_id: 1
   category_id: 8
    product_id: NULL
       id_path: category/8
  request_path: electronics/cell-phones.html
   target_path: catalog/category/view/id/8
     is_system: 1
       options: NULL
   description: NULL
1 row in set (0.01 sec)

When Magento sees the url http://magento.example.com/electronics/cell-phones.html, it matches this row because the request_path variable is electronics/cell-phones.html. It then changes its internal representation of the URL to the target_path (catalog/category/view/id/8). Then, Magento handles the URL normally.

That's probably a bit much to follow if you're not used to it, but the important thing to take away is the system that decides how to handle the URL doesn't care that it's a category URL, it just cares that there's an entry in the core_url_rewrite table. This same table is used for product name URLs. Many SEO extensions and custom code solution use this table as well.

CMS Page Routing

After Magento finishes referencing the core_url_rewrite table, what happens next is

  1. It checks for any Admin application page matches (manage products, manage categories, etc.)

  2. It checks for any Frontend application page matches (product listing page, the above mentioned category listing page, etc.)

  3. If numbers 1 & 2 contain no matches, it then looks for a CMS page match.

Magento doesn't use the core_url_rewrite table for CMS pages. Instead, if is reaches step number 3, it tries to match the URL with the URL Key set on the CMS page object. (it would be more accurate to say that when Magento is looking for a CMS page match, it's operating on a URL already modified by the core_url_rewrite process — but things are already confusing enough)

The important take aways here is: CMS matching happens only after a category page match has failed.

It sounds like you may have external processes modifying the core_url_rewrite table, or may have a custom router object added to your system the does extra routing, or maybe even a non-magento system doing things to change URLs.

I'm afraid there's no quick and easy answer for your situation.

What I'm looking for is a URL structure similar to the previous Magento versions (IE category/subcategory), without losing the benefits of background indexing that 1.13 gives.

This is a problem I've been looking at, which so far doesn't seem to have a great solution. We have some deeply nested categories, for example:

Cat A
    Cat B
        Cat C
            Cat D

Prior to 1.13, the category url would have generated as www.domain.com/cat-a/cat-b/cat-c/cat-d/, but now it generates as www.domain.com/catd. Although if you have multiple "Cat D"s, then it could generate as something like www.domain.com/catalog/category/view/s/cat-d/id/132/.

I've been tinkering with different ideas for addressing this, one thing I'm trying right now is to modify the loadByRequestPath method of Enterprise_UrlRewrite_Model_Resource_Url_Rewrite to first look for a full path, before using the default behavior. I did that by adding this method:

protected function tryLoadByFullPath($object, $paths)
{
    if (count($paths) > 1) {
        $_path = implode('/', $paths);

        $select = $this->_getReadAdapter()->select()
            ->from(array('m' => $this->getMainTable()))
            ->where('m.request_path = ?', $_path);

        $result = $this->_getReadAdapter()->fetchRow($select);

        if ($result) {
            $object->setData($result);
            $this->unserializeFields($object);
            $this->_afterLoad($object);

            return true;
        }
    }

    return false;
}

and then adding this code to the top of loadByRequestPath():

if ($this->tryLoadByFullPath($object, $paths)) {
        return $this;
}

It appears to work, at first glance anyway, I haven't tested it very well yet. The downside to this is that the url_key has to be manually set to the full path for every category, so you would have to set the url key for Cat D to "cat-a/cat-b/cat-c/cat-d". That's obviously not ideal.

Anyway, that's probably not very helpful, but maybe someone has a better take on this approach.

@benmarks answer is already good. But you will still have issues if your product URLs match your CMS page URLs. The URL rewrites from the table core_url_rewrite are checked before the routers are checked - see Mage_Core_Controller_Varien_Front::dispatch:

public function dispatch()
{
    // [...]
    $this->_getRequestRewriteController()->rewrite();

    Varien_Profiler::start('mage::dispatch::routers_match');
    $i = 0;
    while (!$request->isDispatched() && $i++ < 100) {
        foreach ($this->_routers as $router) {
            /** @var $router Mage_Core_Controller_Varien_Router_Abstract */
            if ($router->match($request)) {
                break;
            }
        }
    }
    // [...]
}

If a product URL matches the current path info, the path info of the request will be changed in Mage_Core_Model_Url_Rewrite_Request::_rewriteDb, so that the CMS router will not match any more even though it is called before the catalog router (if you applied @benmarks config.xml).

Fortunately, there is a nice flag called straight, which can be set on a Mage_Core_Controller_Request_Http. If the flag is set, the URL rewrites from the database will not be checked. Hence, my solution is to observe an event, which is fired early enough (controller_front_init_before) and set the straight flag on the request in it if the current path info matches a CMS page identifier:

In your config.xml:

<global>
    <events>
        <controller_front_init_before>
            <observers>
                <namespace_module>
                    <class>namespace_module/observer</class>
                    <method>controllerFrontInitBefore</method>
                </namespace_module>
            </observers>
        </controller_front_init_before>
    </events>
<global>

Your observer method:

class Namespace_Module_Model_Observer
{

    public function controllerFrontInitBefore(Varien_Event_Observer $observer)
    {
        /** @var Mage_Core_Controller_Request_Http $request */
        $request = $observer->getFront()->getRequest();

        $identifier = trim($request->getPathInfo(), '/');
        $pageId = Mage::getModel('cms/page')->checkIdentifier($identifier, Mage::app()->getStore()->getId());

        if ($pageId) {
            $request->isStraight(true);
        }
    }

}

Since the catalog router will not match if the the database URL rewrites are not applied, this solution should work on its own and you do not even need the config.xml of @benmarks.

Hope this helps someone - that took quite some effort to debug!

Licensed under: CC-BY-SA with attribution
Not affiliated with magento.stackexchange
scroll top