Some parts of Magento 2 are oriented towards flexibility, others seem to have been designed to cause headaches. One of those areas that causes me to frown is the Magento 2 robots router. Here's why.
The goal: Creating a robots.txt
output
This story starts very simple: Let's create a robots.txt
output when the URL /robots.txt
is requested. While you might expect this to be accomplished via a simple file in the pub/
folder, the Magento_Robots
module actually takes a different route (pun intended). Here we go.
Stage 1: A router
The first step here is to realize that the file robots.txt
does not exist by default in the pub/
folder. And this causes a HTTP request towards the URL /robots.txt
to be caught by Magento (either with an Apache .htaccess
rule or with a Nginx location match), so that Magento is able to determine the right output. So far so good.
With Magento its default router, any URL is matched with the pattern (frontName/controllerPath/actionClass
) so that a URL checkout/index/index
ends up with the Magento_Checkout
module. There is no module with frontname robots.txt
, so this request would normally die.
However, the Magento_Robots
module also adds its own router (via etc/frontend/di.xml
) which intercepts any request for robots.txt
and forwards it to the path robots/index/index
, which ends up to be caught be the action class (aka controller) \Magento\Robots\Controller\Index\Index
.
Stage 2: An action
The action class \Magento\Robots\Controller\Index\Index
is quite simple: It creates a result page object with the layout handle robots_index_index
. And note that the Content-Type
is set to text/plain
. We're generating a page based on text, not HTML.
The result page calls upon the layout.
Stage 3: The layout
The layout handle robots_index_index
resembles an XML layout file robots_index_index.xml
. But instead of extending the normal default
layout, the file calls upon the XML page layout robots
(view/frontend/page_layout/robots.xml
). This makes sure that all regular containers and blocks are gone, with only container remaining: root
.
This root
container is then filled with content from a block class Magento\Robots\Block\Data
.
Stage 4: The block
The block class Magento\Robots\Block\Data
renders non-HTML output via its _toHtml()
method. Right. Luckily it is not extending upon the template class. Perhaps one of the reasons for this approach is caching, because if then page cache is enabled, the /robots.txt
page will be cached as well.
The actual output is actually retrieved from a model \Magento\Robots\Model\Robots
.
Stage 5: The model
The model class \Magento\Robots\Model\Robots
retrieves a value from the configuration path design/search_engine_robots/custom_instructions
which could be entered with some value. In my case, it is empty.
My head hurts
If you have followed along, you can see that this entire functionality is quite complex, assuming that the actual output was not dynamically created, but instead manually saved to the configuration table. Even worse, in my own case (my development environment, ok) it is empty. Which means that the entire architecture was meant to deliver an empty output.
And this took 152ms. Imagine all this CPU power across all of these Magento shops in production, and we have just burned down part of the Amazon forest.
Improvement: Create a file instead
One simple improvement is to throw away the entire module and simply add a file pub/robots.txt
to your Magento files and be done with it. This has 2 benefits: First of all, you'll understand things a lot better. Second, the request speed goes up, because any kind of page handling (Magento without FPC, Magento with FPC or Varnish) is replaced with a simple file request (5ms at most).
But then we are lacking the functionality of overriding this value per Website scope.
Improvement: Create a better router instead
Another way might be to create a new router instead. Let that router inject itself with the configuration and simply output that same configuration value when requested. Gone is the action, layout, block and model - and the page speed will still be ok-ish. I'm not sure about the cacheability of this, but at least things are less complex.
Anyway, I find this a good example of how Magento can sometimes be overengineered.
About the author
Jisse Reitsma is the founder of Yireo, extension developer, developer trainer and 3x Magento Master. His passion is for technology and open source. And he loves talking as well.