How to create XML sitemaps with Laravel - Tutorial

INTRODUCTION
A quick and easy way to create Google search friendly XML sitemap with Laravel framework and ping Google after you've updated your sitemap.
ABOUT THE AUTHOR
Kaustubh has been working with Laravel for several years now and wants to help newbie Laravel developers become awesome with Laravel.
Kaustubh Katdare
Kaustubh Katdare
Electrical

Do you really need a sitemap?

Short Answer: Yes, you do! In the initial years of CrazyEngineers, we never bothered to have a proper XML sitemap for the site. We thought if we create great content, Google will index our site; and Google did. 

But times changed and so did Google. The search engine got lot busier and Google engineers started optimising their crawler for 'meaty' content. In my opinion it's now more important than ever to proactively tell Google about the new content on your site. 

That's where the XML sitemaps help.

axsB-laravel-xml-sitemap.jpg

Why not use Composer Package?

Sure, you can. However, I think it's super easy and quick to create your own sitemap with just a few lines of code. Give this tutorial a try and let me know your thoughts. 

Starting with Sitemap Template [with Blade!]

Start by creating a blade template, say sitemap.blade.php in your views directory. You could use this template -

< ? php '< ?xml version="1.0" encoding="UTF-8" ? >'; ? >
< urlset xmlns="http://www.sitemaps.org.schemas/sitemap/0.9" >
   @foreach ($items as $item)
      
         {{$item->permalink}}
         {{ $item->updated_at->tz('UTC')->toAtomString() }}
         weekly
         0.8
      
   @endforeach
< / urlset >

We're going to feed our template with items for which we are going to build sitemap. It could be posts, articles, jobs whatever. 

Make sure that the 'item' you are passing to the template has permalink attribute. The permalink should be a fully qualified URL to the entity. Do not make the mistake of using relative URLs. 

Create a Job to Update Sitemap

The idea is that we create a job that will process our sitemap every time we want it to be updated - either when a model is created or updated. It's easy to create a job in Laravel. All you need to do is issue the following command -

php artisan make:job UpdateSitemap

After running above command, head over to the app/Jobs directory where you should see UpdateSitemap.php file. 

We need to work with the handle() method of our UpdateSitemap job. Let me present the overall structure of the handle method -

$items  =   Model::where('valid', true)->orderBy('updated_at', 'DESC')->get();
$sitemap_contents   =   view("sitemap", compact('items')->render();
Storage::disk('local')->put('public/sitemaps/sitemap.xml', $sitemap_contents);
$client = new Client(); // Guzzle/Http client
$client->request('GET', 'http://google.com/ping?sitemap=https://your-site.com/sitemap.xml');

There are a few things to pay attention to in this approach. It differs from creating sitemap on the fly when Google fetches the sitemap. 

Why this approach?

In general, creating a sitemap is a time consuming process. I therefore do not advise creating a sitemap on the fly. Rather, you should create a sitemap in memory once or twice daily and dump it into XML file.

Laravel's beautiful render() method lets us get the content in a variable ($sitemap_contents) and dump it in the XML file. 

Ping Google After Generating XML Sitemap

The last two lines are even more interesting. After building the sitemap we quickly need to inform Google that we've updated the sitemap and it's ready for Google to access. 

We therefore initialise Guzzle Http client which makes it super easy to make GET/POST requests. Google's ping endpoint accepts the entire sitemap URL and acknowledges it with 'OK'. No further processing is required. 

Optimizing Sitemap Generation

On production site, I advise running a task scheduler to update sitemap. In general, it's advisable to update your sitemap only once or twice at max daily. Let me know if you want me to write a detailed tutorial on how to setup task scheduling on production servers. 

If you have questions, ask them below. I'll try to respond as soon as I can.

Share this content on your social channels -

Replies, Feedback and Questions

Kaustubh Katdare
Kaustubh Katdare
Electrical

06 Jan, 2019

If you are interested in Laravel and want to connect with fellow Laravel developers from all over the world, we've a Laravel Developers group. Join in: https://www.crazyengineers.com/groups/laravel-developers.32/feed/

Mohit G
Mohit G

07 Jan, 2019

Excellent tutorial. Your approach is quite interesting. My current app has XML sitemaps rendered using DomDocument and it's quite complicated logic. I think I could replace it with your logic. How frequently are you updating your sitemap? And is it necessary to ping Google after rebuilding the sitemap or will Google crawl it on its own?

Kaustubh Katdare
Kaustubh Katdare
Electrical

07 Jan, 2019

The render() method offered by laravel is doing all the magic here. DomDocument would be quite complicated because you'll then have to take care setting the headers as well. 

I think frequently pinging Google for each new entry in the sitemap is not encouraged. You could update once or twice daily depending upon the frequency of new content generation and let Google do its own job.

From my experience, Google does crawl XML sitemap on its own. Google's said time and again that the XML sitemaps are not 'strictly' followed and is seen as a 'guideline' document. Google determines its own crawl frequency. 

However, from my own observation, Google actually does crawl the XML sitemap as soon as you send a HTTP request (GET). 

Aaron Hansel
Aaron Hansel
Marine

15 Jan, 2019

Thank you for the tutorial. Is there any specific reason that you have ordered the models with the updated_at timestamp? I searched Google's documentation and they do not mention anything about the order of entries in XML sitemap.

Another question I have is about memory management while creating sitemap. Your approach seems to be workable when the overall count of models is limited that does not cause PHP to go out of memory. What's your take on it? 

Kaustubh Katdare
Kaustubh Katdare
Electrical

15 Jan, 2019

Hi Aaron. Nice observation. It's true that Google does not mention anything about the order of entries in the sitemap. Perhaps, that is one of the reasons that Google pays special attention to the 'last_modified' tag in the sitemap. 

While Google mentions that the timestamp is used only for guidance to Google Bot about indexing and page update frequency and it does not affect the way Google indexes the page. My personal observation is that if you put all the latest entries up in the sitemap, Google has to do less crawling of each entry in XML sitemap.

That's why I recommend sorting the models with updated_at timestamp. It'a a personal choice though.

Second, you are right - if you have really large models, you could run out of memory. But not always. I typically restrict my sitemaps to just about 25,000 entries. Google's official recommendation is that the sitemap should not have more than 50,000 entries per sitemap file. 

However, the XML files can quickly grow bigger and cross the 10MB size limit. I'm exploring other ways of building sitemaps and will update this tutorial with latest methods. If you know any better method, please let me know. Thank you for the comment.