View Feed
group-icon
Coffee Room
Discuss anything here - everything that you wish to discuss with fellow engineers.
12893 Members
Join this group to post and comment.
The_Small_k
The_Small_k • May 23, 2017

How to generate sitemap ?

Hello Engineers,
I am having some questions related to sitemap.
  1. If i will not put sitemap for my website then also google crawlers are able to detect my website pages but looks like it is not able to crawl more than 250 pages. So now the question is how important a sitemap is for a website ?

  2. If we will not put the sitemap for our website is there any limitation with google crawler to detect my website pages(like it is limited to 250 pages for my website) ?

  3. As my website is having 1000+ pages and frequency of adding new page is 2 pages/day do i have to come with my own dynamic sitemap generator or any ready made code is available ?

  4. As i said i will have many pages in future, one sitemap is enough for my website or i have to split our sitemap in multiple files ? if it is, is there any standard design that i can follow like every section should hold a sitemap or something ?
If you find any informative link regarding these queries please share.
Kaustubh Katdare
Kaustubh Katdare • May 23, 2017
The_Small_k
So now the question is how important a sitemap is for a website ?
Sitemaps are very important; but not absolutely necessary. Sitemaps is basically a bunch of URLs that we tell search engines exist on our site; making it easier for them to crawl the site.

As long as every page on your site is 'linked' from an already existing page, you're all set. As you could guess, Google won't magically figure out that a page exists on your site or domain unless you specifically link it from other page or include it in the sitemap.

The_Small_k
As my website is having 1000+ pages and frequency of adding new page is 2 pages/day do i have to come with my own dynamic sitemap generator or any ready made thing is available ?
Most of the modern content management systems offer this functionality out of the box or through plugins. I'd strongly recommend using WordPress or similar CMS to manage your online content.

The_Small_k
As i said i will have many pages in future one sitemap is enough for my website or i have to split our sitemap in multiple files ? if it is is there any standard design that i can follow like every section should hold a sitemap or something ?
There is a limit of 10,000 URLs per sitemap file. Then you can build a 'master' sitemap file for all the sitemap files and submit it to Google.

There are sitemap validators available online. But I'd recommend not doing it on your own. No point in reinventing the wheel. Use existing solutions; they do a fantastic job.

I hope this answers your questions.
The_Small_k
The_Small_k • May 30, 2017
Thanks For Your Answer Sir.
Kaustubh Katdare
As long as every page on your site is 'linked' from an already existing page, you're all set. As you could guess, Google won't magically figure out that a page exists on your site or domain unless you specifically link it from other page or include it in the sitemap.
By above statement you mean to say that drilling down to pages through link are making it hard for crawler to crawl the pages. so we should try to put the links which are rich with internal links ?

Otherwise
Lets take the example of crazyengineer. Thousands of pages are there out all that page i will just take my domain name in sitemap and all the pages are linked through my domain name directly or indirectly so according the statement quoted above putting domain name in sitemap should be sufficient to index all the pages.

As a best practice what should be followed ?
Kaustubh Katdare
Kaustubh Katdare • May 30, 2017
The_Small_k
By above statement you mean to say that drilling down to pages through link are making it hard for crawler to crawl the pages. so we should try to put the links which are rich with internal links ?
Think of a Googlebot as a human who is viewing your site. For it to be able to visit any page, it must find a link on it. It's similar to a treasure hunt for the googlebot. It will land on the index page, which will have links to pages A,B,C and D. Then it will visit all these pages. Then each of these pages will link to other pages, say 'A1, A2, A3', 'B1, B2' and so on. The crawler will keep visiting the pages. Google determines which pages to index and how often to visit them.

We've no control over it.

Then if there's a page on your site, say 'Z' and it's not linked from any other page, Google is likely to not find it ever. The best practice says that all your 'content rich' pages need to be updated regularly with fresh content.

The_Small_k
Lets take the example of crazyengineer. Thousands of pages are there out all that page i will just take my domain name in sitemap and all the pages are linked through my domain name directly or indirectly so according the statement quoted above putting domain name in sitemap should be sufficient to index all the pages.
At CrazyEngineers, we do have auto-generated sitemap that lists tens of thousands of links. We handle it on the server side and generate sitemap that is automatically updated and fed to Google. We've seen that Google visits in < 3 seconds after a page is updated in most of the cases.

Rely on automation for tasks like sitemap generation. Don't even think of doing it manually.

Share this content on your social channels -