There are two types of Sitemap, which differ in format and purpose:
It is compiled for users to make it easier for them to navigate through the pages of the site. This is an optional element, such a map can be compiled on a large site with a complex hierarchy for the convenience of users.
It is necessary for search bots, it is important for SEO, as it helps bots to index pages of a resource. The presence of this file tells search robots exactly how the structure of the site is organized. Accordingly, Yandex, Google, Bing and other search engines get the opportunity to better index the project.
In this article, we will analyze the creation of an XML map.
What is the Sitemap for:
Speed up the full indexing process.
Google crawlers can skip recently created or modified pages, and the Map will help navigate them.
Provide additional information.
Additional recommendations for the robot are indicated in the Map: they make notes about the importance of the page, the frequency and date of updates.
Display the number of indexed URLs
Signals from Sitemap are also needed to display indexed links in Search Console.
Speed up the batch de-indexing process.
To do this, create a temporary Sitemap file, which includes the pages to be deleted. John Mueller told about this, he recommends to indicate the date of the last modification of the pages, when they were assigned the error code 404 or the noindex attribute. This will tell Google to re-crawl the site's content. After a few months, this temporary sitemap can be removed.
General requirements for Sitemap:
The number of URLs in a sitemap file must not exceed 50 thousand. If there are more of them, it is necessary to create several separate sitemaps, which are registered in the Sitemap index file. According to the official Google representative, it is enough to create several separate sitemaps, no more than 50 thousand links each, and the total of links may be more than the limit. After that, a common parent Sitemap file is created for several Maps.
The maximum file size is 50 MB. The file can be shrunk using gzip compression, but it should still be less than 50 MB uncompressed.
The sitemap must be on the same domain as the website for which it was created.
You can use UTF-8 encoding, as well as Latin letters and numbers.
The server's response to a request to the Sitemap file should be an HTTP status with a 200 OK code.
Session URL identifiers should not be in the Sitemap.
Links must have the same syntax.
The Map contains only canonical URLs.
The map does not conflict with information in robots.txt - if some pages in robots.txt are closed from indexing, then they should not be in the Map.