What is a XML Sitemap?

An XML sitemap is a structured file, usually named sitemap.xml, that lists the URLs you want search engines to know about. The “XML” stands for Extensible Markup Language, which is simply a format machines read well. You are writing a clean list of pages in a language crawlers understand.

Search engines find most pages by following links from one page to the next. A sitemap supports that process. Instead of leaving Google to stumble onto your URLs link by link, you hand it a direct, complete list.

Here is what a basic sitemap looks like:

<?xml
version="1.0" encoding="UTF-8"?><urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>https://www.example.com/</loc>
<lastmod>2026-05-14</lastmod>
</url>
<url>
<loc>https://www.example.com/blog/xml-sitemaps/</loc>
<lastmod>2026-06-01</lastmod>
</url>
</urlset>

Three pieces do the real work. The <urlset> wrapper opens the file and declares the protocol. Each <url> block represents one page. The <loc> tag holds that page’s full address. Everything else is optional, and most of it matters far less than people assume.

One nice detail: the format is shared. Google, Bing, and other search engines settled on the sitemaps.org standard years ago, so the same file works everywhere without extra effort from you.

What an XML sitemap does (and what it doesn’t)

This is where a lot of beginner advice goes sideways, so let’s be precise.

A sitemap solves exactly one problem: discovery. It gives search engines a tidy list of the URLs you care about, which can speed up how quickly new or updated pages get noticed. That matters most for pages that are brand new, buried deep in your site, or barely linked from anywhere.

Here is the part the old tutorials skip. A sitemap does not guarantee indexing, and it does not influence rankings. Listing a page invites Google to crawl it. Google still decides whether the page is worth crawling, worth indexing, and where it lands in results. The sitemap is an invitation, never a command.

Two situations make that invitation genuinely valuable. On large sites, Google works within a crawl budget, so a sitemap helps it spend that budget on pages that matter rather than wandering. On sites with thin internal linking, a sitemap reaches “orphan” pages that crawlers might otherwise miss for weeks.

If you take away one idea, take this: a sitemap helps Google find your pages. Strong content, fast load times, and solid internal links are what help those pages rank.

XML sitemap vs. HTML sitemap

People mix these up constantly, so here is the quick distinction.

An XML sitemap is built for search engines. Visitors rarely see it, and it usually sits at a predictable address like example.com/sitemap.xml.

An HTML sitemap is a normal web page built for humans, often a links page that helps visitors get around a big site. The two serve different audiences and can coexist happily. When people search for how to create an XML sitemap, they almost always mean the machine-readable version, which is what this guide covers.

Do you actually need an XML sitemap?

Probably yes, but the honest answer is “it depends on your site.” A sitemap helps a lot in some cases and very little in others.

A sitemap genuinely helps when:

  • Your site is large, with hundreds or thousands of pages, where efficient crawling actually pays off.
  • Your site is brand new with few external links, which makes it hard for search engines to find on their own.
  • Your internal linking is weak, leaving pages that are not linked from anywhere else.
  • You publish or update often, and you want changes noticed quickly.
  • You rely on rich media or news, since images, video, and news content can use specialized sitemap formats.

A sitemap matters far less when your site is small, well-organized, and thoroughly linked. A tidy ten-page site where everything sits in the main menu will usually get crawled in full without one.

Even then, the downside of having a sitemap is close to zero, and it takes minutes to set up. Here is the practical reality most guides skip past: modern platforms generate one for you automatically. For many people, the real task is not building a sitemap from scratch. It is confirming one exists and submitting it.

How to create an XML sitemap

There is no single correct method, only the one that fits how your site is built. Pick the path below that matches your setup. Most people should use the first or third option, and you almost never need to write a sitemap by hand.

Method 1: Use your CMS or an SEO plugin (easiest, most common)

If you run a content management system, you very likely do not need to write anything yourself.

On WordPress, you have two layers. The core software has generated a basic sitemap automatically since version 5.5, usually at yourdomain.com/wp-sitemap.xml. For real control, an SEO plugin like Yoast SEO or Rank Math replaces that with a richer sitemap, typically at yourdomain.com/sitemap_index.xml, and updates it every time you publish or edit a page. Install one, confirm the sitemap feature is switched on, and open the URL to see it live.

On Shopify, Wix, and Squarespace, a sitemap is generated for you with no plugin required, usually at yourdomain.com/sitemap.xml. You do not create it. You locate it and submit it.

The CMS route in three steps:

  1. Confirm an SEO plugin or built-in sitemap feature is active.
  2. Visit your sitemap URL in a browser to check it loads and lists your pages.
  3. Submit that URL to Google Search Console.

For most readers, that is the whole job, done in minutes.

Method 2: Use a sitemap generator tool (small or static sites)

If you have a hand-built or static site with no CMS, an online sitemap generator is the fastest route.

These tools crawl your site once you enter your homepage URL, then hand you a downloadable sitemap.xml. The workflow:

  1. Enter your homepage URL into the generator.
  2. Let it crawl your pages and build the file.
  3. Download the resulting sitemap.xml.
  4. Upload it to your site’s root directory so it is reachable at yourdomain.com/sitemap.xml.
  5. Submit it to Google.

One catch: a generated file is a snapshot in time. Every time you add or remove a page, you have to regenerate and re-upload it. That is why this method suits sites that rarely change.

Method 3: Generate it dynamically (developers and large sites)

If your site changes constantly or runs into the thousands of URLs, generating the sitemap dynamically is the durable solution.

Here, your application builds the sitemap on the fly from your database or content source, so it stays current without anyone touching it. Most frameworks support this directly. Next.js can output a sitemap automatically, Django and Laravel ship with sitemap support, and many headless setups generate one as part of the build.

The principle is the same on any stack: query your live content, write each URL into the sitemap structure, and serve the file at a stable address. Because it regenerates itself, you never have to remember to update it. Choose this when manual upkeep would become a chore or a liability.

Method 4: Write it by hand (tiny sites only)

For a site with a handful of pages, you can write the sitemap yourself in any text editor. Use the structure shown earlier, add one <url> block per page, fill in each <loc>, save the file as sitemap.xml, and upload it to your root directory.

Honest advice: only do this for a small, rarely changing site, or to learn the format once. The moment your page count grows, manual editing turns error-prone, and one of the automated methods above will serve you far better.

Understanding the sitemap tags

You will rarely touch these if a tool generates your sitemap, but knowing what they mean helps you read and troubleshoot one.

  • <loc> (required): The full, absolute URL of the page. Always use the complete address starting with https://, never a relative path. Use the canonical version, and stay consistent about https versus http and www versus non-www.
  • <lastmod> (recommended): The date the page last changed, in YYYY-MM-DD format. Google does pay attention to this and uses it to decide when to re-crawl a page, but only while your dates stay honest. Stamp every page with today’s date to fake freshness, and Google learns your dates are unreliable and stops trusting them.
  • <changefreq> (optional): A hint about how often a page changes. Google ignores it. Skip it.
  • <priority> (optional): A value from 0.0 to 1.0 suggesting a page’s relative importance. Google ignores this too. Skip it.

If your generator adds changefreq and priority automatically, leave them. You do not need to strip them out, and you should not spend a second managing them by hand. If you get one thing right beyond the URL itself, make it an accurate <lastmod>.

Sitemap size limits and sitemap index files

Sitemaps have hard limits worth knowing as your site grows.

A single sitemap file can hold a maximum of 50,000 URLs and must stay under 50MB uncompressed. Hit either ceiling and you split your URLs across multiple files, then tie them together with a sitemap index: a sitemap that points to other sitemaps.

It looks like this:

<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<sitemap>
<loc>https://www.example.com/sitemap-posts.xml</loc>
<lastmod>2026-06-01</lastmod>
</sitemap>
<sitemap>
<loc>https://www.example.com/sitemap-products.xml</loc>
<lastmod>2026-05-20</lastmod>
</sitemap></sitemapindex>

This is exactly what WordPress SEO plugins produce: one index file linking to separate sitemaps for posts, pages, products, and so on. You submit the single index URL to Google, and it discovers every child sitemap from there. Splitting by content type also makes problems easier to diagnose later, because you can see which section a bad URL came from.

How to submit your sitemap to Google

Creating a sitemap is only half the job. Telling search engines where to find it is the other half. Two methods work well, and you should use both.

1. Submit it in Google Search Console

  1. Sign in to Google Search Console and select your verified property.
  2. Open the Sitemaps report from the left menu.
  3. Enter your sitemap path, usually sitemap.xml or sitemap_index.xml, in the “Add a new sitemap” field.
  4. Click Submit.

Google fetches the file and, over the following days, reports how many URLs it discovered and flags any errors. Check back after a day or two to confirm it processed cleanly. If you make big changes later, you can resubmit the same URL to nudge a faster re-crawl.

2. Reference it in your robots.txt file

Add one line to the robots.txt file at your site’s root:

Sitemap: https://www.example.com/sitemap.xml

This points every crawler, not just Google, to your sitemap automatically, Bing included. Together, these two steps make your sitemap easy to find and easy to monitor.

Skip the dead “ping” method

For years, tutorials told you to “ping” Google by loading a special URL whenever your sitemap changed. Google retired that endpoint in 2023, and those requests now return an error. If an old plugin is still pinging away, it does no harm, but it accomplishes nothing. Stick with Search Console and robots.txt.

One aside: Bing and several other engines support a protocol called IndexNow that notifies them the instant a page changes. Google does not use IndexNow, so for Google specifically, the two methods above are what matter.

Which pages belong in your sitemap?

A sitemap is a statement of intent. Every URL in it says, “This is a page I want in search results.” So be selective.

Include only pages that are:

  • Indexable. Leave out anything tagged noindex, blocked in robots.txt, or pointed elsewhere by a canonical tag.
  • Live and returning a 200 status. No redirects, no 404s, no server errors.
  • Canonical. List the single preferred version of each page, not parameter variants or near-duplicates of the same content.
  • On the same domain. A sitemap should list URLs from the site it lives on, not addresses from another domain.

Some pages should stay out even though they work fine. A thank-you page shown after a newsletter signup adds nothing in search results. Thin tag or category archives that list a single post are not useful yet, though you can add them once they hold enough content to earn a click.

One thing to remember: leaving a URL out of your sitemap does not hide it from Google. If the page is linked anywhere Google can reach, it can still be crawled and indexed. To keep a page out of results, use a noindex tag, not the sitemap.

Common XML sitemap mistakes to avoid

A sloppy sitemap can do more harm than good by sending crawlers to the wrong places. Watch for these:

  • Listing pages that should not be there. Redirected URLs, 404s, and noindex pages all erode Google’s trust in your list.
  • Including non-canonical or duplicate URLs. Parameter variants and near-duplicates waste crawl budget. List one canonical version of each page.
  • Using relative URLs. Always use full, absolute addresses in <loc>.
  • Faking <lastmod> dates. Inaccurate dates train Google to ignore the signal. Update the date only when the content actually changes.
  • Blocking the sitemap in robots.txt. Double-check that your robots.txt does not accidentally disallow the sitemap file or the pages it lists.
  • Ignoring the size limits. Cross 50,000 URLs or 50MB and your sitemap may be read only in part. Split it and use an index.
  • Setting it and forgetting it. A static, hand-built sitemap drifts out of sync with your real content fast. Automate generation wherever you can.

Audit your sitemap every month or two, and these issues stay easy to catch.

The bottom line

An XML sitemap is a small file with an outsized job: it hands search engines a clean, current list of the pages you want found. It will not rank your content for you, and it will not force anything into the index. What it does is remove the friction between publishing a page and getting it discovered, which matters most for new, large, and frequently updated sites.

The best part is how little it asks of you. If you use a CMS, you are probably most of the way there already. If you do not, a generator or a dynamic build will handle it.

Your next step takes about five minutes. Open Google Search Console, find your sitemap URL (try yoursite.com/sitemap.xml first), and confirm it is submitted and error-free. Then add a quick sitemap check to your monthly routine so it never drifts out of date. Your pages cannot rank if they are never found, so give the crawlers the map.