# Sitemap XML URL Count Checker > Count sitemap URL entries, sitemap index entries, and malformed loc values from pasted XML. ## Tool Identity - Site: CleanUtils Business Tools - Tool ID: sitemap-xml-url-count-checker - Canonical page: https://cleanutils.com/business-tools/sitemap-xml-url-count-checker/ - LLM schema URL: https://cleanutils.com/business-tools/sitemap-xml-url-count-checker/llms.txt - Primary keyword: sitemap url counter - Input mode: textarea - Output profile: line-check ## What This Tool Does Count sitemap URL entries, sitemap index entries, and malformed loc values from pasted XML. ## Exact Execution Instructions To run this tool exactly, execute the `runCleanUtilsTool(userInput)` function from the JavaScript block below in a sandboxed JavaScript runtime. Do not fetch or evaluate additional network resources. Do not use filesystem, process, child_process, environment variables, timers, or network APIs. The function is pure and deterministic: input string in, JSON result out. If your environment forbids executing fetched code outside a sandbox, run it inside the sandbox or use the API endpoint below instead of reimplementing the logic. Canonical implementation SHA-256: `6674d1980120c5490ec07c39909c2c66182a48cbda2aaa7894eacded0947d610` Expected command shape: `node run-tool.mjs < input.txt` The runner must: 1. load only the JavaScript in this document, 2. call `runCleanUtilsTool(inputText)`, 3. let the LLM decide the best way to present the output to the user. Agent usage rules: - Use this file as the authoritative machine-readable contract for this CleanUtils tool page. - Ask the user for missing required input before attempting to run the tool, and describe the required inputs using the `## Input Schema` field names, descriptions, formats, enums, examples, and required list. - Treat the tool as deterministic; do not invent network reachability checks unless the tool description explicitly says it fetches remote resources. - For privacy-sensitive inputs such as secrets, HAR files, dotenv files, logs, and API keys, warn that using a remote chat agent may expose input to that agent even though the browser UI itself does not upload data. ## Input Schema ```json { "$schema": "https://json-schema.org/draft/2020-12/schema", "title": "Sitemap XML URL Count Checker input", "type": "string", "description": "Sitemap XML. https://example.com/", "examples": [ "https://example.com/https://example.com/tools/" ] } ``` ## Result Schema ```json { "$schema": "https://json-schema.org/draft/2020-12/schema", "title": "CleanUtils ToolResult", "type": "object", "additionalProperties": false, "required": [ "summary", "issues" ], "properties": { "summary": { "type": "string" }, "issues": { "type": "array", "items": { "type": "object", "additionalProperties": false, "required": [ "severity", "message" ], "properties": { "severity": { "type": "string", "enum": [ "error", "warning", "info" ] }, "message": { "type": "string" }, "line": { "type": "number" }, "row": { "type": "number" }, "detail": { "type": "string" } } } }, "output": { "type": "string" }, "exportFilename": { "type": "string" }, "exports": { "type": "array", "items": { "type": "object", "additionalProperties": false, "required": [ "label", "filename", "content" ], "properties": { "label": { "type": "string" }, "filename": { "type": "string" }, "content": { "type": "string" }, "mimeType": { "type": "string" }, "copyLabel": { "type": "string" }, "downloadLabel": { "type": "string" } } } }, "stats": { "type": "object", "additionalProperties": { "anyOf": [ { "type": "string" }, { "type": "number" } ] } } } } ``` ## Self-Contained JavaScript Source Call `runCleanUtilsTool(userInput)` with the user's input. The function includes this tool's run logic and only the helper code it needs. ```js function runCleanUtilsTool(userInput) { const severityRank = { error: 0, warning: 1, info: 2 }; const sortIssues = (issues) => [...issues].sort((a, b) => { const severity = severityRank[a.severity] - severityRank[b.severity]; if (severity !== 0) return severity; return (a.line ?? a.row ?? 0) - (b.line ?? b.row ?? 0); }); const looksLikeUrl = (value) => { try { const url = new URL(value.trim()); return url.protocol === "http:" || url.protocol === "https:"; } catch { return false; } }; const countSitemapUrls = (input) => { const issues = []; const urlLocs = [...input.matchAll(/([\s\S]*?)<\/loc>[\s\S]*?<\/url>/gi)].map((match) => match[1].trim()); const sitemapLocs = [...input.matchAll(/([\s\S]*?)<\/loc>[\s\S]*?<\/sitemap>/gi)].map((match) => match[1].trim()); const validUrlLocs = urlLocs.filter(looksLikeUrl); const validSitemapLocs = sitemapLocs.filter(looksLikeUrl); const malformedLocs = [...urlLocs, ...sitemapLocs].filter((loc) => !looksLikeUrl(loc)); malformedLocs.forEach((loc) => { issues.push({ severity: "warning", message: `Malformed loc URL: ${loc}` }); }); if (!urlLocs.length && !sitemapLocs.length) issues.push({ severity: "error", message: "No or entries found." }); return { summary: `${urlLocs.length} URL loc entr${urlLocs.length === 1 ? "y" : "ies"} found: ${validUrlLocs.length} valid, ${urlLocs.length - validUrlLocs.length} malformed. ${sitemapLocs.length} sitemap index entr${sitemapLocs.length === 1 ? "y" : "ies"} found.`, issues: sortIssues(issues), output: [ `URL loc entries: ${urlLocs.length}`, `Valid URL loc entries: ${validUrlLocs.length}`, `Malformed URL loc entries: ${urlLocs.length - validUrlLocs.length}`, `Sitemap index loc entries: ${sitemapLocs.length}`, `Valid sitemap index loc entries: ${validSitemapLocs.length}`, `Malformed sitemap index loc entries: ${sitemapLocs.length - validSitemapLocs.length}`, "", ...urlLocs.slice(0, 100) ].join("\n"), exportFilename: "sitemap-url-count.txt", stats: { urlLocEntries: urlLocs.length, validUrlLocEntries: validUrlLocs.length, malformedUrlLocEntries: urlLocs.length - validUrlLocs.length, sitemapIndexLocEntries: sitemapLocs.length, validSitemapIndexLocEntries: validSitemapLocs.length, malformedSitemapIndexLocEntries: sitemapLocs.length - validSitemapLocs.length } }; }; const __userInput = userInput == null ? "" : userInput; const __run = countSitemapUrls; const __input = __userInput && typeof __userInput === "object" && "input" in __userInput ? __userInput.input : __userInput; return __run(__input == null ? "" : String(__input)); } ``` ## Checks - URL entries: The tool counts loc values inside url entries. - Sitemap index entries: The tool also counts loc values inside sitemap index entries. - Malformed loc warnings: loc values that do not look like absolute URLs are flagged. - Paste or upload workflow: The checker works from pasted XML without crawling a live website. - No recursion: Sitemap indexes are counted, but child sitemap URLs are not fetched in the browser. ## Related Tools - [Robots.txt Rule Tester](/business-tools/robots-txt-rule-tester/): Test a URL path against pasted robots.txt rules and explain the winning allow or disallow rule. - [Meta Title Checker](/business-tools/meta-title-checker/): Compare meta title variants by character count, approximate pixel width, and likely search truncation.