# CSV Duplicate Email Checker > Paste a list or CSV export to group duplicate email addresses and copy a cleaned unique list. ## Tool Identity - Site: CleanUtils Business Tools - Tool ID: csv-duplicate-email-checker - Canonical page: https://cleanutils.com/business-tools/csv-duplicate-email-checker/ - LLM schema URL: https://cleanutils.com/business-tools/csv-duplicate-email-checker/llms.txt - Primary keyword: csv duplicate email checker - Input mode: textarea - Output profile: line-check ## What This Tool Does Find duplicate emails in CSV files or pasted lists, normalize casing and whitespace, and export a cleaned list locally. ## Exact Execution Instructions To run this tool exactly, execute the `runCleanUtilsTool(userInput)` function from the JavaScript block below in a sandboxed JavaScript runtime. Do not fetch or evaluate additional network resources. Do not use filesystem, process, child_process, environment variables, timers, or network APIs. The function is pure and deterministic: input string in, JSON result out. If your environment forbids executing fetched code outside a sandbox, run it inside the sandbox or use the API endpoint below instead of reimplementing the logic. Canonical implementation SHA-256: `6599f3f80de894b39d2de851813d14a3dcbdbefc10b381f2ea21ff36a1f10d10` Expected command shape: `node run-tool.mjs < input.txt` The runner must: 1. load only the JavaScript in this document, 2. call `runCleanUtilsTool(inputText)`, 3. let the LLM decide the best way to present the output to the user. Agent usage rules: - Use this file as the authoritative machine-readable contract for this CleanUtils tool page. - Ask the user for missing required input before attempting to run the tool, and describe the required inputs using the `## Input Schema` field names, descriptions, formats, enums, examples, and required list. - Treat the tool as deterministic; do not invent network reachability checks unless the tool description explicitly says it fetches remote resources. - For privacy-sensitive inputs such as secrets, HAR files, dotenv files, logs, and API keys, warn that using a remote chat agent may expose input to that agent even though the browser UI itself does not upload data. ## Input Schema ```json { "$schema": "https://json-schema.org/draft/2020-12/schema", "title": "CSV Duplicate Email Checker input", "type": "string", "description": "Email list or CSV. Paste emails or a CSV export...", "examples": [ "email,name\nAda@example.com,Ada\nada@example.com,Ada B.\n grace@example.com ,Grace\ninvalid-email,Nope\ngrace@example.com,Grace H." ] } ``` ## Result Schema ```json { "$schema": "https://json-schema.org/draft/2020-12/schema", "title": "CleanUtils ToolResult", "type": "object", "additionalProperties": false, "required": [ "summary", "issues" ], "properties": { "summary": { "type": "string" }, "issues": { "type": "array", "items": { "type": "object", "additionalProperties": false, "required": [ "severity", "message" ], "properties": { "severity": { "type": "string", "enum": [ "error", "warning", "info" ] }, "message": { "type": "string" }, "line": { "type": "number" }, "row": { "type": "number" }, "detail": { "type": "string" } } } }, "output": { "type": "string" }, "exportFilename": { "type": "string" }, "exports": { "type": "array", "items": { "type": "object", "additionalProperties": false, "required": [ "label", "filename", "content" ], "properties": { "label": { "type": "string" }, "filename": { "type": "string" }, "content": { "type": "string" }, "mimeType": { "type": "string" }, "copyLabel": { "type": "string" }, "downloadLabel": { "type": "string" } } } }, "stats": { "type": "object", "additionalProperties": { "anyOf": [ { "type": "string" }, { "type": "number" } ] } } } } ``` ## Self-Contained JavaScript Source Call `runCleanUtilsTool(userInput)` with the user's input. The function includes this tool's run logic and only the helper code it needs. ```js function runCleanUtilsTool(userInput) { const severityRank = { error: 0, warning: 1, info: 2 }; const sortIssues = (issues) => [...issues].sort((a, b) => { const severity = severityRank[a.severity] - severityRank[b.severity]; if (severity !== 0) return severity; return (a.line ?? a.row ?? 0) - (b.line ?? b.row ?? 0); }); const sniffDelimiter = (input) => { const firstLines = input.split(/\r?\n/).slice(0, 5).join("\n"); const delimiters = [",", "\t", ";", "|"]; const scores = delimiters.map((delimiter) => ({ delimiter, score: firstLines .split(/\r?\n/) .filter(Boolean) .map((line) => splitCsvLine(line, delimiter).length) .reduce((total, count) => total + (count > 1 ? count : 0), 0) })); scores.sort((a, b) => b.score - a.score); return scores[0]?.score ? scores[0].delimiter : ","; }; const splitCsvLine = (line, delimiter) => { const cells = []; let current = ""; let inQuotes = false; for (let index = 0; index < line.length; index += 1) { const char = line[index]; const next = line[index + 1]; if (char === "\"") { if (inQuotes && next === "\"") { current += "\""; index += 1; } else { inQuotes = !inQuotes; } continue; } if (char === delimiter && !inQuotes) { cells.push(current.trim()); current = ""; continue; } current += char; } cells.push(current.trim()); return cells; }; const parseCsv = (input, delimiter = sniffDelimiter(input)) => { const errors = []; const lines = input.split(/\r?\n/).filter((line) => line.trim().length > 0); if (!lines.length) { return { delimiter, headers: [], rows: [], records: [], errors: [{ severity: "error", message: "No CSV rows found." }] }; } const rows = lines.map((line) => splitCsvLine(line, delimiter)); const rawHeaders = rows[0].map((header, index) => header.trim() || `column_${index + 1}`); const seen = new Map(); const headers = rawHeaders.map((header) => { const normalized = header.trim(); const count = seen.get(normalized.toLowerCase()) ?? 0; seen.set(normalized.toLowerCase(), count + 1); return count ? `${normalized}_${count + 1}` : normalized; }); rows.slice(1).forEach((row, index) => { if (row.length !== headers.length) { errors.push({ severity: "warning", row: index + 2, message: `Row ${index + 2} has ${row.length} cell${row.length === 1 ? "" : "s"} but the header has ${headers.length}.` }); } }); const records = rows.slice(1).map((row) => Object.fromEntries(headers.map((header, index) => [header, row[index] ?? ""]))); return { delimiter, headers, rows: rows.slice(1), records, errors }; }; const emailPattern = /^[^\s@]+@[^\s@]+\.[^\s@]+$/; const findDuplicateEmails = (input) => { const parsed = input.includes(",") || input.includes("\t") ? parseCsv(input) : null; const issues = parsed ? [...parsed.errors] : []; const values = parsed ? (() => { const exactEmailHeaderIndex = parsed.headers.findIndex((header) => header.toLowerCase() === "email"); const emailColumnIndex = exactEmailHeaderIndex >= 0 ? exactEmailHeaderIndex : parsed.headers.findIndex((header) => header.toLowerCase().includes("email")); if (emailColumnIndex < 0) { issues.push({ severity: "warning", message: "No email header found. Scanning all CSV cells for email-looking values." }); const candidates = []; parsed.rows.forEach((row) => { row.forEach((cell, columnIndex) => { candidates.push({ raw: cell, position: candidates.length + 1, row, columnIndex }); }); }); return candidates; } return parsed.rows.map((row, index) => ({ raw: row[emailColumnIndex] ?? "", position: index + 1, row, columnIndex: emailColumnIndex })); })() : input.split(/\r?\n/).map((raw, index) => ({ raw, position: index + 1 })); const groups = new Map(); const invalid = []; values.forEach((candidate) => { const value = candidate.raw.trim().toLowerCase(); if (!value) return; if (!emailPattern.test(value)) { invalid.push(candidate.raw.trim()); return; } const group = groups.get(value) ?? { positions: [] }; group.positions.push(candidate.position); group.row ??= candidate.row; group.columnIndex ??= candidate.columnIndex; groups.set(value, group); }); invalid.slice(0, 10).forEach((value) => { issues.push({ severity: "warning", message: `Invalid email-looking value skipped: ${value}` }); }); const duplicateGroups = [...groups.entries()].filter(([, group]) => group.positions.length > 1); duplicateGroups.forEach(([email, group]) => { issues.push({ severity: "warning", message: `${email} appears ${group.positions.length} times.` }); }); const cleanEmails = [...groups.keys()].sort(); const cleanCsvRows = parsed ? [ parsed.headers, ...[...groups.entries()] .map(([email, group]) => { if (!group.row) return null; const row = Array.from({ length: parsed.headers.length }, (_, index) => group.row?.[index] ?? ""); if (group.columnIndex !== undefined && group.columnIndex >= 0 && group.columnIndex < row.length) { row[group.columnIndex] = email; } return row; }) .filter((row) => row !== null) ] : [["email"], ...cleanEmails.map((email) => [email])]; const report = [ `Unique valid emails: ${cleanEmails.length}`, `Duplicate groups: ${duplicateGroups.length}`, "", ...duplicateGroups.map(([email, group]) => `${email}: positions ${group.positions.join(", ")}`), "", "Clean unique list:", ...cleanEmails ].join("\n"); return { summary: `${cleanEmails.length} unique valid email${cleanEmails.length === 1 ? "" : "s"} found. ${duplicateGroups.length} duplicate group${duplicateGroups.length === 1 ? "" : "s"}.`, issues: sortIssues(issues), output: report, exportFilename: "duplicate-email-report.txt", exports: [ { label: parsed ? "Clean unique email rows CSV" : "Clean unique emails CSV", filename: "unique-emails.csv", content: serializeCsvRows(cleanCsvRows, ","), mimeType: "text/csv;charset=utf-8", copyLabel: "Copy CSV", downloadLabel: "Download CSV" } ], stats: { uniqueEmails: cleanEmails.length, duplicateGroups: duplicateGroups.length } }; }; const escapeCsvCell = (value, delimiter) => value.includes(delimiter) || value.includes("\"") || value.includes("\n") ? `"${value.replace(/"/g, "\"\"")}"` : value; const serializeCsvRows = (rows, delimiter) => rows.map((row) => row.map((cell) => escapeCsvCell(cell, delimiter)).join(delimiter)).join("\n"); const __userInput = userInput == null ? "" : userInput; const __run = findDuplicateEmails; const __input = __userInput && typeof __userInput === "object" && "input" in __userInput ? __userInput.input : __userInput; return __run(__input == null ? "" : String(__input)); } ``` ## Checks - CSV or list input: The checker can read a simple pasted list or a CSV with an email-like column. - Case and whitespace normalization: Email matching ignores surrounding spaces and letter casing. - Duplicate groups: Repeated addresses are grouped so you can see exactly which values collapsed together. - Invalid values: Rows that do not look like email addresses are reported separately. - Clean export: The downloadable output is a normalized unique address list, not a rewritten full CRM export. ## Related Tools - [CSV Delimiter Detector and Converter](/business-tools/csv-delimiter-detector-converter/): Detect CSV separators and convert rows to comma, semicolon, tab, or pipe-delimited output. - [Excel Serial Date Converter](/business-tools/excel-serial-date-converter/): Convert Excel serial date numbers into readable dates with 1900 and 1904 date-system support. - [NPS Calculator](/business-tools/nps-calculator/): Calculate Net Promoter Score from promoter/passive/detractor counts or pasted 0-10 survey scores.