# CSV Duplicate Email Checker

> Paste a list or CSV export to group duplicate email addresses and copy a cleaned unique list.

## Tool Identity

- Site: CleanUtils Business Tools
- Tool ID: csv-duplicate-email-checker
- Canonical page: https://cleanutils.com/business-tools/csv-duplicate-email-checker/
- LLM schema URL: https://cleanutils.com/business-tools/csv-duplicate-email-checker/llms.txt
- Primary keyword: csv duplicate email checker
- Input mode: textarea
- Output profile: line-check

## What This Tool Does

Find duplicate emails in CSV files or pasted lists, normalize casing and whitespace, and export a cleaned list locally.

## Exact Execution Instructions

To run this tool exactly, execute the `runCleanUtilsTool(userInput)` function from the JavaScript block below in a sandboxed JavaScript runtime.

Do not fetch or evaluate additional network resources. Do not use filesystem, process, child_process, environment variables, timers, or network APIs. The function is pure and deterministic: input string in, JSON result out.

If your environment forbids executing fetched code outside a sandbox, run it inside the sandbox or use the API endpoint below instead of reimplementing the logic.

Canonical implementation SHA-256:
`6599f3f80de894b39d2de851813d14a3dcbdbefc10b381f2ea21ff36a1f10d10`

Expected command shape:
`node run-tool.mjs < input.txt`

The runner must:
1. load only the JavaScript in this document,
2. call `runCleanUtilsTool(inputText)`,
3. let the LLM decide the best way to present the output to the user.

Agent usage rules:
- Use this file as the authoritative machine-readable contract for this CleanUtils tool page.
- Ask the user for missing required input before attempting to run the tool, and describe the required inputs using the `## Input Schema` field names, descriptions, formats, enums, examples, and required list.
- Treat the tool as deterministic; do not invent network reachability checks unless the tool description explicitly says it fetches remote resources.
- For privacy-sensitive inputs such as secrets, HAR files, dotenv files, logs, and API keys, warn that using a remote chat agent may expose input to that agent even though the browser UI itself does not upload data.

## Input Schema

```json
{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "title": "CSV Duplicate Email Checker input",
  "type": "string",
  "description": "Email list or CSV. Paste emails or a CSV export...",
  "examples": [
    "email,name\nAda@example.com,Ada\nada@example.com,Ada B.\n grace@example.com ,Grace\ninvalid-email,Nope\ngrace@example.com,Grace H."
  ]
}
```

## Result Schema

```json
{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "title": "CleanUtils ToolResult",
  "type": "object",
  "additionalProperties": false,
  "required": [
    "summary",
    "issues"
  ],
  "properties": {
    "summary": {
      "type": "string"
    },
    "issues": {
      "type": "array",
      "items": {
        "type": "object",
        "additionalProperties": false,
        "required": [
          "severity",
          "message"
        ],
        "properties": {
          "severity": {
            "type": "string",
            "enum": [
              "error",
              "warning",
              "info"
            ]
          },
          "message": {
            "type": "string"
          },
          "line": {
            "type": "number"
          },
          "row": {
            "type": "number"
          },
          "detail": {
            "type": "string"
          }
        }
      }
    },
    "output": {
      "type": "string"
    },
    "exportFilename": {
      "type": "string"
    },
    "exports": {
      "type": "array",
      "items": {
        "type": "object",
        "additionalProperties": false,
        "required": [
          "label",
          "filename",
          "content"
        ],
        "properties": {
          "label": {
            "type": "string"
          },
          "filename": {
            "type": "string"
          },
          "content": {
            "type": "string"
          },
          "mimeType": {
            "type": "string"
          },
          "copyLabel": {
            "type": "string"
          },
          "downloadLabel": {
            "type": "string"
          }
        }
      }
    },
    "stats": {
      "type": "object",
      "additionalProperties": {
        "anyOf": [
          {
            "type": "string"
          },
          {
            "type": "number"
          }
        ]
      }
    }
  }
}
```

## Self-Contained JavaScript Source

Call `runCleanUtilsTool(userInput)` with the user's input. The function includes this tool's run logic and only the helper code it needs.

```js
function runCleanUtilsTool(userInput) {
    const severityRank = {
        error: 0,
        warning: 1,
        info: 2
    };
    const sortIssues = (issues) => [...issues].sort((a, b) => {
        const severity = severityRank[a.severity] - severityRank[b.severity];
        if (severity !== 0)
            return severity;
        return (a.line ?? a.row ?? 0) - (b.line ?? b.row ?? 0);
    });
    const sniffDelimiter = (input) => {
        const firstLines = input.split(/\r?\n/).slice(0, 5).join("\n");
        const delimiters = [",", "\t", ";", "|"];
        const scores = delimiters.map((delimiter) => ({
            delimiter,
            score: firstLines
                .split(/\r?\n/)
                .filter(Boolean)
                .map((line) => splitCsvLine(line, delimiter).length)
                .reduce((total, count) => total + (count > 1 ? count : 0), 0)
        }));
        scores.sort((a, b) => b.score - a.score);
        return scores[0]?.score ? scores[0].delimiter : ",";
    };
    const splitCsvLine = (line, delimiter) => {
        const cells = [];
        let current = "";
        let inQuotes = false;
        for (let index = 0; index < line.length; index += 1) {
            const char = line[index];
            const next = line[index + 1];
            if (char === "\"") {
                if (inQuotes && next === "\"") {
                    current += "\"";
                    index += 1;
                }
                else {
                    inQuotes = !inQuotes;
                }
                continue;
            }
            if (char === delimiter && !inQuotes) {
                cells.push(current.trim());
                current = "";
                continue;
            }
            current += char;
        }
        cells.push(current.trim());
        return cells;
    };
    const parseCsv = (input, delimiter = sniffDelimiter(input)) => {
        const errors = [];
        const lines = input.split(/\r?\n/).filter((line) => line.trim().length > 0);
        if (!lines.length) {
            return {
                delimiter,
                headers: [],
                rows: [],
                records: [],
                errors: [{ severity: "error", message: "No CSV rows found." }]
            };
        }
        const rows = lines.map((line) => splitCsvLine(line, delimiter));
        const rawHeaders = rows[0].map((header, index) => header.trim() || `column_${index + 1}`);
        const seen = new Map();
        const headers = rawHeaders.map((header) => {
            const normalized = header.trim();
            const count = seen.get(normalized.toLowerCase()) ?? 0;
            seen.set(normalized.toLowerCase(), count + 1);
            return count ? `${normalized}_${count + 1}` : normalized;
        });
        rows.slice(1).forEach((row, index) => {
            if (row.length !== headers.length) {
                errors.push({
                    severity: "warning",
                    row: index + 2,
                    message: `Row ${index + 2} has ${row.length} cell${row.length === 1 ? "" : "s"} but the header has ${headers.length}.`
                });
            }
        });
        const records = rows.slice(1).map((row) => Object.fromEntries(headers.map((header, index) => [header, row[index] ?? ""])));
        return { delimiter, headers, rows: rows.slice(1), records, errors };
    };
    const emailPattern = /^[^\s@]+@[^\s@]+\.[^\s@]+$/;
    const findDuplicateEmails = (input) => {
        const parsed = input.includes(",") || input.includes("\t") ? parseCsv(input) : null;
        const issues = parsed ? [...parsed.errors] : [];
        const values = parsed
            ? (() => {
                const exactEmailHeaderIndex = parsed.headers.findIndex((header) => header.toLowerCase() === "email");
                const emailColumnIndex = exactEmailHeaderIndex >= 0
                    ? exactEmailHeaderIndex
                    : parsed.headers.findIndex((header) => header.toLowerCase().includes("email"));
                if (emailColumnIndex < 0) {
                    issues.push({
                        severity: "warning",
                        message: "No email header found. Scanning all CSV cells for email-looking values."
                    });
                    const candidates = [];
                    parsed.rows.forEach((row) => {
                        row.forEach((cell, columnIndex) => {
                            candidates.push({ raw: cell, position: candidates.length + 1, row, columnIndex });
                        });
                    });
                    return candidates;
                }
                return parsed.rows.map((row, index) => ({
                    raw: row[emailColumnIndex] ?? "",
                    position: index + 1,
                    row,
                    columnIndex: emailColumnIndex
                }));
            })()
            : input.split(/\r?\n/).map((raw, index) => ({ raw, position: index + 1 }));
        const groups = new Map();
        const invalid = [];
        values.forEach((candidate) => {
            const value = candidate.raw.trim().toLowerCase();
            if (!value)
                return;
            if (!emailPattern.test(value)) {
                invalid.push(candidate.raw.trim());
                return;
            }
            const group = groups.get(value) ?? { positions: [] };
            group.positions.push(candidate.position);
            group.row ??= candidate.row;
            group.columnIndex ??= candidate.columnIndex;
            groups.set(value, group);
        });
        invalid.slice(0, 10).forEach((value) => {
            issues.push({ severity: "warning", message: `Invalid email-looking value skipped: ${value}` });
        });
        const duplicateGroups = [...groups.entries()].filter(([, group]) => group.positions.length > 1);
        duplicateGroups.forEach(([email, group]) => {
            issues.push({
                severity: "warning",
                message: `${email} appears ${group.positions.length} times.`
            });
        });
        const cleanEmails = [...groups.keys()].sort();
        const cleanCsvRows = parsed
            ? [
                parsed.headers,
                ...[...groups.entries()]
                    .map(([email, group]) => {
                    if (!group.row)
                        return null;
                    const row = Array.from({ length: parsed.headers.length }, (_, index) => group.row?.[index] ?? "");
                    if (group.columnIndex !== undefined && group.columnIndex >= 0 && group.columnIndex < row.length) {
                        row[group.columnIndex] = email;
                    }
                    return row;
                })
                    .filter((row) => row !== null)
            ]
            : [["email"], ...cleanEmails.map((email) => [email])];
        const report = [
            `Unique valid emails: ${cleanEmails.length}`,
            `Duplicate groups: ${duplicateGroups.length}`,
            "",
            ...duplicateGroups.map(([email, group]) => `${email}: positions ${group.positions.join(", ")}`),
            "",
            "Clean unique list:",
            ...cleanEmails
        ].join("\n");
        return {
            summary: `${cleanEmails.length} unique valid email${cleanEmails.length === 1 ? "" : "s"} found. ${duplicateGroups.length} duplicate group${duplicateGroups.length === 1 ? "" : "s"}.`,
            issues: sortIssues(issues),
            output: report,
            exportFilename: "duplicate-email-report.txt",
            exports: [
                {
                    label: parsed ? "Clean unique email rows CSV" : "Clean unique emails CSV",
                    filename: "unique-emails.csv",
                    content: serializeCsvRows(cleanCsvRows, ","),
                    mimeType: "text/csv;charset=utf-8",
                    copyLabel: "Copy CSV",
                    downloadLabel: "Download CSV"
                }
            ],
            stats: {
                uniqueEmails: cleanEmails.length,
                duplicateGroups: duplicateGroups.length
            }
        };
    };
    const escapeCsvCell = (value, delimiter) => value.includes(delimiter) || value.includes("\"") || value.includes("\n")
        ? `"${value.replace(/"/g, "\"\"")}"`
        : value;
    const serializeCsvRows = (rows, delimiter) => rows.map((row) => row.map((cell) => escapeCsvCell(cell, delimiter)).join(delimiter)).join("\n");
    const __userInput = userInput == null ? "" : userInput;
    const __run = findDuplicateEmails;
    const __input = __userInput && typeof __userInput === "object" && "input" in __userInput ? __userInput.input : __userInput;
    return __run(__input == null ? "" : String(__input));
}
```

## Checks

- CSV or list input: The checker can read a simple pasted list or a CSV with an email-like column.
- Case and whitespace normalization: Email matching ignores surrounding spaces and letter casing.
- Duplicate groups: Repeated addresses are grouped so you can see exactly which values collapsed together.
- Invalid values: Rows that do not look like email addresses are reported separately.
- Clean export: The downloadable output is a normalized unique address list, not a rewritten full CRM export.

## Related Tools

- [CSV Delimiter Detector and Converter](/business-tools/csv-delimiter-detector-converter/): Detect CSV separators and convert rows to comma, semicolon, tab, or pipe-delimited output.
- [Excel Serial Date Converter](/business-tools/excel-serial-date-converter/): Convert Excel serial date numbers into readable dates with 1900 and 1904 date-system support.
- [NPS Calculator](/business-tools/nps-calculator/): Calculate Net Promoter Score from promoter/passive/detractor counts or pasted 0-10 survey scores.