You cannot improve diversity without measuring it. But measuring it incorrectly — by collecting demographic data in ways that are not separated from hiring decisions, or retaining it beyond legal limits — creates legal exposure and ethical harm that can outweigh the benefit. Diversity hiring software solves this by providing the right framework: voluntary, anonymised, separated from evaluation, and legally compliant.
Why Diversity Data Collection Requires Care
The impulse to track diversity in hiring is correct: organisations that cannot measure where they are failing in representation cannot fix it. However, the mechanism of data collection is where many organisations make errors that create regulatory and ethical problems.
The three most common mistakes:
Using demographic data in screening decisions. Any system where a candidate's gender, ethnicity, disability status, or other protected characteristic influences whether they advance in a pipeline is unlawful in both the US and UK, regardless of intent. Using diversity data to try to increase representation through hiring decision interference is both legally prohibited and counterproductive — it masks the root causes of underrepresentation.
Making self-identification mandatory. Requiring candidates to disclose protected characteristics as a condition of completing an application is illegal in most jurisdictions. Self-identification must be genuinely voluntary with no consequence for declining to answer. Systems that treat "prefer not to say" as a separate demographic category for tracking purposes must be careful that this choice is not inadvertently used in downstream analysis.
Retaining diversity data with candidate files. Demographic data collected for diversity monitoring should be stored separately from recruitment records and hiring decision data. Storing them together creates the structural possibility of contamination — where a reviewer sees demographic information alongside a CV — and creates GDPR retention complexity.
The Core Principle: Measurement Without Influence
Effective diversity tracking is built on one foundational principle: demographic data must be structurally impossible to use in hiring decisions. This is not just a policy statement — it must be enforced at the system level. Software that collects diversity data in a separate module, accessible only to HR analytics roles and never visible in the recruitment pipeline view, is architecturally compliant. Software that stores demographic tags on candidate records is not, regardless of what the policy manual says.
What Diversity Metrics You Should Track (and How)
Meaningful diversity measurement goes beyond counting. The metrics that reveal systemic issues and drive actionable interventions are funnel-based: they show where in the hiring process underrepresentation occurs.
The most valuable diversity metrics to track:
- Application rate by demographic group — are underrepresented groups applying at expected rates given the labour market? Low application rates indicate sourcing and attraction problems, not screening problems.
- Screening advancement rate by demographic group — are application-to-screen advancement rates consistent across demographic groups? Significant disparities indicate potential bias in initial screening criteria or processes.
- Interview-to-offer rate by demographic group — where in the funnel does disproportionate drop-off occur? Interview stage disparities often indicate structured interviewing gaps.
- Offer acceptance rate by demographic group — if certain groups accept offers at lower rates, this may indicate compensation equity issues or cultural environment concerns surfaced during the interview process.
- Time-to-hire by demographic group — significant differences in time-to-hire may indicate process variation that correlates with demographics.
Statistical Significance Matters
Diversity metrics are only meaningful at sufficient sample sizes. A company hiring 10 people per year cannot draw statistically valid conclusions from demographic data. Organisations under 50 annual hires should aggregate data across 12–18 months before drawing conclusions. Reporting on insufficient sample sizes creates false precision and can lead to incorrect interventions.
Self-Identification: The Right Way to Collect Demographic Data
Self-identification forms are the accepted mechanism for collecting demographic data in hiring. The following principles apply in both US and UK contexts:
Voluntary with genuine choice. Every question must include a "Prefer not to say" or "Decline to identify" option that is clearly presented and carries no implied consequence. The application process must be completable without answering any self-identification question.
Separated from the main application. Self-identification questions should appear after the core application is submitted, or on a clearly distinct separate section with explicit explanation of purpose. They must not appear alongside CV upload, work history, or screening questions.
Purpose explained. Candidates must be informed why their data is being collected, how it will be used, who will have access to it, and how long it will be retained. This is a GDPR requirement in the UK and EU, and best practice in the US under EEOC guidance.
Access controlled. Only HR analytics roles should have access to aggregate diversity reports. Individual demographic records should not be visible to hiring managers, recruiters, or anyone involved in making hiring decisions.
Aggregated reporting only. Diversity dashboards should show aggregate statistics — "42% of applicants identifying as women advanced to phone screen" — not individual-level demographic data linked to specific candidates.
Separating Diversity Data from Hiring Decisions
The structural separation of diversity data from hiring decision data is the most critical architectural requirement. This means:
- Demographic data is stored in a separate database table or system from candidate records
- The ATS pipeline view does not display demographic information on candidate cards
- Hiring managers and recruiters have no access to diversity reports that could be cross-referenced with individual applications
- Scoring and ranking algorithms have no access to demographic data inputs
- Diversity reports are available only in aggregate, with minimum cell sizes (typically n≥5) to prevent re-identification of individuals
Organisations that collect demographic data but do not enforce this structural separation are technically collecting the data for diversity tracking while creating the conditions for it to be misused. The legal protection of a voluntary self-ID process is substantially weakened if the data is accessible to people making hiring decisions.
| Data Type | Where It Should Live | Who Should Access It |
|---|---|---|
| CV/resume and application responses | Candidate record (ATS pipeline) | Recruiter, hiring manager, interview panel |
| Interview scorecards and notes | Candidate record (ATS pipeline) | Interview panel, recruiter, HR |
| Self-identification demographic data | Separate diversity module, anonymised | HR analytics, DEI leadership only — never hiring panel |
| Aggregate diversity metrics | Diversity reporting dashboard | HR leadership, DEI committee, executive team |
US: EEOC Voluntary Self-ID Requirements
In the United States, EEOC regulations govern how employers collect demographic data. Key requirements:
EEO-1 reporting: Employers with 100 or more employees and federal contractors with 50 or more employees must file annual EEO-1 reports disaggregating their workforce by race/ethnicity, sex, and job category. This requires collecting demographic data through the hiring process.
Voluntary basis: The EEOC mandates that demographic data collection be voluntary. Candidates must be informed that the information is used for statistical purposes only and will not affect consideration for employment.
Race/ethnicity categories: The EEOC uses standardised categories: Hispanic or Latino, White, Black or African American, Native Hawaiian or Other Pacific Islander, Asian, American Indian or Alaska Native, Two or More Races. Employers should use these categories consistently for EEO-1 reporting compatibility.
Federal contractors — Section 503: Federal contractors are additionally required to collect voluntary disability self-identification data under Section 503 of the Rehabilitation Act, with a 7% disability utilisation goal.
VEVRAA: Veterans' Employment and Training Service requirements mandate veteran status collection for federal contractors, including specific protected veteran categories.
UK: Equality Act Monitoring Best Practice
In the United Kingdom, diversity monitoring is not legally mandated for most private sector employers, but the Equality Act 2010 creates strong incentives to implement it correctly.
Protected characteristics under the Equality Act 2010: age, disability, gender reassignment, marriage and civil partnership, pregnancy and maternity, race, religion or belief, sex, sexual orientation. Monitoring should cover the characteristics most relevant to your organisation's inclusion priorities, typically race/ethnicity, sex, disability, and age.
GDPR compliance: Demographic data about race, ethnicity, health, and sexual orientation is "special category data" under GDPR. Collecting it requires a lawful basis (typically legitimate interests with a DPA-compliant privacy notice) and must be accompanied by explicit information about purpose, retention period, and data subject rights.
Retention limits: Under GDPR, diversity data collected from unsuccessful candidates should be deleted within the defined retention period — typically 6–12 months post-rejection. Systems should enforce automatic deletion or anonymisation at retention limit.
The public sector equality duty: Public sector employers are required to monitor and publish diversity data as part of their Public Sector Equality Duty obligations. This creates a higher standard of transparency.
UK vs US: Key Differences in Diversity Data
US EEOC reporting uses standardised race/ethnicity categories and is legally mandated for employers over certain size thresholds. UK diversity monitoring is largely voluntary (except for public sector employers) but subject to stricter data protection requirements under GDPR. Organisations with operations in both markets should configure separate data collection forms that meet each jurisdiction's specific requirements, and ensure that data is not cross-border transferred in ways that violate GDPR adequacy requirements.
How Treegarden's EEO Module Handles Diversity Data
Treegarden's EEO and diversity module is designed around the principle of structural separation. Here is how it works:
- Separate collection flow — self-identification forms are presented to candidates after core application submission, in a clearly distinct interface with purpose explanation
- Genuinely voluntary — every field includes "Prefer not to say"; the application processes identically regardless of responses
- Anonymised storage — demographic responses are stored with a reference identifier, not the candidate's name or identifying information
- Pipeline inaccessible — diversity data does not appear in the candidate pipeline view at any stage; hiring managers and interviewers cannot see it
- Aggregate reporting only — the diversity dashboard shows funnel metrics by demographic group, with minimum cell size protection to prevent re-identification
- US EEOC-compatible categories — race/ethnicity, sex, disability (Section 503), and veteran status categories align with EEO-1 reporting requirements
- UK Equality Act categories — configurable to include UK-specific monitoring categories appropriate to the Equality Act 2010 protected characteristics
- GDPR retention enforcement — automatic deletion of diversity data at the configured retention period for unsuccessful candidates
Free Calculators for This Topic
Save time with these free HR calculators — no sign-up required:
Frequently Asked Questions
Can we use diversity data to increase representation in our shortlist?
No. Using demographic data to influence who is shortlisted or advanced in a hiring process constitutes unlawful discrimination in both the US (Title VII) and UK (Equality Act 2010), even with positive intent. The correct approach is to identify where in the funnel underrepresentation occurs and address the root causes — sourcing channels, job description language, interview panel composition — rather than using demographic data as a selection input.
Do we need to file an EEO-1 report?
US private employers with 100 or more employees are required to file an annual EEO-1 Component 1 report with the EEOC. Federal contractors and subcontractors with 50 or more employees and contracts of $50,000 or more are also required to file. If you meet either threshold, EEO-1 reporting is mandatory, not optional.
Is collecting diversity data under GDPR legal?
Yes, with appropriate legal basis. Race, ethnicity, health, and similar characteristics are "special category data" under GDPR, requiring explicit consent or an alternative lawful basis such as substantial public interest. The collection must be accompanied by a clear privacy notice, the data must be stored securely and separately from other personal data, and it must be deleted at the end of the lawful retention period.
What is the difference between EEO forms and diversity monitoring?
EEO forms (US) are the specific voluntary self-identification forms mandated by the EEOC for federal contractors and recommended for all covered employers. They collect race/ethnicity, sex, disability, and veteran status in standardised categories for EEO-1 reporting. Diversity monitoring is a broader concept that may include additional dimensions (age, sexual orientation, religion) beyond EEO form categories, used for internal inclusion analysis.
How long should we retain diversity data from unsuccessful candidates?
In the US, EEOC recordkeeping requirements mandate retention of employment records (including applications) for at least one year from the date of the personnel action. For federal contractors, this extends to two years. In the UK, GDPR guidance suggests retaining unsuccessful candidate data for no longer than 6–12 months, with diversity data subject to the same or shorter retention periods given its sensitivity as special category data.
Building a Compliant Diversity Tracking Programme
Effective diversity tracking is not about collecting more data — it is about collecting the right data in the right way, storing it with appropriate controls, and using it to identify systemic problems rather than to influence individual hiring decisions.
The organisations that do this well share a common approach: structural separation enforced at the system level, voluntary collection with genuine optionality, aggregate reporting only, and regular funnel analysis to identify where in the hiring process interventions are needed.
Treegarden's EEO module provides this framework out of the box — compliant with both EEOC requirements and GDPR, structurally separated from the hiring pipeline, and reporting on the funnel metrics that reveal actionable insights. Book a demo to see the diversity module in action alongside the core ATS pipeline.