What the AI Match Score actually measures
When a recruiter opens a job pipeline in Treegarden and sees a list of candidates each bearing a percentage score, a reasonable question is: what does that number represent? The answer is specific and important to understand before relying on it.
The AI Match Score measures the degree of alignment between what a candidate has stated in their CV and screening responses and what the job description specifies as requirements. It is a signal about documented fit — not about overall candidate quality, not about likely job performance, and certainly not about the many human dimensions of hiring that written profiles cannot capture.
More precisely, the score reflects how well the AI can match evidence in the candidate's profile to the requirements it has extracted from the job description. A candidate who held a relevant role at a comparable company with the exact tools and skills named in the job description will score high. A strong candidate who writes minimal CVs, comes from an adjacent industry, or describes their experience in language that differs from the job description's terminology may score lower despite being genuinely excellent.
Understanding this distinction — documented fit versus actual candidate quality — is the foundation for using the score well. Recruiters who treat it as a ranking of overall candidate merit will make different, often worse decisions than those who treat it as a filter to help them prioritise where to spend their limited review time.
The score is presented as a percentage alongside a colour-coded indicator in Treegarden's pipeline view: candidates scoring above a configured threshold are highlighted as strong matches, those in the middle range are flagged as worth reviewing, and those below the threshold are available but deprioritised for immediate attention. These thresholds are configurable for each role, which matters because a 70% match on a highly specialised technical role means something different from a 70% match on a generalist administrative role.
AI Match Score in Treegarden ATS
Every applicant receives an automatic match percentage against the job description, with colour-coded indicators in the pipeline view for immediate prioritisation. Recruiters see at a glance which candidates warrant immediate attention, which sit in the review zone, and which scored below threshold — without having to open a single CV to establish initial order of priority.
How Treegarden calculates the score
The AI Match Score is produced by a multi-step process that begins with the job description itself. When a job is created or updated in Treegarden, the AI parses the job description to extract a structured representation of requirements: specific skills, tools and technologies; required years of experience and seniority level; qualifications and certifications; and contextual signals about role scope and environment.
This requirement extraction is where job description quality has its greatest influence. A job description that says "we need a strong communicator with relevant experience" gives the AI very little to work with. A job description that says "5+ years of B2B SaaS sales experience, proven track record of closing enterprise deals above £100k, experience with Salesforce CRM" gives the AI a well-defined set of requirements to match against. The more specific and structured the job description, the more reliable the scores it generates.
Once requirements are extracted, each candidate's profile is processed in turn. The AI reads the CV and any screening responses, identifying evidence of the extracted requirements: mentions of specific skills, job titles indicating seniority, time periods in relevant roles, named tools and technologies, qualifications and certifications. It then calculates a weighted match score based on how many of the extracted requirements it found evidence for, and how strong that evidence is.
The weighting matters. Requirements the AI identifies as central to the role — typically those mentioned multiple times or with explicit emphasis in the job description — carry more weight than peripheral requirements. A candidate who satisfies the core requirements but lacks some secondary ones will score higher than one who satisfies many secondary requirements but is missing a core one.
The output is a percentage, but thinking of it as a continuous spectrum is more useful than focusing on the precise number. A candidate at 82% and one at 78% are essentially indistinguishable from a scoring standpoint — the 4-point difference is within the natural variability of how different candidates document similar experience. What matters is the broad category: clearly strong match, worth reviewing, weak match.
What the score considers: hard skills, experience, context
The AI Match Score draws on several distinct types of signals from the candidate's profile, each contributing differently to the overall result.
Hard skills and technical competencies are the most directly matchable signal. When a job description requires proficiency with Python, SQL and Tableau, and a candidate's CV lists all three with evidence of use, that alignment contributes strongly to the score. Named tools, technologies, methodologies and certifications are high-confidence signals because they can be matched precisely — there is no ambiguity about whether someone who lists Salesforce CRM experience has Salesforce CRM experience.
Experience level and tenure indicators are the second major signal type. The AI looks at years of experience in relevant roles, seniority of the most recent and previous positions, and whether the candidate's career progression suggests the level of expertise the role requires. A job requiring a "senior" professional will score candidates differently based on whether their experience pattern suggests genuine seniority versus a recent title inflation.
Contextual signals form the third category — and the most nuanced one. These include the type and size of companies the candidate has worked at, the scope and scale of projects described, and whether the industry context of their experience is directly relevant or adjacent. A candidate from a direct competitor or a company of similar scale and complexity in the same sector will typically score higher than one from a very different context, even with identical hard skills.
Screening question responses feed into the score when configured. When candidates answer structured questions about their experience with specific tools, their salary expectations, their notice period or their specific achievements, those responses provide additional evidence beyond the CV that informs the score.
Score Breakdown by Requirement
View which specific job requirements contributed positively or negatively to each candidate's score, enabling transparent evaluation rationale. Rather than a single opaque number, the breakdown shows recruiters exactly where a candidate is strong, where they are weak, and which gaps are meaningful versus which are minor secondary requirements — giving human reviewers the context they need to make a well-informed decision about whether to advance the candidate.
How to interpret scores correctly in your pipeline
A common mistake with AI scoring is treating the percentage as a precise measure of relative candidate quality. Two candidates at 85% are not identical. One may have scored highly on core technical requirements while missing soft secondary ones; another may have matched well across all categories at a moderate level. The number alone does not tell you this — the breakdown does.
The most productive interpretation of the score is as a triage tool: which candidates warrant immediate review, which should be reviewed after the immediate set, and which require a specific reason to look at them at all. In a typical hiring round, a recruiter might commit to reviewing all candidates above 75% first, then working through the 50-75% range for candidates whose profiles are otherwise interesting, and treating anything below 50% as requiring a specific flag (e.g. the recruiter noticed something in the name or headline) before investing review time.
Score distribution also tells you something about the job description and the applicant pool. If nearly all candidates score below 50%, the job description may be written too narrowly, or the role is being advertised in channels that don't reach the right audience. If nearly all candidates score above 80%, either the role requirements are very common in the applicant pool, or the job description is too vague to produce meaningful differentiation. A healthy distribution for a well-specified role typically shows a meaningful spread — a cluster of strong matches, a larger group in the mid-range, and a tail of weak matches.
It is also worth examining your own shortlisting patterns against the score over time. If you consistently advance candidates from the 60-70% range while rejecting 85%+ candidates, something interesting is happening — either your job description is systematically not capturing what you actually value, or the score is not weighting your priorities correctly. Both are solvable problems, and both start with this kind of reflective analysis.
What the Score Cannot Measure
AI match scores assess what is on the CV: stated experience, qualifications and skills. They cannot measure motivation, cultural fit, leadership potential or the many soft factors that distinguish a good hire from a great one. Treat the score as a filter, not a hiring decision.
The score as a starting point, not a final decision
The most important principle for working with AI Match Scores is the simplest one: the score determines where you look, not what you decide. A recruiter who reviews the 20 highest-scoring candidates and shortlists the top five based on score alone has not used AI to improve their hiring — they have used AI to skip reviewing CVs, which is a different and considerably riskier thing.
The score's value is in focusing recruiter attention on a prioritised set of candidates, not in replacing the human review of those candidates. Every recruiter who uses Treegarden's AI Match Score still needs to read the candidate profiles of the candidates they advance — but they read them in a more efficient order, having already filtered out the candidates with no reasonable prospect of matching the requirements.
This distinction matters especially for the mid-range — candidates scoring between 50% and 70%. In this zone, the score is telling you that the candidate has some relevant background but also significant gaps relative to the job description's stated requirements. The question a human reviewer should ask is: are those gaps real deal-breakers, or are they artefacts of how the candidate documented their experience? A genuinely strong candidate who works in an adjacent field and hasn't used the specific tools mentioned may score 60%, but a thirty-second scan of their profile might reveal experience that is directly applicable. The score surfaces the candidate for review; the human decides what the gap means.
For candidates below the threshold, the score suggests low priority rather than automatic rejection. In high-volume pipelines, some candidates below 50% may be worth a brief look — especially those who applied with a cover letter, those referred internally, or those from specific target companies. The score should govern the order of review, not permanently close a candidate off from consideration without human eyes on their profile.
Score Threshold Filters
Filter pipeline views by score range to focus recruiter attention on high-scoring candidates while keeping low-scoring applicants available for review. Thresholds are configurable per role, allowing teams to set different cut-points for specialist versus generalist roles, senior versus junior positions, and high-volume versus targeted search campaigns — ensuring the filter is always calibrated to the specific hiring context.
Calibrating the score for your specific roles and culture
Default scoring behaviour in any AI system reflects the assumptions embedded in its training. For most common role types — software engineers, sales executives, marketing managers — those defaults will perform reasonably well. For highly specialised roles, roles in niche industries, or roles where your organisation has unusual criteria, calibration is required.
Calibration in Treegarden starts with configuring must-have versus nice-to-have requirements. When a skill or qualification is genuinely non-negotiable — a specific professional certification, a minimum years of experience in a regulated domain, a language requirement — marking it as a must-have means the AI treats its absence as a significant negative signal rather than one factor among many. This prevents candidates who are missing a deal-breaker from scoring well overall based on strong matches elsewhere.
Score threshold calibration is the second lever. After your first round of hiring with AI scoring enabled, review the score distribution of candidates you shortlisted and candidates you rejected. If you consistently advanced candidates who scored below your configured threshold, the threshold is probably set too high. If you advanced very few below-threshold candidates, the threshold may be well-calibrated. This feedback loop is one of the most direct ways to improve score accuracy over time for your specific roles.
Cultural and contextual criteria are harder to encode but worth attempting. If you specifically value experience at fast-growth companies, or experience managing cross-functional teams rather than just individual contributors, or experience in specific geographic markets, articulating these in the job description gives the AI a better chance of picking them up. The more of your actual evaluation criteria you can make explicit in the job description, the better the score reflects what you actually care about.
Known limitations and edge cases
Being clear-eyed about where AI Match Scores perform poorly is as important as understanding where they perform well. There are several known limitation categories that recruiters should be aware of.
Unconventional career paths are the most significant limitation. Candidates who have built relevant capability through non-standard routes — self-taught, project-based, via freelance work documented poorly on a CV, or through roles with unusual titles — often score lower than their actual capability warrants. Experienced recruiters tend to recognise these profiles immediately; the AI frequently does not.
CV writing quality introduces substantial variability. A well-structured, keyword-rich CV from an average candidate may outscore a sparse, poorly formatted CV from an excellent one. This is a genuine limitation: the score measures how well the candidate documented their experience against the job description, not just the experience itself. Candidates who are poor CV writers are systematically disadvantaged by automated scoring.
Niche or emerging technologies present a specific edge case. If a job description requires experience with a tool or technology that was recently introduced or is specific to a narrow industry segment, the AI may not weight it correctly or may not recognise equivalent adjacent technologies as relevant. Human review is essential for any role where cutting-edge or niche technical requirements are central.
Finally, role fit beyond the documented profile — the dimension that most determines long-term hiring success — is entirely outside the score's reach. Energy, drive, intellectual curiosity, resilience under pressure, cultural alignment: none of these appear on CVs, none of them feed into the match score. The score is a necessary first filter for high-volume hiring; it is not, and should never be positioned as, a substitute for human evaluation of the whole person.
Frequently asked questions about AI Match Scores
How is the AI Match Score calculated in Treegarden?
Treegarden's AI Match Score is calculated by comparing the content of a candidate's CV and screening responses against the job description. The AI extracts required skills, experience levels, qualifications and contextual signals from the job description and evaluates how closely the candidate's stated background matches each requirement. The output is a percentage score accompanied by a breakdown showing which requirements contributed positively or negatively, so recruiters can see the rationale behind any given score.
What does a high AI Match Score actually mean?
A high AI Match Score — typically 80% or above — indicates that the candidate's stated background aligns closely with the requirements described in the job description. It means the AI found strong evidence of the skills, experience levels and qualifications specified. It does not mean the candidate is necessarily the best hire: motivation, cultural fit, work quality and soft skills are not assessed by the score. High scores should be treated as a reliable signal for prioritising review, not as a hiring recommendation.
Can the AI Match Score be wrong?
Yes. The score can produce false positives (high-scoring candidates who interview poorly) and false negatives (lower-scoring candidates who are actually strong). The most common causes are vague job descriptions that give the AI insufficient signal, CVs that do not accurately reflect what a candidate actually did, unconventional career paths that the AI's pattern recognition undervalues, and roles where the most important requirements are difficult to express in a job description. Using the score as a filter rather than a decision removes most of the risk from these edge cases.
How do I improve the accuracy of AI Match Scores for my roles?
The most effective lever is improving job description quality. Specific, structured job descriptions with clearly stated requirements — including required years of experience, explicit skill names and concrete responsibilities — produce substantially more accurate scores than vague descriptions. Additionally, configuring must-have versus nice-to-have criteria in Treegarden's AI settings and reviewing score threshold calibration after the first batch of candidates has been assessed both improve ongoing accuracy.
Write Better Job Descriptions to Get Better Scores
AI match quality is directly proportional to job description quality. Vague descriptions produce unreliable scores. Specific, structured job descriptions with clear requirements produce scores that consistently predict recruiter shortlisting decisions.