What the AI recruiter actually does in your workflow
Before configuring an AI recruiter, it is worth being precise about what it does and does not do in a recruitment workflow. The AI recruiter in Treegarden automates the first-pass screening and ranking of candidates against the requirements of a specific role — the part of the process that traditionally requires a recruiter to read through every application and make an initial judgement about whether it warrants closer review.
In practical terms, this means that when applications arrive, the AI recruiter scores each one against the job description and configured criteria, ranks the pipeline by match quality, surfaces the highest-scoring candidates for immediate recruiter attention, and flags candidates who meet specific knockout criteria (or fail to meet them) for priority or deprioritisation. The recruiter still makes every actual hiring decision — who to advance, who to interview, who to offer. The AI determines where they look first.
This is an important distinction. AI recruiter configuration is not about teaching the system to make hiring decisions; it is about teaching it to direct recruiter attention accurately. A well-configured AI recruiter means recruiters spend their limited review time on the candidates most likely to merit it. A poorly configured one wastes that time by surfacing poor matches at the top of the pipeline and burying strong candidates below them.
Understanding the AI's role in the workflow also clarifies what good configuration looks like. The goal is not to set criteria so tight that the AI does the hiring; it is to set criteria specific enough that the AI's ranking of the pipeline reliably reflects what an experienced recruiter would produce if they reviewed all applications manually. That correlation — between AI ranking and experienced recruiter ranking — is the meaningful measure of configuration quality.
AI Recruiter Configuration Panel in Treegarden
Set must-have qualifications, experience thresholds, skill weights and screening questions that tune the AI to your specific role requirements. The configuration panel gives recruiting teams direct control over what the AI prioritises and how it ranks candidates — translating role-specific hiring criteria into AI behaviour that reflects the organisation's actual standards rather than generic defaults.
The foundation: job description quality determines AI quality
The AI recruiter configuration process begins with the job description — and no amount of subsequent configuration work compensates for a poor job description at this stage. The job description is the primary input from which the AI extracts the role's requirements; its quality directly determines the quality of the AI's understanding of what a good candidate looks like.
A job description that produces reliable AI screening results has several consistent characteristics. It is specific about requirements rather than vague: "5+ years of B2B enterprise sales experience with consistent quota attainment above 100%" rather than "extensive sales experience." It distinguishes explicitly between required and preferred qualifications, using clear language (required, essential, preferred, desirable) rather than listing everything in a single undifferentiated bullet list. It describes what the role actually does in concrete terms, which gives the AI context to assess whether a candidate's background is relevant to the actual work — not just the listed skills.
Specificity about tools, technologies and methodologies matters particularly for technical and specialist roles. The AI can match candidates who have named experience with specific tools against requirements for those tools; it cannot reliably infer that a candidate's generic "data analysis experience" includes the specific analytical tools the role requires. Where specific tools or platforms are genuinely important, naming them explicitly in the job description produces substantially more accurate screening.
Investing time in job description quality before activating AI screening is the highest-leverage configuration action available. The typical pattern for teams new to AI-assisted screening is to activate the AI with an existing job description, find the results unsatisfactory, and iterate on the configuration settings — when the actual issue is the job description quality. Reviewing and improving the job description first produces better initial results and reduces the iteration burden.
Configuring match criteria: must-haves vs nice-to-haves
The most important configuration decision in the AI recruiter setup is the distinction between must-have and nice-to-have requirements. Getting this distinction right determines whether the AI produces a useful, well-calibrated shortlist or either over-filters (missing strong candidates) or under-filters (surfacing too many weak ones).
A must-have requirement is one where the absence is an absolute disqualifier — a qualification that every candidate must have to be considered for the role. Professional certifications required by regulation (legal qualifications, medical licences, financial adviser certification) are genuine must-haves. Specific legal work rights (right to work in a specific jurisdiction without sponsorship, where sponsorship is not offered) are must-haves. Specific language requirements for roles where the language is essential to the work, not just a preference, are must-haves.
Nice-to-have requirements are everything else — skills, experiences and characteristics that the ideal candidate would have, that contribute positively to their suitability, but whose absence does not disqualify. These should be configured as weighted preferences, not filters. Experience with a preferred tool, industry experience in the target sector, educational background at certain institutions, experience at scale-up versus enterprise companies — all of these are typically preferences rather than requirements.
The most common configuration mistake is marking too many criteria as must-haves. Every requirement marked as must-have becomes a filter that excludes any candidate lacking it — and as must-have criteria accumulate, the pool of candidates who satisfy all of them shrinks exponentially. A role with ten must-haves is likely to have very few matching candidates in any realistic applicant pool, leading to the AI surfacing a tiny subset of applications regardless of their overall quality. Starting with only the genuinely non-negotiable requirements as must-haves and adding everything else as weighted preferences is the recommended configuration approach.
Start with Must-Haves Only
In initial configuration, only specify genuinely non-negotiable requirements as knockout criteria. Over-configuring produces over-filtering. Add nice-to-have weighting after you have seen real candidates come through.
Setting up screening questions that feed the AI
Screening questions are a direct input to the AI's candidate assessment — they provide structured, comparable data points across all applicants that supplement what can be inferred from CVs alone. Well-designed screening questions significantly improve AI accuracy for requirements that CVs document inconsistently or ambiguously.
The best screening questions are those that collect specific, verifiable information the AI can use: years of experience in a specific domain, direct experience with a specific tool or technology, availability or notice period, willingness to meet location or travel requirements, possession of specific certifications. These questions produce unambiguous responses that the AI can factor into scoring with confidence.
Screening questions to avoid are those that ask candidates to self-assess in ways that produce unreliable data. "How would you rate your experience with [skill] on a scale of 1-10?" produces responses that vary enormously based on candidate confidence and cultural norms around self-promotion — a highly competent but modest candidate may score lower than a less competent but more assertive one. Questions that invite candidates to describe competencies in open-ended ways produce long free-text responses that are hard to score consistently.
Knockout questions — those where a specific answer is a disqualifier — should be clearly connected to must-have requirements. If the role genuinely cannot proceed without the candidate meeting a specific criterion, a binary screening question (yes/no: do you have the right to work in this country without sponsorship?) that filters on the answer saves both recruiter time and candidate time. Knockout questions should never be used for preferences — using a knockout question to filter on a nice-to-have is equivalent to making it a must-have, with all the over-filtering consequences that entails.
Limit screening questions to those that provide genuinely useful information for the AI and the recruiter. Each additional question adds friction for the candidate and increases application drop-off rate. The question "is this screening question worth an increase in application drop-off?" is the right test for every question added. Typically, five to seven well-designed questions is the optimal range — enough to provide meaningful additional data without creating a barrier that deters strong candidates from completing the application.
Calibrating score thresholds for your typical applicant pool
Score threshold configuration — deciding at what match percentage a candidate is considered a strong match, a borderline case or a weak match — requires understanding both the role's requirements and the realistic applicant pool for that role.
Different roles attract fundamentally different applicant pools in terms of match quality. A highly specialised technical role advertised in targeted channels will attract a pool where most applicants have genuine relevant experience — and a threshold of 75% might reasonably capture the strong candidates from the weak ones. A broadly advertised generalist role will attract a much more varied pool where the top candidates may only score 65% because the role's requirements are wide-ranging and few candidates will match perfectly across all dimensions. The threshold that works for one role type will over-filter or under-filter for the other.
Initial threshold calibration should be set conservatively — erring on the side of reviewing more candidates rather than fewer. As the first batch of applicants is reviewed, comparing the AI's ranking against the recruiter's actual assessment of those candidates produces calibration data. If the recruiter is consistently advancing candidates from the 50-65% range and finding nothing useful in the 65-80% range, the threshold is not well-matched to how this specific role's requirements are expressed in candidates' profiles and the job description should be reviewed.
Market conditions affect threshold calibration over time. In a tight labour market with few available candidates, accepting a lower threshold is often necessary — the candidate pool is simply smaller and match scores will tend to be lower across the board. In a period of high candidate supply, the same threshold will surface a larger group of strong matches. Thresholds should be reviewed regularly against current market conditions rather than set once and left unchanged.
Test Mode for AI Configuration
Run historical candidates through your new AI configuration before going live, checking whether the settings would have correctly identified your previous hires. Treegarden's test mode allows teams to validate configuration quality against known outcomes — if the AI would have screened out your most recent successful hire at the configured settings, the settings need adjustment before the job goes live with real applicants.
Testing configuration before the job goes live
Testing AI configuration before a live role opens is one of the most valuable — and most underused — practices in AI-assisted recruitment setup. The concept is straightforward: before the job goes live, run a set of historical candidate profiles through the configured AI settings and check whether the output matches what experienced recruiters would have produced.
The most direct test is to use the profiles of previous successful hires for similar roles. If the AI configuration would have screened out someone who was ultimately hired and performed well in the role, the configuration has a meaningful flaw — it is filtering too narrowly in a way that would cost the organisation strong candidates. Conversely, if the AI consistently ranks profiles that wouldn't have been shortlisted by an experienced recruiter, the configuration may be too broad.
Testing should also include deliberate edge cases: candidates from adjacent industries to assess whether the AI is correctly identifying transferable skills, candidates with unconventional career paths to assess whether the configuration is over-reliant on conventional experience markers, and candidates who are clearly unqualified to verify that the must-have criteria are correctly excluding profiles they should exclude.
The output of testing should be specific configuration adjustments rather than general impressions. If testing reveals that the AI is consistently undervaluing candidates from a specific industry background that recruiters consider relevant, the job description or weighting configuration should be adjusted to better represent that background as valuable. If testing reveals over-filtering on a specific criterion, that criterion should be moved from must-have to preferred. Specific observations produce specific improvements; vague impressions that "the scores don't feel right" do not.
Ongoing calibration: improving accuracy over time
AI recruiter configuration is not a one-time setup task — it is an ongoing practice of reviewing performance, identifying drift and making adjustments that keep the AI aligned with the team's actual hiring standards as those standards, the market and the role requirements evolve.
The most valuable ongoing calibration input is recruiter override data. Every time a recruiter advances a candidate who scored below threshold, or declines a candidate who scored above it, that is a signal about a gap between the AI's assessment and the recruiter's. Individual overrides are expected and acceptable; systematic patterns of overrides reveal configuration issues worth addressing.
Conversion rate analysis provides a second calibration input. If the AI is well-calibrated, candidates who score highly should interview better and convert to hires at a higher rate than those who score lower. If high-scoring candidates are consistently failing at the interview stage, the AI may be measuring something different from what actually matters for role success. Tracking interview-to-offer conversion rates by AI score band over time reveals this pattern.
External factors require periodic configuration review even when the internal performance data looks stable. If the company expands into new markets, adding geographically specific requirements may be necessary. If a role is restructured with different responsibilities, the job description should be updated and configuration reviewed. If the team changes hiring managers and the new manager has different assessment preferences, aligning the AI configuration with those preferences — rather than the previous manager's — produces more accurate results under the new regime.
AI Configuration Is Not Set-and-Forget
The AI performs best when hiring criteria are regularly reviewed. As market conditions change, what constitutes a 'strong' candidate for a role can shift. Quarterly configuration reviews maintain accuracy.
Feedback Loop Integration
When recruiters override AI recommendations — advancing low-scorers or rejecting high-scorers — those decisions can be used to recalibrate model weights over time. Treegarden's feedback loop captures override patterns and surfaces systematic misalignments between AI behaviour and recruiter judgement, giving teams the data needed to make specific, evidence-based configuration adjustments rather than guessing at what needs to change.
Frequently asked questions about AI recruiter setup
How long does it take to configure the AI recruiter in Treegarden?
Initial configuration of Treegarden's AI recruiter for a specific role typically takes 30-60 minutes, with most of that time spent on the job description and criteria definition rather than navigating settings. For a team setting up AI configuration for the first time, allow a full working day to define the job description standards, configure must-have criteria, write screening questions and validate the output against historical candidates. Ongoing configuration for subsequent roles takes significantly less time once the team has established their criteria definition process.
What happens when recruiters disagree with the AI's candidate recommendations?
Recruiter overrides — advancing a candidate the AI scored low, or rejecting one it scored high — are a normal and expected part of the workflow. These overrides are valuable data: they reveal gaps between how the AI has understood the requirements and how the recruiter actually evaluates candidates. When overrides are logged and reviewed regularly, they identify systematic patterns that inform configuration improvements. An AI that is never overridden may be set too loosely; one that is overridden constantly is not well-calibrated to the team's actual standards.
Should the same AI configuration be used for all roles, or configured per role?
Configuration should always be done at the role level, not applied uniformly across all hiring. Different roles require different skills, experience levels, seniority signals and screening questions. A generic configuration may work adequately for very similar roles within the same function, but applying a senior technical configuration to a junior commercial role — or vice versa — will produce poor results. Treegarden supports role-level configuration that is saved with the job and reused or cloned for similar roles in the future.
How do you know when the AI recruiter configuration needs to be recalibrated?
The clearest signals that recalibration is needed are: a systematic mismatch between AI recommendations and recruiter decisions (the AI consistently surfaces candidates that do not interview well, or misses candidates that do); a change in the labour market for the role that affects what a strong candidate profile looks like; a change in the role's requirements following a regrade or scope change; or a pattern of interview-to-offer conversion rates that suggests the screened candidates are not meeting the expected standard. Quarterly configuration reviews catch most of these drift scenarios before they accumulate into a significant problem.