The problem with generic AI scoring
Most ATS platforms that offer AI candidate scoring use a fixed formula. The AI evaluates every candidate against every job using the same set of criteria with the same weights. Skills might account for 25%, experience for 25%, education for 25%, and keywords for 25% — regardless of whether the role is a C-suite executive position or an entry-level internship.
This approach is simple to implement and easy to explain. It is also fundamentally wrong for the majority of roles. Here is why.
A senior infrastructure engineer with 12 years of experience, deep Kubernetes expertise, and a high school diploma is an outstanding candidate for a senior DevOps role. Under equal-weight scoring, they lose 25% of their possible score on the education dimension before their skills and experience are even considered. Meanwhile, a fresh computer science graduate with a PhD but no production experience might score higher because the education weight inflates their overall number.
The problem is not that education is irrelevant. For some roles, it is the most important factor. The problem is that the same weight applied to every role produces scores that misrepresent candidate suitability for most of them. Generic scoring creates a systematic distortion where the scores look precise but mean something different for every role.
Why different roles need fundamentally different scoring criteria
The diversity of hiring needs across even a single company is enormous. Consider three roles that a mid-sized technology company might hire for simultaneously:
Senior Backend Engineer. What matters: deep technical skills in specific languages and frameworks (skills: high), substantial production experience (experience: high), evidence of architectural thinking (keywords: moderate), formal education (education: low). An engineer with 10 years of relevant experience and no degree is likely a better hire than a recent PhD with no production work.
Graduate Business Analyst. What matters: strong analytical education (education: high), familiarity with relevant tools and methodologies (keywords: moderate), foundational analytical skills (skills: moderate), work experience (experience: low). This is a role where you are hiring for potential, not track record. Experience should have minimal weight because candidates are expected to have little.
VP of Sales, EMEA. What matters: extensive sales leadership experience with proven revenue targets (experience: very high), industry knowledge and network (keywords: high), sales-specific skills and methodologies (skills: moderate), formal education (education: minimal). A VP of Sales is hired for what they have done, not what they studied.
If you score all three roles with the same formula, at least two of the three shortlists will be systematically wrong. The scoring system will present candidates as strong or weak based on criteria that do not match what the role actually requires.
The four dimensions of per-job configuration
Treegarden's AI scoring allows recruiters to configure weights across four dimensions for each individual job. Understanding what each dimension captures is essential for setting weights effectively.
Skills weight controls how much the candidate's demonstrated skills influence their score. This dimension evaluates whether the candidate possesses the technical and soft skills specified in the job requirements. Set this high for roles where specific capabilities are non-negotiable (engineering, design, data science). Set it lower for roles where skills can be trained on the job.
Experience weight controls how much the depth, breadth and relevance of the candidate's work history influences their score. This dimension assesses years of relevant experience, seniority progression, and how closely the candidate's previous roles align with the open position. Set this high for senior roles where track record matters. Set it low for graduate or career-change roles.
Education weight controls how much the candidate's academic credentials influence their score. This includes degree level, field of study, and relevant certifications. Set this high for roles where specific qualifications are required (medical, legal, accounting) or for graduate programmes. Set it low for experienced professional roles where practical skills outweigh academic background.
Keywords weight controls how much the overall language alignment between the CV and job description influences the score. This captures domain expertise, familiarity with industry terminology, and alignment with the specific tools and methodologies the role involves. Set this high for specialist roles where domain vocabulary signals genuine expertise. Set it moderate for generalist roles.
Weights Must Total 100%
The four weights always sum to 100%. When you increase one dimension, you implicitly decrease the influence of the others. This forces a deliberate prioritisation: what matters most for this specific hire? The constraint is a feature, not a limitation — it ensures the recruiter thinks explicitly about trade-offs.
Real-world weight configurations that work
Here are proven weight configurations for common role categories. These are starting points, not rigid rules — adjust based on your specific requirements.
Senior Technical Role (Staff Engineer, Lead Developer, Tech Architect):
Skills: 40% | Experience: 35% | Education: 10% | Keywords: 15%
Rationale: Technical depth and proven experience are the primary hiring criteria. Education is a nice-to-have. Keywords capture familiarity with specific tech stacks.
Graduate Programme / Entry-Level (Analyst, Associate, Junior Developer):
Skills: 20% | Experience: 10% | Education: 45% | Keywords: 25%
Rationale: Candidates have limited experience, so education and domain awareness are the strongest signals available. Skills weight captures foundational abilities.
Sales Leadership (VP Sales, Head of Revenue, Regional Director):
Skills: 20% | Experience: 45% | Education: 5% | Keywords: 30%
Rationale: Track record is everything. Keywords capture industry knowledge and methodology familiarity (MEDDIC, Challenger, etc.). Education is largely irrelevant at this level.
Healthcare Professional (Nurse, Physician, Clinical Specialist):
Skills: 25% | Experience: 25% | Education: 35% | Keywords: 15%
Rationale: Credentials and qualifications are non-negotiable in regulated healthcare roles. Education weight reflects mandatory certifications and licensing requirements.
Creative Role (Designer, Content Strategist, Brand Manager):
Skills: 35% | Experience: 30% | Education: 10% | Keywords: 25%
Rationale: Portfolio and demonstrated capability matter more than formal education. Keywords capture familiarity with relevant tools and platforms.
Operations / Logistics (Warehouse Manager, Supply Chain Lead, Fleet Coordinator):
Skills: 30% | Experience: 35% | Education: 10% | Keywords: 25%
Rationale: Practical experience and operational skills outweigh academic credentials. Keywords capture industry-specific certifications and system knowledge.
Per-Job Configuration in Treegarden
Every job in Treegarden gets its own scoring configuration. Set custom weights for skills, experience, education and keywords before running AI scoring. The configuration takes under two minutes and ensures the AI evaluates candidates against what actually matters for each specific role. Configure your first job now.
The configuration-feedback loop
Getting weights right does not require perfection on the first attempt. The effective approach is iterative:
Configure. Set initial weights based on your understanding of what the role requires. Spend two minutes thinking about what matters most, second-most, and least.
Score. Run AI scoring on the candidate pool.
Review. Look at the top-scoring candidates. Do they match what you want to see? Are the green-badge candidates the ones you would want to interview? Are there candidates you expected to score well who did not?
Adjust. If the top-scoring candidates are not what you expected, adjust the weights. If too many candidates with weak experience are scoring high, increase the experience weight. If strong candidates without degrees are scoring too low, decrease the education weight.
Re-score. Run scoring again with the adjusted weights. Because scoring is user-initiated in Treegarden, you can iterate as many times as needed without affecting the candidates. No applicant is rejected or advanced during the configuration process — you are tuning the scoring model before making decisions.
Most recruiters find their optimal configuration within one or two adjustments. Over time, teams build intuition for what works for their most common role categories and can set effective weights on the first attempt.
Weight configuration as a team alignment tool
An underappreciated benefit of per-job weight configuration is that it forces alignment between the recruiter and the hiring manager about what actually matters for a role.
Without explicit scoring criteria, the recruiter and hiring manager often have different implicit priorities. The recruiter might focus on education and credentials because those are easy to verify. The hiring manager might care most about specific technical skills and relevant project experience. These different mental models produce disagreement at the shortlist stage — the recruiter presents candidates that the hiring manager does not want to interview.
When the recruiter and hiring manager agree on scoring weights before reviewing candidates, they are explicitly aligning on priorities. "For this role, we care 40% about skills, 35% about experience, 15% about keywords and 10% about education" is a concrete statement that both parties can agree or disagree with before any candidates are reviewed. This prevents the common scenario where a recruiter spends hours creating a shortlist that does not match the hiring manager's expectations.
Common weight configuration mistakes to avoid
Based on patterns across thousands of scored roles, here are the most common configuration errors and how to avoid them:
Over-weighting education for experienced roles. If you are hiring someone with 8+ years of experience, their degree from 10 years ago is probably not the strongest signal. Reduce education weight to 5-15% for senior roles.
Under-weighting experience for leadership roles. A VP or Director hire is primarily about track record. If experience is not your heaviest weight for a leadership hire, reconsider your configuration.
Equal weights across all dimensions. Setting everything to 25% is the default, and it is almost always wrong. Every role has a most-important dimension and a least-important one. Use your weights to reflect that hierarchy.
Ignoring keywords for specialist roles. For highly specialised positions (data engineering, regulatory compliance, clinical research), the specific vocabulary a candidate uses is a strong signal of genuine domain expertise. Keywords weight should be 20-30% for specialist roles.
Not adjusting between similar-sounding roles. "Product Manager" at a startup means something different than "Product Manager" at an enterprise company. Even roles with the same title may need different weight configurations based on the specific context and level.
Frequently asked questions
What are AI scoring weights in recruitment?
Scoring weights determine how much each evaluation dimension contributes to a candidate's overall match score. In Treegarden, the four dimensions are skills, experience, education and keywords, each expressed as a percentage totalling 100%.
Why do different jobs need different AI scoring configurations?
Different roles have fundamentally different success criteria. A senior engineer needs deep technical skills; a graduate trainee needs strong education; a sales director needs extensive experience. A single scoring formula cannot capture these differences accurately.
How long does it take to configure scoring weights for a job?
Typically under two minutes. You are making four decisions about relative importance. Most recruiters develop intuition for common role categories quickly and can reuse configurations across similar positions.
What happens if I configure the weights badly?
You get a shortlist that does not match what you want. The fix is simple: adjust the weights and re-run scoring. No candidates are harmed — they remain in the pipeline regardless of score changes. Scoring is user-initiated, so you can iterate freely.