eval
Rank all agent results for a session. Supports metric-based evaluation (run a command), LLM judge (compare diffs), or hybrid.
SkillJury keeps community verdicts, source metadata, and external repository signals in separate lanes so ranking data never pretends to be a review.
No approved reviews yet
Waiting on enough review volume
Weekly or total install activity from catalog data
npx skills add https://github.com/alirezarezvani/claude-skills --skill eval
As of Apr 30, 2026, eval has 791 weekly installs, 0 community reviews on SkillJury. Community votes currently stand at 0 upvotes and 0 downvotes. Source: alirezarezvani/claude-skills. Canonical URL: https://skills.sh/alirezarezvani/claude-skills/eval.
Source description provided by the upstream listing. Community review signal and install context stay separate from this narrative layer.
Latest reviews
No community reviews yet. Be the first to review.
What does eval do?
Rank all agent results for a session. Supports metric-based evaluation (run a command), LLM judge (compare diffs), or hybrid.
Is eval good?
eval does not have approved reviews yet, so SkillJury cannot publish a community verdict.
Which AI agents support eval?
eval currently lists compatibility with Skills CLI.
Is eval safe to install?
eval has been scanned by security audit providers tracked on SkillJury. Check the security audits section on this page for detailed results from Socket.dev and Snyk.
What are alternatives to eval?
Skills in the same category include grimoire-morpho-blue, conversation-memory, second-brain-ingest, zai-tts.
How do I install eval?
Run the following command to install eval: npx skills add https://github.com/alirezarezvani/claude-skills --skill eval
More from alirezarezvani/claude-skills
coverage
Map all testable surfaces in the application and identify what's tested vs. what's missing.
report
Generate test reports that plug into the user's existing workflow. Zero new tools.
ra-qm-skills
12 production-ready compliance skills for HealthTech and MedTech organizations.
init
Set up a production-ready Playwright testing environment. Detect the framework, generate config, folder structure, example test, and CI workflow.
Alternatives in Software Engineering
grimoire-morpho-blue
Query Morpho Blue deployment metadata and vault snapshots via the Grimoire CLI.
conversation-memory
Persistent memory systems for LLM conversations with tiered storage and intelligent retrieval.
second-brain-ingest
Process raw source documents into structured, interlinked wiki pages.
zai-tts
High-quality text-to-speech audio generation using GLM-TTS with customizable voices and playback parameters.