Skip to main content
Back to registry

speculative-decoding

davila7/claude-code-templates

Installs164
Install command
npx skills add https://github.com/davila7/claude-code-templates --skill speculative-decoding
Security audits
Gen Agent Trust HubWARN
SocketPASS
SnykWARN
About this skill
Use Speculative Decoding when you need to: Key Techniques : Draft model speculative decoding, Medusa (multiple heads), Lookahead Decoding (Jacobi iteration) Papers : Medusa (arXiv 2401.10774), Lookahead Decoding (ICML 2024), Speculative Decoding Survey (ACL 2024) Idea : Use small draft model to generate candidates, large target model to verify in parallel. Algorithm : Performance : Source : arXiv 2401.10774 (2024) Innovation : Add multiple prediction heads to existing model, predict future tokens without separate draft model. Architecture : Training : Tree-based Attention : Advantages : Source : ICML 2024 Core idea : Reformulate autoregressive decoding as solving system of equations, solve in parallel using Jacobi iteration. Mathematical formulation : Two branches : Lookahead Branch : Generate n-grams in parallel Verification Branch : Verify promising n-grams Performance : Draft Model Speculative : Medusa : Lookahead : - Speed up inference by 1.5-3.6× without quality loss - Reduce latency for real-time applications (chatbots, code generation) - Optimize throughput for high-volume serving - Deploy efficiently on limited hardware - Generate faster without changing model architecture - Draft model generates K tokens speculatively - Target model evaluates all K tokens in parallel (single forward pass) - Accept tokens where draft and target agree - Reject first disagreement,...

Source description provided by the upstream skill listing. Community reviews and install context appear in the sections below.

Community Reviews

Latest reviews

Sign in to review

No community reviews yet. Be the first to review.

Browse this skill in context
FAQ
What does speculative-decoding do?

speculative-decoding is listed in SkillJury, but the source summary is still sparse.

Is speculative-decoding good?

speculative-decoding does not have approved reviews yet, so SkillJury cannot publish a community verdict.

What agent does speculative-decoding work with?

speculative-decoding currently lists compatibility with codex, gemini-cli, opencode, cursor, github-copilot, claude-code.

What are alternatives to speculative-decoding?

Skills in the same category include telegram-bot-builder, flutter-app-size, sharp-edges, iterative-retrieval.

How do I install speculative-decoding?

npx skills add https://github.com/davila7/claude-code-templates --skill speculative-decoding

Related skills

More from davila7/claude-code-templates

Related skills

Alternatives in Software Engineering