elevenlabs/skillsSoftware EngineeringFrontend and Design

speech-to-text

Transcribe audio and video to text with speaker identification, word-level timestamps, and 90+ language support.

SkillJury keeps community verdicts, source metadata, and external repository signals in separate lanes so ranking data never pretends to be a review.

SkillJury verdict

Pending

No approved reviews yet

Would recommend

Pending

Waiting on enough review volume

Install signal

Weekly or total install activity from catalog data

0 review requests

Install command

npx skills add https://github.com/elevenlabs/skills --skill speech-to-text

SkillJury does not have enough approved reviews to publish a community verdict yet. Source metadata and repository proof are still available above.

SkillJury Signal Summary

As of May 1, 2026, speech-to-text has 3 weekly installs, 0 community reviews on SkillJury. Community votes currently stand at 0 upvotes and 0 downvotes. Source: elevenlabs/skills. Canonical URL: https://skills.sh/elevenlabs/skills/speech-to-text.

Security audits

Gen Agent Trust HubPASS

SocketPASS

SnykWARN

About this skill

Transcribe audio and video to text with speaker identification, word-level timestamps, and 90+ language support. Transcribe audio to text with Scribe v2 - supports 90+ languages, speaker diarization, and word-level timestamps. Setup: See Installation Guide . For JavaScript, use @elevenlabs/* packages only. Word-level timestamps include type classification and speaker identification: Identify WHO said WHAT - the model labels each word with a speaker ID, useful for meetings, interviews, or any multi-speaker audio: For call recordings, the batch API can label diarized speakers as agent and customer by setting detect_speaker_roles=true alongside diarize=true . This option is not compatible with use_multi_channel=true . Help the model recognize specific words it might otherwise mishear - product names, technical jargon, or unusual spellings (up to 100 terms): Automatic detection with optional language hint: Audio: MP3, WAV, M4A, FLAC, OGG, WebM, AAC, AIFF, Opus Video: MP4, AVI, MKV, MOV, WMV, FLV, WebM, MPEG, 3GPP Limits: Up to 3GB file size, 10 hours duration Word types: Common errors: Monitor usage via request-id response header: For live transcription with ultra-low latency (~150ms), use the real-time API. The real-time API produces two types of transcripts: A "commit" tells the model to finalize the current segment.

Source description provided by the upstream listing. Community review signal and install context stay separate from this narrative layer.

Community reviews

Latest reviews

No community reviews yet. Be the first to review.

Browse this skill in context

Agents

Skills CLI

Source

elevenlabs/skills

FAQ

What does speech-to-text do?

Transcribe audio and video to text with speaker identification, word-level timestamps, and 90+ language support.

Is speech-to-text good?

speech-to-text does not have approved reviews yet, so SkillJury cannot publish a community verdict.

Which AI agents support speech-to-text?

speech-to-text currently lists compatibility with Skills CLI.

Is speech-to-text safe to install?

speech-to-text has been scanned by security audit providers tracked on SkillJury. Check the security audits section on this page for detailed results from Socket.dev and Snyk.

What are alternatives to speech-to-text?

Skills in the same category include review-management, conversation-memory, coverage, grimoire-aave.

How do I install speech-to-text?

Run the following command to install speech-to-text: npx skills add https://github.com/elevenlabs/skills --skill speech-to-text

review-management conversation-memory coverage grimoire-aave

Related skills

Alternatives in Software Engineering

eronred/aso-skills/Software Engineering

review-management

Source details, install context, and public review data are available on the full page.

Software EngineeringFrontend and DesignNo reviews yetSource eronred/aso-skills

sickn33/antigravity-awesome-skills/Software Engineering

conversation-memory

Persistent memory systems for LLM conversations with tiered storage and intelligent retrieval.

Software EngineeringFrontend and DesignNo reviews yetSource sickn33/antigravity-awesome-skills

alirezarezvani/claude-skills/Software Engineering

coverage

Map all testable surfaces in the application and identify what's tested vs. what's missing.

Software EngineeringFrontend and DesignNo reviews yetSource alirezarezvani/claude-skills

franalgaba/grimoire/Software Engineering

grimoire-aave

Query Aave V3 market data, reserve snapshots, and health metrics across supported chains.

Software EngineeringFrontend and DesignNo reviews yetSource franalgaba/grimoire

speech-to-text

Latest reviews

Categories

Agents

Source

More from elevenlabs/skills

text-to-speech

agents

music

setup-api-key

Alternatives in Software Engineering

review-management

conversation-memory

coverage

grimoire-aave