Skip to main content
Back to registry

pdf-extraction

claude-office-skills/skills

This skill enables precise extraction of text, tables, and metadata from PDF documents using pdfplumber - the go-to library for PDF data extraction. Unlike basic PDF readers, pdfplumber provides detailed character-level positioning, accurate table detection, and visual debugging.

Installs177
Install command
npx skills add https://github.com/claude-office-skills/skills --skill pdf-extraction
Security audits
Gen Agent Trust HubPASS
SocketPASS
SnykPASS
About this skill
This skill enables precise extraction of text, tables, and metadata from PDF documents using pdfplumber - the go-to library for PDF data extraction. Unlike basic PDF readers, pdfplumber provides detailed character-level positioning, accurate table detection, and visual debugging. Example prompts: - Provide the PDF file you want to extract from - Specify what you need: text, tables, images, or metadata - I'll generate pdfplumber code and execute it - "Extract all tables from this financial report" - "Get text from pages 5-10 of this document" - "Find and extract the invoice total from this PDF" - "Convert this PDF table to CSV/Excel" - Debug Visually : Use to_image() to understand PDF structure - Tune Table Settings : Adjust tolerances for your specific PDF - Handle Scanned PDFs : Use OCR first (this skill is for native text) - Process Page by Page : For large PDFs, avoid loading all at once - Check for Text : Some PDFs are images - verify text exists - Cannot extract from scanned/image PDFs (use OCR first) - Complex layouts may need manual tuning - Some PDF encryption types not supported - Embedded fonts may affect text extraction - No direct PDF editing capability - pdfplumber Documentation - Table Extraction Guide - Visual Debugging

Source description provided by the upstream skill listing. Community reviews and install context appear in the sections below.

Community Reviews

Latest reviews

Sign in to review

No community reviews yet. Be the first to review.

Browse this skill in context
FAQ
What does pdf-extraction do?

This skill enables precise extraction of text, tables, and metadata from PDF documents using pdfplumber - the go-to library for PDF data extraction. Unlike basic PDF readers, pdfplumber provides detailed character-level positioning, accurate table detection, and visual debugging.

Is pdf-extraction good?

pdf-extraction does not have approved reviews yet, so SkillJury cannot publish a community verdict.

What agent does pdf-extraction work with?

pdf-extraction currently lists compatibility with gemini-cli, opencode, kimi-cli, amp, github-copilot, claude-code.

What are alternatives to pdf-extraction?

Skills in the same category include telegram-bot-builder, flutter-app-size, sharp-edges, iterative-retrieval.

How do I install pdf-extraction?

npx skills add https://github.com/claude-office-skills/skills --skill pdf-extraction

Related skills

More from claude-office-skills/skills

Related skills

Alternatives in Software Engineering