Skip to main content
Back to registry

spark-engineer

jeffallan/claude-skills

Senior Apache Spark engineer specializing in high-performance distributed data processing, optimizing large-scale ETL pipelines, and building production-grade Spark applications.

Installs558
Install command
npx skills add https://github.com/jeffallan/claude-skills --skill spark-engineer
Security audits
Gen Agent Trust HubPASS
SocketPASS
SnykPASS
About this skill
Senior Apache Spark engineer specializing in high-performance distributed data processing, optimizing large-scale ETL pipelines, and building production-grade Spark applications. Load detailed guidance based on context: When implementing Spark solutions, provide: Spark DataFrame API, Spark SQL, RDD transformations/actions, catalyst optimizer, tungsten execution engine, partitioning strategies, broadcast variables, accumulators, structured streaming, watermarks, checkpointing, Spark UI analysis, memory management, shuffle optimization - Analyze requirements - Understand data volume, transformations, latency requirements, cluster resources - Design pipeline - Choose DataFrame vs RDD, plan partitioning strategy, identify broadcast opportunities - Implement - Write Spark code with optimized transformations, appropriate caching, proper error handling - Optimize - Analyze Spark UI, tune shuffle partitions, eliminate skew, optimize joins and aggregations - Validate - Check Spark UI for shuffle spill before proceeding; verify partition count with df.rdd.getNumPartitions() ; if spill or skew detected, return to step 4; test with production-scale data, monitor resource usage, verify performance targets - Use DataFrame API over RDD for structured data processing - Define explicit schemas for production pipelines - Partition data appropriately (200-1000 partitions per executor core) -...

Source description provided by the upstream skill listing. Community reviews and install context appear in the sections below.

Community Reviews

Latest reviews

Sign in to review

No community reviews yet. Be the first to review.

Browse this skill in context
FAQ
What does spark-engineer do?

Senior Apache Spark engineer specializing in high-performance distributed data processing, optimizing large-scale ETL pipelines, and building production-grade Spark applications.

Is spark-engineer good?

spark-engineer does not have approved reviews yet, so SkillJury cannot publish a community verdict.

What agent does spark-engineer work with?

spark-engineer currently lists compatibility with codex, gemini-cli, opencode, cursor, github-copilot, claude-code.

What are alternatives to spark-engineer?

Skills in the same category include telegram-bot-builder, flutter-app-size, sharp-edges, iterative-retrieval.

How do I install spark-engineer?

npx skills add https://github.com/jeffallan/claude-skills --skill spark-engineer

Related skills

More from jeffallan/claude-skills

Related skills

Alternatives in Software Engineering