Skip to main content
Back to the directory
aradotso/trending-skillsSoftware EngineeringFrontend and Design

flash-moe-inference

Skill by ara.so — Daily 2026 Skills collection.

SkillJury keeps community verdicts, source metadata, and external repository signals in separate lanes so ranking data never pretends to be a review.

SkillJury verdict
Pending

No approved reviews yet

Would recommend
Pending

Waiting on enough review volume

Install signal
886

Weekly or total install activity from catalog data

Sign in to review
0 review requests
Install command
npx skills add https://github.com/aradotso/trending-skills --skill flash-moe-inference
SkillJury does not have enough approved reviews to publish a community verdict yet. Source metadata and repository proof are still available above.
SkillJury Signal Summary

As of Apr 30, 2026, flash-moe-inference has 886 weekly installs, 0 community reviews on SkillJury. Community votes currently stand at 0 upvotes and 0 downvotes. Source: aradotso/trending-skills. Canonical URL: https://skills.sh/aradotso/trending-skills/flash-moe-inference.

Security audits
Gen Agent Trust HubFAIL
SocketWARN
SnykWARN
About this skill
Skill by ara.so — Daily 2026 Skills collection. Flash-MoE is a pure C/Objective-C/Metal inference engine that runs Qwen3.5-397B-A17B (397B parameter Mixture-of-Experts) on a MacBook Pro with 48GB RAM at 4.4+ tokens/second. It streams 209GB of expert weights from NVMe SSD on demand — no Python, no ML frameworks, just C, Objective-C, and hand-tuned Metal shaders. The Makefile compiles infer.m , chat.m , main.m with Metal shader compilation for shaders.metal . The model has 60 transformer layers : The shaders.metal file contains hand-written kernels. Key kernels: The core innovation — loading only K=4 active experts per layer from SSD: Why pread() not mmap() : mmap incurs per-page fault overhead on cold data (~5x slower). Direct pread() with OS page cache achieves ~71% hit rate naturally. The recurrence update uses Accelerate BLAS — 64% faster than scalar: Key principle : On Apple Silicon, GPU DMA and SSD DMA share the same memory controller.

Source description provided by the upstream listing. Community review signal and install context stay separate from this narrative layer.

Community reviews

Latest reviews

No community reviews yet. Be the first to review.

Browse this skill in context
FAQ
What does flash-moe-inference do?

Skill by ara.so — Daily 2026 Skills collection.

Is flash-moe-inference good?

flash-moe-inference does not have approved reviews yet, so SkillJury cannot publish a community verdict.

Which AI agents support flash-moe-inference?

flash-moe-inference currently lists compatibility with Skills CLI.

Is flash-moe-inference safe to install?

flash-moe-inference has been scanned by security audit providers tracked on SkillJury. Check the security audits section on this page for detailed results from Socket.dev and Snyk.

What are alternatives to flash-moe-inference?

Skills in the same category include grimoire-morpho-blue, conversation-memory, second-brain-ingest, zai-tts.

How do I install flash-moe-inference?

Run the following command to install flash-moe-inference: npx skills add https://github.com/aradotso/trending-skills --skill flash-moe-inference

Related skills

More from aradotso/trending-skills

Related skills

Alternatives in Software Engineering