AI Tool That Summarizes and Tags Videos 36x Faster
Built an AI-powered tool for SWARM Community that summarizes, quotes, and tags YouTube event recordings reducing processing time by 97% and saving ~59 hours/month.

The Challenge
A small Startup with a BIG problem
At SWARM Community, we host numerous online events with technical professionals from diverse fields. These recordings are uploaded to our YouTube channel, but due to time constraints and limited resources, most videos are not summarized, tagged, or indexed. This makes it hard for community members, and even our team to find relevant content after events end.

Watching the recording of the event after it ended and writing descriptions and extracting important quotes and figuring out tags is a time consuming task that no one is interested in
So many videos uploaded on SWARM's YouTube channel without being tagged and having a description.
Customers asking about the previous events videos and can't find them on Youtube even with searching by the related keywords to the videos content.
Takes 2 - 3 hours per video

Low discoverability
Low Views

Not accessible

The Opportunity
How might we make hours of valuable educational YouTube content from expert events actually usable without adding manual labor?
As the AI Experience and Innovation Manager, I identified an opportunity to leverage AI to automate this process and reach the following goals:

Our goal for the team:

Our goal for the business:

Save time on manual tagging and summarization
Increase discoverability of video content on YouTube
Monetizing the YouTube channel
Support accessibility and knowledge-sharing across our customers and our global community
Our goal for our Users:
My Process Overview
I owned design, backend, frontend, and every prompt in between
I worked solo to build and deploy a fully functioning summarization tool in just one week.
* Mapped internal pain points and video workflow
* Tested multiple AI models to evaluate summary quality
* Designed backend data pipeline and prompt structure
* Prototyped UX/UI in Figma for simplicity and speed
* Coded and connected Python backend to Claude API
* Built and styled frontend, connected endpoints, tested with live data
I've faced so many backend errors, API problems and low quality summaries and tags during the process that I was getting disappointed on getting it done.
The Solution
A one-click tool that outputs quotes, tags, and summaries
AT the end, I built the Smart Video Summarizer for the SWARM Community to automate the process of extracting insights from event recordings. The backend pipeline includes:
Accepting a YouTube video URL
Downloading the audio using yt-dlp
Transcribing the audio to text using OpenAI’s Whisper
Sending the transcript to Anthropic’s Claude 4 Sonnet model API to:
Summarize the content
Extract 3–5 insightful quotes
Generate 5–7 topic tags
Result & Impact
Saved 59+ hours per month
Cut per-video processing time from 3 hours → under 5 minutes
Saved ~59 hours/month of manual effort
Reduced human error and improved consistency
Enabled the customers to rediscover and reuse content previously buried
Key Takeaways
Small tools can make BIG IMPACT
Clarity over complexity: A focused use case meant I could build fast and ship faster
Prompting = UX: Structuring AI outputs is part of the user experience — not just a dev task
Design + Dev made me faster: Owning the full stack meant I could problem-solve immediately
Personal InsightThis wasn’t just about building with Claude, it was about designing usefulness in a real-world, content-heavy workflow. The payoff came in hours saved, not features added.
What's Next?
What's Next put something else here
This project helped me build confidence in ML engineering and API orchestration. It challenged me to work across audio, transcription, and natural language generation. I learned to debug complex toolchains and design prompts that produce structured, reusable outputs.
Next steps:
Design a conversational AI tool that transforms expert video content into personalized learning experiences, adapting to individual preferences, styles, and accessibility needs.
Let's create future experiences together :)
Copyright © 2025 Mahnaz H.