MeetEase: AI-Native Video Conferencing

Web Appcompleted

MeetEase: AI-Native Video Conferencing

Nov 20257 weeks

0 env varsthe app deploys and loads on Vercel with zero environment variables set — Clerk, Stream and Anthropic each fail soft and independently, so adding keys lights features up instead of flipping one big switch.

Live demo The build story Source

Overview

Most video apps treat AI like an afterthought — a summary email that lands an hour later. MeetEase flips that: transcription and the copilot aren't two features, they're two ends of one pipe. The hard part was never the model; it was making the plumbing feel instant, and making the app stay up even when its external services aren't configured.

Tech Stack

framework

Next.js 14 (App Router)

video

Stream Video SDK (WebRTC)

Anthropic Claude (streamed)

transcription

Web Speech API

auth

Clerk (graceful degradation)

Challenges

Clerk validates its publishable key at the edge the moment it's invoked — a fresh deploy with no keys 500s before any page renders.
Awaiting the whole Claude response feels frozen for seconds; the copilot needed to start answering before the model finished.
Live transcription and the AI assistant started as two components with faked data — two demos, not one system.
Windows is case-insensitive; Vercel's Linux is not, so /icons/video.svg worked locally and 404'd in production.

Solution

Every external dependency fails soft and independently — the Clerk handler is built only when a key exists and wrapped in a try/catch that falls back to NextResponse.next(), so an auth misconfig can never take the site down. The /api/ai route pipes Anthropic's SSE stream through a ReadableStream transformer that parses each event and re-emits only the text deltas, so tokens render as they arrive (with an honest offline fallback when there's no key). The real unlock was unifying transcription and the copilot behind one MeetingIntelligence context — the microphone writes to the exact state the copilot reads, turning "two demos" into "one pipeline."

Outcome

Ask the copilot to summarize or pull action items mid-call and it answers from the actual words spoken, streamed token by token. Guest "explore mode" means no login wall, and the whole thing is genuinely impossible to crash via missing config — which is why it deploys and loads on Vercel with no environment variables at all.

What I'd do differently

The /api/ai route has no rate limiting yet — I'd protect it before it ever sees real traffic. Transcripts aren't persisted and speaker diarization is client-side only; server-side diarization and a stored transcript history are the obvious next steps.

Built with

Next.jsTypeScriptStream Video SDKClerkAnthropic ClaudeWeb Speech APITailwind