Here’s how I created this overview video in 5 minutes with just a single prompt.
Get Content
Gather the content for the overview video you want to create. The content can be local files (PDFs, text files, etc), copied text, and website URLs. In my case, I got these URLs to pages explaining how credit cards work:
In the left pane of NotebookLM, add your source content.
Add a Prompt
In the middle pane, add a prompt describing what you want NotebookLM to do. In my example, I asked ChatGPT to give me a prompt to tell NotebookLM to generate an overview video of the content in my sources, which I then pasted into NotebookLM.
Generate Overview Video
In the right pane, click “Video Overview” to have NotebookLM generate an overview video based on the content and your prompt. My 4-minute video was generated in a few minutes.
This is not a “build an app in 10 minutes” post. It’s a realistic guide to using AI coding agents the way you’d use a junior engineer: with specs, guardrails, and review.
In this post, I explain how to vibe code a web app using the following tools:
Render.com (for hosting both frontend and backend)
Stripe (for payments)
3rd-party APIs (like kie.ai for text-to-image AI generation)
VS Code (code editor)
Resend (send emails using an API)
ImageKit (image optimization)
Note
You can use Next.js for both front-end and backend code using React and Next.js server actions, but for simplicity, I prefer to use static HTML, CSS, and JS for the front-end and Express.js for the backend. Also, you can use Typescript, but for simplicity, I prefer not to. I think these choices are fine for small apps. They definitely allow for faster vibe coding and simplify the codebase, which reduces the surface area for errors to occur, both during development and in production. If I used Next.js, I would host the app on Vercel, but since I’m not, I will host both the static frontend and Express.js backend/app server on Render.com, which simplifies deployment (push from GitHub, no CORS issues, free plan).
There are many, many, many ways to code/vibe code a web app, including using Claude Code, Cursor, Continue.dev, Gemini Antigravity, Gemini Conductor, and Agent Zero. The approach in this post is just the way I’ve found to work for me for now.
App Idea
First, you need an app idea. Consult with ChatGPT regarding the business strategy, pros, cons, and anything else you have on your mind. If you are a solopreneur, it’s probably best to keep your app simple enough that you can manage it yourself. For example, you can create an app that lets people upload an image of the front of their home and redesign their curb appeal by changing exterior paint color, yard design, and driveway design using an AI text-to-image generator like Nano Banana Pro.
User Flow & Functionality
Discuss the user flow and app functionality with ChatGPT until you decide on the details that you want for your app.
Tech Stack
Discuss with ChatGPT what tech stack you should have for your app. In my case, I prefer the stack listed above.
UI
Once you’ve decided on an app idea, ask ChatGPT to give you basic grayscale page designs using shadcn/ui and Tailwind UI components/blocks. You can try to use Google Stitch to generate more polished designs, but if you can’t, just start with a basic grayscale design first. ChatGPT can give you HTML + Tailwind CSS for these designs. For dev purposes only, just use the Tailwind CSS CDN to save time. Here are some example basic designs. Before going live, switch to compiled CSS using Tailwind.
Unique Interactive Elements
To minimize errors and guessing when the AI coding agent codes your app, have ChatGPT update all interactive elements (buttons, links, etc) with unique attributes so that the AI coding agent can uniquely target each element. For example, for buttons, have ChatGPT add a data-action attribution, e.g., <button data-action=”upload-image”…>
Database
If your project requires a database, ask ChatGPT to propose a database schema. If you agree with the schema, have ChatGPT give you a SQL script to create all tables and constraints. Save the script in a file like 001_init.sql. For the database, I prefer Supabase PostgreSQL for its simplicity.
Create an account at Supabase.com.
Create a project
Click SQL editor
Paste the SQL script and run it to create the database tables.
File Storage
If you need to store files online, like PDFs or images, there are many options, like AWS S3 and Cloudflare R2 (similar to S3 but cheaper). If you’re already using Supabase for your database, you can just use Supabase Storage to store files.
Click “Storage”
Click “New Bucket”
Enter a bucket name
Make sure to make the bucket public if your users or app code needs to access files in it
Authentication
Discuss with ChatGPT the best way to handle authentication and user account management. If you need the usual auth with logins, you can use Supabase Auth. But, if you just need a simpler auth mechanism, like using access tokens, you can do that as well. Discuss the pros and cons of each option with ChatGPT and pick one.
Payments
If you will be accepting payments in your web app, you’ll need a payment processor. Everyone seems to use Stripe for this. It has a test mode, which simplifies testing.
Emails
Your app will most likely need to send emails. There are many popular 3rd-party API-based email service providers, including Resend, Postmark, and Mailgun. I personally prefer Resend for its simplicity.
Image Optimization
If you use Next.js, it can automatically optimize images for you. If you don’t use Next.js, or if you prefer to separate the image optimization process, there are many options available, including ImageKit, Cloudinary, and TwicPics. I personally prefer ImageKit. You can still host your images in Supabase Storage or S3 and use them as a custom origin in ImageKit.
3rd-party APIs
If you’ll be dependent on external APIs, like a text-to-image generator using Nano Banana Pro, decide on which APIs to use. I’ve been using KIE because the UI is simple and it’s cheaper than fal.
Project Folder
On your computer, create a folder for your project, e.g., curb-appeal-ai. Add the SQL script to it in a folder called “sql”, e.g., curb-appeal-ai/schema/001_init.sql
Environment Variables
Ask ChatGPT for a list of all environment variables you’ll need for the app, e.g.,
Create a .env.example file and a .env file in your project folder, create accounts with each 3rd-party provider (Supabase, KIE, Stripe, etc) and add the values for each of the required variables to your .env file (not your .env.example file)
.gitignore
Create a .gitignore file in your project folder and add any files/folders you don’t want to commit, e.g.,
node_modules
.env
.DS_Store
AGENTS.md
Have ChatGPT create an AGENTS.md file in the project folder, following the format at AGENTS.md, containing all the coding guidelines the AI coding agent will need to successfully code the app without any guessing. Ask ChatGPT if any part of the guidelines and initial setup would require the coding agent to guess/infer anything while coding. If ChatGPT says yes, then resolve them now and add any necessary clarifications to AGENTS.md.
Data Formats
Different APIs have different request and response formats. For all APIs you’ll be using, have ChatGPT add each format, e.g., JSON schema, for each API to the AGENTS.md file so the AI coding agent won’t guess what the structure is like.
Logging
Tell ChatGPT to add a logging specification to AGENTS.md so the AI coding agent will log EVERYTHING to the terminal, which will help significantly with debugging.
Initialize your project
Open your project folder in VS Code
Open a terminal
git init
git add .
git commit “first commit”
OpenCode
Follow the instructions at OpenCode to install OpenCode. Or, just ask ChatGPT how to install OpenCode.
I like to run OpenCode in a VS Code terminal in a separate terminal beside other terminals I use for other purposes, like starting a dev server. In the screenshow below, you can see I have a terminal window for OpenCode and one for Node. I like to have my file explorer on the left, code editor in the middle, and terminals on the right.
AI Model
To use OpenCode, you need to connect to a model. You can use the OpenRouter API with OpenCode to easily switch between models like Claude, Gemini, and GTP Codex. However, you’ll have to buy and pay with credits. A cheaper solution is to use a ChatGPT Plus subscription, which is what I’ve been doing.
OpenCode + OpenRouter
You can use many different AI models with OpenCode, however, you’ll probably want to stick with the best models for web development, which are ranked here. Currently, Claude Opus 4.5 is the best model, but it’s expensive. I’ve been using GPT 5.2 Codex, which is good and cheaper than Opus. Gemini 3 is probably just as good. Instead of creating an account with each LLM provider, it’s easier to just create an account with OpenRouter, which allows you to pay in one place and use a single API key while giving you access to all of the models it supports.
In VS Code,
open a terminal
type “opencode” to launch the OpenCode TUI (Terminal User Interface)
type “/connect” to connect to an LLM provider
4. type “openrouter” to filter the list and click on “OpenRouter”
5. Copy/paste your API key from OpenRouter and hit Enter.
6. type “/models” to view the list of available models
7. Search/browse for the model you want and click on it.
OpenCode has 2 modes: Plan and Build. Click the tab key to switch between modes.
The Plan mode reads your codebase, creates an AGENTS.md file if you don’t already have one, asks questions, and creates a coding plan. It doesn’t modify any files.
The Build mode is for coding.
If you enter a prompt in OpenCode and you get a “User not found” error, then you pasted your API key incorrectly. Re-enter your API key and try again.
OpenCode + ChatGPT Plus/Pro
To connect OpenCode to a ChatGPT Plus web subscription, type /connect to open the “Connect a provider” modal. Search for “OpenAI (ChatGPT Plus/Pro or API key) and click it. A link will appear. Open the link in a browser and log in to your ChatGPT account. That will connect OpenCode to your ChatGPT Plus/Pro web subscription account. I find this to be much cheaper than buying credits.
Plan
Switch to Plan mode by hitting the tab key. Ask ChatGPT to guide you through planning and development. In Plan mode, the agent can’t edit any files. Whenever you start a new OpenCode session, always tell the agent to read through AGENTS.md and, optionally, to audit the codebase. You can say things like
Read the AGENTS.md file and all files in the repo to get up to speed on what this app is about and what has been done
Let’s build this app one step at a time following a typical user flow, e.g., 1) sign up, 2) log in, 3) use app, etc
The “Forgot Password” feature gives an error. See attached screenshot. Please debug and propose a fix.
Modularize all header and footer HTML into partials using EJS
Build
When you are ready to have OpenCode code, switch to the Build mode. You can say things like
implement the plan you just described in Plan mode
Test & Debug
When OpenCode is done coding, you will need to test your app. I like to follow the following workflow:
Test functionality in browser
Switch to Plan mode
If there’s an error, tell the agent. Optionally include a screenshot.
The agent explains the error.
If the error is not related to code, e.g., Supabase config error, the agent tells me what to fix myself.
If the error is a minor coding error, I’ll just fix it myself.
If the error is more than a simple one-line fix, I switch to Build mode and have the agent fix the error.
Once the fix has been made, I test the functionality.
If functionality is fixed, I switch to Plan mode and ask the agent to give me a commit message.
I commit the code changes. Otherwise, I repeat from step 3 until the bug is fixed.
In this post, I’ll show how to easily create explainer videos using Google Slides and Google Vids. Here’s an example explainer video I made using these tools.
Create Your Slides
Go to Google Slides and create a slide for anything you want to show visually in your video. A slide can contain a background, text, images, videos, and more. When creating your slides, write your speaking script in the notes section below each slide preview. In my case, I took the slide design from here.
Import Slides into Google Vids
Click the “Convert to Video” button. This will open the presentation in Google Vids.
Choose the slides you want to import. You can also include AI voiceover, script, and background music. If you do that, Google Vids will use AI to create a voiceover script for you. If you want to use the notes you wrote in Google Slides, turn off this toggle.
Google Vids only supports videos that are 10 minutes long. If you have many slides with scripts that, once the voiceovers are generated, will result in a video that exceeds 10 minutes, you will need to delete some slides and voiceovers. If this happens, you will convert slides in batches, e.g., 20 or 30 at a time. When selecting which slides to import, you can click one slide, hold the “shift” key, then click the last slide to select all slides in between.
Edit the Video
Now that you’re in Google Vids, you can edit the video. You will see a video timeline at the bottom.
Move the playhead along the timeline to different sections of the video.
Click the “Play” button to preview the video.
Click “Voiceover” on the right to edit the script for each slide you imported, if needed. Here, you can change the narrator’s voice as well. At this time, Google Vids only supports English, Spanish, Portuguese, Japanese, Korean, French, Italian, and German. If you want a voiceover in a different language, you’ll have to import the audio or record your own voice.
If you want to edit the background music,
click on the background music track,
click the speaker icon to the right of the waveform icon
You will see the Sound panel. Click the “All tracks” tab. You will then see the background music and narration audio for each scene. Clicking on each one will allow you to edit the audio for each scene individually. You can also click the “Apply to all audio” button to have your changes apply to all scenes.
Export Video
When you’re done making your changes, click “Share > Download as MP4” to generate and download the video.
Note
In the example video I created, I used HeyGen to create a lip-sync video of the narrator speaking. Google Vids doesn’t offer this feature, and you don’t need it to create explainer videos. If your video is in a language that is unsupported in Google Vids, e.g., Indonesian, and you want to create a lip-sync video, then you can simply paste the script into Heygen, preview the voiceover, and then generate the lip-sync video. HeyGen only charges for video generation, not text-to-speech previews. For a female voice in Indonesian, for example, I chose “Peyton” (voice ID = 6dd171c356f94a138cdbb5bd11ea8ee8) with “Original accent”.
If you only want to lip-sync some scripts, then it’ll be cheaper to paste those scripts in ElevenLabs. For an Indonesian text-to-speech, I chose Hannah’s voice, using the “Eleven Multilingual V2) model.
You can also clone your own voice in HeyGen and use it to create a lip-sync video.
The Christmas season is here, and you may finally decide to put up some lights on your house. I personally like single-color, soft-white lights rather than multi-colored lights. Regardless of your color preferences, installing string lights can be difficult depending on your particular situation.
Though there are holes every 2 feet to fasten them, those holes don’t line up with rafters under the eaves of my house, so I couldn’t use them all.
Regardless of the hole distance, one way to securely fasten any string lights, or cable, for that matter, is by using zip ties (I prefer the releasable kind) with a base mount.
Zip tie base mountReleasable zip ties
Just screw the base mount to your structure such that the slot where the zip tie would go in is in the direction you need it to be in.
Then, slide the zip tie though the base mount and around your string light cable.
This doesn’t just work for string lights. You can fasten other types of cables and even multiple cables. Unlike plastic cable clamps, which can quickly deteriorate and crack from UV exposure, zip ties can last a long time. Also, unlike cable clamps, which don’t grab cables tightly, resulting in some slack, zip ties can be pulled until the cable they’re securing are tight. This is important if you want your string lights to be straight.
If you’ve bought glue that uses a caulking gun, you’ve probably run into situations where you’ve wasted a lot of glue because it’s clogged the nozzle, and you’ve probably tried sticking a screw or nail in the tip to prevent the glue from drying, only to find that that didn’t work either. You may have even tried covering the tip with tape, which also probably didn’t work. So, you ended up throwing away more than 50% of glue just because the tip was clogged. Well, fortunately, that won’t happen if you buy glue that has a removable nozzle, as shown below. When I’m done using some glue, I’ll wrap the tip in duct tape so that glue doesn’t slowly ooze out, and then I’ll put it away for future use.
When I need to use the glue again, I’ll use a utility knife to remove the tape and then see if any glue comes out. Normally, the nozzle will be clogged. I’ll then use pliers to unscrew and remove the nozzle. I’ll stick the seal puncture on the caulk gun through the nozzle, which easily removes any dried-up glue. With the nozzle clear, I just screw the nozzle back and I’m good to go.
I like to make travel videos, but I don’t like to hold a camera or make it obvious that I’m filming people around me. I’ve been using the Insta360 X Series cameras with this magnetic chest mount, but it’s very heavy, bulky, and uncomfortable to wear.
Fortunately, DJI recently came out with their own 360 camera, and it’s lighter and compact.
Just wear magnetic chest mount necklace, placing the magnetic mount under your shirt. Then, screw the complementary magnetic mount to the camera and rotate the round magnetic surface to mount the camera as shown above.
Since the camera takes 360-degree videos, you can adjust the angle in post-production, so you don’t need to worry of the lens is facing down or up to to the side.
With the Snapshot feature, filming is as easy as clicking a button. With the camera off, just press either the record or function button to turn the camera on and start filming. To stop and turn off the camera, just press either button again.
I like to have the screen facing my shirt so no one can see that I’m filming them. You can have the screen turn off automatically, but it will take at least 3 seconds before it turns off.
With this setup, discreet POV filming is super simple and hands-free. My workflow is
When I’m ready to start filming, e.g., while boarding a plane, I press the record button. The camera turns on and starts filming in 360.
When I’m ready to stop filming, I press the record button again. The camera stops filming and turns off.
When I’m done filming for the day, I transfer the 360 videos to my laptop using a USB cable and I import the videos into the DJI Studio app for post-production (changing angles, trimming, converting to a flat video, etc).
Lastly, I import the flat videos into a video editing app like Capcut to combine the footage with other footage, like footage from holding my Insta360 X5 on a selfie stick.
Let’s say you have the rights to a song, e.g, a song that’s in the public domain, and you want to create a cover for it by only replacing the lyrics. Here’s how you can do it using ChatGPT and Suno AI.
Upload the original song to Suno
For this example, I uploaded this song.
Suno will add your uploaded song to your workspace, as shown below. When you click on the song’s title, you’ll see an auto-generated style description of the song and the lyrics.
Song Style
A French pop song with a moderate tempo and a romantic, dreamy atmosphere, The instrumentation features a prominent acoustic guitar playing arpeggiated chords, a bass guitar providing a smooth, walking bass line, and a drum kit with a soft, brushed snare sound, Synthesizers contribute to the ethereal quality with pad sounds and occasional melodic lines, The female lead vocalist sings with a soft, breathy tone, employing a gentle vibrato, The song structure follows a verse-chorus pattern with a bridge, The chord progression is primarily diatonic, creating a sense of warmth and familiarity, Reverb is applied generously to the vocals and some instrumental elements, enhancing the dreamy quality, The overall mix is balanced, with the vocals sitting clearly in the foreground
To create new lyrics, the easiest way is to ask ChatGPT to do it for you. For example, here’s a prompt I gave ChatGPT to generate new lyrics that meet my specific criteria:
I want to create a cover of a French song. The lyrics are below. Give me English lyrics that have a similar theme and can be used as a drop-in replacement for the French lyrics so I can use the Suno cover feature. If the French song has a rhyming pattern, the new lyrics should rhyme as well. If possible, try to match the syllable count, but prioritize the flow and meaning of the lyrics over the syllable count.
[Verse 1] Quand tu me souris Mon cœur s’envole Tout devient doux Tout prend son rôle …
ChatGPT will then respond with the lyrics. Here’s what it gave me:
[Intro]
(Para para pa) (Para para pa) (Para para pa) (Para para pa) (Para para pa) (Para para pa) (Para para pa) (Para para pa)
VERSE 1
When you smile at me, My heart goes free, Everything glows, Warm like it knows, Your name rings on, Soft as a song, In dreams each night, You are my light.
[Chorus]
Your love lifts me higher each day, Like an angel learning to fly, Every touch sends my worries away, Under moonlight, you open the sky.
[Verse 2]
Your eyes shine bright, like summer air, Every night I find you there, Your soft voice brings me into dream, Calling you flows like a stream.
[Chorus]
Your love lifts me higher each day, When your hands are holding mine, Time stands still in your gentle way, In your heart, all my echoes align.
[Bridge]
Even when the morning shines, And the world moves on, Still your love stays in my mind, You’re the dream I lean upon.
[Chorus]
Your love lifts me higher each day, Every moment feels so true, I feel you— So high in love with you.
[Outro]
Floating through the sky (Para para pa) (Para para pa) (Para para pa) (Para para pa) (Para para pa) (Para para pa) (Para para pa) (Para para pa)
Create cover song
Now that you have new lyrics, go back to Suno and click on the 3 dots next to the original song title, the click “Remix/Edit” > “Cover”
In the left pane, Suno will load the original song and lyrics. Replace the original lyrics with your new lyrics. In the “Style” field, paste the Suno-generated style description. Suno will also set some advanced options, like “weirdness”, “style influence”, and “audio influence”. You can keep the defaults.
Click “Create”. Suno will create 2 cover songs, as shown below.
Edit the cover song
After listening to the two cover songs, I like the second one more, but one section didn’t sound right. Some of the lines in verse 2 sounded rushed. To fix this, edit the song by clicking on “Open in Editor”.
With the editor open, you’ll see the song’s waveform, color-coded by section. Click on the section containing the lyrics you wan to edit. In this example, that’s the pink section shown below. When you click on it, the lyrics for the section will be selected in the lyrics box on the left. You can then type in revised lyrics in the “new lyrics” box below it. In this case, I made some of the lyrics shorter (fewer words).
Click the “Replace” button. Suno will generate two alternate versions of that section with the modified lyrics you provided. Click the play button beside each one to preview the alternate versions. If you don’t like either one, click “Regenerate” to generate more versions. When you like a version, click “Commit” to replace the section with the new section.
When you’re done editing, click “Save as new song”. The edited song will appear in your workspace.
You can then download the song.
Here’s the cover song in English. As you’ll hear, the backing instrumentals sound almost identical to the original French song, but the lyrics are new.
Recently, I needed some high-quality Mediterranean images. I tried searching stock photo libraries, but they were expensive, it took too long, and the images weren’t that good or what I was really looking for. I found several videos on YouTube that had the type of images I wanted, but I didn’t want to copy them exactly, so I used AI to create new images inspired by them. Here’s how I created them.
AI Model: SeeDream 4.0
Prompt: Create a Greek home using the same colors, lighting, and elements from the reference image, but it should look different from the reference image.
Reference Image:
AI-Generated Image:
AI Model: SeeDream 4.0
Prompt: Create an image using the same colors, lighting, and elements from the reference image, but it should look different from the reference image. The perspective should be 45 degrees from the perspective of the reference image, facing the sea.
Reference Image:
AI-Generated Image:
AI Model: SeeDream 4.0
Prompt: Create an image with the same composition and view as the first one, but use colors and materials from the second one.
Reference Images:
AI-Generated Image:
AI Model: SeeDream 4.0
Prompt: Create an image with the same composition and view as the first one, but use colors and materials from the second one.
Reference Images:
AI-Generated Image:
AI Model: SeeDream 4.0
Prompt: Create an image with the same composition and view as the first one, but use colors and materials from the second one.
Reference Images:
AI-Generated Image:
AI Model: SeeDream 4.0
Prompt: Create an image with the same, layout, composition and view as the first one, but use colors and materials from the second one.
Reference Images:
AI-Generated Image:
AI Model: SeeDream 4.0
Prompt: Create an image with the same, layout, composition and view as the first one, but use colors and materials from the second one. Keep the blue and white tile in the first image as is.
Reference Images:
AI-Generated Image:
AI Model: SeeDream 4.0
Prompt: Create an image with the same, layout, composition, elements and view/framing as the first one, but use colors and building materials from the second one.
Reference Images:
AI-Generated Image:
AI Model: SeeDream 4.0
Prompt: Create an image with the same, layout, composition, elements and view/framing as the first one, but use colors and building materials from the second one.
Reference Images:
AI-Generated Image:
AI Model: SeeDream 4.0
Prompt: Create an image with the same, layout, composition, elements and view/framing as the first one, but use colors and building materials from the second one.
Reference Images:
AI-Generated Image:
AI Model: SeeDream 4.0
Prompt: Create an image with the same, layout, composition, elements and view/framing as the first one, but use colors and building materials from the second one.
Reference Images:
AI-Generated Image:
AI Model: SeeDream 4.0
Prompt: Create an image with the same, layout, composition, elements and view/framing as the first one, but use colors and building materials from the second one.
Reference Images:
AI-Generated Image:
AI Model: SeeDream 4.0
Prompt: Create an image with the same, layout, composition, elements and view/framing as the first one, but use colors and building materials from the second one.
Reference Images:
AI-Generated Image:
AI Model: SeeDream 4.0
Prompt: Create an image with the same, layout, composition, elements and view/framing as the first one, but use colors and building materials from the second one.
Reference Images:
AI-Generated Image:
AI Model: SeeDream 4.0
Prompt: Create an image with the same, layout, composition, elements and view/framing as the first one, but use colors and building materials from the second one.
Reference Images:
AI-Generated Image:
AI Model: SeeDream 4.0
Prompt: Create an image with the same, layout, composition, elements and view/framing as the first one, but use colors and building materials from the second one.
Reference Images:
AI-Generated Image:
AI Model: SeeDream 4.0
Prompt: Create an image with the same, layout, composition, elements and view/framing as the first one, but use colors and building materials from the second one.
Reference Images:
AI-Generated Image:
I ended up using the images to create 5-second background videos using Kling AI for this music video:
There are many talking lipsync AI tools out there. The inputs are usually text and an image of a person. The results are almost indistinguishable from a non-AI talking video. But when it comes to lipsync videos involving singing, that’s a whole different story. Generating realistic singing lipsync videos is apparently very challenging. Tools like Kling AI and Runway ML, despite being very popular tools for video generation, do a horrible job at this. After trying a number of tools, the two best ones I’ve found are TopMediAI and HeyGen. In this post, I’ll share my experience using them.
UPDATE 12/19/2025: Longcat Avatar is a new option that is worth trying and comparing against.
UPDATE 12/8/2025: There’s a new singing lipsync generator called WaveSpeed MultiTalk (WAN 2.1). Preliminary testing indicates that, with respect to video quality, MultiTalk is better than TopMediAI but not as good as Heygen. With respect to lipsync, Multitalk is just as good as TopMediAI and better than Heygen.
This tool does a decent job at creating singing lipsync videos, and the interface is very simple and intuitive. Though it’s designed for singing, it’s far from perfect.
Inputs
upload an audio file (mp3) between 2 and 30 seconds
upload an image of the character you want to sing
When generating a video using TopMediAI, sometimes, generation will fail repeatedly. From my experience, you have to keep trying 3-5 times until generation succeeds. It’s annoying, but it’ll eventually work.
This tool was designed for creating talking lipsync videos, not for singing. Nevertheless, it’s most advanced motion engine (Avatar IV) does a pretty good job a generating a singing lipsync video if you choose the “Quality” mode with a “Custom motion” value of “singing”. If you use the “Avatar Unlimited” engine, the results are just not good enough, in my opinion.
Update 12/8/2025: If you use the “Faster” generation mode, the quality appears to be just as good as the “Quality” mode, so just choose that mode since it costs half the cost of the “Quality” mode.
The process to create a lipsync video using HeyGen is more complex. Here are the steps:
Click “Avatars” > “Create New” > “Start from a photo” >
Upload a photo and wait for it to be processed
Choose to create a new avatar or add the photo as a new “look” of an existing avatar. (One avatar can have multiple “looks”)
Click “Create with AI Studio”
Click “Audio” > “Upload Audio” , then upload your audio clip. You can upload a clip anywhere between 1 second and 3 minutes.
You can also choose from a previously uploaded audio.
Play and confirm the uploaded/selected audio.
HeyGen will attempt to transcribe the audio. If transcription fails, you won’t be able to proceed. In my experience, if it fails, it’s usually because the audio clip is too short. When I upload a longer clip, it usually can transcribe it. Note that the transcription can be wrong. This doesn’t appear to matter, as the video generation appears to be based on sound rather than words.
Click “Generate”.
Comparing HeyGen to TopMediAI
Body movements
Neither TopMediAI nor HeyGen will make your character dance, but they will animate your character’s body to some extent. This is good, because older technologies literally only animated the lips or face and left everything else frozen/static. I feel that TopMediAI generates stronger body and lip movements, which makes the results look more realistic from that perspective.
Lipsync accuracy
When uploading a audio clip, it’s better to isolate the vocals from the backing track to prevent TopMediAI and HeyGen from getting confused. Neverthless, even when you upload the vocal track of a song, both AI tools occasionally produce inaccurate results, e.g., instead of lip movements to sing the word “hati”, TopMediAI made the lip movements as if to sing the word “hapi”; it wasn’t able to detect the difference between the “t” and “p” sounds. HeyGen seems to do a better job at lipsync accuracy.
Sustained vocal sounds
TopMediAI animates both the subject’s body and their lips to try to match the sounds in the audio file. This is particularly necessary for sustained vocal sounds, like in the following example.
Using the same inputs, and using HeyGen’s most advanced model (Avatar IV in “Quality” mode with a “Custom Motion” value of “Singing”, you can see below that HeyGen failed.
Video picture quality
With TopMediAI, if you upload an image of a zoomed-out character, even if it’s a hi-res image, the tool will have difficulty detecting the facial features, and the resulting video will be blurry with lots of artifacts. For that reason, I only upload images containing close-up shots of the character from the waist up. However, even then, the picture quality of the generated lipsync video deteriorates, sometimes significantly. For example, here’s the source image I uploaded to TopMediAI:
And here’s a frame from the generated video:
That’s a big difference.
HeyGen, on the other hand, does a much better job at preserving picture quality of the source. For example, compare the source and generated (screenshot) images below.
Teeth
TopMediAI can’t seem to produce consistent and natural-looking teeth. Sometimes, the results are acceptable, but other times, they are not. Compare the following.
Teeth appear okay, but still not perfect.Teeth appear heavily chipped
HeyGen, on the other hand, does a very good job and showing natural, and almost perfect, teeth, as in this example:
Output resolution
With HeyGen, you can export videos up to 4K quality. With TopMediAI, there are no resolution options.
Recommendations
I would definitely use HeyGen’s Avatar IV with the “Quality” mode first to generating singing lipsync videos. If the results don’t look good, then I’d use TopMediaAI as a fallback.