Turn Your Photos Into a Music Video With AI Lip Sync Tech

In this post, I’ll share how I created this music video from just a few static photos.

Assets

To make a music video showing someone singing, you’ll need the following:

  • One or more photos of the people who will be singing (usually JPG format)
  • A song (usually MP3 format)
  • Background video footage

Tools

To make the video, I used the following tech tools.

  • Adobe Photoshop Image Editor
  • Audacity Audio Editor
  • Capcut Pro Video Editor (you can use this as an audio editor as well)
  • YT-DLP to download videos and music from YouTube
  • UVR Online to split a song into stems and extract the vocals as a separate audio file
  • HeyGen AI video generator to convert a photo into a lipsynced video
  • Topaz Photo AI to upscale low-res photos

There are many alternatives to the tools above, but I like these the best.

Instructions

Get a song audio file

One easy way to get your song audio file is to find a song on YouTube and download it. I wanted this song:

I downloaded the audio MP3 file using the YT-DLP command line tool. The URL of the song was https://www.youtube.com/watch?v=D0ru-GcBIr4, which shows a video ID of D0ru-GcBIr4. So, to download the audio, I downloaded YT-DLP and ran the following command:

yt-dlp --extract-audio --audio-format mp3 --audio-quality 0 https://www.youtube.com/watch?v=D0ru-GcBIr4

If you are a simpleton and this looks confusing to you, then just use the online version of YT-DLP, which currently looks like this:

Get background video footage

Since the name of the song I made into a music video was 你本來就很美, which means “You are beautiful”, and the original music video showed beautiful beach scenes, I looked for similar beach footage on YouTube to use as the background for my video. I liked this video (https://www.youtube.com/watch?v=0ZBqnOeIxbQ):

Since the video ID was 0ZBqnOeIxbQ, I downloaded it from YouTube using YT-DLP using the command

yt-dlp 0ZBqnOeIxbQ

You don’t have to put a video in the background of your music video; you can also just put one or more still photos, but I think a background video looks better.

Separate vocals from the song’s audio

To improve lipsyncing and transcribing, we’ll need to separate the song into stems, where each track is a separate sound, e.g. vocals, instruments, etc. Go to UVR Online and upload the song’s audio file (MP3). You get process up to 12 minutes of audio per day for free.

When done, download the vocals track. Here’s what I got.

Note: I used Audacity to trim the audio to remove the silent sections to speed up AI-processing in Heygen and because Heygen has a duration limit per video.

Choose photos to lipsync

The photos that work best for lipsyncing are ones that

  • show the subject facing forward
  • are hi-res

These are the photos I picked for my video.

This last photo shows me looking to the side to add variety to the video. The AI lipsyncing results aren’t perfect, but it was acceptable, so I kept it.

Remove background from photos

I used Adobe Photoshop to remove the background from photos. For example, when I open a photo in Photoshop, I see a floating toolbar with a button called “Remove background”.

Clicking on it add a layer mask to the image layer

which causes the background to be transparent.

If the mask isn’t perfect and you see some of the background showing or some of the subject removed, you can edit the mask by

  1. clicking on the mask thumbnail in the layer,
  2. clicking on the paintbrush tool,
  3. adjusting the size of the paintbrush,
  4. changing the color of the paint to either white or black
  5. painting on the are of the image you want to show or hide

Change the photo aspect ratio to 16:9

Since I wanted to put my music video on YouTube, I wanted the video to be landscape format, 16:9, and 4K. I used Adobe Photoshop’s “crop’ tool to convert my portrait, 9:16 photos to 16:9. Notice that when I do that, I lose much of the photo, like the top of the hat and the shirt in the screenshot below.

To remedy this, first I expand the canvas wider than the original image and crop it.

Since my arms are cut off, I select those two areas and click the “Generative Fill” button.

The generative fill produces 3 variations. I picked one that looked the best.

This made my image 19:6 image look like this:

Change the background to neon green

Since we’ll want to change the background of our singing subjects to show our background video, we’ll need to put our photos on a green background so we can chroma key the green background out when editing the video in Capcut. In photoshop, change the foreground color to neon green.

Then, use the paint bucket tool to paint the background green. You may need to create a new layer positioned below the subject layer.

I repeated these steps for all other images, as necessary.

Upscale the images

Since my target video platform is YouTube and many devices like TVs support 4K, I upscaled some of my images to 4K. I used Topaz Photo AI to do this, but there are many alternatives that may be cheaper. With Topaz Photo AI, you can also do other things like sharpen, denoise, etc.

Create an avatar in Heygen

Now that we’ve collected and prepared all of our assets, we can create convert our photos to videos with lips synced to the lyrics. I created an account with Heygen and paid for cheapest monthly plan ($29 / month). This outputs 1080p videos only. If I wanted to created 4K videos, I would need the next plan at $39 / month. However, I think Capcut was able to upscale my 1080p footage to 4K using AI.

In Heygen, click on “Avatars” and create a new avatar by uploading your photos.

Create lipsync video

In Heygen, click the “Create video” button, choose “Landscape” for the orientation, and change the default avatar to one of your avatar’s “looks”.

The next part is important. For the script, click the “Audio” link and “Upload Audio”.

Upload the audio vocals file to Heygen, which has the backing music removed.

You can leave “Voice Mirroring” turned off so that the voice will be the voice in the uploaded audio.

Heygen will then try to transcribe the audio and display the words (lyrics) for the AI to produce the lipsync. Depending on the quality of the uploaded audio file, this may or may not work. In my case, it worked, and Heygen even detected the language as Chinese.

Click the “Generate” button and choose the appropriate settings.

Heygen will take some time render the video, which will show up under “Projects”. Repeat the above process for all photos (avatar looks).

Note: sometimes, some portions a video will have good lipsync results while others may not. In this case, you can try recreating the lipsync video.

Create the final video

To create the final video, I used Capcut. Though Capcut can be used for free, I paid for “Pro” access so I could use some of the paid features that I needed. I won’t go through all steps in detail since there are many general video editing tutorials online.

  1. Import (drag) all media (audio, video) to the “Media” pane (top left) in Capcut
  2. Drag your media to the timeline (bottom pane) to the appropriate tracks

The screenshot below shows the following tracks from bottom to top

  1. Audio track (complete song, not just the vocals)
  2. Main video track (background video showing different beach scenes)
  3. Other video tracks (since the lipsync results weren’t perfect for the duration of each Heygen-produced video, and because I wanted to show different versions of me signing different parts of the song, I chopped the videos into sections where the lipsync results were good)
  4. Text track (I copied and pasted the Chinese lyrics so viewers who read Chinese can read the lyrics as the video plays).

The tracks behave like layers in Photoshop, so media on higher tracks appear above media on lower tracks, which is why the tracks containing the green screen lipsync videos are above the track containing the background video.

Remove the green screen

To remove the green background from the lipsync videos,

  1. click on the video clip in the timeline
  2. in the top-right pane, click Video > Remove BG, Chrome key, and then, using the color picker, click on any area of the green background in the “Player” pane in the middle. The green background with suddenly disappear, revealing the media in the lower track (the background video).
  3. you may see some green artifacts around the edge of the subject. To clean up them up, slide the following sliders until you see good results: “Clean up edge”, “Feature edge”, ” Intensity”.

To improve the final video, you can do some other things as well, like

  • add a crossfade (“mix”) transition between adjacent video clips
  • Add a “Fade in” and/or “Fade out” animation to clips that are not adjacent to any other clips (go to “Animation” > “In” > “Fade In” and “Animation” > “Out” > “Fade Out”)
  • adjust the color of a video clip by going to Video > Adjust > Auto Adjust or Manual Adjust

Read my other post on how to work with Capcut.

Here’s another example of a music video I created using the same method.

Here’s another way to create a music video. It requires more effort, but the result is more interesting. It uses Heygen, Krea, and Runway AI.

Generate a Driving Performance Video

Follow the same steps above to create a lipsync video in Heygen to be used to drive another video. The background doesn’t have to be green. Make sure the resulting video is as follows:

  • Well-lit with defined facial features 
  • Single face framed from around the shoulders and up
  • Forward-facing in the direction of the camera

The most important thing is to facial features and lip movements. What clothes you wear and the background are irrelevant as Runway will just use your facial expressions to feed the input video.

For example, here’s a driving video I created.

Split the driving video into 30-second clips

Runway’s ACT-ONE feature only lets you create videos that are 30 seconds long. So, we’ll have to split the driving video into a series of 30-second clips. If your driving video is 2 minutes long, you’ll end up with 4 clips, e.g.

  1. Clip 1 (0:00 – 0:30)
  2. Clip 2 (0:30 – 1:00)
  3. Clip 3 (1:00 – 1:30)
  4. Clip 4 (1:30 – 2:00)

Train a model of your face in Krea

Go to https://www.krea.ai/train and follow the instructions to train a model of your face. You will need a subscription to do this. Use the Flux model.

  1. Train the AI model: Upload multiple photos of yourself (ideally around 20 with different angles and lighting) to the “train” section of Krea AI. You can specify whether you’re training a style, object, or character, but for faces, a default option is usually available. 
  2. Use the Flux model: Once your model is trained, utilize the Flux model, as it is currently the only one that supports direct integration with trained face models for personalized image generation. 
  3. Add your trained style: In the Flux model, click on “add style” and select your newly trained style from the “my styles” category. 
  4. Adjust influence: Use the provided slider to control how much your trained face style influences the generated images. Increasing the slider will make your face more recognizable, while decreasing it will reduce the resemblance. 

Generate a bunch of images of yourself signing

You can now use various prompts in Krea to generate diverse images incorporating your trained face. For example, you can make an image of you

  • signing in front of a mic facing the right with the camera close up
  • signing while holding a mic with the camera at a distance
  • playing drums
  • playing an electric guitar
  • etc

Here’s an example prompt:

subject is singing in front on stage in front of a band with a large audience in front of him. there are many multi-colored lights illuminated the stage.

And here’s one example output.

    Generate a bunch of 30-second videos

    Use Runway’s image-to-video feature to create a series of 10-second videos based on the images generated in Krea. Here’s an example.

    Since our song is 2 minutes long, we’ll need to create twelve 10-second videos.

    Create a lipsync video

    In Runway’s Act-One, upload your first driving video in the Driving Performance section on top. For the character reference below, upload your first character video produced in the previous step. Click the button to generate the lipsynced video

    Runway will take the facial expressions and lip movements from your driving video and apply them to the character video.

    Here’s an example using a simple character video for demo purposes.

    A Comparison of Video Quality at Different Bitrates

    Source Video

    The source video was taken with the Insta360 Ace Pro using PureVideo mode at 4K30fps. It’s 28 seconds long and 355 MB in size. HDR was disabled and can’t be enabled in this mode.

    I converted that video using Handbrake using the following presets and modifications. The presets all maintain the resolution (4K). In all cases, I changed the framerate from 60 to “Same as source”.

    Very Fast 2160p60 4K AV1, Quality = 63

    I set the quality to the lowest option (63).

    Very Fast 2160p60 4K AV1, Quality = 35

    I left the quality at the preset value (35).

    Super HQ 2160p60 4K AV1 Surround, Quality = 20

    I left the quality at the preset value (20).

    Super HQ 2160p60 4K AV1 Surround, Quality = 0

    I set the quality to the highest option (0).

    Here are the results

    VideoVideo Quality Setting (CRF)BitrateFile SizeFile Size % of Original
    Original106Mbps355 MB
    Fast630.5 Mbps1.7 MB99% smaller
    Fast354.7 Mbps16 MB95.5% smaller
    Super HQ2020 Mbps66 MB82% smaller
    Super HQ0 (highest)319 Mbps1092 MB300% larger

    According to ffmpeg, when encoding to H.264, a Constant Rate Factor (CRF) value of 17 or 18 is considered to produce visually lossless output. ffmpeg defaults to a value of 23. Looking at the table above, the nearest preset quality CRF value in Handbrake is 20, which produces a video with a 20 Mbps bitrate and an output that is 82% smaller.

    To compare video playback quality, you can use GridPlayer to play multiple videos side by side. Of course, if the videos have 4K resolution, you should use a 4K monitor.

    LosslessCut to Quickly Extract Good Parts From Long Video Footage

    This post will explain how I easily made this short 5-minute 4K video using LosslessCut and VideoStudio Pro.

    Short videos like this are handy when you want to condense a lot of video footage into a short video. Unlike 1080p video footage, 4K footage is much larger in file size, so, depending on your computer, editing it can be slow and frustrating. I used LosslessCut to easily and quickly create many short video clips from several large video files. I then used Corel VideoStudio Pro to assemble the clips and add transitions, effects, music, etc.

    1. Take Video Footage

    I used the Insta360 Ace Pro action camera. This camera has a large flip screen, which made it super easy to get footage on my own using a tripod. The camera has pretty good low-light quality when you choose the PureVideo setting. I took 4K30fps video.

    2. Extract Clips From Video Footage

    Open your video footage in LosslessCut.

    Go to the beginning of the first segment you want to turn into a clip. You can click the play button to play the video or jump to different sections by clicking on the timeline. You can also click the left and right arrow keys to move one frame at a time. Mark the beginning of the segment by clicking the left-finger pointer button. Go to the end of the segment and click the right-finger pointer button. In my video above, I made each clip between 3 and 5 seconds long.

    You will see the segment in the right column.

    To create a new segment, click the plus (+) button below the segments in the right column. If you want to delete a segment, you can click the minus (-) button.

    Repeat this process until you have all the segments you want to export. When you’re ready to export, click the Export button in the bottom right corner. The segments will almost instantly be exported as individual clips without loss of quality.

    3. Compress Video Clips

    Use Handbrake to batch compress all video clips.

    VideoVideo Quality Setting (CRF)BitrateFile SizeFile Size % of Original
    Original106Mbps355 MB
    Fast630.5 Mbps1.7 MB99% smaller
    Fast354.7 Mbps16 MB95.5% smaller
    Super HQ2020 Mbps66 MB82% smaller
    Super HQ0 (highest)319 Mbps1092 MB300% larger

    I used the Super HQ (20) preset since that produced 82% smaller files while maintaining visual quality.

    Learn more

    4. Assemble Clips in a Video Editor

    I used Corel VideoStudio Pro to create my final video.

    Import all the clips into VideoStudio Pro. Make sure to enable smart proxy to improve video editing performance.

    Since I made a 4K video, I made a poster image that was 3840 px by 2160 px.

    For the music, I chose the following

    For the transitions, I used Fade to Black. You can choose “Apply current effect to video track” to apply the transition to all clips at once.

    For the workout scenes, I added the duotone effect to the clips so the background wouldn’t be distracting.

    Drag the Duotone effect thumbnail to the clip on the timeline. You’ll see an “FX” label on the clip.

    Here’s how a clip looked before and after applying the duotone effect.

    Original clip
    With duotone effect applied

    When you click on a clip that has the duotone effect applied, you can customize the effect, e.g. by changing the color.

    5. Export the Video

    After assembling all assets (video clips, music, etc), transitions and effects, click the Share tab to export the video. Make sure the render video properties match the properties of the source video clips to maintain quality. In particular, pay attention to the resolution (3840 x 2160) and bitrate.

    Music Management with MusicBee and Omnia

    If you have a large collection of music files (mp3, etc) and want to easily manage them on desktop and mobile, you’ll have many apps to choose from. After testing some of the highly rated ones, I’ve (for now) settled on what I think are the two best: MusicBee for desktop and Omnia for mobile. This article will go over how I use these two apps to easily manage almost 2000 songs.

    MusicBee

    At this time, MusicBee is only available for Windows. Below is a screenshot of my MusicBee instance. The UI is customizable, which is great.

    Header Bar

    For the header bar, I customized it to have just what I care about, specifically

    • MUSIC (lists all music)
    • PLAYLISTS (lists all playlists in the left panel and all music within a playlist in the middle panel)
    • INBOX (I use this as a temporary staging location when I add new tracks to MusicBee)
    • NOW PLAYING
    • MUSIC EXPLORER (lets you browse by artist, showing albums for each artist)

    To edit the header bar tabs,

    • right-click on a tab and click “Close tab” to remove it
    • click the + icon to add a new tab

    “MUSIC” Tab

    When I click the MUSIC tab in the header bar, I see this:

    Left Sidebar

    The left sidebar shows a list of all artists. The very first option is “All Artists”. I click “All Artists’ to show a list of all my music files in the middle pane.

    Middle Pane

    The middle pane shows the filtered music tracks. I customized the columns to just what I care about, specifically

    • Bitrate (I use this to check the encoding bitrate. If a bitrate is too low, I may replace the track with one with a higher bitrate).
    • Time (the song’s duration)
    • Year (the year the song was released)
    • Title (the title of the song)
    • Artist (the name of the song’s artist)
    • Playlist (a comma-delimited list of playlists a song is in)

    To change the columns, right-click on the header and click “Set Displayed Fields…”

    Bottom Middle Panel

    In the middle, below the track list, is an optional pane for editing a song’s properties. I normally enter the following metadata:

    • Title
    • Artist
    • Year
    • Comments (the YouTube video ID if I ripped the song from YouTube)

    Bottom Right Panel

    In the bottom right panel, you can see a song’s artwork. You can change the article by right-clicking and browsing to an image on your computer. The image should be a square, e.g. 500 x 500 px.

    “PLAYLISTS” Tab

    When I click the “PLAYLISTS” tab, I see the following:

    In the left sidebar, I see a list of playlists. In the middle panel, I see the list of song tracks. To create a playlist or add/remove a track from/to a playlist, right-click on a track, click ‘Include in Playlist”, and either

    • click “<New Playlist>” at the bottom to create a new playlist
    • click or ctrl+click one or more existing playlists to add the song to the playlist(s)

    Playlist Format

    My music files are all in a single folder called “Music”.

    Within that folder, I have a subfolder called “Playlists” containing all my playlist files.

    I export my playlists in m3u8 format with relative paths. This allows me to copy my entire “Music” folder, including “Playlists” subfolder” to another device, like my phone or tablet, and the music player on the other device should be able to read my playlists and referenced music files without error.

    To set the playlist file format and path preference, click the hamburger menu in the top left corner, then click “Edit Preferences”.

    Then, click “Library” in the left sidebar and then select “M3U8” and check the “use relative file paths” as shown below.

    Now, if you export a playlist and open the playlist m3u8 file in a text editor, you’ll see relative paths to each song in the playlist like this

    Rename, Delete and Export a Playlist

    To rename or delete a playlist, click on the playlist in the left sidebar and click the corresponding option.

    Omnia

    Omnia is now my preferred app for mobile and tablet. It is currently only available on Android.

    Omnia is pretty simple and intuitive to use except when you update playlists.

    First-time use

    When you use Omnia for the first time, tell Omnia where your music and playlist files are.

    1. Click Settings > Music Folders and specify a folder path. Omnia will scan the folder for all music files.
    2. Once the files have been scanned, click on “Songs” tab in the main view to verify your songs are listed.
    3. Then, click on the “Playlists” tab to verify the playlists were loaded. If they weren’t, click on the 3 vertical dots in the top-right corner, click “Import”, click “SELECT ALL’, then check all playlists, and then click the “OK” button.

    Subsequent use

    If you’ve updated your music file library and playlists, you’ll need to

    1. click on the 3 vertical dots in the top-right corner, click “Rescan Library”, to get updated files.
    2. click the 3 vertical dots to the right of each playlist, and then click “Delete”, to delete the playlists that have been updated
    3. click on the 3 vertical dots in the top-right corner, click “Import”, click one or all playlists, then click “OK” to load the updated playlist.

    Workflow

    This is my workflow for acquiring music, adding it to MusicBee, and transferring it to Omnia on another device.

    1. Buy a song on Amazon Digital Music ($1 / song) or download a song as mp3 from YouTube using yt-dlp. I store the new files in a temporary “Music” folder.
    2. Edit the song in Audacity as necessary, e.g.
      • trim out start and end silence
      • normalize the volume
    3. Copy the music files to the folder containing all other music files.
    4. Import the new music files into the “Inbox” in MusicBee clicking “MusicBee” > File > ‘Scan Folders for New Files…”

    I then specify the folder containing all my music, select “add to inbox”, which is a temporary staging area, and then click “Proceed”.

    The new music will appear in the “INBOX” tab in MusicBee.

    1. Edit each song’s metadata (title, artist, year, etc) and add the song to existing playlists
    2. Move (send) the songs from the “INBOX” to the main “MUSIC LIBRARY”.
    3. Export any playlists that have been updated
    4. Connect my other device (phone / tablet) to my laptop
    5. Copy the music files from the temporary folder on my laptop to the external device
    6. Copy the updated playlist files from my laptop to the external device
    7. In Omnia in the external device, rescan the music library, delete and reimport any updated playlists

    Encyrpt/Decrypt Files/Messages with GPG

    Let’s say that you have a very confidential message or file that you want to share with someone and you absolutely don’t want anyone else to be able to see it. Most people would just share the message via email or some other communication tool, but this isn’t necessarily secure, especially since the tool provider, like Gmail, could access your email communications. When you want to be 100% sure that only your recipient can view your message, the best practice is to encrypt/decrypt it using GPG.

    Encryption Variations

    Pretty Good Privacy (PGP) is a proprietary encryption program that provides cryptographic privacy and authentication for data communication.

    OpenPGP is a non-proprietary, open-source version of PGP.

    GnuPG (Gnu Privacy Guard / GPG) is a non-proprietary, open-source version of OpenPGP.

    GpgFrontend

    This post will show you how to share secret messages with a friend by encyrpting/decrypting them using GPG. There are many tools that can encrypt/decrypt messages using GPG, including command line tools. I will use GpgFrontend because it appears to be the easiest for the general public to use. For this tutorial, I will use Windows.

    1. Download the GpgFrontend

    Both you and your friend will need to perform this step.

    You can download the Windows installer or the portable version that doesn’t require installation. For simplicity, I will just download the portable version.

    1. Download the portable version from GpgFrontend’s latest releases, labeled GpgFrontend-*******-windows-x86_64-portable.zip.
    2. Extract the contents of the downloaded ZIP file.
    3. Run GpgFrontend.exe from the extracted Program/bin directory.

    2. Generate a Key Pair

    Your friend will need to perform this step.

    Open GpgFrontend

    Click on “Manage Keys”.

      Click “New keypair”.

        Fill in your details, such as name and email, and choose your desired key type and size. You can also add a comment if you wish. Set an expiration date for the key, or choose ‘Never Expire’ if you prefer. Optionally, create a passphrase for added security. I will click the Non Pass Phrase checkbox for simplicity. Once all details are filled in, click ‘OK’ to generate your key pair. You will then see your key listed in the KeyPair Management window.

          3. Share Public Key

          Your friend will need to perform this step.

          In the KeyPair Management window, select the key you want to export, right-click, and click “Show Key Details”.

          Click the “Operations” tab, then click the “Export Public Key” button.

          Save the public key somewhere on your computer. For me, the filename generated was Davidabdullah747@gmail.com_pub.asc. The “pub” stands for “public” because this is your public key that you can share with other people. If you open the public key file, it will look something like this.

          Your friend should send you this file so that you can use it to encrypt your message.

          4. Import Public Key

          You will need to perform this step.

          Click “Import Key” -> “File” and browse to the public key file your friend shared with you.

          Click the OK button to confirm importing the key.

          5. Encrypt a Message

          You will need to perform this step.

          Write a message in the main text area of GpgFrontend.

          Check the public key in the list of keys in the Key Toolbox pane. To do this, click in the leftmost column of the key you want to use, outlined in red in the screenshot below.

          Click the “Encrypt” button. You will see that your plaintext message was converted to a PGP message.

          6. Share the Encyrpted Message

          You will need to perform this step.

          Copy the encrypted message and share it with your friend, e.g. via email.

          7. Decrypt the Encrypted Message

          Your friend will need to perform this step.

          Copy the encrypted message into GpgFrontend’s main text area.

          Click the “Decrypt” button. You’ll see the encrypted message get converted to plain text.

          With GPG, you can do many other things as well, including

          Learn more

          Camera Tips For Making Travel Videos

          After making a number of vacation videos using

          • my phone (Google Pixel 8)
          • my action camera (Insta360 Ace Pro)
          • my two 360 cameras (Insta360 X3)

          I’ve found the following works best for depending on the scenario.

          Driving

          • If HDR is necessary and you only want to film in one direction, mount the Ace Pro on the windshield facing the front of the vehicle (FPV)
          • If you want to film both the vehicle occupants and the road, mount the Insta360 X3 on the windshield
          • Shoot either a regular video or a Timeshift video

          Walking (Daytime)

          • Mount one Insta360 X3 on your chest (FPV). The Ace Pro isn’t a 360 camera, so it will not allow you to reframe later on to get subjects that are not directly in front of the camera.
          • Hold a second Insta360 X3 or X4 on an extended selfie stick such that the camera is at eye level to film yourself

          Walking (Nighttime)

          • For good image quality, mount the Ace Pro either on your chest or on a selfie stick and use “Pure Video” mode to get good night shots.
          • If you use the Insta360 X3/X4, the image quality will be much worse.

          Hotel Room Tour

          • Mount the Ace Pro on a selfie stick and film as FPV or film with the screen flipped up so you can film yourself (selfie)
          • Use “Pure Video” mode because you will be indoors and with unnatural (low) light.
          • I tried using my phone with a gimbal, but I preferred the simplicity and the results of using the Ace Pro.

          Here’s a comparison of the picture quality from taking a snapshot of a video taken by both the X3 and the Ace Pro. As you can see, the Ace Pro picture quality is way better. Click on each image to open the original files.

          Snapshot from 360-deg video on X3 – View enlarged image
          Snapshot from 4K, 30 fps PureVideo using “Action” (wide) setting on Ace Pro – View enlarged image

          Here’s another comparison.

          X3 – 1080p
          Ace Pro – 4K PureVideo

          Eating

          • Use X3

          Airplane Take-off and Landing

          When filming a take-off/landing from a window seat of an airplane, the difficult is keeping the horizon level. Also, with HDR technology, the sky can look overexposed, white, and washed out. If you use a 360 camera, you can adjust the horizon in post, but that’s a hassle. Ideally, use a gimbal, like the one integrated with the DJI Osmo Pocket.

          Replace Song Vocals With Any Instrument

          There may be times when you want to listen to a popular song, but you just want an instrumental version of it. There are many tools that use AI to remove vocals from a song, but unless you are making a track for karaoke, where you’d provide your own vocals, simply removing the vocals from a song will not result in music you’d care to listen to. For example, if you search YouTube for an instrumental version of a song, you’ll find many where the vocals were simply removed – and with poor quality at that. A better instrumental version of a song is one made by replacing vocals with instrument, e.g. piano, saxophone, etc. Compare the 3 audio clips below.

          Audio clip with original vocals

          Audio clip with vocals simply removed (somewhat imperfectly)

          Audio clip with vocals replaced with Alto Saxophone instrument

          Audio clip with vocals replaced with Chill Wave Synth instrument

          Here’s how to replace vocals with any instrument.

          1. Create a BandLab account

          https://bandlab.com

          Some features require a subscription.

          2. Open the splitter

          Click the icon to the right of the BandLab logo, then click on the Splitter link. Or, go to https://www.bandlab.com/splitter.

          3. Split a song into tracks

          You will have to option to choose how many tracks to split the song into. Since we’re only interested in replacing the vocals, you can choose “4 Tracks”. Then, click the “Split” button.

          BandLab will upload the song audio file and take a minute to split the audio.

          When it’s done, click “Open in Studio”.

          You will see the 4 tracks in the Studio.

          4. Convert voice to MIDI

          Right-click on the “Vocals” track and click “Audio to MIDI” > “Voice to MIDI”.

          5. Change the MIDI track to one of 370+ instruments

          You’ll see a new track below the “Vocals” track called “Audio-to-MIDI”.

          On the “Vocals” track, click the “M” button to mute that track. The track’s soundwave will gray out.

          6. Choose an instrument

          1. Click on the “Audio-to-MIDI” track.
          2. Click “Studio Grand” (the default instrument” to open the “Browse Instruments” dialog. Click on the play button to the left of an instrument to preview the sound. Search for an instrument by typing the instrument’s name in the search field. I will type “sax” and then click the “Alto Saxophone” instrument.

          7. Preview the instrumentalized song

          1. On the “Audio-to-MIDI” track, move the vertical playbar to the beginning of some signals in the track.
          2. Click the Play button to preview the song.
          3. Experiment with different instruments.

          Optionally, lower the volume of the “Audio-to-MIDI” track if that produces a better result.

          8. Tweak MIDI notes

          Some of the MIDI notes probably won’t sound right. You’ll need to tweak them.

          1. Click the “Audio-to-MIDI” track.
          2. Click the “MIDI Editor” button at the bottom left.
          3. Click on a MIDI note (a single blue bar to the right of the vertical piano keys) to hear the note.
          4. Left-click and drag the MIDI note up or down to change the sound.

          9. Export audio

          When you’re satisfied with the preview, download the audio by clicking Project > Download > Mixdown As”

          Choose one of the download options.

          Insta360 Ace Pro Tips

          Chest Mount

          Use this magnetic chest mount for hands-free FPV (first-person view) filming. The camera should be positioned upside down to lay flat on your chest.

          Daytime Video Recording

          • Choose “Freeframe video” mode.
          • Ratio: 4:3 (in the Studio app, change it to 16:9)
          • Active HDR: On
          • Resolution: 4K (this is the only option)
          • Frame Rate: 30 fps is good enough (this is the highest frame rate that supports HDR)
          • Horizon lock: choose 360-degree horizon lock

          Slow Motion Video Recording

          Choose “slow motion” mode to record a video with a high frame rate. Allows you to shoot in 240 frames per second in 1080P, or up to 120 frames per second in 4K.

          Night Shots and Low-Light Scenes

          Choose “Pure Video” mode.

          FlowState Stabilization

          There are 3 stabilization options: Off, Low, Standard, and High. Choose High for very shaky situations. Note that the high setting will result in more cropping.

          Field of View (FOV)

          Choose Dewarp to experience a wide FOV with minimal distortion. This is ideal for scenarios such as skiing and vlogging.

          Quickly cancel a recording

          You can quickly cancel a recording if you’ve messed up by long-holding the shutter button while recording.

          Pause recording

          If you have some downtime during a shot, for example, you’re waiting at an intersection or for something to happen, tap the screen while recording (at the bottom) to pause your recording. You can then tap to resume at any time. 

          Quick Capture

          When the camera is off, press the shutter button to start shooting with QuickCapture. The camera will start shooting using the last-used shooting mode.

          Pre-recording

          Save 15 or 30 seconds of footage before you start recording. This can be handy if you want to catch something on film that is imminent.

          Interval (long period)

          Take a series of single photos over a long period of time. The photos can then be combined into a timelapse.

          Burst (short period)

          Take a series of photos in a short period of time (up to 10s). One use case is if you are alone and want to take a selfie or if you are with someone who is not good at taking photos. You can set the burst setting to 30p/10s, meaning 30 pictures in 10 seconds. Then, you can stand, smile, try different poses, etc, and later pick the best shots.

          Timelapse (for static, fixed-position shots)

          Create a timelapse video that condenses a long scene into a short video by taking a photo at a set interval. Use a tripod or fix the camera in one spot.

          • Shoot length: infinite, 3m, 5m, 10m, … 10h
          • Interval: 0.5s, 1s, 2s, … 120s

          For example, let’s say you want to film yourself having dinner on an airplane. You expect you will take 30 minutes to eat, so you choose 30m for “shoot length”. Then, you can adjust the interval to see how long the timelapse video will be. For example,

          • interval = 0.5s, => timelapse duration = 2m
          • interval = 10s, => timelapse duration = 6s

          The timelapse duration is shown on the screen.

          Timeshift (for moving shots)

          This is like a moving time-lapse.

          Auto-stop Recording

          If you want to shoot a video and have the camera auto-stop recording, you can set the duration to one of the preset values. The default is infinity (until the battery dies). Other options include 15s, 30s, … 3h, etc.

          Inconspicuous shooting

          If you want to film without people knowing that you’re filming, you can do the following to make people think that the camera is turned off.

          • Turn off the indicator light so there’s no flashing red light on the camera.
          • Turn off “Front Screen Display”
          • Use the GPS remote to start and stop shooting without touching the camera.

          How to Add Lower Thirds Motion Graphics in Adobe Premier Pro

          A lower third is a text title or graphic overlay placed in the lower region of the screen. Motion Graphics templates are a file type (.mogrt) that can be created in After Effects or Premiere Pro. Here’s an example of a motion graphic lower third.

          There are many motion graphic templates for lower thirds available online, e.g.

          In this tutorial, I just get a free one from Mixkit.

          Download MOGRT template

          Download this free template and unzip it. You’ll get an mogrt file and an mp4 file showing how the animated title looks.

          Add a video to your timeline in Premier Pro

          Drag a video to the timeline pane.

          Open the Essential Graphics pane

          Click on Window > Essential Graphics. You’ll see the Essential Graphics pane on the right.

          Install the mogrt file

          In the Essential Graphics pane, click the “Install Motion Graphics template” button in the bottom right corner. Browse to the mogrt file and click Open. The template will appear at the top of the list of templates in the Essential Graphics pane.

          Drag the template to the timeline

          I dragged it to the V2 track.

          Edit the template

          Double-click the motion graphics clip in the timeline (pink bar) to edit the template. Depending on the template, you can edit the text, colors, size, positioning, etc.

          Preview the lower thirds title

          Click the play to preview the lower thirds animated title. You may find the duration to be too short.

          Extend the duration of the lower thirds title

          Move the playhead in the timeline to the point where the title text is fully shown and not moving. At that point, right-click on the motion graphic clip in the video track (pink bar) and click on “Insert Frame Hold Segment”.

          This will split all clips at that timestamp and the motion graphic clip will be split such that you can extend the middle subclip, which is the clip containing the frame showing the full title text, to increase the duration of the title. You will need to move the right subclip of the animated title to the right to make room to extend the duration of the middle subclip.

          For the split video clip, just drag the right clips toward the left clip.

          You may end up with something like this.

          Preview the animated lower thirds title. If it looks good, export the video. Otherwise, tweak the title further.

          Remove an Object, Text or Watermark from a Video Using Adobe After Effects Content-Aware Fill

          The video below has some text added to it.

          Video with overlaid text / watermark

          The video below is the same video but with the text removed.

          Same video with text / watermark removed

          We can remove objects like text and watermarks from videos using Adobe After Effects (AE) content-aware fill feature. Here’s how I did it for the video above.

          1. Open your video in Adobe After Effects

          Choose “New Composition From Footage” and select your video.

          2. Shorten work area

          The content-aware fill takes a long time to process. To speed up testing, shorten the work area to a 2-second section by dragging the blue start and end markers on the timeline.

          3. Create a mask

          Click a tool like the pen tool and create a mask area around the object (in this case, text) you want to remove.

          4. Adjust mask settings

          In the video track, expand the Masks and select “Subtract”. For the Mask Feature, choose 20 pixels for the vertical and horizontal feather. For the Mask Expansion, choose 20 pixels as well. Play with these settings until you find values that produce good results.

          These changes cause the mask to look like this

          5. Adjust content-aware fill settings

          If the content-aware fill pane isn’t open, open it by going to Window > Content-Aware Fill.

          For the alpha expansion, I set the value to 13. You can experiment with different values.

          For the fill method, choose “Object”.

          For the range, choose “Work Area” because we only want to apply the fill to the short 2-second work area for now.

          Click the “Generate Fill Layer” button. You will be asked to save the project if you haven’t already done so. You will then see AE analyze and generate the fill layer.

          You will see the fill layer above the video layer in the list of tracks / layers.

          6. Preview the content-aware fill layer

          Click the play button in the preview pane to preview the content-aware fill.

          This is how it looks for me. I think that looks good. If it doesn’t look good, go back and tweak some of the previous settings and try again.

          7. Apply content-aware fill to the entire range containing the text to remove

          Move the blue start and end work area markers to the beginning and end where the object / text you want to remove.

          Disable the test content-aware fill layer by toggling the eye icon.

          In the content-aware fill pane, click “Generate Fill Layer”. AE will create a new content-aware fill layer above our test layer. If the work area duration is long, this will take time to process.

          Content-are fill will go through an analyzing phase followed by a rendering phase.

          8. Preview complete content-aware fill effect

          Click the play button in the preview pane again to see how the fill effect looks in the entire clip. If it looks good, you can export the video.

          9. Export the video

          Click File > Export > Add to Render Queue.

          If the output location is unspecified, choose an Output To location. Then click the Render button.

          Update: Use a reference frame for better results. The more reference frames, the better the results.