For years, the solution to make large video libraries searchable was to add descriptive metadata such as a title, description and a few keywords to ingested video. But this was a manual task, tedious to say the least and impossible to scale with the amount of video being produced today.
Next came AI-powered video indexing tools, promising to automate the process. They tagged videos with faces, objects , and provided transcripts—but without context. They simply flooded media asset management (MAM) and digital asset management (DAM) systems with irrelevant metadata, making search results messy and creating more issues that they were initially promising to solve.
It’s why we launched MXT in late 2023—an AI that actually understands and describes video moments like a human. It breaks videos down into meaningful scenes, recognizing who’s in them, what’s happening, where it’s taking place, and even what kind of shots are used. It can also pull the best soundbites from interviews, speeches, or press conferences—so users don’t have to waste time scrubbing through footage.
Customers have used MXT to find and repurpose content from their video libraries seven times faster. And now, with MXT-2, we’re making video indexing even smarter and more accurate.
Most AI-powered video indexing tools rely on labels to make footage searchable. While tagging a clip with “dog,” “beach,” or “crowd” can be helpful, it doesn’t tell you what’s actually happening in a scene. Instead of just assigning generic tags, our technology generates rich, timecoded descriptions that explain each moment like a human would. This level of detail and context makes it much easier to find the exact moment you need, rather than sifting through hundreds of vaguely tagged clips.
And with MXT-2, the descriptions are now even smarter and more precise.
MXT-2 is trained on three times more data than our previous model, making it much better at describing what’s happening in a video. It’s capable of capturing far greater detail and nuance in video content.
To illustrate this improvement, consider the difference between how the previous model and the new MXT-2 model describe the same image.
Another new addition to MXT-2 is a feature we’re calling Custom Moments. With Custom Moments you can now define exactly how you want a specific content type to be segmented—making it easier than ever to find specific moments at scale.
Instead of sifting through hours of footage, you can instantly pinpoint the clips that matter most for your project.
For example:
Custom Moments ensure that the most relevant, high-impact content is instantly accessible, whatever story is being crafted, reducing manual edit time.
Tagging and organizing video has always been a messy, inconsistent process. Despite some organizations trying to enforce naming conventions and taxonomy, some users still label clips one way, while others do it another way—leading to a confusing, disorganized mess.
That’s why with MXT-2, we’re introducing another feature called Custom Insights. Custom Insights will automatically classify video content with precision and consistency. By harnessing the metadata generated by MXT, Custom Insights can instantly categorize videos based on whatever parameter makes the most sense for you. Whether it's by topic, theme, content type, or something more specific, classification is no longer a manual burden.
It can also go even deeper. Take a boxing match, for example. You could ask Custom Insights to classify the video by:
With this level of insight, you could instantly search for Floyd Mayweather’s best knockouts within the first five rounds, making content discovery effortless.
This capability not only streamlines video organization but also makes exploratory searches far easier—whether you’re archiving footage, pulling clips for a project, or just trying to find the best moments in a massive video library.
Custom Insights doesn’t just help you find content—it also helps you use it. As well as generating media level tags to help classify video, it can also be used to automatically generate any text you can use for publishing, or simply getting information about a video.
Here are just a few examples of how you can use Custom Insights to generate text from your video.
MXT-2 is set to transform how lean media teams discover and make the most of their extensive video content, speeding up production workflows and enhancing storytelling.
Organizations can search and discover MXT-2 indexed content using the Moments Lab video discovery platform, or integrate MXT-2 generated metadata into their current tool of choice.
As we continue to innovate and expand our AI technology, we remain dedicated to providing our clients with the tools they need to succeed.
Want to learn more about what success looks like with Moments Lab and MXT-2? Get in touch with us.