ThoughtLeaders supports advertisers and content creators by providing sponsorship intelligence. Our media buying customers are brands who advertise in media and want to understand where their competitors are advertising. Meanwhile, our media sellers are content creators who accept relevant sponsorships and share them with their audience. Creators want to know which brands would be interested and relevant sponsors for their content.
We aim to help both groups by providing accurate sponsorship data. To do so, we transcribe millions of YouTube videos, podcasts, and Twitch streams every day and then detect and extract sponsorships. In this article, we'll take a closer look at how we do so, starting at the beginning - what exactly is a sponsorship and how can a machine detect one?
For media buyers and sellers, in our case - the brands and creators, a sponsorship is a transaction. The brand pays the thought leader to create content related to the company’s product and highlight the unique value proposition of the product. It’s a win-win situation for both sides - the brand is able to promote its product to a loyal audience in a more controlled setting (the company sends the creator key points they need to mention and via UTMs, can easily track its success). The creator gets paid for the integration and brand sponsorships can also help strengthen the relationship with viewers (if done correctly).
But millions of pieces of content are added each day, how can each brand integration be tracked?
We built an artificial intelligence system that automatically detects sponsorships in transcripts. In order to teach the system to detect the sponsorships, we had to understand what linguistic phenomena distinguish a ‘sponsorship’. Keep in mind - the machine has to learn when to expect an upcoming sponsorship mention. For example, what terms are used when a brand is promoted, such as “brought to you by” or “sponsored by”. When we set out to build such a machine, we first turned to data and analyzed millions of pieces of content. We found endless variability but, over time, we were able to establish a specific pattern and ultimately build the anatomy of a sponsorship.
We decided to run a scientific process and annotated a large swathe of our data using LightTag, a set of text annotation tools. By annotating and analyzing the data, we were able to recognize the anatomy of the sponsorship and pick up on patterns that were re-occuring in integrations across a handful of creators, hosts, channels and networks.
Let’s take a closer look at the example above - sponsorship with Petco.
The sponsorship starts with a preamble indicating the beginning of a sponsorship, in this case “let you know about the offer for listeners”. The preamble let’s the audience know that the topic has shifted to a sponsorship. Next comes the name of the sponsoring brand, e.g., "let you know about the offer for listeners from Petco.com”. Then there is a value proposition, what does Petco do, what makes it unique? In the text, this is indicated with: “Specialty retailers of premium pet food”. Most sponsorships also include a call to action and promotion or incentive. In this case the CTA is “Go to Petcom.com/energy” and the special offer is “10% off your order and free shipping”.
Media buyers want to know the effectiveness of their sponsorships. By adding a tracking mechanism, brands can easily track how many consumers entered the website via a specific creator. In this case we see a vanity URL - “petcom.com/energy”-. In some cases, promotional codes are used instead of a vanity URL, such as “Enter code ENERGY at checkout”
In cases where the content has a few sponsors, it’s common for creators to bundle their sponsors in their show notes or at the start of the show. In the example above, we see such a summary mentioning three sponsors, Geico, PAINT MY LIFE and Netsuite. Although it is somewhat clear to us that the creator is highlighting sponsorships, the wording of the text does not follow the pattern explained in the first example. For instance, at a glance, the word ‘sponsor’ or ‘brought to you by’ isn’t mentioned at all. Furthermore, while Geico’s value proposition is mentioned, a tracking code and incentive for the consumer are both missing.
As mentioned above, because the host doesn’t clearly state that the brands mentioned are sponsoring this piece of content, this sponsorship may fall through the cracks when the system scours for potential, clearly noted brand integrations. ThoughtLeaders’ AI overcomes these difficulties by generalizing the idea of the sponsorship’s anatomy, helping it recognize sponsorships that don’t fit the mold perfectly in the same way we can recognize a human even if they are missing a tooth or even, heaven forbid, an arm.
Another challenge that our AI overcomes is distinguishing between sponsorships, casual brand mentions and affiliate marketing mentions - the process by which an affiliate earns a commission for marketing a company’s products. The creator promotes a product and adds their ‘affiliate link’ which therefore allows the brand to track sales and ultimately give the ‘affiliate’ a piece of the profit from each sale.
Let’s look at an example of affiliate links to understand the distinction:
Just like with sponsorship mentions, affiliate mentions include a brand name, a tracking URL and an incentive. But, some key aspects are different, such as the sponsorship indicator, the brands’ value proposition, and personalized vanity URLs. As viewers, it’s easy for us to tell that these are affiliate links , but something the ThoughtLeaders team worked hard on was teaching our AI system to not only spot, collect and track these affiliate links but to differentiate them from sponsorship mentions. This has ultimately allowed us to ensure that our clients have an accurate, high-quality data pool of affiliate-free sponsorship data.
Our customers shouldn't concern themselves with the anatomy of sponsorship. That's our job. But why do we even care?
Having a model of what sponsorship looks like helps us create accurate artificial intelligence systems that can process millions of pieces of content a day automatically and give our customers the information they care about - what type of channels should a brand sponsor on, where did competing brands sponsor?, how long was the ad spot?, where did the ad take place in the content?, and much more.
Breaking down the data into its anatomy can also help us distinguish between an affiliate link and a sponsorship because they have different anatomies. It also lets us use big data to extract deeper analytics about a sponsorship. This ever growing data pool allows us to give our clients the best estimate in regards to media value.