ChatGPT: 4 sources of risk for inbound marketing

23 march 2023•1288 words, 6 min. read

By Pierre-Nicolas Schwab

IntoTheMinds founder

Companies that rely on SEO to find customers may need help with llead generation. After a few weeks of using ChatGPT and the new Bing, I look at the dangers of these conversational agents for companies using inbound marketing. Based on the search context, I have identified 4 issues likely to impact users negatively.

Contact our agency for your inbound marketing projects

Summary

The opacity of the data used for training
Wrong but very convincing answers
Selection bias
Risk to serendipity
Conclusion

The opacity of the data used for training

The responses of conversational agents can evolve according to the data they absorb during their interactions with users. We remember the unfortunate episode of Microsoft’s Nazi chatbot, which foreshadowed the possible excesses of this type of artificial intelligence. We know that Open AI used Kenyan workers paid $2 an hour to supervise the training of ChatGPT and avoid its drifts. Nevertheless, the corpora submitted to ChatGPT are also integrated into its corpus. Very serious companies such as JPMorgan or Amazon have been worried about this and have forbidden their employees to use ChatGPT for fear that confidential data might be leaked. Yet Microsoft claimed, following the release of GPT-4, that customer data was not used for training models.

Example of a generative AI hallucination. Bing explains that ChatGPT was developed in China.

Whether your data is exploited or not is not the problem. The problem is that the conversational agent is trained on an opaque dataset. You will tell me that Google’s algorithm could be more transparent, and you would be right. However, there are 2 differences:

the important factors to be taken into account by Google’s algorithm are more or less known (backlinks, freshness and size of content, etc. …), and the indexing of your pages can be requested
the information used to formulate answers in ChatGPT & Co seems quite random. It is impossible to know where this information comes from; sometimes, it must be corrected. For example, the Bing conversational agent explained that IntoTheMinds had done market research for companies that were never our customers.

All this leads to the famous “hallucinations,” a real informational poison. It is nothing more nor less than Fake News tolerated under cover of technological limitations.

I have reproduced an anecdotal but symptomatic example. Bing explains that ChatGPT was developed in China.

This is nothing more nor less than Fake News tolerated under cover of technological limitations.

Wrong but very convincing answers

LLMs (Large Language Models) can confidently formulate false answers very convincingly. I used the new Bing in multiple contexts and submitted tasks of varying complexity. My queries were formulated in English and French, and each time I was bluffed by the structuring of the answer and the arguments put forward. Except for the subjects I mastered well, I quickly realized that the information needed to be corrected. When you rely on inbound marketing to find customers, these false answers can have negative consequences.

For example, here’s what Bing says to my question, “Why choose IntoTheMinds?”. Since it’s my company, I think I know it relatively well?. Here is his answer.

Example of a Bing (GPT-3) response containing objectively false data.

Objectively, I have nothing to say about the first 3 points since they come directly from our website. However, there is a problem with the fourth point. We have (unfortunately) never been contacted by Orange, L’Oréal, Nespresso, or Ikea (see part highlighted in yellow).

In this case, Bing’s answer doesn’t bother me more than that. These companies are prestigious, and being associated with them can only be positive. But imagine that Bing explains that you have worked with a cigarette company, an arms dealer, or a far-right political organization. The reputational risk is real, and there is no way to control the results or complain.

So, there is no real intelligence in selecting information but rather a huge vulnerability to false or outdated results.

Selection bias

Conversational agents like ChatGPT or Bing have the advantage of offering an immediate answer to your questions. In doing so, they offer answers that are usually brief. This leads them sometimes to take shortcuts and to privilege brevity over exhaustiveness. The tests I conducted show that the answers are rarely complete, except for very specific and closed questions. As soon as the task entrusted to GPT is more complex, requiring, for example, an analysis, expect some surprises. You might be better served by a classic search that gives you access to various web pages.

The problem with generative AI is selection bias. ChatGPT doesn’t even tell us its sources, and Bing cites a few without explaining why or how they were selected. We also know that generative AI suffers from an “ideological bias” derived from the training data.

Here’s an example. I asked Bing for advice on market research firms in Belgium. For someone outside this market, the answer may seem convincing. But the proposed answer makes you smile for someone active in this market for 20 years (me in this case).

Bing (GPT-3) sometimes relies on comparators to provide answers. It then depends on the quality of the information these third-party sites provide.

You’ll first notice that it relies on a single source (an unknown site called GoodFirms). Bing then proposes 4 companies, one of which is located … in Bulgaria (highlighted in yellow), and another (which I have anonymized) is not active anymore.

The selection bias is obvious, and the errors noted follow. The great weakness of generative AI is, indeed, to rely on other sites. For this kind of request, Bing does nothing more than “translate” your request into keywords and bring out the results of one or several pages. So, there is no real intelligence in selecting information but rather a huge vulnerability to false or outdated results.

Risk to serendipity

I am a great believer in the concept of serendipity on the internet. Serendipity is the ability to discover interesting information that one did not expect. Using a conversational agent also presents a risk of intellectual impoverishment for the user.

By reducing the user’s effort to a minimum and short-circuiting the search, conversational agents undoubtedly reduce the user’s cognitive load and ability to discover new information.

In a B2B search, this serendipity is essential. If the user behind his screen is looking for a supplier, he needs to learn more about the market and who to contact. A classic search engine will expose him to different results that will help to refine his request. In doing so, the classic search engine contributes to making the user better informed. He can then understand the differences and commonalities between competing providers and address those that best meet his need. This is the essence of inbound marketing.

With generative AI, these nuances disappear, resulting in biased and potentially false information (see examples above).

Avec l’IA générative, ces nuances disparaissent et vous vous retrouvez avec une information partiale et potentiellement fausse (voir exemples ci-dessus).

Conclusion

In this article, – I have discussed the 4 problems with generative AI. These problems will impact companies that use inbound marketing to find their customers:

Training data leads to “hallucinations,” which are nothing less than “fake news”
Incorrect answers offered in very convincing forms that mislead the user
Selection biases exist, which lead to the presentation of biased information to the user
The operating mechanism of conversational agents eradicates any possibility of serendipity

Even if I don’t believe online search behaviors will change, we must deal with ChatGPT and its clones. So, it’s better to prepare now to limit the impact on your inbound marketing. This is what I will deal with in another article.

ChatGPT: 4 sources of risk for inbound marketing

Summary

The opacity of the data used for training

Wrong but very convincing answers

Selection bias

Risk to serendipity

Conclusion

You're at the end of this article
We think you will also like

LinkedIn remains under-used by marketing managers [Research]

Backlink scam: DMCA Copyright Infringement Notice

What if Bard (Google) was trained on data from Gmail?

Veesual: generative AI at the service of e-commerce and fashion

Is it still possible to innovate? The productivity challenge.

Data visualization: 6 bad examples analyzed