SEO
Google Bard AI – What Sites Were Used To Train It?

Google’s Bard is based on the LaMDA language model, trained on datasets based on Internet content called Infiniset of which very little is known about where the data came from and how they got it.
The 2022 LaMDA research paper lists percentages of different kinds of data used to train LaMDA, but only 12.5% comes from a public dataset of crawled content from the web and another 12.5% comes from Wikipedia.
Google is purposely vague about where the rest of the scraped data comes from but there are hints of what sites are in those datasets.
Google’s Infiniset Dataset
Google Bard is based on a language model called LaMDA, which is an acronym for Language Model for Dialogue Applications.
LaMDA was trained on a dataset called Infiniset.
Infiniset is a blend of Internet content that was deliberately chosen to enhance the model’s ability to engage in dialogue.
The LaMDA research paper (PDF) explains why they chose this composition of content:
“…this composition was chosen to achieve a more robust performance on dialog tasks …while still keeping its ability to perform other tasks like code generation.
As future work, we can study how the choice of this composition may affect the quality of some of the other NLP tasks performed by the model.”
The research paper makes reference to dialog and dialogs, which is the spelling of the words used in this context, within the realm of computer science.
In total, LaMDA was pre-trained on 1.56 trillion words of “public dialog data and web text.”
The dataset is comprised of the following mix:
- 12.5% C4-based data
- 12.5% English language Wikipedia
- 12.5% code documents from programming Q&A websites, tutorials, and others
- 6.25% English web documents
- 6.25% Non-English web documents
- 50% dialogs data from public forums
The first two parts of Infiniset (C4 and Wikipedia) is comprised of data that is known.
The C4 dataset, which will be explored shortly, is a specially filtered version of the Common Crawl dataset.
Only 25% of the data is from a named source (the C4 dataset and Wikipedia).
The rest of the data that makes up the bulk of the Infiniset dataset, 75%, consists of words that were scraped from the Internet.
The research paper doesn’t say how the data was obtained from websites, what websites it was obtained from or any other details about the scraped content.
Google only uses generalized descriptions like “Non-English web documents.”
The word “murky” means when something is not explained and is mostly concealed.
Murky is the best word for describing the 75% of data that Google used for training LaMDA.
There are some clues that may give a general idea of what sites are contained within the 75% of web content, but we can’t know for certain.
C4 Dataset
C4 is a dataset developed by Google in 2020. C4 stands for “Colossal Clean Crawled Corpus.”
This dataset is based on the Common Crawl data, which is an open-source dataset.
About Common Crawl
Common Crawl is a registered non-profit organization that crawls the Internet on a monthly basis to create free datasets that anyone can use.
The Common Crawl organization is currently run by people who have worked for the Wikimedia Foundation, former Googlers, a founder of Blekko, and count as advisors people like Peter Norvig, Director of Research at Google and Danny Sullivan (also of Google).
How C4 is Developed From Common Crawl
The raw Common Crawl data is cleaned up by removing things like thin content, obscene words, lorem ipsum, navigational menus, deduplication, etc. in order to limit the dataset to the main content.
The point of filtering out unnecessary data was to remove gibberish and retain examples of natural English.
This is what the researchers who created C4 wrote:
“To assemble our base data set, we downloaded the web extracted text from April 2019 and applied the aforementioned filtering.
This produces a collection of text that is not only orders of magnitude larger than most data sets used for pre-training (about 750 GB) but also comprises reasonably clean and natural English text.
We dub this data set the “Colossal Clean Crawled Corpus” (or C4 for short) and release it as part of TensorFlow Datasets…”
There are other unfiltered versions of C4 as well.
The research paper that describes the C4 dataset is titled, Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer (PDF).
Another research paper from 2021, (Documenting Large Webtext Corpora: A Case Study on the Colossal Clean Crawled Corpus – PDF) examined the make-up of the sites included in the C4 dataset.
Interestingly, the second research paper discovered anomalies in the original C4 dataset that resulted in the removal of webpages that were Hispanic and African American aligned.
Hispanic aligned webpages were removed by the blocklist filter (swear words, etc.) at the rate of 32% of pages.
African American aligned webpages were removed at the rate of 42%.
Presumably those shortcomings have been addressed…
Another finding was that 51.3% of the C4 dataset consisted of webpages that were hosted in the United States.
Lastly, the 2021 analysis of the original C4 dataset acknowledges that the dataset represents just a fraction of the total Internet.
The analysis states:
“Our analysis shows that while this dataset represents a significant fraction of a scrape of the public internet, it is by no means representative of English-speaking world, and it spans a wide range of years.
When building a dataset from a scrape of the web, reporting the domains the text is scraped from is integral to understanding the dataset; the data collection process can lead to a significantly different distribution of internet domains than one would expect.”
The following statistics about the C4 dataset are from the second research paper that is linked above.
The top 25 websites (by number of tokens) in C4 are:
- patents.google.com
- en.wikipedia.org
- en.m.wikipedia.org
- www.nytimes.com
- www.latimes.com
- www.theguardian.com
- journals.plos.org
- www.forbes.com
- www.huffpost.com
- patents.com
- www.scribd.com
- www.washingtonpost.com
- www.fool.com
- ipfs.io
- www.frontiersin.org
- www.businessinsider.com
- www.chicagotribune.com
- www.booking.com
- www.theatlantic.com
- link.springer.com
- www.aljazeera.com
- www.kickstarter.com
- caselaw.findlaw.com
- www.ncbi.nlm.nih.gov
- www.npr.org
These are the top 25 represented top level domains in the C4 dataset:
If you’re interested in learning more about the C4 dataset, I recommend reading Documenting Large Webtext Corpora: A Case Study on the Colossal Clean Crawled Corpus (PDF) as well as the original 2020 research paper (PDF) for which C4 was created.
What Could Dialogs Data from Public Forums Be?
50% of the training data comes from “dialogs data from public forums.”
That’s all that Google’s LaMDA research paper says about this training data.
If one were to guess, Reddit and other top communities like StackOverflow are safe bets.
Reddit is used in many important datasets such as ones developed by OpenAI called WebText2 (PDF), an open-source approximation of WebText2 called OpenWebText2 and Google’s own WebText-like (PDF) dataset from 2020.
Google also published details of another dataset of public dialog sites a month before the publication of the LaMDA paper.
This dataset that contains public dialog sites is called MassiveWeb.
We’re not speculating that the MassiveWeb dataset was used to train LaMDA.
But it contains a good example of what Google chose for another language model that focused on dialogue.
MassiveWeb was created by DeepMind, which is owned by Google.
It was designed for use by a large language model called Gopher (link to PDF of research paper).
MassiveWeb uses dialog web sources that go beyond Reddit in order to avoid creating a bias toward Reddit-influenced data.
It still uses Reddit. But it also contains data scraped from many other sites.
Public dialog sites included in MassiveWeb are:
- Quora
- YouTube
- Medium
- StackOverflow
Again, this isn’t suggesting that LaMDA was trained with the above sites.
It’s just meant to show what Google could have used, by showing a dataset Google was working on around the same time as LaMDA, one that contains forum-type sites.
The Remaining 37.5%
The last group of data sources are:
- 12.5% code documents from sites related to programming like Q&A sites, tutorials, etc;
- 12.5% Wikipedia (English)
- 6.25% English web documents
- 6.25% Non-English web documents.
Google does not specify what sites are in the Programming Q&A Sites category that makes up 12.5% of the dataset that LaMDA trained on.
So we can only speculate.
Stack Overflow and Reddit seem like obvious choices, especially since they were included in the MassiveWeb dataset.
What “tutorials” sites were crawled? We can only speculate what those “tutorials” sites may be.
That leaves the final three categories of content, two of which are exceedingly vague.
English language Wikipedia needs no discussion, we all know Wikipedia.
But the following two are not explained:
English and non-English language web pages are a general description of 13% of the sites included in the database.
That’s all the information Google gives about this part of the training data.
Should Google Be Transparent About Datasets Used for Bard?
Some publishers feel uncomfortable that their sites are used to train AI systems because, in their opinion, those systems could in the future make their websites obsolete and disappear.
Whether that’s true or not remains to be seen, but it is a genuine concern expressed by publishers and members of the search marketing community.
Google is frustratingly vague about the websites used to train LaMDA as well as what technology was used to scrape the websites for data.
As was seen in the analysis of the C4 dataset, the methodology of choosing which website content to use for training large language models can affect the quality of the language model by excluding certain populations.
Should Google be more transparent about what sites are used to train their AI or at least publish an easy to find transparency report about the data that was used?
Featured image by Shutterstock/Asier Romero
SEO
New Ecommerce Exploit Affects WooCommerce, Shopify, Magento


A serious hacking attack has been exploiting ecommerce websites to steal credit card information from users and to spread the attack to other websites.
These hacking attacks are called Magecart style skimmer and it’s spreading worldwide across multiple ecommerce platforms.
Attackers are targeting a variety of ecommerce platforms:
- Magento
- Shopify
- WooCommerce
- WordPress
What Does the Attack Do?
The attackers have two goals when infecting a website:
1. Use the site to spread itself to other sites
2. Steal personal information like credit card data from customers of the infected website.
Identifying a vulnerability is difficult because the code dropped on a website is encoded and sometimes masked as a Google Tag or a Facebook Pixel code.


What the code does however is target input forms for credit card information.
It also serves as an intermediary to carry out attacks on behalf of the attacker, thus covering up the true source of the attacks.
Magecart Style Skimmer
A Magecart attack is an attack that enters through an existing vulnerability on the ecommerce platform itself.
On WordPress and WooCommerce it could be a vulnerability in a theme or plugin.
On Shopify it could an existing vulnerability in that platform.
In all cases, the attackers are taking advantage of vulnerabilities that are present in the platform the ecommerce sites are using.
This is not a case where there is one single vulnerability that can be conveniently fixed. It’s a wide range of them.
The report by Akamai states:
“Before the campaign can start in earnest, the attackers will seek vulnerable websites to act as “hosts” for the malicious code that is used later on to create the web skimming attack.
…Although it is unclear how these sites are being breached, based on our recent research from similar, previous campaigns, the attackers will usually look for vulnerabilities in the targeted websites’ digital commerce platform (such as Magento, WooCommerce, WordPress, Shopify, etc.) or in vulnerable third-party services used by the website.”
Recommended Action
Akamai recommends that all Ecommerce users secure their websites. That means making sure all third party apps and plugins are updated and that the platform is the very latest version.
They also recommend using a Web Application Firewall (WAF), which detects and prevents intrusions when hackers are probing a site in search of a vulenerable website.
Users of platforms like WordPress have multiple security solutions, with popular and trusted ones being Sucuri Security (website hardening) and WordFence (WAF).
Akamai recommends:
“…the complexity, deployment, agility, and distribution of current web application environments — and the various methods attackers can use to install web skimmers — require more dedicated security solutions, which can provide visibility into the behavior of scripts running within the browser and offer defense against client-side attacks.
An appropriate solution must move closer to where the actual attack on the clients occurs. It should be able to successfully identify the attempted reads from sensitive input fields and the exfiltration of data (in our testing we employed Akamai Page Integrity Manager).
We recommend that these events are properly collected in order to facilitate fast and effective mitigation.”
Read the original report for more details:
New Magecart-Style Campaign Abusing Legitimate Websites to Attack Others
SEO
Google’s John Mueller On Domain Selection: gTLDs Vs. ccTLDs


Google Search Advocate, John Mueller, has shed light on the difference between generic top-level domains (gTLDs) and country code top-level domains (ccTLDs), offering practical advice to businesses and SEO professionals.
His comments arrive amidst the recent update by Google that categorizes .ai domains as gTLDs, moving away from their previous association with Anguilla, a British Overseas Territory in the Eastern Caribbean.
Understanding The gTLD & ccTLD Distinction
A website owner in a Reddit thread on the r/SEO forum asks about the SEO implications of choosing country-specific domains.
Responding to the thread, Mueller notes that ccTLDs, such as .nl, .fr, and .de, are advantageous if a business is targeting customers in that region.
However, for those aiming for a global market or targeting a different country than the ccTLD suggests, a gTLD or the relevant ccTLD might be a better choice.
Mueller explains:
“The main thing I’d watch out for is ccTLD (“country code” — like nl, fr, de) vs gTLD (“generic” – com, store, net, etc). ccTLDs tend to focus on one country, which is fine if you plan on mostly selling in that country, or if you want to sell globally. If you mostly want to target another country (like “nationwi.de” but you want to target the US), then make sure to get either that ccTLD or a gTLD.”
He further clarifies that new TLDs are all classified as gTLDs. Even those that seem geographically specific, like “.berlin,” are technically not considered ccTLDs.
Mueller continues:
“All of the new TLDs are gTLDs, for what it’s worth — some sound geo-specific, but they’re technically not (like “.berlin” — it’s a gTLD). Apart from ccTLD vs gTLD for SEO, there’s also the user-aspect to think about: will they click on a link that they perceive to be for users in another country?”
In another similar thread, Mueller warns against selecting TLDs predominantly used by spammers:
“From an SEO POV, I would just not pick a TLD that’s super-cheap and over-run with spam.” This comment underlines the importance of considering the reputation of TLDs when strategizing for SEO.
Google’s .ai Domain Update
Google recently updated its help documentation, specifying that it now treats .ai domain names as a gTLD, similar to .com, .org, and others.
This means Google Search won’t consider .ai domains geo-specific to Anguilla.
Gary Illyes from the Google Search Relations team provides the reason behind the change:
“We won’t infer the target country from the ccTLD so targeting Anguilla became a little harder, but then again there are barely any .ai domains that try to do that anyway.”
This update is significant for businesses and SEO professionals previously avoiding the use of .ai domain names for fear of Google associating them with Anguilla.
The new classification removes the concerns, and such domains can now be used without the worry of geo-specific targeting by Google’s algorithms.
In Summary
Choosing the right domain, whether country-specific (ccTLD) or generic (gTLD), makes a difference in reaching the right audience.
A ccTLD could be a good fit if a business mainly targets customers in a specific country. A gTLD might be a better choice if the goal is to reach a broader, global audience.
Additionally, it’s a good idea to avoid spammy TLDs that hurt your site’s reputation.
Mueller’s comments are a good reminder of the strategic decisions in registering your domain.
Featured image generated by the author using Midjourney.
SEO
11 Tips For Optimizing Performance Max Campaigns


Performance Max campaigns are the pinnacle of automation in PPC, so it’s no surprise they continue to be a major topic of debate for PPC professionals looking to balance time savings with peak campaign performance.
The primary goal of Performance Max campaigns is to drive conversions, such as sales, leads, or sign-ups, for your business while maintaining a competitive cost-per-action (CPA) or return-on-ad-spend (ROAS).
By utilizing Smart Bidding strategies and dynamically adapting ad creatives, these campaigns help advertisers reach a wider audience and boost the results obtained from traditional, single-channel campaigns.
But their high dependence on AI doesn’t mean these are set-it-and-forget-it campaigns.
Automation can still benefit from the touch of an expert PPC manager. But because they are so different from traditional campaigns, there are unique ways to optimize Performance Max (PMax) campaigns.
PMax optimization broadly falls into three categories:
- Setting them up for success.
- Monitoring that the AI is driving the right results.
- Tweaking the campaigns to further optimize their performance.
Read on to learn how to get the most out of your PMax campaigns by addressing each of these three areas of opportunity.
How To Set Up PMax Campaigns For Success
Let’s start with what can be done to set up Performance Max campaigns to be successful out of the gate.
Remember that one big risk of automated PPC is that machine learning algorithms can eat up a significant amount of budget during the learning phase, where it establishes what works and what doesn’t.
Many advertisers don’t have the patience or the deep pockets to pay for machines to learn what they already know from their own experience.
1. Run It In Addition To Traditional Campaign Types
This advice is straight from Google, which says
“It’s designed to complement your keyword-based Search campaigns to help you find more converting customers across all of Google’s channels like YouTube, Display, Search, Discover, Gmail, and Maps.”
And while running Performance Max as a stand-alone campaign is better than not advertising on Google at all, for professional marketers, it should be seen as a supplement to existing campaign types.
Running PMax campaigns in conjunction with traditional search and display campaigns offers advertisers a more comprehensive and diversified marketing strategy.
This approach allows businesses to capitalize on the strengths of each campaign type while mitigating their limitations, resulting in a more balanced and effective promotional effort.
Traditional search campaigns are particularly effective at capturing user intent through keyword targeting, ensuring ads are shown to users actively searching for relevant products or services.
Traditional display campaigns, on the other hand, are excellent at raising brand awareness and reaching audiences across a vast network of websites and apps.
PMax campaigns complement these traditional approaches by utilizing machine learning to optimize ad targeting and placement across multiple Google platforms.
This broadens the reach of advertising efforts, tapping into new audience segments and driving conversions more efficiently.
Combining these campaign types allows advertisers to cover all stages of the customer journey, from awareness and consideration to conversion and retention, while maximizing their ROAS.
2. Exclude Brand Keywords From Performance Max
One keyword-targeted search campaign you should always have is a brand campaign.
Then, ask your Google rep to exclude your brand keywords from all PMax campaigns so they don’t cannibalize traffic from your brand campaign.
Brand traffic should be inexpensive because it’s leveraging the power of your own brand. When users search for that, your ads will be the best match with the highest Quality Score and hence should be discounted significantly.
But because Performance Max’s mission is to generate more conversions, it may actually end up bidding on really expensive brand-adjacent queries.
For example, if I bid on the keyword “optmyzr,” I’ll pay around $0.10 per click when someone searches for exactly that.
(Disclosure, I am the co-founder of Optmyzr.)
But if I show ads for the keyword “optmyzr ppc management software,” I’m competing against every advertiser who bids for ‘ppc management software,’ my brand discount disappears, and those clicks will cost several dollars each.
In a branded search campaign, I can control exactly which traffic to target using positive and negative keywords. But in Performance Max, there is no easy way to manage keywords, so Google may use the really cheap brand traffic to subsidize the much more expensive brand-adjacent traffic.
Ultimately, you will get results within your stated ROAS or CPA limits. And while that may be acceptable to some, many advertisers prefer to manage their brand campaign separately from everything else.
3. Create Multiple Performance Max Campaigns To Target Different Goals
The same reasons why you would run more than one campaign in an account without Performance Max apply to why you should consider having multiple PMaxcampaigns.
For example, online retailers often set different goals for different product categories because they have different profit margins. By splitting these products into different campaigns with different ROAS targets, advertisers can maximize their profitability.


Maintaining multiple campaigns also supports seasonal advertising plans that may require different budgets at different times of the year.
Google supports up to 100 Performance Max campaigns per account, so that indicates that it, too, agrees there are many different good reasons why an advertiser would want to maintain more than one campaign.
4. Manage Final URL Expansion
When you create a PMax campaign, you tell Google what landing page to send traffic to. But you also get to decide if Google can expand to other landing pages on your domain.
Think of it a bit as dynamic search ads (DSAs), which automatically match your site’s pages to potentially relevant searches and automatically generate the ads to show.
Final URL expansion should be used cautiously.
At the campaign’s onset, consider focusing all your budget on the landing pages you care most about. If the results are good, then expand to more final URLs automatically.
And always be sure to use rules and exclusions to ensure Google doesn’t show your ads for parts of your site you don’t want advertised. For example, exclude your login page (assuming that one is ranked high in SEO).
You can also exclude sections of your site that are the focus of other campaigns. A retailer could exclude all pages that include the path ‘electronics’ in their apparel campaign to ensure consumers interested in electronics are served ads from the most relevant campaign.
5. Add Audience Signals From The Start
Adding audiences to a Performance Max campaign helps enhance the targeting and performance of your marketing efforts.
While PMax campaigns already utilize machine learning to optimize ad targeting, incorporating audience information provides additional context that can further improve the campaign’s efficiency.
Adding audience information enables the machine learning algorithms in PMax campaigns to make more informed decisions when optimizing ad targeting and placements. This can lead to better campaign performance and a higher ROAS.
By specifying particular audience segments, such as in-market, affinity, or remarketing audiences, advertisers can tailor their campaign messaging and creative to resonate better with their target users. This enables more personalized and relevant ad experiences, resulting in higher engagement and conversion rates.
Advertisers should also attach their own audiences to Performance Max campaigns. For example, by attaching a list of all their existing customers, they can choose to have the PMax campaign prioritize new user acquisition.
Because it is generally harder and more expensive to find new users than to convince existing users to make another purchase, adding this setting can better focus the ad budget on what is most valuable to the business.
How To Monitor Performance Max Campaigns For Success
Even when campaigns are well set up, monitoring AI is always a smart idea because it can sometimes make questionable decisions.
When I accidentally turned on automatically applied recommendations from Google, I found that my brand keyword ‘optmyzr’ was removed by Google because the AI felt it was redundant to some other keywords in my campaign, particularly some misspellings of our brand name.
I investigated and found the keywords Google preferred delivered fewer conversions and had a higher CPA than the keywords it removed. So not only was AI semantically wrong, but it also made a bad decision for my bottom line.
So let’s look at some ways to monitor Performance Max campaigns.
6. Report Where Your Performance Max Traffic Is Coming From
Just like you may have monitored clicks and impressions by device types or from different geographic areas, in PMax you should care about the performance of the various channels where your ads are shown.
If you only look at the overall performance of a PMax campaign, you may be falling into the trap of averages.
Relying solely on averages can be misleading and might not accurately represent the true nature of the underlying data.
Averages can oversimplify complex data, reducing it to a single value that may not capture important nuances or patterns within the dataset, and this can mask the variability or range of values in the dataset, leading to false assumptions about the consistency or homogeneity of the data.




For example, is low performance on the display network made up for by the great performance of ads on YouTube?
On average, the campaign drives the results you want. But by eliminating some wasteful portions, results could be even better than what you asked for.
Even if the campaign is delivering the desired results, knowing about possible inefficiencies puts you in a better position to address those and tilt the playing field back in your favor.
Tools like Optmyzr make it easy to see where your budget is spent in PMax, and there are also Google Ads scripts that will add this type of clarity to your data.
7. Monitor For Cannibalization
Because PMax campaigns don’t include the traditional search terms reports and only include part of that data in insights, it can be difficult to know when it is cannibalizing the other campaigns you’re running in parallel.
When it comes to standard shopping campaigns and PMax for retail (which replaced Smart Shopping campaigns), the PMax campaign always takes precedence over the traditional shopping campaign. For this reason, it’s important to segment products to avoid overlap.
For example, you could advertise shower doors in one campaign and bathroom vanities in another. But if there is any possible overlap, even segmenting campaigns may not lead to the desired result.
For example, shower wands advertised in a traditional shopping campaign may be closely enough related to shower doors and get mixed into the PMax campaign for shower enclosures.
Regarding keyword cannibalization, Google says if the user’s query is identical to an eligible Search keyword of any match type in your account, the Search campaign will be prioritized over Performance Max.
But if the query is not identical to an eligible Search keyword, the campaign or ad with the highest Ad Rank, which considers creative relevance and performance, will be selected.
And even a keyword that is an identical match may be ineligible due to a variety of factors and still get cannibalized.
The best way to monitor for cannibalization is to monitor campaign volumes and look for shifts. Does an unexpected drop in a search campaign correspond to an increase in traffic to the PMax campaign? If so, dig deeper and use our optimization tip for managing negative keywords that we’ll cover in the next section.
Optimizations For Performance Max
While PMax promises to optimize itself on an ongoing basis thanks to AI, there are some proactive ways you can still help the machines deliver better results.
8. Use Account-Level Negative Keywords
Unfortunately, it’s not possible to add negative keywords to a PMax campaign without the help of a Google rep. And even then, they will generally only add negative brand keywords to help prevent cannibalizing a brand campaign.
But PMax campaigns can work with shared negative keyword lists if you email Support and ask them to attach one of your shared negative lists to your PMax campaigns.
From that point forward, you can simply add negative keywords to the shared list, and they will instantly take effect on the PMax campaign that is associated with the shared negative list.
While Google doesn’t share full search term details for PMax the way it does for search campaigns, it will show keyword themes under insights. This is one good source for negative keyword ideas.
You should also leverage data from traditional search campaigns you’re running in parallel to PMax.
So mine your traditional search campaigns for negative keyword ideas, for example, when users search for things like ‘free’ ‘login’, etc., that never convert well. Add these as negative keywords to the shared negative list that is attached to your PMax campaign.
9. Use Account-Level Placement Exclusions
When it comes to placements, Google has a predefined report that shows placements where your Performance Max ads were shown.




This is a great starting point to find ideas for placements to exclude.
To exclude placements from PMax, you’ll need to exclude them at the account level, since it’s not possible to add negative placements to individual PMax campaigns. You’ll find this ability under the “Content” section of the Google Ads account.




Just like with negative keyword discovery, consider using your account-wide placement data from all campaigns to find placements to exclude in PMax.
And if you run multiple Google Ads accounts, you can get even better results by finding money-wasting sites and apps in the display network to exclude across all the accounts you manage.
Or when working with a tool provider, they may even be able to help you find negative placement ideas from their own vast network of data.
10. Exclude Non-Performing Geo Locations
Even though PMax uses automated bidding, which doesn’t support geo bid adjustments, you can still leverage geo data in two ways.
You can either exclude locations that don’t drive conversions or use conversion value rules to manipulate the value you report for conversions from different regions so that the bids will get adjusted accordingly.
For example, if you report conversions as soon as someone fills out your lead form, but you know that people in Munich become paying customers at a higher rate than people who fill out the same form from Berlin, you can set a conversion value rule to value conversions from Munich more highly.
This helps automated bidding make the right decisions about what CPC bid will likely have the desired ROAS.
And that leads to our final optimization tip, which is a big one.
11. Feed Correct Conversion Data
AI can only do a good job for your account if you tell it what the goal is.
And the goal should be precise.
It shouldn’t be to get the most conversions possible if your real goal is to drive profits.
Or to get as many leads as possible if you want leads that turn into customers.
Setting up goals correctly can make a huge difference in how well PPC automation will perform.
Updating goals with margin data or with data from your sales team can be a significant effort, and that’s why I’ve listed this as an ongoing optimization strategy rather than an up-front setup task.
Get PMax up and running with the conversions you’ve already been operating with, and then work to constantly enhance that conversion data.
Conclusion
With these 11 tips to optimize your Performance Max campaigns, you can expect better results while also benefiting from the time savings promised by automated campaign types.
There are many more tips I didn’t cover here that you can discover by joining the dialogue online.
And there will be many more tips to come as PPC automation continues to evolve.
More resources:
Featured Image: TippaPatt/Shutterstock
-
SEARCHENGINES4 days ago
Google Updates Shopping Ads Policy Center & Free Listings Policy Center
-
SEO4 days ago
How To Use AI To Enhance Your SEO Content Writing [Webinar]
-
SEARCHENGINES4 days ago
Google Local Service Ads Sends Out Mass Policy Violation Notices
-
SEARCHENGINES4 days ago
Google Search With More Detailed Car Comparison Tools
-
SEO4 days ago
Google’s Search Relations Team Explores Web3’s SEO Impact
-
PPC4 days ago
49 Father’s Day Instagram Captions & Ready-Made Images
-
WORDPRESS6 days ago
Custom Theme Designs Just Got Easier – WordPress.com News
-
SEARCHENGINES6 days ago
Bing Video Search “More Like This” Button