Connect with us


Leaked Google Memo Admits Defeat By Open Source AI



Leaked Google Memo Admits Defeat By Open Source AI

A leaked Google memo offers a point by point summary of why Google is losing to open source AI and suggests a path back to dominance and owning the platform.

The memo opens by acknowledging their competitor was never OpenAI and was always going to be Open Source.

Cannot Compete Against Open Source

Further, they admit that they are not positioned in any way to compete against open source, acknowledging that they have already lost the struggle for AI dominance.

They wrote:

“We’ve done a lot of looking over our shoulders at OpenAI. Who will cross the next milestone? What will the next move be?

But the uncomfortable truth is, we aren’t positioned to win this arms race and neither is OpenAI. While we’ve been squabbling, a third faction has been quietly eating our lunch.

I’m talking, of course, about open source.

Plainly put, they are lapping us. Things we consider “major open problems” are solved and in people’s hands today.”

The bulk of the memo is spent describing how Google is outplayed by open source.

And even though Google has a slight advantage over open source, the author of the memo acknowledges that it is slipping away and will never return.

The self-analysis of the metaphoric cards they’ve dealt themselves is considerably downbeat:

“While our models still hold a slight edge in terms of quality, the gap is closing astonishingly quickly.

Open-source models are faster, more customizable, more private, and pound-for-pound more capable.

They are doing things with $100 and 13B params that we struggle with at $10M and 540B.

And they are doing so in weeks, not months.”

Large Language Model Size is Not an Advantage

Perhaps the most chilling realization expressed in the memo is Google’s size is no longer an advantage.

The outlandishly large size of their models are now seen as disadvantages and not in any way the insurmountable advantage they thought them to be.

The leaked memo lists a series of events that signal Google’s (and OpenAI’s) control of AI may rapidly be over.

It recounts that barely a month ago, in March 2023, the open source community obtained a leaked open source model large language model developed by Meta called LLaMA.

Within days and weeks the global open source community developed all the building parts necessary to create Bard and ChatGPT clones.

Sophisticated steps such as instruction tuning and reinforcement learning from human feedback (RLHF) were quickly replicated by the global open source community, on the cheap no less.

  • Instruction tuning
    A process of fine-tuning a language model to make it do something specific that it wasn’t initially trained to do.
  • Reinforcement learning from human feedback (RLHF)
    A technique where humans rate a language models output so that it learns which outputs are satisfactory to humans.

RLHF is the technique used by OpenAI to create InstructGPT, which is a model underlying ChatGPT and allows the GPT-3.5 and GPT-4 models to take instructions and complete tasks.

RLHF is the fire that open source has taken from

Scale of Open Source Scares Google

What scares Google in particular is the fact that the Open Source movement is able to scale their projects in a way that closed source cannot.

The question and answer dataset used to create the open source ChatGPT clone, Dolly 2.0, was entirely created by thousands of employee volunteers.

Google and OpenAI relied partially on question and answers from scraped from sites like Reddit.

The open source Q&A dataset created by Databricks is claimed to be of a higher quality because the humans who contributed to creating it were professionals and the answers they provided were longer and more substantial than what is found in a typical question and answer dataset scraped from a public forum.

The leaked memo observed:

“At the beginning of March the open source community got their hands on their first really capable foundation model, as Meta’s LLaMA was leaked to the public.

It had no instruction or conversation tuning, and no RLHF.

Nonetheless, the community immediately understood the significance of what they had been given.

A tremendous outpouring of innovation followed, with just days between major developments…

Here we are, barely a month later, and there are variants with instruction tuning, quantization, quality improvements, human evals, multimodality, RLHF, etc. etc. many of which build on each other.

Most importantly, they have solved the scaling problem to the extent that anyone can tinker.

Many of the new ideas are from ordinary people.

The barrier to entry for training and experimentation has dropped from the total output of a major research organization to one person, an evening, and a beefy laptop.”

In other words, what took months and years for Google and OpenAI to train and build only took a matter of days for the open source community.

That has to be a truly frightening scenario to Google.

It’s one of the reasons why I’ve been writing so much about the open source AI movement as it truly looks like where the future of generative AI will be in a relatively short period of time.

Open Source Has Historically Surpassed Closed Source

The memo cites the recent experience with OpenAI’s DALL-E, the deep learning model used to create images versus the open source Stable Diffusion as a harbinger of what is currently befalling Generative AI like Bard and ChatGPT.

Dall-e was released by OpenAI in January 2021. Stable Diffusion, the open source version, was released a year and a half later in August 2022 and in a few short weeks overtook the popularity of Dall-E.

This timeline graph shows how fast Stable Diffusion overtook Dall-E:

The above Google Trends timeline shows how interest in the open source Stable Diffusion model vastly surpassed that of Dall-E within a matter of three weeks of its release.

And though Dall-E had been out for a year and a half, interest in Stable Diffusion kept soaring exponentially while OpenAI’s Dall-E remained stagnant.

The existential threat of similar events overtaking Bard (and OpenAI) is giving Google nightmares.

The Creation Process of Open Source Model is Superior

Another factor that’s alarming engineers at Google is that the process for creating and improving open source models is fast, inexpensive and lends itself perfectly to a global collaborative approach common to open source projects.

The memo observes that new techniques such as LoRA (Low-Rank Adaptation of Large Language Models), allow for the fine-tuning of language models in a matter of days with exceedingly low cost, with the final LLM comparable to the exceedingly more expensive LLMs created by Google and OpenAI.

Another benefit is that open source engineers can build on top of previous work, iterate, instead of having to start from scratch.

Building large language models with billions of parameters in the way that OpenAI and Google have been doing is not necessary today.

Which may be the point that Sam Alton recently was hinting at when he recently said that the era of massive large language models is over.

The author of the Google memo contrasted the cheap and fast LoRA approach to creating LLMs against the current big AI approach.

The memo author reflects on Google’s shortcoming:

“By contrast, training giant models from scratch not only throws away the pretraining, but also any iterative improvements that have been made on top. In the open source world, it doesn’t take long before these improvements dominate, making a full retrain extremely costly.

We should be thoughtful about whether each new application or idea really needs a whole new model.

…Indeed, in terms of engineer-hours, the pace of improvement from these models vastly outstrips what we can do with our largest variants, and the best are already largely indistinguishable from ChatGPT.”

The author concludes with the realization that what they thought was their advantage, their giant models and concomitant prohibitive cost, was actually a disadvantage.

The global-collaborative nature of Open Source is more efficient and orders of magnitude faster at innovation.

How can a closed-source system compete against the overwhelming multitude of engineers around the world?

The author concludes that they cannot compete and that direct competition is, in their words, a “losing proposition.”

That’s the crisis, the storm, that’s developing outside of Google.

If You Can’t Beat Open Source Join Them

The only consolation the memo author finds in open source is that because the open source innovations are free, Google can also take advantage of it.

Lastly, the author concludes that the only approach open to Google is to own the platform in the same way they dominate the open source Chrome and Android platforms.

They point to how Meta is benefiting from releasing their LLaMA large language model for research and how they now have thousands of people doing their work for free.

Perhaps the big takeaway from the memo then is that Google may in the near future try to replicate their open source dominance by releasing their projects on an open source basis and thereby own the platform.

The memo concludes that going open source is the most viable option:

“Google should establish itself a leader in the open source community, taking the lead by cooperating with, rather than ignoring, the broader conversation.

This probably means taking some uncomfortable steps, like publishing the model weights for small ULM variants. This necessarily means relinquishing some control over our models.

But this compromise is inevitable.

We cannot hope to both drive innovation and control it.”

Open Source Walks Away With the AI Fire

Last week I made an allusion to the Greek myth of the human hero Prometheus stealing fire from the gods on Mount Olympus, pitting the open source to Prometheus against the “Olympian gods” of Google and OpenAI:

I tweeted:

“While Google, Microsoft and Open AI squabble amongst each other and have their backs turned, is Open Source walking off with their fire?”

The leak of Google’s memo confirms that observation but it also points at a possible strategy change at Google to  join the open source movement and thereby co-opt it and dominate it in the same way they did with Chrome and Android.

Read the leaked Google memo here:

Google “We Have No Moat, And Neither Does OpenAI”

Source link

Keep an eye on what we are doing
Be the first to get latest updates and exclusive content straight to your email inbox.
We promise not to spam you. You can unsubscribe at any time.
Invalid email address


Google Quietly Ends Covid-Era Rich Results




Google Quietly Ends Covid-Era Rich Results

Google removed the Covid-era structured data associated with the Home Activities rich results that allowed online events to be surfaced in search since August 2020, publishing a mention of the removal in the search documentation changelog.

Home Activities Rich Results

The structured data for the Home Activities rich results allowed providers of online livestreams, pre-recorded events and online events to be findable in Google Search.

The original documentation has been completely removed from the Google Search Central webpages and now redirects to a changelog notation that explains that the Home Activity rich results is no longer available for display.

The original purpose was to allow people to discover things to do from home while in quarantine, particularly online classes and events. Google’s rich results surfaced details of how to watch, description of the activities and registration information.

Providers of online events were required to use Event or Video structured data. Publishers and businesses who have this kind of structured data should be aware that this kind of rich result is no longer surfaced but it’s not necessary to remove the structured data if it’s a burden, it’s not going to hurt anything to publish structured data that isn’t used for rich results.

The changelog for Google’s official documentation explains:

“Removing home activity documentation
What: Removed documentation on home activity structured data.

Why: The home activity feature no longer appears in Google Search results.”

Read more about Google’s Home Activities rich results:

Google Announces Home Activities Rich Results

Read the Wayback Machine’s archive of Google’s original announcement from 2020:

Home activities

Featured Image by Shutterstock/Olga Strel

Source link

Keep an eye on what we are doing
Be the first to get latest updates and exclusive content straight to your email inbox.
We promise not to spam you. You can unsubscribe at any time.
Invalid email address
Continue Reading


Google’s Gary Illyes: Lastmod Signal Is Binary




Google's Gary Illyes: Lastmod Signal Is Binary

In a recent LinkedIn discussion, Gary Illyes, Analyst at Google, revealed that the search engine takes a binary approach when assessing a website’s lastmod signal from sitemaps.

The revelation came as Illyes encouraged website owners to upgrade to WordPress 6.5, which now natively supports the lastmod element in sitemaps.

When Mark Williams-Cook asked if Google has a “reputation system” to gauge how much to trust a site’s reported lastmod dates, Illyes stated, “It’s binary: we either trust it or we don’t.”

No Shades Of Gray For Lastmod

The lastmod tag indicates the date of the most recent significant update to a webpage, helping search engines prioritize crawling and indexing.

Illyes’ response suggests Google doesn’t factor in a website’s history or gradually build trust in the lastmod values being reported.

Google either accepts the lastmod dates provided in a site’s sitemap as accurate, or it disregards them.

This binary approach reinforces the need to implement the lastmod tag correctly and only specify dates when making meaningful changes.

Illyes commends the WordPress developer community for their work on version 6.5, which automatically populates the lastmod field without extra configuration.

Accurate Lastmod Essential For Crawl Prioritization

While convenient for WordPress users, the native lastmod support is only beneficial if Google trusts you’re using it correctly.

Inaccurate lastmod tags could lead to Google ignoring the signal when scheduling crawls.

With Illyes confirming Google’s stance, it shows there’s no room for error when using this tag.

Why SEJ Cares

Understanding how Google acts on lastmod can help ensure Google displays new publish dates in search results when you update your content.

It’s an all-or-nothing situation – if the dates are deemed untrustworthy, the signal could be disregarded sitewide.

With the information revealed by Illyes, you can ensure your implementation follows best practices to the letter.

Featured Image: Danishch/Shutterstock

Source link

Keep an eye on what we are doing
Be the first to get latest updates and exclusive content straight to your email inbox.
We promise not to spam you. You can unsubscribe at any time.
Invalid email address
Continue Reading


How to Persuade Your Boss to Send You to Ahrefs Evolve



How to Persuade Your Boss to Send You to Ahrefs Evolve

There’s one thing standing between you and several days of SEO, socializing, and Singaporean sunshine: your boss (and their Q4 budget 😅).

But don’t worry—we’ve got your back. Here are 5 arguments (and an example message) you can use to persuade your boss to send you to Ahrefs Evolve.

About Ahrefs Evolve

  • 2 days in sunny Singapore (Oct 24–25)
  • 500 digital marketing enthusiasts
  • 18 top speakers from around the world

Learn more and buy tickets.

SEO is changing at a breakneck pace. Between AI Overviews, Google’s rolling update schedule, their huge API leak, and all the documents released during their antitrust trial, it’s hard to keep up. What works in SEO today?

You could watch a YouTube video or two, maybe even attend an hour-long webinar. Or, much more effective: you could spend two full days learning from a panel of 18 international SEO experts, discussing your takeaways live with other attendees.

How to Persuade Your Boss to Send You to AhrefsHow to Persuade Your Boss to Send You to Ahrefs
Evolve speakers from around the world.

Our world-class speakers are tackling the hardest problems and best opportunities in SEO today. The talk agenda covers topics like:

  • Responding to AI Overviews: Amanda King will teach you how to respond to AI Overviews, Google Gemini, and other AI search functions.
  • Surviving (and thriving) Google’s algo updates: Lily Ray will talk through Google’s recent updates, and share data-driven recommendations for what’s working in search today.
  • Planning for the future of SEO: Bernard Huang will talk through the failures of AI content and the path to better results.

(And attendees will get video recordings of each session, so you can share the knowledge with your teammates too.)

View the full talk agenda here.

There’s no substitute for meeting with influencers, peers, and partners in real life. 

Conferences create serendipity: chance encounters and conversations that can have a huge positive impact on you and your business. By way of example, these are some of the real benefits that have come my way from attending conferences:

  • Conversations that lead to new customers for our business,
  • Invitations to speak at events,
  • New business partnerships and co-marketing opportunities, and
  • Meeting people that we went on to hire.

There’s a “halo” effect that lingers long after the event is over: the people you meet will remember you for longer, think more highly of you, and be more likely to help you out, should you ask.

(And let’s not forget: there’s a lot of information, particularly in SEO, that only gets shared in person.)

The “international” part of Evolve matters too. Evolve is a different crowd to your local run-of-the-mill conference. It’s a chance to meet with people from markets you wouldn’t normally meet—from Australia to Indonesia and beyond.

How to Persuade Your Boss to Send You to AhrefsHow to Persuade Your Boss to Send You to Ahrefs
Evolve attendees by home country.

If you’re an Ahrefs customer (thank you!), you’ll learn tons of tips, tricks and workflow improvements from attending Evolve. You’ll have opportunities to:

  • Attend talks from the Ahrefs team, showcasing advanced features and strategies that you can use in your own business.
  • Pick our brains at the Ahrefs booth, where we’ll offer informal 1:1 coaching sessions and previews of up-coming releases (like our new content optimization tool 🤫).
  • Join dedicated Ahrefs training workshops, hosted by the Ahrefs team and Ahrefs power users (tickets for these workshops will sold separately).

As a manager myself, there are two questions I need answered when approving expenses:

  • Is this a reasonable cost?
  • Will we see a return on this investment?

To answer those questions: early bird tickets for Evolve start at $570. For context, “super early bird” tickets for MozCon (another popular SEO conference) this year were almost twice as much: $999.

There’s a lot included in the ticket price too:

  • World-class international speakers,
  • 5-star hotel venue,
  • 5-star hotel food (two tea breaks with snacks & lunch),
  • Networking afterparty, and
  • Full talk recordings to later share with your team.

SEO is a crucial growth channel for most businesses. If you can improve your company’s SEO performance after attending Evolve (and we think you will), you’ll very easily see a positive return on the investment.

Traveling to tropical Singapore (and eating tons of satay) is great for you, but it’s also great for your team. Attending Evolve is a chance to break with routine, reignite your passion for marketing, and come back to your job reinvigorated.

This would be true for any international conference, but it goes double for Singapore. It’s a truly unique place: an ultra-safe, high-tech city that brings together dozens of different cultures.

1718123166 301 How to Persuade Your Boss to Send You to Ahrefs1718123166 301 How to Persuade Your Boss to Send You to Ahrefs
Little India in Singapore

You’ll discover different beliefs, working practices, and ways of business—and if you’re anything like me, come back a richer, wiser person for the experience.

If you’re nervous about pitching your boss on attending Evolve, remember: the worst that can happen is a polite “not this time”, and you’ll find yourself in the same position you are now.

So here goes: take this message template, tweak it to your liking, and send it to your boss over email or Slack… and I’ll see you in Singapore 😉

Email template

Hi [your boss’ name],

Our SEO tool provider, Ahrefs, is holding an SEO and digital marketing conference in Singapore in October. I’d like to attend, and I think it’s in the company’s interest:

  • The talks will help us respond to all the changes happening in SEO today. I’m particularly interested in the talks about AI and recent Google updates. 
  • I can network with my peers. I can discover what’s working at other companies, and explore opportunities for partnerships and co-marketing.
  • I can learn how we can use Ahrefs better across the organization.
  • I’ll come back reinvigorated with new ideas and motivation, and I can share my top takeaways and talk recordings with my team after the event.

Early bird tickets are $570. Given how important SEO is to the growth of our business, I think we’ll easily see a return from the spend.

Can we set up time to chat in more detail? Thanks!

Source link

Keep an eye on what we are doing
Be the first to get latest updates and exclusive content straight to your email inbox.
We promise not to spam you. You can unsubscribe at any time.
Invalid email address
Continue Reading