Connect with us


Google LIMoE – A Step Towards Goal Of A Single AI



Google LIMoE - A Step Towards Goal Of A Single AI

Google announced a new technology called LIMoE that it says represents a step toward reaching Google’s goal of an AI architecture called Pathways.

Pathways is an AI architecture that is a single model that can learn to do multiple tasks that are currently accomplished by employing multiple algorithms.

LIMoE is an acronym that stands for Learning Multiple Modalities with One Sparse Mixture-of-Experts Model. It’s a model that processes vision and text together.

While there are other architectures that to do similar things, the breakthrough is in the way the new model accomplishes these tasks, using a neural network technique called a Sparse Model.

The sparse model is described in a research paper from 2017 that introduced the Mixture-of-Experts layer (MoE) approach, in a research paper titled, Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer.

In 2021 Google announced a MoE model called GLaM: Efficient Scaling of Language Models with Mixture-of-Experts that was trained just on text.

The difference with LIMoE is that it works on text and images simultaneously.


The sparse model is different from the the “dense” models in that instead of devoting every part of the model to accomplishing a task, the sparse model assigns the task to various “experts” that specialize in a part of the task.

What this does is to lower the computational cost, making the model more efficient.

So, similar to how a brain sees a dog and know it’s a dog, that it’s a pug and that the pug displays a silver fawn color coat, this model can also view an image and accomplish the task in a similar way, by assigning computational tasks to different experts that specialize in the task of recognizing a dog, its breed, its color, etc.

The LIMoE model routes the problems to the “experts” specializing in a particular task, achieving similar or better results than current approaches to solving problems.

An interesting feature of the model is how some of the experts specialize mostly in processing images, others specialize mostly in processing text and some experts specialize in doing both.

Google’s description of how LIMoE works shows how there’s an expert on eyes, another for wheels, an expert for striped textures, solid textures, words, door handles, food & fruits, sea & sky, and an expert for plant images.

The announcement about the new algorithm describes these experts:

“There are also some clear qualitative patterns among the image experts — e.g., in most LIMoE models, there is an expert that processes all image patches that contain text. …one expert processes fauna and greenery, and another processes human hands.”

Experts that specialize in different parts of the problems provide the ability to scale and to accurately accomplish many different tasks but at a lower computational cost.


The research paper summarizes their findings:

  • “We propose LIMoE, the first large-scale multimodal mixture of experts models.
  • We demonstrate in detail how prior approaches to regularising mixture of experts models fall short for multimodal learning, and propose a new entropy-based regularisation scheme to stabilise training.
  • We show that LIMoE generalises across architecture scales, with relative improvements in zero-shot ImageNet accuracy ranging from 7% to 13% over equivalent dense models.
  • Scaled further, LIMoE-H/14 achieves 84.1% zeroshot ImageNet accuracy, comparable to SOTA contrastive models with per-modality backbones and pre-training.”

Matches State of the Art

There are many research papers published every month. But only a few are highlighted by Google.

Typically Google spotlights research because it accomplishes something new, in addition to attaining a state of the art.

LIMoE accomplishes this feat of attaining comparable results to today’s best algorithms but does it more efficiently.

The researchers highlight this advantage:

“On zero-shot image classification, LIMoE outperforms both comparable dense multimodal models and two-tower approaches.

The largest LIMoE achieves 84.1% zero-shot ImageNet accuracy, comparable to more expensive state-of-the-art models.

Sparsity enables LIMoE to scale up gracefully and learn to handle very different inputs, addressing the tension between being a jack-of-all-trades generalist and a master-of-one specialist.”

The successful outcomes of LIMoE led the researchers to observe that LIMoE could be a way forward for achieving a multimodal generalist model.

The researchers observed:


“We believe the ability to build a generalist model with specialist components, which can decide how different modalities or tasks should interact, will be key to creating truly multimodal multitask models which excel at everything they do.

LIMoE is a promising first step in that direction.”

Potential Shortcomings, Biases & Other Ethical Problems

There are shortcomings to this architecture that are not discussed in Google’s announcement but are mentioned in the research paper itself.

The research paper notes that, similar to other large-scale models, LIMoE may also introduce biases into the results.

The researchers state that they have not yet “explicitly” addressed the problems inherent in large scale models.

They write:

“The potential harms of large scale models…, contrastive models… and web-scale multimodal data… also carry over here, as LIMoE does not explicitly address them.”

The above statement makes a reference (in a footnote link) to a 2021 research paper called, On the Opportunities and Risks of Foundation Models (PDF here).

That research paper from 2021 warns how emergent AI technologies can cause negative societal impact such as:

“…inequity, misuse, economic and environmental impact, legal and ethical considerations.”

According to the cited paper, ethical problems can also arise from the tendency toward the homogenization of tasks, which can then introduce a point of failure that is then reproduced to other tasks that follow downstream.


The cautionary research paper states:

“The significance of foundation models can be summarized with two words: emergence and homogenization.

Emergence means that the behavior of a system is implicitly induced rather than explicitly constructed; it is both the source of scientific excitement and anxiety about unanticipated consequences.

Homogenization indicates the consolidation of methodologies for building machine learning systems across a wide range of applications; it provides strong leverage towards many tasks but also creates single points of failure.”

One area of caution is in vision related AI.

The 2021 paper states that the ubiquity of cameras means that any advances in AI related to vision could carry a concomitant risk toward the technology being applied in an unanticipated manner which can have a “disruptive impact,” including with regard to privacy and surveillance.

Another cautionary warning related to advances in vision related AI is problems with accuracy and bias.

They note:

“There is a well-documented history of learned bias in computer vision models, resulting in lower accuracies and correlated errors for underrepresented groups, with consequently inappropriate and premature deployment to some real-world settings.”

The rest of the paper documents how AI technologies can learn existing biases and perpetuate inequities.


“Foundation models have the potential to yield inequitable outcomes: the treatment of people that is unjust, especially due to unequal distribution along lines that compound historical discrimination…. Like any AI system, foundation models can compound existing inequities by producing unfair outcomes, entrenching systems of power, and disproportionately distributing negative consequences of technology to those already marginalized…”

The LIMoE researchers noted that this particular model may be able to work around some of the biases against underrepresented groups because of the nature of how the experts specialize in certain things.

These kinds of negative outcomes are not theories, they are realities and have already negatively impacted lives in real-world applications such as unfair racial-based biases introduced by employment recruitment algorithms.

The authors of the LIMoE paper acknowledge those potential shortcomings in a short paragraph that serves as a cautionary caveat.

But they also note that there may be a potential to address some of the biases with this new approach.

They wrote:

“…the ability to scale models with experts that can specialize deeply may result in better performance on underrepresented groups.”

Lastly, a key attribute of this new technology that should be noted is that there is no explicit use stated for it.

It’s simply a technology that can process images and text in an efficient manner.

How it can be applied, if it ever is applied in this form or a future form, is never addressed.


And that’s an important factor that is raised by the cautionary paper (Opportunities and Risks of Foundation Models), calls attention to in that researchers create capabilities for AI without consideration for how they can be used and the impact they may have on issues like privacy and security.

“Foundation models are intermediary assets with no specified purpose before they are adapted; understanding their harms requires reasoning about both their properties and the role they play in building task-specific models.”

All of those caveats are left out of Google’s announcement article but are referenced in the PDF version of the research paper itself.

Pathways AI Architecture & LIMoE

Text, images, audio data are referred to as modalities, different kinds of data or task specialization, so to speak. Modalities can also mean spoken language and symbols.

So when you see the phrase “multimodal” or “modalities” in scientific articles and research papers, what they’re generally talking about is different kinds of data.

Google’s ultimate goal for AI is what it calls the Pathways Next-Generation AI Architecture.

Pathways represents a move away from machine learning models that do one thing really well (thus requiring thousands of them) to a single model that does everything really well.

Pathways (and LIMoE) is a multimodal approach to solving problems.

It’s described like this:


“People rely on multiple senses to perceive the world. That’s very different from how contemporary AI systems digest information.

Most of today’s models process just one modality of information at a time. They can take in text, or images or speech — but typically not all three at once.

Pathways could enable multimodal models that encompass vision, auditory, and language understanding simultaneously.”

What makes LIMoE important is that it is a multimodal architecture that is referred to by the researchers as an “…important step towards the Pathways vision…

The researchers describe LIMoE a “step” because there is more work to be done, which includes exploring how this approach can work with modalities beyond just images and text.

This research paper and the accompanying summary article shows what direction Google’s AI research is going and how it is getting there.


Read Google’s Summary Article About LIMoE

LIMoE: Learning Multiple Modalities with One Sparse Mixture-of-Experts Model

Download and Read the LIMoE Research Paper

Multimodal Contrastive Learning with LIMoE: the Language-Image Mixture of Experts (PDF)

Image by Shutterstock/SvetaZi


Source link



8 Pillar Page Examples to Get Inspired By



8 Pillar Page Examples to Get Inspired By

Pillar pages are high-level introductions to a topic. They then link to other pages, which are usually more detailed guides about parts of the main topic.

Altogether, they form a content hub.

Example of a content hub

But not all pillar pages look the same. 

In this guide, we’ll look at eight examples of pillar pages to get your creative juices flowing.

Excerpt of beginner's guide to SEO by Ahrefs

Key stats

Estimated organic traffic: 1,200
Backlinks: 6,900
Referring domains: 899

Overview of Ahrefs' beginner's guide to SEO in Ahrefs' Site Explorer

This is our very own pillar page, covering the broad topic of search engine optimization (SEO)

Why I like it

Besides the fact that I’m biased, I like the custom design we created for this page, which makes it different from the articles on our blog. 

Even though the design is custom, our pillar page is still a pretty classic “hub and spoke” style pillar page. We’ve broken the topic down neatly into six different chapters and internally linked to guides we’ve created about them. There are also custom animations when you hover over each chapter:

Examples of chapters in the SEO guide

We’ve also added a glossary section that comes with a custom illustration of the SERPs. We have explanations of what each element means, with internal links to more detailed content:

Custom illustration of the SERP

Finally, it links to another “pillar page”: our SEO glossary


Consider creating a custom design for your pillar page so that it stands out. 

Excerpt of Doctor Diet's ketogenic diet guide

Key stats

Estimated organic traffic: 92,200
Backlinks: 21,600
Referring domains: 1,700

Overview of Diet Doctor's ketogenic diet guide in Ahrefs' Site Explorer

Diet Doctor is a health company focusing on low-carb diets. Its pillar page is a comprehensive guide on the keto diet. 

Why I like it

On the surface, it doesn’t exactly look like a pillar page; it looks like every other post on the Diet Doctor site. But that’s perfectly fine. It’s simply a different approach—you don’t have to call out the fact that it’s a pillar page. 


Diet Doctor’s guide is split into 10 different sections with links to its own resources. The links bring you to different types of content (not just blog posts but videos too).

Video course about keto diet for beginners

Unlike the classic pillar page, Diet Doctor’s guide goes into enough detail for anyone who is casually researching the keto diet. But it also links to further resources for anyone who’s interested in doing additional research.


Pillar pages need not always just be text and links. Make it multimedia: You can add videos and images and even link to your own multimedia resources (e.g., a video course).

Excerpt of Wine Folly's beginner's guide to wine

Key stats

Estimated organic traffic: 5,600
Backlinks: 2,800
Referring domains: 247

Overview of Wine Folly's beginner's guide to wine in Ahrefs' Site Explorer

Wine Folly is a content site devoted to wine knowledge and appreciation. Its pillar page, as expected, is about wine. 

Why I like it

Wine Folly’s pillar page is a classic example of a “hub and spoke” style pillar page—split into multiple sections, with some supporting text, and then internal links to other resources that support each subsection. 

Supporting text and links to other resources

This page doesn’t just serve as a pillar page for ranking purposes, though. Given that it ranks well and receives quite a significant amount of search traffic, the page also has a call to action (CTA) to Wine Folly’s book:

Short description of book; below that, CTA encouraging site visitor to purchase it


While most websites design pillar pages for ranking, you can also use them for other purposes: capture email addresses, sell a book, pitch your product, etc. 

Excerpt of A-Z directory of yoga poses

Key stats

Estimated organic traffic: 11,100
Backlinks: 3,400
Referring domains: 457

Overview of Yoga Journal's A-Z directory of yoga poses in Ahrefs' Site Explorer

Yoga Journal is an online and offline magazine. Its pillar page is an A-Z directory of yoga poses.

Why I like it

Yoga Journal’s pillar page is straightforward and simple. List down all possible yoga poses (in both their English and Sanskrit names) in a table form and link to them. 

List of yoga poses in table form

Since it’s listed in alphabetical order, it’s useful for anyone who knows the name of a particular pose and is interested in learning more. 

What I also like is that Yoga Journal has added an extra column on the type of pose each yoga pose belongs to. If we click on any of the pose types, we’re directed to a category page where you can find similar kinds of poses: 

Examples of standing yoga poses (in grid format)


The A-Z format can be a good format for your pillar page if the broad topic you’re targeting fits the style (e.g., dance moves, freestyle football tricks, etc.).

Excerpt of Atlassian's guide to agile development

Key stats

Estimated organic traffic: 115,200
Backlinks: 3,200
Referring domains: 860

Overview of Atlassian's guide to agile development in Ahrefs' Site Explorer

Atlassian is a software company. You’ve probably heard of its products: Jira, Confluence, Trello, etc. Its pillar page is on agile development.

Why I like it

Atlassian’s pillar page is split into different topics related to agile development. It then has internal links to each topic—both as a sticky table of contents and card-style widgets after the introduction: 

Sticky table of contents
Card-style widgets

I also like the “Up next” feature at the bottom of the pillar page, which makes it seem like an online book rather than a page. 

Example of "Up next" feature


Consider adding a table of contents to your pillar page. 

Excerpt of Muscle and Strength's workout routines database

Key stats

Estimated organic traffic: 114,400
Backlinks: 2,900
Referring domains: 592

Overview of Muscle and Strength's workout routines database in Ahrefs' Site Explorer

Muscle and Strength’s pillar page is a massive database linking to various categories of workouts. 

Why I like it

Calling it a pillar page seems to be an understatement. Muscle and Strength’s free workouts page appears to be more like a website. 

When you open the page, you’ll see that it’s neatly split into multiple categories, such as “workouts for men,” “workouts for women,” “biceps,” “abs,” etc. 

Workout categories (in grid format)

Clicking through to any of them leads us to a category page containing all sorts of workouts:

Types of workouts for men (in grid format)

Compared to the other pillar pages on this list, where they’re linking to other subpages, Muscle and Strength’s pillar page links to other category pages, which then link to their subpages, i.e., its massive archive of free workouts.


Content databases, such as the one above, are a huge undertaking for a pillar page but can be worth it if the broad topic you’re targeting fits a format like this. Ideally, the topic should be about something where the content for it is ever-growing (e.g., workout plans, recipes, email templates, etc.).

Excerpt of Tofugu's guide to learning Japanese

Key stats

Estimated organic traffic: 39,100
Backlinks: 1,100
Referring domains: 308

Overview of Tofugu's guide to learning Japanese in Ahrefs' Site Explorer

Tofugu is a site about learning Japanese. And its pillar page is about, well, learning Japanese.

Why I like it

This is an incredible (and yes, ridiculously good) guide to learning Japanese from scratch. It covers every stage you’ll go through as a complete beginner—from knowing no Japanese to having intermediate proficiency in the language. 

Unlike other pillar pages where information is usually scarce and simply links out to further resources, this page holds nothing back. Under each section, there is great detail about what that section is, why it’s important, how it works, and even an estimated time of how long that stage takes to complete. 

Another interesting aspect is how Tofugu has structured its internal links as active CTAs. Rather than “Learn more” or “Read more,” it’s all about encouraging users to do a task and completing that stage. 

CTA encouraging user to head to the next task of learning to read hiragana


Two takeaways here:

  • Pillar pages can be ridiculously comprehensive. It depends on the topic you’re targeting and how competitive it is.
  • CTAs can be more exciting than merely just “Read more.”
Excerpt of Zapier's guide to working remotely

Key stats

Estimated organic traffic: 890
Backlinks: 4,100
Referring domains: 1,100

Overview of Zapier's guide to working remotely in Ahrefs' Site Explorer

Zapier allows users to connect multiple software products together via “zaps.” It’s a 100% remote company, and its pillar page is about remote work. 

Why I like it

Zapier’s pillar page is basically like Wine Folly’s pillar page. Break a topic into subsections, add a couple of links of text, and then add internal links to further resources. 

In the examples above, we’ve seen all sorts of execution for pillar pages. There are those with custom designs and others that are crazily comprehensive.

But sometimes, all a pillar page needs is a simple design with links. 


If you already have a bunch of existing content on your website, you can create a simple pillar page like this to organize your content for your readers. 


Keep learning

Inspired by these examples and want to create your own pillar page? Learn how to successfully do so with these two guides:

Any questions or comments? Let me know on Twitter.  

Source link

Continue Reading

Subscribe To our Newsletter
We promise not to spam you. Unsubscribe at any time.
Invalid email address