Connect with us

SEO

What It Is & Why It Matters For SEO

Published

on

What It Is & Why It Matters For SEO

You may have run across the W3C in your web development and SEO travels.

The W3C is the World Wide Web Consortium, and it was founded by the creator of the World Wide Web, Tim Berners-Lee.

This web standards body creates coding specifications for web standards worldwide.

It also offers a validator service to ensure that your HTML (among other code) is valid and error-free.

Making sure that your page validates is one of the most important things one can do to achieve cross-browser and cross-platform compatibility and provide an accessible online experience to all.

Invalid code can result in glitches, rendering errors, and long processing or loading times.

Simply put, if your code doesn’t do what it was intended to do across all major web browsers, this can negatively impact user experience and SEO.

WC3 Validation: How It Works & Supports SEO

Web standards are important because they give web developers a standard set of rules for writing code.

If all code used by your company is created using the same protocols, it will be much easier for you to maintain and update this code in the future.

This is especially important when working with other people’s code.

If your pages adhere to web standards, they will validate correctly against W3C validation tools.

When you use web standards as the basis for your code creation, you ensure that your code is user-friendly with built-in accessibility.

When it comes to SEO, validated code is always better than poorly written code.

According to John Mueller, Google doesn’t care how your code is written. That means a WC3 validation error won’t cause your rankings to drop.

You won’t rank better with validated code, either.

But there are indirect SEO benefits to well-formatted markup:

  • Eliminates Code Bloat: Validating code means that you tend to avoid code bloat. Validated code is generally leaner, better, and more compact than its counterpart.
  • Faster Rendering Times: This could potentially translate to better render times as the browser needs less processing, and we know that page speed is a ranking factor.
  • Indirect Contributions to Core Web Vitals Scores: When you pay attention to coding standards, such as adding the width and height attribute to your images, you eliminate steps that the browser must take in order to render the page. Faster rendering times can contribute to your Core Web Vitals scores, improving these important metrics overall.

Roger Montti compiled these six reasons Google still recommends code validation, because it:

  1. Could affect crawl rate.
  2. Affects browser compatibility.
  3. Encourages a good user experience.
  4. Ensures that pages function everywhere.
  5. Useful for Google Shopping Ads.
  6. Invalid HTML in head section breaks Hreflang.

Multiple Device Accessibility

Valid code also helps translate into better cross-browser and cross-platform compatibility because it conforms to the latest in W3C standards, and the browser will know better how to process that code.

This leads to an improved user experience for people who access your sites from different devices.

If you have a site that’s been validated, it will render correctly regardless of the device or platform being used to view it.

That is not to say that all code doesn’t conform across multiple browsers and platforms without validating, but there can be deviations in rendering across various applications.

Common Reasons Code Doesn’t Validate

Of course, validating your web pages won’t solve all problems with rendering your site as desired across all platforms and all browsing options. But it does go a long way toward solving those problems.

In the event that something does go wrong with validation on your part, you now have a baseline from which to begin troubleshooting.

You can go into your code and see what is making it fail.

It will be easier to find these problems and troubleshoot them with a validated site because you know where to start looking.

Having said that, there are several reasons pages may not validate.

Browser Specific Issues

It may be that something in your code will only work on one browser or platform, but not another.

This problem would then need to be addressed by the developer of the offending script.

This would mean having to actually edit the code itself in order for it to validate on all platforms/browsers instead of just some of them.

You Are Using Outdated Code

The W3C only started rendering validation tests over the course of the past couple of decades.

If your page was created to validate in a browser that predates this time (IE 6 or earlier, for example), it will not pass these new standards because it was written with older technologies and formats in mind.

While this is a relatively rare issue, it still happens.

This problem can be fixed by reworking code to make it W3C compliant, but if you want to maintain compatibility with older browsers, you may need to continue using code that works, and thus forego passing 100% complete validation.

Both problems could potentially be solved with a little trial and error.

With some work and effort, both types of sites can validate across multiple devices and platforms without issue – hopefully!

Polyglot Documents

Polyglot documents include any document that may have been transferred from an older version of code, and never re-worked to be compatible with the new version.

In other words, it’s a combination of documents with a different code type than what the current document was coded for (say an HTML 4.01 transitional document type compared to an XHTML document type).

Make no mistake: Even though both may be “HTML” per se, they are very different languages and need to be treated as such.

You can’t copy and paste one over and expect things to be all fine and dandy.

What does this mean?

For example, you may have seen situations where you may validate code, but nearly every single line of a document has something wrong with it on the W3C validator.

This could be due to somebody transferring over code from another version of the site, and not updating it to reflect new coding standards.

Either way, the only way to repair this is to either rework the code line by line (an extraordinarily tedious process).

How WC3 Validation Works

The W3C validator is this author’s validator of choice for making sure that your code validates across a wide variety of platforms and systems.

The W3C validator is free to use, and you can access it here.

With the W3C validator, it’s possible to validate your pages by page URL, file upload, and Direct Input.

  • Validate Your Pages by URL: This is relatively simple. Just copy and paste the URL into the Address field, and you can click on the check button in order to validate your code.
  • Validate Your Pages by File Upload: When you validate by file upload, you will upload the html files of your choice one file at a time. Caution: if you’re using Internet Explorer or certain versions Windows XP, this option may not work for you.
  • Validate Your Pages by Direct Input: With this option, all you have to do is copy and paste the code you want to validate into the editor, and the W3C validator will do the rest.

While some professionals claim that some W3C errors have no rhyme or reason, in 99.9% of cases, there is a rhyme and reason.

If there isn’t a rhyme and reason throughout the entire document, then you may want to refer to our section on polyglot documents below as a potential problem.

HTML Syntax

Let’s start at the top with HTML syntax. Because it’s the backbone of the World Wide Web, this is the most common coding that you will run into as an SEO professional.

The W3C has created a specification for HTML 5 called “the HTML5 Standard”.

This document explains how HTML should be written on an ideal level for processing by popular browsers.

If you go to their site, you can utilize their validator to make sure that your code is valid according to this spec.

They even give examples of some of the rules that they look for when it comes to standards compliance.

This makes it easier than ever to check your work before you publish it!

Validators For Other Languages

Now let’s move on to some of the other languages that you may be using online.

For example, you may have heard of CSS3.

The W3C has standards documentation for CSS 3 as well called “the CSS3 Standard.”

This means that there is even more opportunity for validation!

You can validate your HTML against their standard and then validate your CSS against the same standard to ensure conformity across platforms.

While it may seem like overkill to validate your code against so many different standards at once, remember that this means that there are more chances than ever to ensure conformity across platforms.

And for those of you who only work in one language, you now have the opportunity to expand your horizons!

It can be incredibly difficult if not impossible to align everything perfectly, so you will need to pick your battles.

You may also just need something checked quickly online without having the time or resources available locally.

Common Validation Errors

You will need to be aware of the most common validation errors as you go through the validation process, and it’s also a good idea to know what those errors mean.

This way, if your page does not validate, you will know exactly where to start looking for possible problems.

Some of the most common validation errors (and their meanings) include:

  • Type Mismatch: When your code is trying to make one kind of data object appear like another data object (e.g., submitting a number as text), you run the risk of getting this message. This error usually signals that some kind of coding mistake has been made. The solution would be to figure out exactly where that mistake was made and fix it so that the code validates successfully.
  • Parse Error: This error tells you that there was a mistake in the coding somewhere, but it does not tell you where that mistake is. If this happens, you will have to do some serious sleuthing in order to find where your code went wrong.
  • Syntax Errors: These types of errors involve (mostly) careless mistakes in coding syntax. Either the syntax is typed incorrectly, or its context is incorrect. Either way, these errors will show up in the W3C validator.

The above are just some examples of errors that you may see when you’re validating your page.

Unfortunately, the list goes on and on – as does the time spent trying to fix these problems!

More Specific Errors (And Their Solutions)

You may find more specific errors that apply to your site. They may include errors that reference “type attribute used in tag.”

This refers to some tags like JavaScript declaration tags, such as the following: <script type=”text/javascript”>.

The type attribute of this tag is not needed anymore and is now considered legacy coding.

If you use that kind of coding now, you may end up unintentionally throwing validation errors all over the place in certain validators.

Did you know that not using alternative text (alt text) – also called alt tags by some – is a W3C issue? It does not conform to the W3C rules for accessibility.

Alternative text is the text that is coded into images.

It is primarily used by screen readers for the blind.

If a blind person visits your site, and you do not have alternative text (or meaningful alternative text) in your images, then they will be unable to use your site effectively.

The way these screen readers work is that they speak aloud the words that are coded into images, so the blind can use their sense of hearing to understand what’s on your web page.

If your page is not very accessible in this regard, this could potentially lead to another sticky issue: that of accessibility lawsuits.

This is why it pays to pay attention to your accessibility standards and validate your code against these standards.

Other types of common errors include using tags out of context.

For code context errors, you will need to make sure they are repaired according to the W3C documentation so these errors are no longer thrown by the validator.

Preventing Errors From Impacting Your Site Experience

The best way to prevent validation errors from happening is by making sure your site validates before launch.

It’s also useful to validate your pages regularly after they’re launched so that new errors do not crop up unexpectedly over time.

If you think about it, validation errors are the equivalent of spelling mistakes in an essay – once they’re there, they’re difficult (if not impossible) to erase, and they need to be fixed as soon as humanly possible.

If you adopt the habit of always using the W3C validator in order to validate your code, then you can, in essence, stop these coding mistakes from ever happening in the first place.

Heads Up: There Is More Than One Way To Do It

Sometimes validation won’t go as planned according to all standards.

And there is more than one way to accomplish the same goal.

For example, if you use <button> to create a button and then give it an href tag inside of it using the <a> element, this doesn’t seem to be possible according to W3C standards.

But is perfectly acceptable in JavaScript because there are actually ways to do this within the language itself.

This is an example of how we create this particular code and insert it into the direct input of the W3C validator:

Screenshot from W3C validator, February 2022

In the next step, during validation, as discussed above we find that there are at least 4 errors just within this particular code alone, indicating that this is not exactly a particularly well-coded line:

Screenshot showing errors in the W3C validator tool.Screenshot from W3C validator, February 2022

While validation, on the whole, can help you immensely, it is not always going to be 100% complete.

This is why it’s important to familiarize yourself by coding with the validator as much as you can.

Some adaptation will be needed. But it takes experience to achieve the best possible cross-platform compatibility while also remaining compliant with today’s browsers.

The ultimate goal here is improving accessibility and achieving compatibility with all browsers, operating systems, and devices.

Not all browsers and devices are created equal, and validation achieves a cohesive set of instructions and standards that can accomplish the goal of making your page equal enough for all browsers and devices.

When in doubt, always err on the side of proper code validation.

By making sure that you work to include the absolute best practices in your coding, you can ensure that your code is as accessible as it possibly can be for all types of users.

On top of that, validating your HTML against W3C standards helps you achieve cross-platform compatibility between different browsers and devices.

By working to always ensure that your code validates, you are on your way to making sure that your site is as safe, accessible, and efficient as possible.

More resources: 


Featured Image: graphicwithart/Shutterstock




Source link

Keep an eye on what we are doing
Be the first to get latest updates and exclusive content straight to your email inbox.
We promise not to spam you. You can unsubscribe at any time.
Invalid email address

SEO

Technical SEO Checklist for 2024: A Comprehensive Guide

Published

on

Technical SEO Checklist 2024 Comprehensive Strategies

Technical SEO Checklist 2024 Comprehensive Strategies

With Google getting a whopping total of six algorithmic updates and four core updates in 2023, you can bet the search landscape is more complicated (and competitive) to navigate nowadays.

To succeed in SEO this year, you will need to figure out what items to check and optimize to ensure your website stays visible. And if your goal is to not just make your website searchable, but have it rank at the top of search engine results, this technical SEO checklist for 2024 is essential.

Webmaster’s Note: This is part one of our three-part SEO checklist for 2024. I also have a longer guide on advanced technical SEO, which covers best practices and how to troubleshoot and solve common technical issues with your websites.

Technical SEO Essentials for 2024

Technical SEO refers to optimizations that are primarily focused on helping search engines access, crawl, interpret, and index your website without any issues. It lays the foundation for your site to be properly understood and served up by search engines to users.

1. Website Speed Optimization

A site’s loading speed is a significant ranking factor for search engines like Google, which prioritize user experience. Faster websites generally provide a more pleasant user experience, leading to increased engagement and improved conversion rates.

Server Optimization

Often, the reason why your website is loading slowly is because of the server it’s hosted on. It’s important to choose a high-quality server that ensures quick loading times from the get-go so you skip the headache that is server optimization.

Google recommends keeping your server response time under 200ms. To check your server’s response time, you need to know your website’s IP address. Once you have that, use your command prompt.

In the window that appears, type ping, followed by your website’s IP address. Press enter and the window should show how long it took your server to respond. 

If you find that your server goes above the recommended 200ms loading time, here’s what you need to check:

  1. Collect the data from your server and identify what is causing your response time to increase. 
  2. Based on what is causing the problem, you will need to implement server-side optimizations. This guide on how to reduce initial server response times can help you here.
  3. Measure your server response times after optimization to use as a benchmark. 
  4. Monitor any regressions after optimization.

If you work with a hosting service, then you should contact them when you need to improve server response times. A good hosting provider should have the right infrastructure, network connections, server hardware, and support services to accommodate these optimizations. They may also offer hosting options if your website needs more server resources to run smoothly.

Website Optimization

Aside from your server, there are a few other reasons that your website might be loading slowly. 

Here are some practices you can do:

  1. Compressing images to decrease file sizes without sacrificing quality
  2. Minimizing the code, eliminating unnecessary spaces, comments, and indentation.
  3. Using caching to store some data locally in a user’s browser to allow for quicker loading on subsequent visits.
  4. Implementing Content Delivery Networks (CDNs) to distribute the load, speeding up access for users situated far from the server.
  5. Lazy load your web pages to prioritize loading the objects or resources only your users need.

A common tool to evaluate your website speed is Google’s PageSpeed Insights or Google Lighthouse. Both tools can analyze the content of your website and then generate suggestions to improve its overall loading speed, all for free. There are also some third-party tools, like GTMetrix, that you could use as well.

Here’s an example of one of our website’s speeds before optimization. It’s one of the worst I’ve seen, and it was affecting our SEO.

slow site speed score from GTMetrixslow site speed score from GTMetrix

So we followed our technical SEO checklist. After working on the images, removing render-blocking page elements, and minifying code, the score greatly improved — and we saw near-immediate improvements in our page rankings. 

site speed optimization results from GTMetrixsite speed optimization results from GTMetrix

That said, playing around with your server settings, coding, and other parts of your website’s backend can mess it up if you don’t know what you’re doing. I suggest backing up all your files and your database before you start working on your website speed for that reason. 

2. Mobile-First Indexing

Mobile-first Indexing is a method used by Google that primarily uses the mobile version of the content for indexing and ranking. 

It’s no secret that Google places a priority on the mobile users’ experience, what with mobile-first indexing being used. Beyond that, optimizing your website for mobile just makes sense, given that a majority of people now use their phones to search online.

This change signifies that a fundamental shift in your approach to your website development and design is needed, and it should also be part of your technical SEO checklist.

  1. Ensuring the mobile version of your site contains the same high-quality, rich content as the desktop version.
  2. Make sure metadata is present on both versions of your site.
  3. Verify that structured data is present on both versions of your site.

Tools like Google’s mobile-friendly test can help you measure how effectively your mobile site is performing compared to your desktop versions, and to other websites as well.

3. Crawlability & Indexing Check

Always remember that crawlability and Indexing are the cornerstones of SEO. Crawlability refers to a search engine’s ability to access and crawl through a website’s content. Indexing is how search engines organize information after a crawl and before presenting results.

  1. Utilizing a well-structured robots.txt file to communicate with web crawlers about which of your pages should not be processed or scanned.
  2. Using XML sitemaps to guide search engines through your site’s content and ensure that all valuable content is found and indexed. There are several CMS plugins you can use to generate your sitemap.
  3. Ensuring that your website has a logical structure with a clear hierarchy, helps both users and bots navigate to your most important pages easily. 

Google Search Console is the tool you need to use to ensure your pages are crawled and indexed by Google. It also provides reports that identify any problems that prevent crawlers from indexing your pages. 

4. Structured Data Markup

Structured Data Markup is a coding language that communicates website information in a more organized and richer format to search engines. This plays a strategic role in the way search engines interpret and display your content, enabling enhanced search results through “rich snippets” such as stars for reviews, prices for products, or images for recipes.

Doing this allows search engines to understand and display extra information directly in the search results from it.

Key Takeaway

With all the algorithm changes made in 2023, websites need to stay adaptable and strategic to stay at the top of the search results page. Luckily for you, this technical SEO checklist for 2024 can help you do just that. Use this as a guide to site speed optimization, indexing, and ensuring the best experience for mobile and desktop users.

Source link

Keep an eye on what we are doing
Be the first to get latest updates and exclusive content straight to your email inbox.
We promise not to spam you. You can unsubscribe at any time.
Invalid email address
Continue Reading

SEO

Why Google Seems To Favor Big Brands & Low-Quality Content

Published

on

By

Why Google Seems To Favor Big Brands & Low-Quality Content

Many people are convinced that Google shows a preference for big brands and ranking low quality content, something that many feel has become progressively worse. This may not be a matter of perception, something is going on, nearly everyone has an anecdote of poor quality search results. The possible reasons for it are actually quite surprising.

Google Has Shown Favoritism In The Past

This isn’t the first time that Google’s search engine results pages (SERPs) have shown a bias that favored big brand websites. During the early years of Google’s algorithm it was obvious that sites with a lot of PageRank ranked for virtually anything they wanted.

For example, I remember a web design company that built a lot of websites, creating a network of backlinks, raising their PageRank to a remarkable level normally seen only in big corporate sites like IBM. As a consequence they ranked for the two-word keyword phrase, Web Design and virtually every other variant like Web Design + [any state in the USA].

Everyone knew that websites with a PageRank of 10, the highest level shown on Google’s toolbar, practically had a free pass in the SERPs, resulting in big brand sites outranking more relevant webpages. It didn’t go unnoticed when Google eventually adjusted their algorithm to fix this issue.

The point of this anecdote is to point out an instance of where Google’s algorithm unintentionally created a bias that favored big brands.

Here are are other  algorithm biases that publishers exploited:

  • Top 10 posts
  • Longtail “how-to” articles
  • Misspellings
  • Free Widgets in footer that contained links (always free to universities!)

Big Brands And Low Quality Content

There are two things that have been a constant for all of Google’s history:

  • Low quality content
  • Big brands crowding out small independent publishers

Anyone that’s ever searched for a recipe knows that the more general the recipe the lower the quality of recipe that gets ranked. Search for something like cream of chicken soup and the main ingredient for nearly every recipe is two cans of chicken soup.

A search for Authentic Mexican Tacos results in recipes with these ingredients:

  • Soy sauce
  • Ground beef
  • “Cooked chicken”
  • Taco shells (from the store!)
  • Beer

Not all recipe SERPs are bad. But some of the more general recipes Google ranks are so basic that a hobo can cook them on a hotplate.

Robin Donovan (Instagram), a cookbook author and online recipe blogger observed:

“I think the problem with google search rankings for recipes these days (post HCU) are much bigger than them being too simple.

The biggest problem is that you get a bunch of Reddit threads or sites with untested user-generated recipes, or scraper sites that are stealing recipes from hardworking bloggers.

In other words, content that is anything but “helpful” if what you want is a tested and well written recipe that you can use to make something delicious.”

Explanations For Why Google’s SERPs Are Broken

It’s hard not to get away from the perception that Google’s rankings for a variety of topics always seem to default to big brand websites and low quality webpages.

Small sites grow to become big brands that dominate the SERPs, it happens. But that’s the thing, even when a small site gets big, it’s now another big brand dominating the SERPs.

Typical explanations for poor SERPs:

  • It’s a conspiracy to increase ad clicks
  • Content itself these days are low quality across the board
  • Google doesn’t have anything else to rank
  • It’s the fault of SEOs
  • Affiliates
  • Poor SERPs is Google’s scheme to drive more ad clicks
  • Google promotes big brands because [insert your conspiracy]

So what’s going on?

People Love Big Brands & Garbage Content

The recent Google anti-trust lawsuit exposed the importance of the Navboost algorithm signals as a major ranking factor. Navboost is an algorithm that interprets user engagement signals to understand what topics a webpage is relevant for, among other things.

The idea of using engagement signals as an indicator of what users expect to see makes sense. After all, Google is user-centric and who better to decide what’s best for users than the users themselves, right?

Well, consider that arguably the the biggest and most important song of 1991, Smells Like Teen Spirt by Nirvana, didn’t make the Billboard top 100 for that year. Michael Bolton and Rod Stewart made the list twice, with Rod Stewart top ranked for a song called “The Motown Song” (anyone remember that one?)

Nirvana didn’t make the charts until the next year…

My opinion, given that we know that user interactions are a strong ranking signal, is that Google’s search rankings follow a similar pattern related to users’ biases.

People tend to choose what they know. It’s called a Familiarity Bias.

Consumers have a habit of choosing things that are familiar over those that are unfamiliar. This preference shows up in product choices that prefer brands, for example.

Behavioral scientist, Jason Hreha, defines Familiarity Bias like this:

“The familiarity bias is a phenomenon in which people tend to prefer familiar options over unfamiliar ones, even when the unfamiliar options may be better. This bias is often explained in terms of cognitive ease, which is the feeling of fluency or ease that people experience when they are processing familiar information. When people encounter familiar options, they are more likely to experience cognitive ease, which can make those options seem more appealing.”

Except for certain queries (like those related to health), I don’t think Google makes an editorial decision to certain kinds of websites, like brands.

Google uses many signals for ranking. But Google is strongly user focused.

I believe it’s possible that strong user preferences can carry a more substantial weight than Reviews System signals. How else to explain why Google seemingly has a bias for big brand websites with fake reviews rank better than honest independent review sites?

It’s not like Google’s algorithms haven’t created poor search results in the past.

  • Google’s Panda algorithm was designed to get rid of a bias for cookie cutter content.
  • The Reviews System is a patch to fix Google’s bias for content that’s about reviews but aren’t necessarily reviews.

If Google has systems for catching low quality sites that their core algorithm would otherwise rank, why do big brands and poor quality content still rank?

I believe the answer is that is what users prefer to see those sites, as indicated by user interaction signals.

The big question to ask is whether Google will continue to rank what users biases and inexperience trigger user satisfaction signals.  Or will Google continue serving the sugar-frosted bon-bons that users crave?

Should Google make the choice to rank quality content at the risk that users find it too hard to understand?

Or should publishers give up and focus on creating for the lowest common denominator like the biggest popstars do?



Source link

Keep an eye on what we are doing
Be the first to get latest updates and exclusive content straight to your email inbox.
We promise not to spam you. You can unsubscribe at any time.
Invalid email address
Continue Reading

SEO

Google Announces Gemma: Laptop-Friendly Open Source AI

Published

on

By

Google Announces Gemma: Laptop-Friendly Open Source AI

Google released an open source large language model based on the technology used to create Gemini that is powerful yet lightweight, optimized to be used in environments with limited resources like on a laptop or cloud infrastructure.

Gemma can be used to create a chatbot, content generation tool and pretty much anything else that a language model can do. This is the tool that SEOs have been waiting for.

It is released in two versions, one with two billion parameters (2B) and another one with seven billion parameters (7B). The number of parameters indicates the model’s complexity and potential capability. Models with more parameters can achieve a better understanding of language and generate more sophisticated responses, but they also require more resources to train and run.

The purpose of releasing Gemma is to democratize access to state of the art Artificial Intelligence that is trained to be safe and responsible out of the box, with a toolkit to further optimize it for safety.

Gemma By DeepMind

The model is developed to be lightweight and efficient which makes it ideal for getting it into the hands of more end users.

Google’s official announcement noted the following key points:

  • “We’re releasing model weights in two sizes: Gemma 2B and Gemma 7B. Each size is released with pre-trained and instruction-tuned variants.
  • A new Responsible Generative AI Toolkit provides guidance and essential tools for creating safer AI applications with Gemma.
  • We’re providing toolchains for inference and supervised fine-tuning (SFT) across all major frameworks: JAX, PyTorch, and TensorFlow through native Keras 3.0.
  • Ready-to-use Colab and Kaggle notebooks, alongside integration with popular tools such as Hugging Face, MaxText, NVIDIA NeMo and TensorRT-LLM, make it easy to get started with Gemma.
  • Pre-trained and instruction-tuned Gemma models can run on your laptop, workstation, or Google Cloud with easy deployment on Vertex AI and Google Kubernetes Engine (GKE).
  • Optimization across multiple AI hardware platforms ensures industry-leading performance, including NVIDIA GPUs and Google Cloud TPUs.
  • Terms of use permit responsible commercial usage and distribution for all organizations, regardless of size.”

Analysis Of Gemma

According to an analysis by an Awni Hannun, a machine learning research scientist at Apple, Gemma is optimized to be highly efficient in a way that makes it suitable for use in low-resource environments.

Hannun observed that Gemma has a vocabulary of 250,000 (250k) tokens versus 32k for comparable models. The importance of that is that Gemma can recognize and process a wider variety of words, allowing it to handle tasks with complex language. His analysis suggests that this extensive vocabulary enhances the model’s versatility across different types of content. He also believes that it may help with math, code and other modalities.

It was also noted that the “embedding weights” are massive (750 million). The embedding weights are a reference to the parameters that help in mapping words to representations of their meanings and relationships.

An important feature he called out is that the embedding weights, which encode detailed information about word meanings and relationships, are used not just in processing input part but also in generating the model’s output. This sharing improves the efficiency of the model by allowing it to better leverage its understanding of language when producing text.

For end users, this means more accurate, relevant, and contextually appropriate responses (content) from the model, which improves its use in conetent generation as well as for chatbots and translations.

He tweeted:

“The vocab is massive compared to other open source models: 250K vs 32k for Mistral 7B

Maybe helps a lot with math / code / other modalities with a heavy tail of symbols.

Also the embedding weights are big (~750M params), so they get shared with the output head.”

In a follow-up tweet he also noted an optimization in training that translates into potentially more accurate and refined model responses, as it enables the model to learn and adapt more effectively during the training phase.

He tweeted:

“The RMS norm weight has a unit offset.

Instead of “x * weight” they do “x * (1 + weight)”.

I assume this is a training optimization. Usually the weight is initialized to 1 but likely they initialize close to 0. Similar to every other parameter.”

He followed up that there are more optimizations in data and training but that those two factors are what especially stood out.

Designed To Be Safe And Responsible

An important key feature is that it is designed from the ground up to be safe which makes it ideal for deploying for use. Training data was filtered to remove personal and sensitive information. Google also used reinforcement learning from human feedback (RLHF) to train the model for responsible behavior.

It was further debugged with manual re-teaming, automated testing and checked for capabilities for unwanted and dangerous activities.

Google also released a toolkit for helping end-users further improve safety:

“We’re also releasing a new Responsible Generative AI Toolkit together with Gemma to help developers and researchers prioritize building safe and responsible AI applications. The toolkit includes:

  • Safety classification: We provide a novel methodology for building robust safety classifiers with minimal examples.
  • Debugging: A model debugging tool helps you investigate Gemma’s behavior and address potential issues.
  • Guidance: You can access best practices for model builders based on Google’s experience in developing and deploying large language models.”

Read Google’s official announcement:

Gemma: Introducing new state-of-the-art open models

Featured Image by Shutterstock/Photo For Everything



Source link

Keep an eye on what we are doing
Be the first to get latest updates and exclusive content straight to your email inbox.
We promise not to spam you. You can unsubscribe at any time.
Invalid email address
Continue Reading

Trending

Follow by Email
RSS