Does Googlebot care about valid HTML?
Articles,  Blog

Does Googlebot care about valid HTML?


ADRIAN MADARAS: Today’s question
comes from Bangalore. Hemanth asks, does the crawler
really care about valid HTML? Validating google.com gives me
23 errors and 4 warnings. So there are plenty of reasons
to write a valid HTML, and to pay attention to your HTML,
and to make sure that it’s really clean and that
it validates. It makes it more maintainable. It makes it easier whenever
you want to upgrade. It makes it much better if you
want to hand that code off to somebody else. And there’s just a lot of
good reasons to do it. At the same time, Google has to
work with the web we have, not the web that we
want to have. And the web that we have has a
lot of syntax errors, a lot of invalid HTML. And so we have to build the
crawler to compensate for that and to deal with all the errors
and weird syntax that people sometimes mistakenly
write in a broken way onto the web. So Google does not penalize you
if you have invalid HTML, because there would be a huge
number of web pages like that. And some people know the rules
and then decide to make things a little bit faster or to tweak
things here or there, and so their pages
don’t validate. And there are enough pages that
don’t validate that we said, OK, this would actually
hurt search quality if we said only the pages that validate are
allowed to rank, or rank those a little bit higher. First and foremost, we have to
look at the quality of the information and whether users
are getting the most relevant information they need rather
than whether someone has done a very good job of making the
cleanest website they can. Now, I wouldn’t be
surprised if they correlate relatively well. Maybe it’s a signal we’ll
consider the future. But at least for right now, do
it because it’s good for maintenance. It’s easier for you if
you want to change the site in the future. Don’t just do it because you
think it will give you higher search rankings.

44 Comments

  • HostTribe

    If you have a ton of HTML errors, I can guarantee your site will not rank well. Google takes into account site upkeep from what I have seen. Everybody has a errors somewhere though and it's hard to have zero.

  • Kenneth von Rauch

    I guess it actually influences ranking but in a somewhat indirect way. If your coding errors slow down your site, it's a most def factor to negatively influence rankings. At the same time, if those errors have zero effect on page loading speed, then it's a non-issue. What do you think, guys?

  • Matthew Potter

    Good content that is coded with good semantic tagging will simply result in better crawling for ALL search bots. Validation of HTML is good but there are also a lot of requirements in the validation that don’t make sense.

    As an example: Not all images require alt tags. If the image is used for visual design purposes only, alt tags are a hinderance for accessibility and screen readers.

  • Joey Altherr

    If the code is so bad that the page load time is slower than it should be then THAT makes a bad user experience, and that's a ranking factor. correct?

  • Mal Milligan

    Google is now like the Monolith in 2001 A Space Odyssey. It watches what we do and sometimes can change the way we behave. If Google added a validation signal to The Algorithm, the whole world would write cleaner code. While we spend lots of time and money working on the details regarding content and SEO, the coders get away with being lazy and minimally competent. Google is simply ignoring the fact that poorly written code is frequently an indication of poorer quality and less accessibility.

  • Hexanet Communications

    It's important to note that if you want the screen reader to skip the image, you need a 'null' alt text, otherwise it may read the link destination, which is worse. And neither are fully predictable depending on the screen reader options.

    Thus if the image is used for visual design only, you are better off using a css background instead.

  • Ivan Rakic

    Matt, You forgot to say why Google have invalid code. I'm just curious. It will be awesome to get an input from developers that actually made that first page.

  • Nate

    I would argue that likely someone that slapped a site together without paying as much attention to syntax as a "pro" would…may have higher quality content than someone that spent time validating everything…and no time on their content.

  • Santiago Henden

    Don't read this because it actually works. Tomorrow will be the be the best day ever and the nearest Friday you will be kissed by your true love. and if you don't repost this in 150 minutes in 2 days you will die

  • Adrian Madaras

    Even the property of tag like "rel" and "itemprop" (etc.) integrated by Google, to have a better experience on web, are not valid HTML. What about that?

  • BluCoder

    Matt once said around 40% of the websites online have invalid code, so the short answer is NO, google bot wont give much importance to invalid HTML as long as you have good content in your website.

  • Anaglyph80

    I'd suggest that valid code (or very closely) will have secondary / flow on effects that "can" boost rankings.
    Valid code means pages load faster and cleaner AND it will display better in more browsers.
    Website owners are more likely to link out to good looking fast loading websites. People are also more likely to click social share buttons, talk about the site, spend longer on it, return to it, etc.
    Not paying attention to code is just lazy, it's not hard!
    Little things make big differences!

  • Alpha

    "Valid code means pages load faster"
    Sorry, but this simply isn't true. For example, if you leave out the /body and /html tags at the end of each document, the resulting filesize is lower which means the entire file will transfer faster. Take a look at one of Google's error pages. That code is far from valid (hell, it doesn't even have a head or body tag) but it works just fine in every modern browser.

  • Adrian Madaras

    Use the “itemprop=…” and other property tag like “rel=…” as you wish (I use this tags), as long as you don’t forget to close tags like table, div, ul etc this can be a problem for the browser, so a “bad experience for users” and for this Google will penalize the site.

  • Alexander Aranda

    "I wouldn't be surprised if they correlate relatively well", neither would I. A webmaster that spends a great deal of time and effort on their content is quite likely to have spent time creating a quality site (which is much easier).

  • Dwight Cocran

    Thats a long video just to plainly say "no". Oh well never really cared for validation much, which can be applied to more then html 🙂

  • RankYa

    Good answer Matt, and I'm sure if Google informed webmasters that "it is important" to have a valid HTML (common sense), then most of the websites would get cleaned up. But I guess with so many different browsers out there all working on different architecture, it is not as easy to have a valid HTML that renders in all browsers. Yet, I still think adhering to web standards (especially moving towards HTML5) should be emphasized so that we can all have ONE web experience, that just works always.

  • ヤンギレGaWd

    I wish that YT itself would allow HTML in comments, just another reason I hate YT imo xD (just saying, because it's what I searched for).

  • Ronan Le Glouannec

    Beaucoup de raisons pour lesquelles la validation html est tout de même intéressante. Matt n'indique pas lesquelles mais on peut considérer que l'interopérabilité est une raison majeure de respecter les normes w3c. C'est sans doute le meilleur moyen de s'assurer que le site web s'affichera correctement sur la majorité des navigateurs et supports (desktop, tablettes et mobiles).

  • Nicolas Augé

    Perso, je résume ça en : "Faites un site web en pensant avant tout aux visiteurs avant de penser aux bots".

    En gros, ce sera peut-être un signal que Google considérera dans le futur mais il ne faut en aucun cas considérer la validation HTML (w3c) en tant que facteur de "ranking". 😉

  • Michael Ecklund

    Don't "penalize" those who take shortcuts and break the rules of the web, but "award" those who actually do write VALID code and respect the WEB STANDARDS. What's the point of having "Web Standards" if no one abides by them? Google can actually help make the web a better place by awarding those who actually follow the rules of the web.

Leave a Reply

Your email address will not be published. Required fields are marked *