THE RKGBLOG

Googlebot is Crawling JavaScript: Is Your Site Ready?

Google has been pursuing the ability to crawl, execute and understand JavaScript for many years. We have seen evidence of this in the past and Jody O’Donnell wrote an RKG blog post over a year ago when we noticed that Google was finding links to a page that didn’t exist, but were discovered through cobbling together JavaScript code found in the HTML on the page.

Google has been open about their desire to be able to fully understand JavaScript on webpages. With the release of this new Google Webmaster Central blog post we can see that Google is getting closer to this goal. Does this mean it will happen right away or further down the road? That is a question only Google gets to answer.

crawling-javascript-rkg

Does This Change My Entire SEO Strategy?

What does this mean to us? At RKG, we don’t believe it changes our best practices strategically. We will still concentrate on the Crawl->Index->Rank cycle of SEO by fixing the block-and-tackle aspects of a website’s technical situation while improving the content, presentation and marketing of a website.

As Google acquires the ability to execute JavaScript, it will change the tactical ways we look at individual tasks. For example, if a linking widget on a page meets our criteria of being relevant, targeted and not spammy, but is done with JavaScript, would we still recommend putting it into HTML? If we know Google can read this, probably not, although we still have to consider Bing when making this decision.

What Does Google Say?

In their blog post, Google made a reference to a few things to verify about your current site and what to watch for in the coming weeks:

Ensure that CSS and JS resources are not blocked in a way that Googlebot cannot retrieve them.

This is a standard recommendation that we’ve made to all of our clients before this announcement. While it might seem like blocking those assets through a robots.txt disallow or other methods might be making the crawl of a website more efficient, the ability for spiders to crawl those resources better allows search engines to understand the structure and architecture of the site. This has become even more important with the increasing amount of traffic that comes from mobile and it helps search engines understand your optimizations for mobile devices.

Ensure that your web server is able to handle the increased volume of crawl requests for resources.

With the improved crawling of JavaScript by Googlebot on the horizon, there will be an incremental increase in the number of requests made of your servers. If your servers are not able to handle this increase, it will hamper both Google’s and a browser’s ability to render your pages. This inability to render pages alongside slow page times, which is a ranking factor, could negatively affect the user experience, as well.

Ensure that your website will degrade gracefully.

Simply put, you want the most modern browsers to render your pages as intended, and you want older browsers, users with JavaScript turned off and other spiders that do not have the ability to crawl JavaScript to receive pages that are functional, albeit with less features, bells and whistles. Pages that “break” with or without the execution of JavaScript are problematic to both users and search engines.

Ensure that the JavaScript on your site isn’t too complex or arcane to be executed by Google.

This is something to monitor and we will address this in more detail below. At this time, we do not know what is too complex and arcane for Googlebot. But Google has added some new functionality to the Fetch as Googlebot tool so you can test your pages to ensure that they are rendering as intended.

If JavaScript is removing content from the page, Googlebot will not be able to index that content.

Showing content to spiders that wouldn’t be seen by users with JavaScript enabled is cloaking. If your site is currently doing this, this is an opportune moment to stop as cloaking can result in a manual penalty from search engines.

What Can We Expect?

Google’s disregard for JavaScript in the past has also been used by many websites to their advantage. It will be important to understand the repercussions of Google being able to index all of the content and assets on your pages, as it will be able to see your pages as a user can now.

This announcement creates more questions than it answers, especially considering the versatility and expanding capabilities of JavaScript. We don’t truly know what Google’s goals are for this update, but RKG is formulating testing to better understand the ramifications.

A few scenarios to think about:

  • Is your site is hiding duplicate content behind JavaScript, perhaps by a review aggregator, will Googlebot now crawl all of that duplicate content?
  • Will all JavaScript links in <a> tags or JavaScript jump menus from forms suddenly count as links?
  • How will faceted navigation within JavaScript be interpreted?
  • Will the execution of JavaScript by Googlebot be in small, incremental waves or will this new capability be a fire-hose that is simply turned on?
  • How will Bing respond?
  • Do the crawlers that you use internally for your own site analysis have the ability to execute JavaScript?  Will they no longer be able to mimic Googlebot?

What Happens When This Ramps Up?

We will have to wait for the answers to all these questions, but we should start thinking about the ramifications of how this changes the SEO landscape. Ultimately, the days of hiding anything in JavaScript were numbered and this doesn’t change any of the best practices that RKG has maintained since inception.

In terms of understanding the changing capabilities of Googlebot, in the Google Webmaster Central announcement there was mention of additional tools in the “coming days” to help debug the JavaScript issues your site might be having.  This functionality is actually a new feature in the Fetch as Googlebot Tool, as we noted above.

Traditionally, fetching as Googlebot returned the code and HTML of the page, but now it will return a visual representation of what is now being seen by Googlebot. As Googlebot’s ability to understand JavaScript continues to evolve, the Fetch as Googlebot tool should be our clearest signal into the current capabilities of that spider to render webpages. It will also help us better understand what is “too complex and arcane” for Googlebot, as well.

TL;DR

While Google’s ability to crawl JavaScript is something that has been in the works for quite some time, it still presents challenges that  may not be fully evident for many months. In the short term, this is an opportunity to better understand the visibility of your webpages for both users and spiders, and to make sure that those visions are one and the same.

  • Jamey Barlow
    Jamey Barlow is an SEO Team Director at RKG.
  • Comments
    5 Responses to “Googlebot is Crawling JavaScript: Is Your Site Ready?”
    1. Sujan Meko says:

      Very good article and a real nice share with us. First of all thanks for that.
      Honestly speaking I am bit confused with this. Do you mean that Google now understands javascript links and AJAX loaded contents?
      Javascript links are bad for SEO, we knew that for long time and now even If Google crawls and understands (index) those types of links and texts, I would prefer and suggest to use normal html links for internal site navigation.
      Surely it can be helpful for websites which has extensive use of AJAX but I am not sure how Google will collect & index those data, links, images loaded by AJAX.

    2. Jamey Barlow Jamey Barlow says:

      Thanks, Sujan.

      We are hoping to hear more from Google about the specifics. We are doing our own testing in the mean-time and hopefully we can share those results in the near future.

      Until then, we are continuing to preach the standard best practice of using crawlable, HTML links on your website.

    3. Guilherme says:

      With this feature (GoogleBOT executing JS) is still necessary to implement ajax crawling to dynamic pages?
      https://developers.google.com/webmasters/ajax-crawling/

    4. Altaf Gilani says:

      Great article!! We had a feeling that javascript and jquery were indexed by Google. The Google Page Speed tool suggests external JavaScript files will increase load times.

    5. Jack says:

      Javascript should be fully avoided.

    Leave A Comment