SEO Consultant and Founder of Orainti, Aleyda Solis, reviews some of the main learnings from the SEO chapter in the Web Almanac 2020. Watch the video to learn how website SEO practices have changed since 2019 and the trends were for 2020.

Video Transcript

I’m Aleyda Solis, SEO consultant and founder at Orainti. I was kindly invited by the search relations team to host this Search Central Lightning Talk to present the 2020 Web Almanac SEO chapter. As I had the wonderful opportunity to author it with the amazing Jamie Indigo and Mike King.

♪ [Music] ♪

What is the Web Almanac?

The web almanac is a comprehensive report on the state of the web backed by real data. It comprises 22 chapters spawning aspects from page content to user experience, including SEO. The goal of the web almanac SEO chapter is to identify and assess main elements and configurations that played a role in website Search Engine Optimization. The web almanac analysis in 2020 was based on 7.5 million websites from the august 2020 datasets of the HTTP Archive taking into consideration both raw and rendered HTML elements. It also included data from Lighthouse and the UX Chrome Report. it’s important to note that in the case of the HTTP Archive and Lighthouse, the data is limited to the website home pages only, not site-wide crawls.

Crawlability and Indexability

Let’s go through some of the most important SEO findings, starting with the crawlability and indexability related configurations. In 2019, 72.16% of mobile sites had a valid robots.txt versus 75% in 2020. Meta robot tags were found in 28% of desktop and mobile pages. Interestingly, rendering changed the meta robots tags in 0.16% of pages While this percentage is not high, there is no inherent issue with using JavaScript to add a meta robot tag to a page or change its content either. SEO should be careful with this. Why? Because if a page loads with a new index directive in the meta robot tags before rendering, Google search, and potentially other search engines, won’t run the JavaScript that changed the tag value or indexed the page.

In the 2019 chapter, it was identified that 48.3% of mobile pages were using a canonical tag. In 2020 the number of mobile pages featuring a canonical tag has grown to 53.6%. 45% of mobile pages’ canonical tags were self-referential and 8.5% were pointing to a different URL as the canonical ones. 52% of desktop pages were found to be featuring a canonical tag in 2020 with 48% being self-referential and 4% pointing to a different URL. It’s understandable that there’s a higher share of mobile pages canonicalized because of those sites using an independent mobile configuration. However, the 4% canonicalized desktop pages could be considered a bit too high if we think that these are from pages which are not self-canonicalizing.

When analyzing the canonical tags, implemented in the raw HTML versus those relying on client-side renderer JavaScript, we identified that 0.7% of the mobile pages and 0.5% of the desktop ones include a canonical tag in the render but not in the raw HTML. This means that there’s only a very small number of pages that are relying on JavaScript to implement canonical tags in 0.93% of the mobile pages and 0.76% of the desktop ones. We saw canonical tags implemented via both the raw and rendered HTML with a conflict happening between the URLs specified in the raw versus the render HTML of the same pages. This can generate indexability issues as mixed information is sent to search engines about which is the canonical URL for the same page. A similar conflict can be found with different implementation methods with 0.15% of the mobile pages and 0.17% of the desktop ones showing conflicts between the canonical tags implemented via their HTTP headers and the HTML head.

Content Trends

What about the content optimization status? In 2020 the median desktop page was found to have 402 words and the mobile page 348 words. While in 2019 the median desktop page has 346 words and the media mobile page had a slightly lower count at 306 words. This represents 16.2% and 13.7% growth respectively. It’s important to note how more content is not necessarily better and what is important is to satisfy user search needs with comprehensive information that is helpful.

If you want to learn more about this topic, i highly recommend you to watch Lily Ray’s conversation with Martin Splitt in the “Is more content better?” episode of SEO Mythbusting. We found that the median desktop site features 12% more words when rendered than it does on its raw HTML. We also found that the median mobile page site displayed 13% less content than its desktop counterpart, the median mobile site also displays 11.5% more words when rendered than its raw HTML counterpart. This might not necessarily be a problem, however, even if Google as well as other search engines are continually improving in their capacity to render and index JavaScript content some sites will be missing out on opportunities to improve their organic search visibility through a stronger focus on ensuring their content is always available and indexable.

When analyzing the usage of the title tag, we found that 99% of desktop and mobile pages are featuring one. This represents a slight improvement since 2019 when 97% of mobile pages had a title tag. The median page features a title that is six words long with no different in the word count between mobile and desktop contents. The median page title character account is 38 characters on both mobile and desktop. Interestingly, this is up from 20 characters on desktop and 21 characters on mobile from the 2019 analysis.

When analyzing the use of meta descriptions we found that 68.6% of desktop pages and 68.2% of mobile pages have one. Although, this might be surprisingly low, it’s a slight improvement from 2019 when only 64% of mobile pages had a meta description. The median length of the meta description is 19 words. The only disparity in word count takes place on the 19th percentile where the desktop content has one more word than mobile. The median character count for the meta description is 138 characters on desktop and 136 characters on mobile pages, which is under the 160 characters that is usually shared as a guideline or reference in SEO best practices.

Let’s see if things have also improved with links. The median desktop page includes 76 links while the median mobile page has 67. The median page has 61 internal links, going to pages within the same site, on desktop and 54 on mobile. This is down 12.8% and 10% respectively from the 2019 analysis and might be suggesting that sites are not maximizing in the ability to improve the crawlability and link equity flow through their pages in the way that they did the year before. The median page is linking to external sites seven times on desktop and six times on mobile. This is a decrease from 2019 when it was found that the median number of external links per page were 10 on desktop and 8 on mobile. This decrease in external links could suggest that websites are now being more careful when linking to other sites’ whether to avoid passing link popularity or to avoid referring users to them.

There is a disparity between the links of mobile and desktop pages that will negatively impact sites as Google becomes more committed to mobile-only indexing rather than use mobile-first one. This is illustrated in the 62 links on mobile versus the 68 links on desktop for the median web page. 28.6 percent of pages include rel=”nofollow” attributes on desktop and 30.7% on mobile. However, rel=”ugc” and rel=”sponsored” adoption is quite low with less than 0.3% of pages having either. Since these attributes don’t add any more value to publishers than the rel=”nofollow”, it is reasonable to expect that the rate of adoption will continue to be slow.

However, it is important to mention that these attributes add semantic information that Google can use, so, it’s advisable to make use of them. The link discoverability for major JavaScript frameworks used for single page applications increased dramatically compared to 2019. By testing mobile navigation links for hash URLs, we solved minus 53% of instances of uncrawlable links from sites using React; minus 58% fewer from Vue.js powered sites; and minus 91% reduction from Angular single-page applications.

Structured Data

What’s the status of structured data? As part of our examination, we took a look at the incidence rates of different type of structure markup. JSON-LD has become the preferred format. It appears on 29.8% of mobile pages and 30.6% of desktop ones. 38.6% of desktop pages and 39.3% of mobile pages feature JSON-LD or micro format structured data in the raw HTML while 40.1% of desktop and mobile pages feature structured data in the render DOM. When reviewing this in more detail, we found that 1.5% of desktop pages and 1.8% of mobile pages only feature this type of structured data in the render DOM due to JavaScript transformations relying on search engines’ JavaScript execution capabilities. 4.5% of desktop pages and 4.6% of mobile pages feature structured data that appears in the raw HTML and is subsequently changed by JavaScript transformation in the rendered DOM.

Depending on the type of changes applied to the structured data configuration, this will generate mixed signals for search engines when rendering them. Also, despite the fact that reviews are not supposed to be associated with home pages, the data indicates that aggregate ratings is used on 23.9% on mobile and 23.7% on desktop. It’s also interesting to see the growth of the video object structured data– the usage of it grew 30.11% on desktop and 27.7% on mobile. The growth of this object is a general indication of the increased adoption of structured data. There’s also an indication of what Google gives visibility within search features increases the incidence rates of lesser used objects. Google announced that FAQPage, HowTo, and QAPage objects as search visibility opportunities in 2019 and they sustained significant year over year growth. FAQ page markup grew 3,261% on desktop and 3,000% on mobile. HowTo markup grew 605% on desktop and 623% on mobile. QAPage grew 167% on desktop and 192% on mobile. It’s important to note that this data might not necessarily be representative of their actual level of growth, since these objects are usually placed on internal pages.

Page Experience

Are websites ready for the page experience update? When analyzing for web vitals, desktop continues to be the more performant platform for users despite more users on mobile devices. 33.1% of websites score good Core Web Vitals for desktop and only 20% of their mobile counterparts pass the core web vitals assessment. From a security perspective, 77.4% of desktop pages and 73.2% of mobile pages have adopted HTTPS. This is up 10.4% from 2019.

It is important to note that browsers have become more aggressive in pushing HTTPS by signaling that p ages are insecure when you visit them without HTTPS. Also, HTTPS is currently a requirement to capitalize on higher performing protocols such as HTTP/2. All these things have likely contributed to the higher adoption rate year over year. From a mobile friendliness perspective, 42% of mobile pages and 42% percent of desktop ones too have a viewport meta tag with the right configuration. However, 11% of mobile pages and 16.2% of desktop ones are not including the tag at all suggesting that they might not necessarily be mobile friendly yet. We found that 80.3% of desktop pages and 83% of the mobile ones are using either the height, width, or aspect-ratio CSS configurations, meaning that a high percentage of pages have responsive features.

On the other hand, desktop websites that have separate mobile versions are recommended to link to them using the rel=”alternate” tag in their heads of the HTML. Only 0.64% of the analyzed desktop pages were found to be including this tag with the specified media value. It’s so much data, as you can see some areas have improved but others, not so much.


Let’s wrap up. Consistent with what was found and concluded in 2019, most sites have crawlable, indexable desktop and mobile pages and are making use of the fundamental SEO-related configurations. It is important to highlight how the link discoverability for major JavaScript frameworks used for single-page applications increased dramatically compared to 2019.

Additionally, we also identified that there has been a slight improvement from 2019 findings across many of the analyzed areas. For example, robots.txt in 2019, 72.2% of mobile pages had a robots.txt that was valid versus 75% in 2020. Canonical tag, in 2019 we identified that 48% of mobile pages were using a canonical tag versus 54% in 2020. Structured data, despite the fact that reviews are not supposed to be associated with home pages, the data indicates that aggregate ratings is up 24% on mobile and desktop pages. HTTPS usage, 77.4% of desktop pages and 73% of mobile pages have adopted HTTPS. This is up 10.4% from 2019.

However, not everything has improved over the last year. The median desktop page includes 61 internal links while the median mobile page has 54. This is down 13% and 10% respectively from 2019. Also, 5.6% of desktop pages contain no internal links as well as 6% of mobile phone pages. Despite the growing use of mobile devices and Google moved to a mobile-first index, the following findings could negatively affect as Google completes its migration to a mobile-first index in March 2021. 10.8% of mobile pages and 16.2% of desktop ones are not including the viewport tag at all, suggesting that they are not yet mobile friendly. Non-trivial disparities were found across mobile and desktop pages like the one between mobile and desktop links illustrated in the 62 links on mobile versus the 68 links on desktop for the median webpage. 33.1% of desktop pages score good Core Web Vitals, while only 20% of their mobile counterparts passed the Core Web Vital assessment, suggesting that desktop continues to be the more performant platform for users. Disparities were also found across render and non-rendered HTML. The median mobile page displays 11.5% more words when rendered than its raw HTML, indicating a reliance on client-side JavaScript to show content.

This finding suggests that search engines are continually evolving in their capacity to effectively crawl, index, and run websites. And some of the most important SEO configurations are now also better taken into consideration. However, many sites across the web are still missing out on important search visibility and growth opportunities, which also shows the persisting need of SEO evangelization and best practices adoptions across organizations.

Thank you very much for your attention, I hope these findings help you to prioritize your organic search optimization efforts in this year. As you could see, the web needs SEO. if you have any questions or comments, please don’t hesitate to leave them here. Thank you very much, bye bye.

