chrislynch.link

Architecture for SEO

Google Says Code To Text Ratio Has Never Been An SEO Factor… but is that really true?

Moments of clarity from Google are rare, so it’s nice to when they come out and say something simple and unambiguous like

The code to text ratio is not, and never has been, a factor in SEO.

John Mueller / Google

Except… is this really true?

Assuming it’s safe to take Google at face value here and that there is not a ranking factor specifically measuring how much “code” (the stuff that makes up the page and its design but isn’t the stuff you read) there is compared to “text” (the stuff you actually read) then is it safe to just stuff as much code as you want into a page?

Err… no. Definitely not.

The reason for this is our old friend from Core Web Vitals, Site Speed.

The more code on the page, the more bytes of data you need to transfer to load it. The more Javascript libraries, CSS files, etc. that you load, the more bytes of data you need to transfer to load those as well. You can have a lot of code and not a lot of content on your web page and all of it will add to the amount of data you transfer and potentially slow your page down.

So, whilst the specific ratio of code to text isn’t a ranking factor that’s no reason to let your pages get bloated with unnecessary code. This is especially important when working with a CMS like WordPress. In their efforts to make themes flexible and configurable, WordPress developers often have to write more complex code than would be required if they knew exactly the layout that they were producing ahead of time. More options means more code. If you know the layout you want, consider working with a developer to create a bespoke theme that with just the options, and code, that you need.

You can find the quote from Google and more at https://www.seroundtable.com/google-code-to-text-ratio-seo-factor-32889.html

Keep Unique Content Above the Fold

Like the killer in an 80s slasher movie, The Fold just won’t die. What is it? It’s the invisible dividing line between what you can see when you first load up a webpage and what lies out of sight until you scroll downwards. Web designers used to obsess about the fold and what was, and wasn’t above it.

Then, the mobile web came along and suddenly there were a lot more devices to worry about and designers stopped worrying about the fold quite so much. (In fact, designers now spend a lot of time telling people not to worry about the fold). Those cheeky designers… they do like an easy life.

Because, guess what? The fold isn’t dead. There’s a big space where its body should be. The fold is alive!

Whilst it’s arguably true that the average user is much more used to scrolling now than they ever were and it may genuinely not be as important to keep your important information and calls to action “above the fold” Google’s John Mueller has been uncharacteristically clear on the importance of putting unique content above the fold.

Mueller explains that when there is a lot of repeated template content above the fold, this can cause problems. It’s been a commonly held SEO belief for some time that content further up the page is more important, something that Google has occasionally denied. In this instance however, they are in a rarely transparent mode – you need unique content above the fold.

So if you have a banner on top, and you have generic URL image on top, that’s totally fine. But some of the above the fold content should be unique for that page. And that could be something like a heading that’s visible in a minimum case, but at least some of the above the fold content should be there,

John Mueller, Google Search Liaison. https://www.seroundtable.com/google-templated-content-above-fold-31559.html

What should you have above the fold?

You’re going to have a lot of standard stuff above the fold, potentially taking up a lot of space on mobile. Logos, contact details, menus, baskets… these all take up space. Add in that super-big hero image that marketing simply insisted you have on the page and there may not be any space left for any actual content.

But content is what you need – unique, useful content, right there above the fold.

H1 Headings… Are Over 50% of SEOs Doing it Wrong?

If this isn’t a saying by now, it should be…

If you want three different opinions on an SEO question, ask two different SEO consultants…

Search Engine Optimisation remains, in the vast majority of areas a practice shrouded in mystery. Google’s guidance, where they give it, is often as vague as the predictions of a seaside fortune teller and no matter what Google do say you will invariable find a fistful of SEO consultants who will say that Google aren’t telling the truth and often have good worked examples of where they’ve achieved excellent results by doing the opposite of what Google suggested.

Even the most basic of questions like “How many H1s should there be a on a page” seems open to debate. (If you’re not in the know, a H1 tag means “Heading 1” and is (or should be) the topmost and largest heading on the page.)

This is a great example of how SEOs don’t all agree and how Google’s vague advice can lead people down the wrong route. Cyrus Shepard got quite excited about more than half of SEO’s giving the “wrong answer”, only to receive a small deluge of tweets pointing to articles where Google contradict themselves on this point and give better guidance on what headings are for and how they matter. (And they do matter)

So it’s not so much that suddenly your page ranks higher because you have those keywords there. But suddenly it’s more well Google understands my content a little bit better and therefore it can send users who are explicitly looking for my content a little bit more towards my page.

John Mueller, via https://www.searchenginejournal.com/heading-tags-for-seo/341817

What do I recommend for Headings, including H1?

My own personal experience on this when guiding non-SEO people in doing their own SEO is that a one H1 strategy is very hard to get wrong. That doesn’t mean that having more than one H1 is bad, contrary to Google’s advice, but it does require a little more thought to get it “right”. You have to think about page structure more, you have to think about your content more, and it inevitably leads to more question like – is the H1 nearer the top of page more important? (Based on this advice, it might be) Also, as a rule, insisting on one H1 per page focusses the minds of content creators on what they want to talk about and removes the get-out-of-jail-free card for any lazy developer who used a H1 tag where what they really wanted as just “some big text”. Getting your headings right is also really important for accessibility. Google’s fancy-pants AI may not need so much structure, but screen readers for visually impaired users really like it. That’s a good reason to get headings organised if nothing else.

Google may not be looking at H1s as a key instruction of what your page is about, but your users certainly are and, in this context, making one page about one thing is very sensible.

This survey also highlights one of the big problems in SEO. There are very few, if any, right or wrong answers. The answer to the vast majority of SEO questions is “it depends” (whether you ask an SEO consultant or Google themselves), so asking a “Yes or No” question like this is misleading. I’m one hundred percent sure that it you asked John Mueller “Is there ever a scenario when having just one H1 tag is best” the answer would be a heavily caveated, some equivocating,

“Yes, it can be, because we don’t look at H1 tags like blah blah, blah blah blah blah so just blah and blah and make the best site you can.”

Not Google’s John Mueller, but it really could be

What should you do about multiple H1s?

Let’s start with what you shouldn’t do. Don’t go on some kind of Liam Neison in Taken style killing spree trying to prune every page down to one H1 just because I said that that was a simple, hard to get wrong strategy. Also, don’t start sprinkling H1s over your page like Rip Taylor with a bucket of confetti just because Google said it was OK and now you’ve got one less thing to worry about.

Instead, start considering these questions:

  1. What is this page about?
  2. Does the first H1 make that clear to the user?
  3. Do the sub headings guide the user and provide more context?
  4. If there is a second H1 on the page, is it legitimately the start of whole new topic?

Fiddling about with headings isn’t a silver bullet (nothing in SEO is…) but in terms of page construction it is something easy to get right and something your developer, designer, and content creator(s) should be considering.

Google say: It’s Fine if 30-40% of URLs in Search Console are 404s

Your website will throw a 404 error if someone tries to go to a webpage that isn’t there. Also known as “broken links”, 404s are generally considered to be bad for SEO.

However, in a recent update, Google said that SEOs were maybe worrying too much about 404s and it was “fine if 30-40% of URLs in Search Console are 404s”.

Source: https://www.searchenginejournal.com/google-fine-if-30-40-of-urls-in-search-console-are-404s/397313/amp/

Does this seem weird? Well, yes it does… so let’s break it down a little.

Why you might have lots of 404s in your Search Console

Google freely admit that they often come back to old URLs, even if they have been 404-ing for a long time. I’ve personally seen the GoogleBot request URLs on websites that are many years out of date. Of course, this can be because there’s a site out there, somewhere, that’s still linking to that out of date URL but more often than not its just that GoogleBot never seems to quite give up on a URL.

Do you need to fix 404s or not?

It seems like a remarkably large cop-out by Google to say that having such a potentially large number of 404s in your Search Console results is OK and I’d be concerned about working with any SEO who took this figure and ran with it as a reason not to tackle 404s.

Sure, it’s great news that we need to worry less about those random URLs that turn up like Banquo at the feast to spoil our spotless Search Console, but if we turn a blind eye to the 404 report then we take a big risk that we are ignoring URLs that worked quite recently and have been lost. I’d also be concerned about this piece of advice being used as a reason not to do comprehensive redirects when migrating a site from one host to another.

Like many things in SEO, the answer to whether or not you need to fix a particular 404 is subjective and you need to consider things like:

  1. Have any actual users tried to use this URL?
  2. Should the URL work? Is something broken?
  3. When did this URL last work?
  4. Can the URL be fixed?
  5. Is there a suitable, meaningful replacement page for this URL on your site?

I’ve tried to prioritise these questions and #1 is almost certainly the most important check here – fixing 404s that real users are seeing is always worth the time and effort. You should be monitoring 404s within your website CMS to achieve this – don’t rely on Google as it won’t tell you the complete answer to this.

Of course, if you have SEO problems then you may not be generating traffic to all of your URLs so you can’t rely on test #1 alone. Test #2 is where we pick this up and, again, I’d always encourage you to fix a problem like this.

The far-from-last word on 404s and Redirects

In short, I’d always encourage you to fix any 404 you reasonably can. The one area where I think we can start to change our advice here is on URLs were there is no logical replacement URL.

Getting a 404 for “big widgets” because you don’t see “big widgets” any more? That’s fine. Let it hit your 404 or, maybe, be super-kind to your customer and redirect them somewhere where big widgets can still be found.

Why you need to a web host with fast and reliable hardware

This should go without saying but… if you find that the host is having frequent problems (or worse, the same problem over and over again) when you’re investigating their support arrangement – bump them off the list.

It may seem harsh, but if hosting is their business and they aren’t doing it well, why should you let that affect your business? I can’t think of a single logical reason why you would do that.

Beyond this, assessing how reliable a particular infrastructure is can be an extremely complex task. The biggest hosts, such as Amazon, offer such a plethora of hosting options and, whilst each has a solid underpinning from Amazon themselves, the way in which these are configured and connected together can have a huge impact on how reliable your website actually is.

In short – working out if a particular configuration is reliable or not is a topic for a whole other book.

If you’re working with a web development agency then they may have a preferred hosting platform 

Most web developers will have a relationship with a hosting provider and will either want to put you in touch with them or resell their services.

This is completely normal – but beware of a developer who sells “hosting” but won’t tell you where they get it from. Either their markup is astronomical, they are selling you short in terms of capacity, they’ve got multiple customers packed on the same server like bees in a hive, or there’s something else fishy about this arrangement. Find out who they use and then check them out.

Hosting is not an area to cut corners. The guys at Trotters Independent Hosting (Data Centres in New York, Paris, and Peckham) are not your friend. 

Leave the cheap hosting to the bloggers – we need servers ready to do business.

Use a fast server

“I feel the need… the need for speed”

Maverick, Top Gun

Website speed matters. An increase in page load time of 1 sec can impact your conversion rates by up to 90%. People are lazy, in a hurry, and don’t have time to wait for your website.

In 2020, Gary Illyes from Google confirmed that a server upgrade can change how frequently and rapidly Google scans your site for new content:

Fun fact: changing a site’s underlaying infrastructure like servers, IPs, you name it, can change how fast and often Googlebot crawls from said site. That’s because it actually detects that something changed which prompts it to relearn how fast and often it can crawl.”

Gary Illyes

The Bottom Line

The quickest and easiest way to improve the performance of a website is to increase the specification of the hardware it is running on. More processing power, more memory, faster discs, and more (or faster) bandwidth are the order of the day – all of these things affect the speed at which your site will load. However, they are not the whole story…

Reviewing your Infrastructure and Platform

Today’s exciting new technology that you simply must use is tomorrow’s legacy platform that everyone winces when they talk about.

SEO is changing. Remember that? So is software. So is hosting infrastructure. The change is constant, relentless, and needs to be managed.

One of the things I hate most about the web development industry is the number of web developers who don’t explain to their clients the need to maintain a website’s software infrastructure after it’s been implemented. There are far too many providers who simply install the latest version of “CMS X”, configure it, apply a theme, and then walk away. Project complete. Job done.

And well it might be, but the job will be undone in a few weeks by the next security update or system patch that “CMS X” or one of the plugins are going to need.

This is why it is absolutely crucial that you have a support contract with whoever builds your website for you. 

Your support should include:

  1. Updates to the underlying server software – operating system, web server, database, etc.
  2. Updates to the CMS software.
  3. Updates to any plugins.
  4. Updates to any themes.
  5. Updates to any bespoke code that breaks down as a consequence of upgrades to 1, 2, 3, or 4 above.
  6. Any queries.

It’s reasonable to expect that your provider will want to be paid both for providing this support and for any work over and above what is agreed – you should note that changes to the website are not covered in the list above.

Backups! 

If you lose your site then you lose your rankings, as search engines quickly remove sites that won’t load and there’s nothing worse for a customer than clicking a link that goes to a dead or broken website. 

Make sure you have access to a recent backup of your website and make sure you download these backups and store them securely somewhere that you have access to.

If the worst happens and either your current web host or your website provider goes out of business, it could take weeks or even months to replicate all of the data in your website with a new provider.

My web developer says my site is Open Source, so I don’t have to worry

They’re lying; just because your website is open source, doesn’t mean you don’t need backups or that you shouldn’t have access to them.

Nightmare scenario part 1: the jigsaw puzzle of death

Your website is made of CMS “X” extended by a bunch of additional components –  plugins, themes, third-party plugins – all of which have been picked by your developer and all of which have their own particular version number. In addition to this, your developer may have written some code of their own, a bespoke theme (or tweaks to a bought-in design), changes to a plugin etc.

The way your website works is a product of that particular combination of components. Precisely that combination. 

So, if your developer is planning on rebuilding your website in the event of a problem simply by downloading the same components – they haven’t thought things through. What if the components or the CMS have changed? What if they aren’t available anymore? And what about all those little tweaks and changes, probably undocumented, that the web developer made to get everything just so?

You’d have more chance of recreating George’s Marvellous Medicine that you would have of recreating your website this way.

I’m a huge Open Source advocate, but just because something is Open Source does not mean it does not need to be maintained and properly looked after.

Nightmare scenario part 2: the empty CMS

While we’re on this topic, let’s also not forget that your data and content is not open source – that’s all sitting in the database, underneath the CMS. It belongs to you and data is more valuable than gold.

So, if by some miracle your no-backups developer managers to reconstruct your website like CSI Miami at a crime scene, no backups still means that your data – blog posts, pages, products, even customers – it’s all gone.

If you take nothing else away from this book, take this:

Your business does not “have” data, your business is the data.

My provider says that they are “Software as a Service”

This one is slightly more difficult. In this environment, you don’t “own” the code for your website at all, all you have is a license to use a service for a specified amount of time.

Don’t get me wrong – there are lots of great “Software as a Service” website platforms out there (I founded one of them!) and there are big advantages to working with a “Software as a Service” provider. It’s simply a matter of understanding what you have and what you don’t have.

If you are using a “Software as a Service” platform, here are the things you should ensure you will be able to access:

  1. Access to all of your data – every page, post, customer, order, contact form submission… everything.
  2. Access to all of your files – images, videos, logos, PDF attachments etc.

Make sure you understand how you can access this information. Ideally, you want to be able to download this information in bulk, not one item at a time, and you want to be able to download it in a format that will be easy to import to another system.

Never underestimate the power of the humble CSV file. It’s the “spare gun in the ankle holster” of file formats. You might sneer at it, but one day it’s going to save your digital backside.

CSV (Comma Separated Value) files are your friend – every spreadsheet application in the world can read them, every database can import them, and they are pretty much “human readable” if you want to open them in a text editor/word processor.

You should also make sure that your data is going to be somewhere safe if your “Software as a Service” provider goes to the wall. Often, when this happens (and it does happen often) the company gives customers a small window of opportunity to grab their data before the service shuts down for good.

Be prepared.

What you need to do, today

Every business needs to:

  1. Ensure you have access to full data and file backups.
  2. If you’re working with an open source platform, ensure you can also access code backups.
  3. If you’re working with a proprietary platform or Software as a Service platform, ensure you have an agreement in place to gain access to source code in the event that the provider goes out of business.
  4. Ensure you have documentation on how to restore backups to a fully working state.
  5. Ensure you have tested the process in the documentation.

Brief notes on software “escrow”

When it comes to source code, providers who don’t want to give you access to the code in the normal run of business may talk about a software “escrow” arrangement.

In effect, this means lodging a copy of the software with a trusted third party who will ensure it is released to you if the vendor is no longer able to provide the service and, for whatever reason, is not able to facilitate the transfer of the code themselves.

A perfectly good arrangement, just make sure you’re working with a reliable escrow provider and that there is a documented and audited process in place for the software vendor to update the copy of the software that the escrow company is holding.

Make your site secure

You can tell the difference between a secure and an insecure site by the presence (or not!) of a padlock next to the URL in your browser. When you see the padlock it means that the data that you are exchanging with the website is being encrypted. Sites that have been through a process called “Enhanced Verification” or “EV” have had their real-world credentials checked and will present a green background on the URL/address bar to show their credentials.

We’ve had secure sites for a long time and the technology is essential in protecting credit card and other personal data as it moves across the web. 

Google has been pushing something they call “HTTPS Everywhere” for quite a while too – the idea that every site (not page!) should use encryption at all times.

So keen are Google on this that they’ve been very open in stating that they’ve made it a “ranking signal” – something that directly affects your search engine ranking. So – even if you’re not collecting personal data, there’s a good reason to get your site secured right there.

It’s also an incredibly cheap thing to do – Google, Facebook, and others have been funding projects like www.letsencrypt.org, which issues completely free security certificates, for some time.

The process to set up a new certificate takes around fifteen minutes tops for a developer or system administrator – so there’s really no excuse for not getting this done as a matter of urgency if you currently don’t have it.

Getting a padlock does not mean your site is secure

Let’s be clear – just because a website has a padlock, even if it comes on a fancy green background, doesn’t mean that it is “secure”. All it means is that data you are exchanging with the website is secured whilst it is in transit across the internet. What happens to it once it arrives at the website is a whole different story.

It’s like posting a letter in a very secure envelope. It’s safe between the postbox and the destination, but once the envelope is removed the information is vulnerable again.

I’ve picked on WordPress a few times already but it is one of the worst culprits for unfixed security problems – 268 are listed on cvedetails.com as I write this.

Two. Hundred. And. Sixty. Eight.

https://www.cvedetails.com/vulnerability-list/vendor_id-2337/product_id-4096/

And that’s just the core code – it doesn’t take into account plugins and themes that you may also be using on your website.

When pages go missing – the world of 301s and 404s

Over time, your website will change. You’ll add pages and you’ll take pages away. You will move pages and you will restructure. You may even rebuild, changing out your CMS for something new and better (and hopefully not WordPress).

And that’s a problem for search engines.

Search indexes maintain a list of all the pages that they’ve ever seen – that is, of course, effectively what the index is – a massive list of pages and information on what they contain that people then search.

As a consequence, search engines are rather touchy about the topic of pages going “missing” – probably because this reflects on them more than it does on you.

Imagine opening up an old-fashioned telephone directory, looking for a plumber and, when you call, finding that the number in the directory is out of date. You’re not going to blame the plumber, are you?

The SEO implications of missing pages

When a web page can’t be found, your web server returns what is called a “404 error”. Google Search Console records the number of 404 errors your site has; e.g. the number of pages that have gone missing; and this is believed to be a key metric in terms of search engine optimisation – the lower the better.  

Even if Google weren’t counting up your missing pages, there’s another good reason to redirect missing pages, and that’s to pass on the benefits of any links or general SEO “goodness” that page had accrued on to its successor.

Imagine you had spent a year building backlinks to pages all through your website. Then, you get a CMS upgrade and your URLs change. All of those backlinks are pointing to pages that don’t exist and therefore have little (or probably absolutely no) value. Your site may just be about to tank.

The UX implications of missing pages

SEO aside, a 404 error is still something you should deal with – why would you leave a visitor to your website sitting on an error page when they could be looking at your product, your story, and engaging with your brand?


What You Should Do About Missing Webpages

Here are the things you should be doing to deal with missing pages:

1. Use 301 redirects

If you change the URL or a page, make sure you create a 301 redirect pointing to the new version of that page, or a replacement, to preserve SEO value and avoid losing customers.

If you don’t know how to do this on your current website/CMS then learn – as quickly as humanly possible. It’s vital that you are able to do this. 

You will often need to create multiple redirects at once, so check to see if there is an option to import multiple redirects at once as well. It’s not vital that you can do this, but it will save you a lot of time if you can.

2. Use 301 redirects properly

Don’t redirect everything that you can’t find to your homepage. It’s lazy and Google will record it as a “soft 404”. If you genuinely don’t have a replacement for a page that’s gone walkabout either create one or …

3. Learn how to fail gracefully

404 pages are like noses – most people have one and most people wouldn’t pick theirs if anyone else was watching.

Joking apart, the 404 page is the “last chance saloon” of retaining a visitor to your website. You’ve already let them down – the thing they wanted is no longer there – the more gracefully, and usefully, you can do this the better.

“This isn’t flying – this is falling with style”
– Buzz Lightyear

Good 404 pages have even become a category in web design all of their own, with their own awards and frequent “Top 10 404 Pages” blog articles popping up across the internet.

If you’re stuck for inspiration, here are some things you can try on your 404 page to make it a little less… useless.

Give it personality and make it your own

Make sure your 404 page actually carries your branding – there are sites out there that either use the default 404 page for the CMS or the web server that they are hosted on. Disgraceful.

At the very least, say sorry that your visitor isn’t able to find the page that they wanted. It’s not an error – you’ve dropped the ball.

Give the user a chance to find what they wanted

Your 404 page should be topped and tailed with your normal site navigation.

Think about including latest or top selling products, most popular blog posts, and prominent links to pages where visitors can contact you, engage on social media, and find any terms and conditions or policy pages that site contains.

Over and above that, offering the customer a search option is a great way of keeping them on your website rather than sending them running back to wherever they came from.

Remember that the user has been there

A good CMS will keep its own record of 404 errors, giving you a chance to create a redirect before the search engines have the chance to notice that there is a problem.The simplest place to do this is on the 404 page itself – make sure yours either records missed opportunities or passes this information on to your analytics platform (more on those later!)


Migrating URLs to avoid 404s

At the start of this section, I said that one of the events that can generate a lot of 404 errors is a change of CMS or website construction.

I wrote this down because it happens, not because it has to happen.

If you’re changing your CMS and your URLs are already clean, keyword rich (this is discussed in the Content section) and indexed – do not change them.

Take it from someone who has used nearly every CMS available and has written more than three of his own. There is no good technical reason that you can’t copy your URLs from one system to another.I’m not saying that there’s not a reason – you just won’t convince me it’s a good reason.

When Architecture Goes Bad

If you’re going to put in the hours of toil, the pints of blood, and the gallons of tears required to create a great website, why on Earth would you accept it going offline?

Not only will your site’s rankings be impacted if your site is regularly inaccessible, but you could be offline just when a customer needs you.

Ensuring that your website is running on fast, reliable,   infrastructure is a must – but it’s an area that a lot of website owners overlook, opting instead to go for the cheapest possible hosting they can find.

The first little pig ran his website on an out-of-date version of WordPress hosted on a cheap shared server with no support contract, no SLA, and no backups.

Wolves ate him.

The number of business I’ve come across who are having problems with their websites simply because of bad infrastructure would amaze you. Although this is a technical topic, it’s actually one of the easiest things to get right.

Here’s are my tips for making sure that your infrastructure is giving your website the support it deserves:

Use a web host with good support

Problems happen and no technology, no hardware or software, is fault free.

“If General Motors had developed technology like Microsoft, we would all be driving cars that would crash twice a day for no reason whatsoever.”

General Motors Press Release

More than anything else, making sure that your hosting provider has a robust support process and a skilled team of system administrators and engineers is a must. 

Of course, everyone will tell you that they have these things, so you’re either going to have to dig around for information online or get a recommendation from somebody. Personally, I prefer a word-of-mouth recommendation from somebody that I trust over anonymous content on “trust” and “ratings” websites.

Status stalking

Most good hosting companies have a “status” page on their website that lets customers know if there are problems with any of their systems.

If the hosting company that you are looking at doesn’t have one – bump them off the list. If they do have one – monitor it for a few days. Check to see if there are problems and, if there are, how quickly are they resolved? How good is the communication with customers when the problem is going on?

It’s also worth looking at the hosting company’s social media, in particular their Twitter feed; this is an easy way to find more historic information on problems that have occurred. Keep in mind that some savvy companies have a separate Twitter account for support and status announcements, so make sure you’re looking at the right one if you’re doing a little research.