What Is Latent Semantic Indexing And Why Is It Important For Your Google Rankings?


May 12, 2017




Marketing Professionals

Latent semantic indexing might sound confusing, but it’s actually the natural evolution of the humble search engine and therefore of search engine marketing.

Let’s start with a quick history lesson. Back at the turn of the millennium, search engines were much more rudimentary than they are today. In the year 2000, Google was only two years old and still fighting to establish itself in the market, and most other search engines were using keywords and meta titles to determine which results to surface.

But it didn’t take long for Google to take over the scene with its groundbreaking Page Rank system, which sorted websites based on their inbound links. Search has continued to evolve ever since, and Google now uses around 200 different factors to determine its rankings.


Semantic search was the next logical step. Loosely speaking, the term refers to the process of teaching search engines to understand the intent and the meaning behind a search term, and not just the keywords that the user enters into the search bar.

Search Engine Watch has a fantastic article that goes in-depth into how semantic search works, and they even provide a useful example. Keyword search is all well and good, but it gets stuck if one word has multiple meanings. For example, searching for ‘Portishead’ could deliver results on either the band or on the place in Somerset.

Semantic search engines use context to understand exactly what it is that you’re looking for. If you’ve never been to England but you like listening to music, they can guess that you’re talking about the band. If you’re searching from an IP address that’s somewhere in Somerset, they’re more likely to deliver the location.

This is semantic search at its most basic, but it’s enough to give you a general idea of how search engines react to human language – and why they’re working hard to better understand it. After all, if they can understand what their users are searching for then they can make sure that they provide the most relevant results – to stop them from switching over to a different search engine.


Latent Semantic Indexing (LSI) is the next step up from semantic search, and it refers to a variation on traditional algorithms that attempts to determine the relationship between different terms and ideas in a single piece of content.

For example, if a website includes ‘Harry Potter’ in the URL and in the meta title, the search engine would expect to see words like ‘wand’, ‘magic’, ‘spells’ and ‘Rowling’. As the search engine spider crawls the web, it will examine each webpage to identify common words and phrases and to cross-check them with their knowledge of what the page is about.

This is all suggested by the title of the technique. Latent refers to the fact that the contextual information is latent – it’s there, but it’s buried, existing only as potential until search engines discover it. Semantics is the scientific or linguistic study of the relationships between words and their meanings. And search engines uncover the latent semantics by creating an index.

LSI came about as a response to black hat SEOs who were trying to game the system by stuffing pages full of keywords and irrelevant text to trick search engines into thinking that they were relevant. It’s effectively an attempt to ensure that webmasters are rewarded for the quality of their content – instead of the quantity that they create.


The interesting thing about the switch to Latent Semantic Indexing is that if you’re doing things properly, you have nothing to worry about. In fact, LSI is just one step further than semantic search – instead of trying to understand the meaning behind the search term, it focuses on the meaning of the destination pages.

It basically means that as long as your content is all above board, the search engine will be better equipped to understand it. But if you skimped on cost and hired poor quality writers, or if you simply spun content from other sites using automated software, you’ll get caught out and potentially penalized.

It isn’t exactly clear how LSI will affect your rankings, because search engines tend to be secretive about how they work. Their software is proprietary, and their algorithms are what set them apart from their competitors. On top of that, they know that if they explain how they work, people will try to exploit it by finding workarounds.

What is clear is that spammy websites will be less likely to be seen as authoritative, especially if they publish content about a wide range of topics. If you constantly talk about marketing, for example, then search engines will learn to expect content that’s about marketing. If you suddenly release a post about cheese-making, they’ll suspect that something’s up.

On top of that, search engines will be able to tell if a page is over-optimized, and so while you should continue to use keywords in the content that you release, you should never focus solely on a single keyword. Always make sure that keywords sound natural, and look for ways to use similar, related keywords in header tags and elsewhere in your article.

Effectively, it’s no longer good enough to repeatedly create similar but slightly different content to try to rank for multiple keywords. Instead, you’ll want to accept the fact that trying to rank for specific keywords is no longer relevant – and that it’s better to focus on becoming a thought-leader in your niche. Use keywords to inform your content – and not to dictate it.


It’s simple, really – focus on quality content, just like the search engines. With the rise of LSI, those spammy “get rich quick” sites will disappear, and so will those repositories of random articles that publish poor quality content for cheap linkbuilding firms.

Simply put, it means that webmasters will be rewarded for creating something of value, a repository of content that’s interrelated and which marks the site apart as an authority on a certain subject matter. This means that you’ll want to adapt your content strategy accordingly – if you’re not heading in this direction already – so that you focus on creating long-term value instead of short-term wins.

Of course, both now and in the future, there will always be disreputable SEO gurus who will promise you the world but fail to deliver on it. Some companies believe that they can work around semantic search by continuing to use old-school methods, but those companies will soon disappear from the results pages and leave only the very best sites at the top of the results.

Ultimately, it falls to marketers to think and act more like publishers, which is the way that the industry has been moving for some time. Monitoring conversion rates and targeting keywords is still important – as it always will be – but they’re not the only metrics to keep an eye on. Track average time on site and your bounce rate as well, as both of those can be an indicator of how your visitors perceive your content.


Will you be using semantic search and latent semantic indexing to influence your content strategy? And what other trends have you got your eyes on? Let us know what you think.