08.06.2022 Positioning 4 min

LSI keywords – what are they and do they have real impact on SEO optimization?

Blog LSI keywords – what are they and do they have real impact on SEO optimization?

Table of content

LSI keywords are extremely important phrases, especially if we are talking about positioning. But… is Google really working this way? Find the answer in the following article.

What is LSI based on?

Latent Semantic Indexing (LSI), as the name suggests, is a „hidden” semantic indexing. It is a process of semantic analysis of a website – search engines make use of it in order to get to know / get familiar / understand the content published on specific website in detail.

In the past, search engines placed a great emphasis on the presence of specific phrases in their texts – i.e. keywords – in order to determine the content on the website. For this reason, the largest possible number of occurrences of a given “cluster” of words was desirable. It indicated what the content was about. Certainly, this method can be clearly called „out-of-date”.

Nowadays, Google (as well as every other major search engine) uses a number of advanced algorithms that are processing “natural language” and which aim is to discover the topic of the specific website. What for?

Easier finding information on any topic

First and foremost, if we limit ourselves to searching and visiting website on the basis of given key phrase, we would have to know it precisely in order to find this specific website. Information is often cataloged in the search engine – to achieve such a state of affairs, algorithms use words other than those that the potential customer will use to find the website.

In need for examples? Let’s assume that we purchase catalysts and we set these two words as the key phrase for our website. Image that customer has such parts for sale, but instead the potential recipient typed “purchase of car parts” in the search engine. Due to this, our website will not be presented to the client.

This „problem” seems to be quite easy to solve – we only have to fill our website with as many key phrases as possible. Unfortunately, so-called natural language is much more complex.

Solving the problems of synonymous, polysemic and homonymous words…

Language has many ways that are a real wealth and quite useful tool to show a concept.

It is rich in synonyms, polysemic words and homonyms. Therefore, it is no longer enough to create a pool of available meanings for a given word in the search engine – it is also necessary to understand the broader context.

After doing an experiment on your own, you will see that Google can really „understand” the concept of synonyms and polysemics.

Example

Type „mouse” into the search engine. In addition to any data related to the computer mouse, the button “see also” may catch your eye, in case you are looking for a mouse, i.e. an animal.

Is latent semantic indexing the answer to all linguistic nuances?

As you can see, Google can verify rich natural language very well. The problem is… it does not do so with LSI.

John Mu, webmaster trends analysist in Google, says it directly in his Twitter post: „There’s no such thing as LSI keywords — anyone who’s telling you otherwise is mistaken, sorry”. LSI is an example of IT technology, of course, but it has its origins in the 80’s, before the actual era of global Internet started! Moreover, the patent for it expired in 2008. Google uses search algorithms, the so-called word co-occurrence and bipartite graph co-clustering which are, let’s be honest, completely unclear for most of us!

Anyway, let’s answer an important question: no matter if we are talking about LSI or not, can the use of related words or phrases improve the performance of our website? The answer is: absolutely!

How to find LSI keywords?

Let’s start from the beginning. It is probably worth taking a look at what Google itself tells us about its search process:

„Understanding the meaning of your search terms is crucial and influence the quality of the response. (…) Besides matching keywords, algorithms try to determine whether the potential search results that respond to the user’s needs are effective. When you search for “dogs”, you probably are not expecting to see a website where this word is repeated one hundred times. We try to guess whether a given website contains an answer to your query and if it is not just a copy of it. So, search engine algorithms analyze of the content published on the website, for example, whether they contain photos of dogs, videos, or even a list of dog breeds. We even take into account whether the website language matches the language of the query to prioritize the website in the preferred language.”

Google analyzes websites in terms of possible topic development. For this purpose it uses website that contain semantically similar content.

Therefore, the more cross-sectional materials we place on our website, the greater the chance that will appear high in the search results for similar terms. In other words, cross-sectional texts containing a semantically large cross-section of a given issue = our website placed higher in the results. Maybe not directly, but it is definitely meaningful to every entrepreneur. So, how do you find related phrases?

Autocomplete show results

Enter your phrase on Google and see what the query ending suggests. These suggestions show the phrases that users have already been searching for, which means that you should consider them as a “tool” to enrich your texts.

Databases

Large databases that gather information, such as Wikipedia or Wikidata, present related articles. What is more, you will be able to find many synonymous expressions for your phrase.

Knowledge graph

For some search results, Google displays a text box that contain the most important information and associations. It is also a good source for potential testimonials and related concepts for given keywords.

Summary

Search engines are still far from understanding what the user is exactly looking for. However, we are finding newer and newer ways to deal with such issue. Understanding the mechanisms and algorithms that allow them to do so will certainly benefit future content writers.

Jan Susmaga

By education - philologist / linguist, by current profession - proofreader and copywriter, by interest - a bit of everything. He enjoys writing, reading both prose and non-fiction (mainly scientific), watching horror movies, training martial arts, trying new things.