What is indexing in solr?

Indexing is the process by which Solr includes the specified file terms in an index.

Indexing in Solr would be similar to creating an index at the end of a book that includes the words that appear in that book and their location, so basically we would take an inventory of the words that appear in the book and an inventory of the pages where said words appear

That is, by including content in the index, we make said content available for search by Solr.

This type of index, called an inverted index, is a way of structuring the information that will be retrieved by a search engine.

What is an inverted index

In an inverted index, the search engine creates the indexes, or search terms, from a series of documents, indicating the specific documents that contain them.

index of inverted index

In this way, when the user types a specific search term, the search engine created with Solr will indicate where the term appears.

In the inverted index the index is created a posteriori, when the engine has analyzed the documents on which the search will be based.

Example of creating an inverted index with Solr: base documents

We are going to create an inverted index with Solr from the documents in the example folders that come with Solr.

We start from the base that we have already downloaded Solr. If not, we recommend visiting this page where we teach how to do it.

If we enter the folder solr > example > exampledocs we will check that these files exist:

As we can see, the files are of various types: XML, JSON, Excel,….

These are the files on which Solr will create the inverted index.

Creating the inverted index: index folder

On this web page we detail step by step the instructions for creating the index from the example files that come, in this case, with the solr version 8.3.1 folder. We encourage you to follow the example we detail.

Once the indicated instructions have been followed, the following screen will appear:

In them we observe that the indexing has been produced from the documents contained in the exampledocs folder:

We also note that the indicated documents have been indexed correctly.

Inside the solr> techproducts> data folder we check that the index folder has been created:

Thus, the base documents have been indexed and we can now search with Solr.

In this type of index, the terms that are part of the index are not predetermined, on the contrary, they have been elaborated once we have provided Solr with the base documents in which the search will take place. This characteristic differentiates an inverted index with respect to an index of an Access type database, where previously we have to indicate which are fields.

To access the Solr Administration Panel, we will type, as indicated:

localhost: 8983 / solr

Once in the Panel, we will carry out our search.