Add Documents to Index for Case‑Insensitive Search in Java

When you need to add documents to index and guarantee that users can find what they’re looking for regardless of letter case, a case‑insensitive search is essential. In this guide we’ll walk through configuring GroupDocs.Search for Java so that every document you add to the index is normalized during indexing, giving you reliable, case‑insensitive results every time.

Quick Answers

  • What does “add documents to index” mean? It means feeding your source files into a searchable index so they can be queried later.
  • Why use character replacement? It normalizes text (e.g., forces everything to lower‑case) so searches ignore case differences.
  • Do I need a license? A free trial works for development; a full license is required for production.
  • Which Java version is required? Java 8 or newer; the library targets Java 11+ for best compatibility.
  • Can I search case‑sensitively if needed? Yes—search options let you toggle case‑sensitive behavior per query.

What is “add documents to index” in GroupDocs.Search?

Adding documents to an index means loading files (PDFs, Word docs, plain text, etc.) into a data structure that GroupDocs.Search can query. The library parses each file, extracts searchable text, and stores it in a way that makes look‑ups fast and efficient.

Why enable character replacement during indexing?

Character replacement transforms every character to a predefined equivalent—commonly lower‑case—while the index is being built. This ensures that a query like “Promotion” matches “promotion”, “PROMOTION”, or any mixed‑case variant without extra effort from the caller.

Prerequisites

  • GroupDocs.Search for Java version 25.4 or newer.
  • Java Development Kit (JDK) 8 or later installed.
  • Basic familiarity with Maven (or ability to add JARs manually).

Setting Up GroupDocs.Search for Java

Maven Setup

Add the GroupDocs repository and dependency to your pom.xml:

<repositories>
    <repository>
        <id>repository.groupdocs.com</id>
        <name>GroupDocs Repository</name>
        <url>https://releases.groupdocs.com/search/java/</url>
    </repository>
</repositories>

<dependencies>
    <dependency>
        <groupId>com.groupdocs</groupId>
        <artifactId>groupdocs-search</artifactId>
        <version>25.4</version>
    </dependency>
</dependencies>

Direct Download

If you prefer not to use Maven, grab the latest JAR from the official site: GroupDocs.Search for Java releases.

License Acquisition

  • Free Trial – download a trial license to start experimenting.
  • Temporary License – request an extended testing license from the GroupDocs portal.
  • Full License – purchase a production license when you’re ready to go live.

Basic Initialization (Create the index)

The following snippet creates an index folder and enables character replacements:

import com.groupdocs.search.Index;
import com.groupdocs.search.IndexSettings;

String indexFolder = "YOUR_OUTPUT_DIRECTORY/AdvancedUsage/Indexing/CharacterReplacementDuringIndexing";
IndexSettings settings = new IndexSettings();
settings.setUseCharacterReplacements(true);
Index index = new Index(indexFolder, settings);

Implementation Guide

Enable Character Replacement in Index Settings

Activating this feature tells the engine to replace characters while indexing, which is the core step for case‑insensitive behavior.

Step 1: Configure IndexSettings

import com.groupdocs.search.Index;
import com.groupdocs.search.IndexSettings;

String indexFolder = "YOUR_OUTPUT_DIRECTORY/AdvancedUsage/Indexing/CharacterReplacementDuringIndexing";

// Create an instance of IndexSettings and enable character replacement.
IndexSettings settings = new IndexSettings();
settings.setUseCharacterReplacements(true);

// Initialize the index with these settings.
Index index = new Index(indexFolder, settings);

Configure Character Replacements

Map each character to its lower‑case counterpart (or any custom mapping you need).

Step 2: Define and Add Replacement Pairs

import com.groupdocs.search.dictionaries.CharacterReplacementPair;

// Access existing replacements and clear them.
index.getDictionaries().getCharacterReplacements().clear();

// Create an array for new replacements.
CharacterReplacementPair[] characterReplacements = new CharacterReplacementPair[Character.MAX_VALUE + 1];
for (int i = 0; i < characterReplacements.length; i++) {
    char originalChar = (char) i;
    char replacementChar = Character.toLowerCase(originalChar);
    characterReplacements[i] = new CharacterReplacementPair(originalChar, replacementChar);
}

// Add these replacements to the index's dictionary.
index.getDictionaries().getCharacterReplacements().addRange(characterReplacements);

Indexing Documents

Now that the index is ready, you can add documents to index from any folder.

Step 3: Add Documents for Indexing

import com.groupdocs.search.Index;

String documentFolder = "YOUR_DOCUMENT_DIRECTORY";

// Initialize the index and add documents.
Index index = new Index(indexFolder, settings);
index.add(documentFolder);

Perform Case‑Sensitive Search (Optional)

If a particular query must respect case, you can toggle it per request.

Step 4: Execute Case‑Sensitive Searches

import com.groupdocs.search.Index;
import com.groupdocs.search.SearchOptions;
import com.groupdocs.search.results.SearchResult;

String query = "Promotion";
SearchOptions options = new SearchOptions();
options.setUseCaseSensitiveSearch(true);

// Perform the search.
Index index = new Index(indexFolder, settings);
SearchResult result = index.search(query, options);

Practical Applications

  1. Marketing Campaigns – Normalize product names so sales teams can locate assets without worrying about case.
  2. Customer Support – Power help‑desk search boxes that return the right article whether the user types “login” or “Login”.
  3. E‑commerce Catalogs – Ensure shoppers find items regardless of how they type product titles.

Performance Considerations

  • Organize Source Files – A tidy folder hierarchy speeds up the add documents to index step.
  • Monitor Memory – Large corpora can consume significant RAM; consider incremental indexing or batch processing.
  • Asynchronous Indexing – If your version of GroupDocs.Search supports it, run indexing on a background thread to keep the UI responsive.

Common Issues & Troubleshooting

SymptomLikely CauseFix
No results returned for a known termCharacter replacements not enabledVerify settings.setUseCharacterReplacements(true) and that replacements were added.
Out‑of‑memory error during indexingIndexing too many large files at onceIndex in smaller batches or increase JVM heap (-Xmx).
Search returns case‑sensitive results unexpectedlySearchOptions.setUseCaseSensitiveSearch(true) was setRemove or set to false for default case‑insensitive behavior.

Frequently Asked Questions

Q: How do I handle special characters (e.g., “é”, “ß”) during indexing?
A: Include those characters in your replacement map. You can map them to their ASCII equivalents or keep them unchanged, depending on your search requirements.

Q: Can I limit character replacement to a specific language?
A: Yes. Build a custom replacement array that only contains characters for the target language before adding it to the dictionary.

Q: What should I do if the index takes a long time to load?
A: Optimize the folder structure, remove unnecessary files, and consider persisting the index on fast SSD storage.

Q: Is it possible to revert the character replacements after indexing?
A: No. Replacements are baked into the indexed data; you must rebuild the index with new settings to change them.

Q: Where can I find more detailed API documentation?
A: The official docs and API reference provide exhaustive details (see Resources below).

Resources


Last Updated: 2026-02-03
Tested With: GroupDocs.Search 25.4 for Java
Author: GroupDocs