Stop Words in Search: Add Documents to Index with GroupDocs.Search Java

If you need to add documents to index while making sure that no important term—especially common ones—is ignored, you’ve come to the right place. In this guide we’ll show you how to disable stop words in search using GroupDocs.Search for Java, so every token (even “on”, “by”, or “the”) becomes searchable and your results are far more accurate.

Quick Answers

  • What does “add documents to index” mean? It means loading your source files into a searchable index so they can be queried efficiently.
  • Why would I disable stop words? To include common words (e.g., “on”, “the”) in searches when those terms are meaningful for your domain.
  • Which library version is required? GroupDocs.Search for Java 25.4 or later.
  • Do I need a license? A free trial works for evaluation; a permanent license is required for production.
  • Can I use this in a Maven project? Yes – just add the repository and dependency shown below.

What are stop words in search and why might you want to disable them?

Stop words are frequent terms that many search engines automatically filter out to speed up queries. While this improves performance for generic web searches, it can hurt precision in specialized domains—legal contracts, e‑commerce catalogs, or technical manuals—where words like “on”, “by”, or “as” carry real meaning. Disabling stop words lets you treat every word as significant, ensuring that no relevant document is missed.

How does adding documents to index work in GroupDocs.Search?

When you add documents, the library reads each file, tokenizes its content, and stores the tokens in an optimized data structure (the index). Once indexed, the engine can retrieve matching documents in milliseconds, even for large collections.

Prerequisites

  • Required Libraries: GroupDocs.Search for Java 25.4 (or newer).
  • Development Environment: IntelliJ IDEA, Eclipse, or any Java IDE you prefer.
  • Basic Knowledge: Familiarity with Java syntax and the concept of indexing.

Setting Up GroupDocs.Search for Java

Maven Installation

If you’re using Maven, include the following in your pom.xml:

<repositories>
    <repository>
        <id>repository.groupdocs.com</id>
        <name>GroupDocs Repository</name>
        <url>https://releases.groupdocs.com/search/java/</url>
    </repository>
</repositories>

<dependencies>
    <dependency>
        <groupId>com.groupdocs</groupId>
        <artifactId>groupdocs-search</artifactId>
        <version>25.4</version>
    </dependency>
</dependencies>

Direct Download

Alternatively, download the latest version from GroupDocs.Search for Java releases.

License Acquisition Steps

  • Free Trial – start testing right away.
  • Temporary License – obtain a time‑limited key for full functionality.
  • Purchase – secure a permanent license for production use.

Basic Initialization and Setup

Create an instance of IndexSettings to control how the index behaves:

import com.groupdocs.search.IndexSettings;

// Create an instance of IndexSettings
IndexSettings settings = new IndexSettings();

How to disable stop words in search (Java)

The following line turns off the built‑in stop‑word filter:

// Disable the use of stop words
tsettings.setUseStopWords(false);

Parameters: setUseStopWords accepts a boolean.
Purpose: Guarantees that every word—including common stop words—is indexed and searchable.

How to add documents to index

Defining the Output Directory

import com.groupdocs.search.Index;

// Define the path to the output directory for indexing
String indexFolder = "YOUR_OUTPUT_DIRECTORY\\IndexingWithStopWords";

// Create an index at the specified location with the configured settings
Index index = new Index(indexFolder, settings);

Specifying the Document Directory

// Define the path to your document directory
String documentsFolder = "YOUR_DOCUMENT_DIRECTORY";

// Add all documents in the specified folder to the index
index.add(documentsFolder);

Now every file in YOUR_DOCUMENT_DIRECTORY is added documents to index and ready for querying.

Performing a Search Query

import com.groupdocs.search.results.SearchResult;

// Define your search query
tString query = "on";

// Perform the search operation using the index and the specified query
SearchResult result = index.search(query);

Because stop words are disabled, the term "on" will be considered during the search, returning matches that would otherwise be ignored.

Practical Applications

  1. Enterprise Document Search – Ensure critical terminology isn’t filtered out.
  2. E‑commerce Platforms – Improve product discovery by indexing every word in product descriptions.
  3. Legal Research Tools – Capture every legal term, even those commonly treated as stop words.

Performance Considerations

  • Optimization Tips: Regularly update and prune your index to keep search speed high.
  • Resource Usage: Monitor JVM heap size; large indexes may require tuning of garbage collection settings.
  • Java Memory Management: Use efficient data structures and consider off‑heap storage for very large corpora.

Common Issues and Solutions

SymptomLikely CauseFix
No results for common wordssetUseStopWords(true) (default)Call setUseStopWords(false) as shown above.
Out‑of‑memory errors during indexingIndexing too many large files at onceIndex files in batches; increase -Xmx JVM option.
Search returns stale dataIndex not refreshed after adding new filesCall index.update() or re‑add the changed documents.

Frequently Asked Questions

Q: What are stop words?
A: Stop words are common terms (e.g., “the”, “is”, “on”) that many search engines ignore to speed up queries. Disabling them lets you treat every token as searchable.

Q: Why disable stop words in search indexes?
A: When exact phrase matching is required—such as in legal or technical documents—every word carries meaning, so you need to include stop words.

Q: How does GroupDocs.Search handle large datasets?
A: The library uses optimized data structures and incremental indexing to keep memory usage low, even with millions of documents.

Q: Can I integrate GroupDocs.Search with other Java applications?
A: Yes, the API is designed for easy embedding into any Java‑based system, from web services to desktop apps.

Q: What should I do if my search results are not accurate?
A: Verify that the index includes all required documents (add documents to index), ensure stop‑word filtering is disabled if needed, and consider re‑building the index after major changes.

Additional Resources

By following this guide, you now know how to add documents to index and disable stop words in search to deliver more accurate results in your Java applications.


Last Updated: 2026-02-19
Tested With: GroupDocs.Search for Java 25.4
Author: GroupDocs