JavaでAspose.OCRとGroupDocs.Parserを使用して画像からテキストを抽出する

Java アプリケーションで extract text from image ファイルを効率的に抽出したいですか? デジタル時代において、ドキュメントの写真を検索可能で編集可能なテキストに変換することは必須の機能です。このチュートリアルでは、Aspose.OCR と GroupDocs.Parser for Java を組み合わせて、スキャンしたドキュメントのテキストを確実に文字列へ変換する手順をすべて解説します。

ライブラリのセットアップから特定領域の認識までカバーし、実際のシナリオでこの統合がどのように活躍するかをご紹介します。

Quick Answers

  • What library handles OCR? Aspose.OCR provides high‑accuracy optical character recognition.
  • Which component parses the result? GroupDocs.Parser extracts structured data from the OCR output.
  • Minimum Java version? JDK 8 or later.
  • Do I need a license? A trial works for testing; a full license unlocks all features.
  • Can I process streams? Yes—both libraries support image streams for web‑based uploads.

What is “extract text from image”?

画像からテキストを抽出するとは、スキャンしたページやレシートの写真などの視覚的文字を、コードで操作・検索・保存できるプレーンテキストに変換することを意味します。OCR(Optical Character Recognition)エンジンはピクセルパターンを解析し、グリフを認識して Unicode 文字列を出力します。

Why combine Aspose.OCR with GroupDocs.Parser?

  • Accuracy: Aspose.OCR delivers industry‑leading recognition rates.
  • Flexibility: GroupDocs.Parser can handle the OCR output, detect page layouts, and return structured results such as tables or form fields.
  • Stream‑friendly: Both libraries work directly with InputStream, making them perfect for web services that receive image uploads.

Prerequisites

  • Java Development Kit: JDK 8+ installed.
  • Maven: Preferred build tool (or manual JAR handling).
  • Aspose OCR Library: Add the JAR to your project.
  • GroupDocs.Parser for Java: Include via Maven (see below) or download the JAR.
  • Basic Java knowledge: Handling streams, exceptions, and collections.

Setting Up GroupDocs.Parser for Java

Maven Setup

Add the repository and dependency to your pom.xml:

<repositories>
   <repository>
      <id>repository.groupdocs.com</id>
      <name>GroupDocs Repository</name>
      <url>https://releases.groupdocs.com/parser/java/</url>
   </repository>
</repositories>

<dependencies>
   <dependency>
      <groupId>com.groupdocs</groupId>
      <artifactId>groupdocs-parser</artifactId>
      <version>25.5</version>
   </dependency>
</dependencies>

Direct Download

If you prefer not to use Maven, grab the latest JAR from GroupDocs Releases.

License Acquisition

A valid license unlocks the full feature set for both Aspose OCR and GroupDocs.Parser. You can start with a free trial or purchase a permanent license from the vendor websites.

Basic Initialization and Setup

  1. Set the License for Aspose OCR:
    import com.aspose.ocr.License;
    
    // Initialize and set the Aspose OCR license
    License license = new License();
    license.setLicense("YOUR_LICENSE_PATH/AsposeOcrLicensePath");
    
  2. Initialize GroupDocs.Parser: Ensure the parser JAR is on the classpath; no extra code is required for basic usage.

Implementation Guide

Feature: Recognize Text from Image Stream

This method lets you feed an InputStream (e.g., an uploaded file) directly into the OCR engine and receive the recognized text.

Overview

The process converts the incoming stream to a BufferedImage, configures optional recognition areas, and calls Aspose OCR’s RecognizePage method.

Step‑by‑step Code

  1. Create the AsposeOCR instance:

    import com.aspose.ocr.AsposeOCR;
    
    AsposeOCR api = new AsposeOCR();
    
  2. Read the image stream into a BufferedImage:

    import java.awt.image.BufferedImage;
    import javax.imageio.ImageIO;
    
    BufferedImage image = ImageIO.read(imageStream);
    
  3. Configure recognition settings (optional area selection):

    import com.aspose.ocr.RecognitionSettings;
    
    RecognitionSettings settings = new RecognitionSettings();
    
    // Example: limit OCR to a specific rectangle
    if (options != null && options.getRectangle() != null) {
        ArrayList<Rectangle> areas = new ArrayList<>();
        areas.add(new Rectangle(
            (int) options.getRectangle().getLeft(),
            (int) options.getRectangle().getTop(),
            (int) options.getRectangle().getSize().getWidth(),
            (int) options.getRectangle().getSize().getHeight()));
        settings.setRecognitionAreas(areas);
    }
    
  4. Run the recognition and handle warnings:

    import com.aspose.ocr.RecognitionResult;
    
    RecognitionResult result = api.RecognizePage(image, settings);
    
    if (options != null && options.getHandler() != null) {
        options.getHandler().onWarnings(pageIndex, result.warnings);
    }
    
    return result.recognitionText;
    

Feature: Recognize Text Areas from Image Stream

When you need each block of text (e.g., separate fields on a form), enable area detection.

Overview

Setting detectAreas tells Aspose OCR to return bounding rectangles for each recognized snippet, which you can then map to your data model.

Step‑by‑step Code

  1. Enable area detection:

    RecognitionSettings settings = new RecognitionSettings();
    settings.setDetectAreas(true);
    
  2. (Optional) Define specific regions – reuse the rectangle logic from the previous section if you only care about certain parts of the image.

  3. Execute OCR and collect area information:

    import java.awt.Rectangle;
    import java.util.ArrayList;
    
    ArrayList<PageTextArea> areas = new ArrayList<>();
    for (int i = 0; i < result.recognitionAreasRectangles.size(); i++) {
        Rectangle rect = result.recognitionAreasRectangles.get(i);
        String text = result.recognitionText;
    
        areas.add(new PageTextArea(
            text,
            new Page(pageIndex, pageSize),
            new Rectangle(
                new Point(rect.getX(), rect.getY()),
                new Size(rect.getWidth(), rect.getHeight()))));
    }
    
    return areas;
    

Practical Applications

  • Document Management Systems: Index scanned PDFs so users can search the full text.
  • Automated Data Entry: Pull fields from photographed receipts or forms.
  • Content Digitization: Convert printed books or manuals into searchable e‑books.

Performance Considerations

  • Batch Processing: Group images into batches to reduce JVM overhead.
  • Image Quality: Higher DPI (300 dpi or more) dramatically improves accuracy.
  • Memory Management: Dispose of BufferedImage objects promptly, especially when processing large volumes.

Common Issues & Troubleshooting

SymptomLikely CauseFix
Garbled charactersLow‑resolution imageUse a higher‑resolution scan (≥300 dpi)
No text returnedWrong image format (e.g., CMYK)Convert to RGB before OCR
Out‑of‑memory errorsVery large imagesProcess in smaller tiles or increase heap size

Frequently Asked Questions

Q: How do I install Aspose OCR in my Maven project?
A: Add the Aspose OCR dependency to your pom.xml (see the vendor’s Maven repository) or download the JAR from the Aspose website and place it on the classpath.

Q: Can I extract text from multi‑page PDFs?
A: Yes. Convert each PDF page to an image (e.g., using Aspose.PDF) and feed the resulting streams to the OCR method described above.

Q: Does this approach work with handwritten text?
A: Aspose OCR primarily targets printed text. For handwriting, consider a dedicated handwriting‑recognition service.

Q: Is a license required for production use?
A: A trial license works for evaluation, but a full license removes watermarks and unlocks all features for commercial deployments.

Q: How can I improve accuracy for a specific language?
A: Set the language in RecognitionSettings (e.g., settings.setLanguage(Language.Spanish);) to guide the engine.

Conclusion

By combining Aspose.OCR’s powerful recognition engine with GroupDocs.Parser’s flexible parsing capabilities, you now have a robust solution to extract text from image files and convert scanned document text into structured data. Experiment with the settings, integrate the code into your service layer, and watch your document workflows become fully searchable and automated.


Last Updated: 2026-01-29
Tested With: Aspose.OCR 23.12, GroupDocs.Parser 25.5
Author: Aspose