@polina123/node-red-ocr-image-to-text 1.0.5
Node-RED node for text recognition from images and PDF files using EasyOCR/PaddleOCR/Tesseract. Supports 20+ languages, parallel processing (up to 10x speedup), PDF text extraction, and automatic text orientation detection.
Node-RED Image OCR (Multi-Engine)
Node-RED node for automatic text recognition from images and PDF files using multiple OCR engines: PaddleOCR (default), EasyOCR, and Tesseract. Supports 20+ languages with parallel processing for up to 10x speedup.
Features
- Three OCR Engines: PaddleOCR (recommended), EasyOCR, Tesseract
- Text Recognition from JPG, JPEG, PNG images and PDF files
- PDF Support - extract text layer or OCR scanned PDFs
- Auto-Rotate - automatic text orientation detection and correction
- Multi-language - support for 20+ languages simultaneously
- Parallel Processing - up to 8-10x speedup for multiple images
- Speed Optimization - image resizing before processing
- Flexible Configuration - choose OCR engine, languages, GPU, quality control
- Error Handling - detailed processing status information
- Performance Monitoring - processing time and statistics
- Batch Processing - multiple images at once
OCR Engine Comparison
OCR Engine | Installation | Speed* | Quality | Languages | Recommendation |
---|---|---|---|---|---|
Tesseract | System install | Fastest (0.5s) | Good | 100+ languages | Best for speed |
EasyOCR | pip install |
Medium (2.3s) | Excellent | 80+ languages | Best quality |
PaddleOCR | pip install |
Slower (4.2s) | Good | Multilingual | Balanced option |
*Test conditions: Single image, CPU, resize_factor 0.5, languages: en+ru
Performance
NEW: Parallel Processing
The node now supports parallel processing of multiple images simultaneously!
Speedup: up to 8-10x when processing multiple images
// Example: 10 images processed in parallel
// Sequential: 20 seconds
// Parallel (4 threads): ~5 seconds
Recommended Settings
Scenario | Resize Factor | GPU | Mode | Speed | Quality |
---|---|---|---|---|---|
Maximum speed | 0.3 | Yes | Parallel | Very Fast | Medium |
Balanced (recommended) | 0.5 | Yes | Parallel | Fast | Good |
High quality | 0.7 | Yes | Batch | Medium | High |
Critical quality | 1.0 | Yes | Sequential | Slow | Maximum |
Recommendation: Use
resize_factor: 0.5
with parallel processing for 8-10x speedup!
Installation
Method 1: Via Palette Manager (recommended)
- Open Node-RED
- Go to menu → Manage Palette
- In the Install tab, search for
@polina123/node-red-ocr-image-to-text
- Click Install
Method 2: Via npm
cd ~/.node-red
npm install @polina123/node-red-ocr-image-to-text
Method 3: Manual installation for development
# Clone the repository
git clone https://github.com/polinaSuvorova/node_red_ocr_image_to_text.git
cd node_red_ocr_image_to_text
# Create symlink
npm link
# In Node-RED folder
cd ~/.node-red
npm link @polina123/node-red-ocr-image-to-text
Installing Python Dependencies
OCR engines will automatically install dependencies on first use. For manual installation:
# Core dependencies (required)
pip install opencv-python pillow
# PaddleOCR (recommended)
pip install paddlepaddle paddleocr
# EasyOCR
pip install easyocr
# Tesseract (requires system installation)
pip install pytesseract
# PDF Support (optional)
pip install PyPDF2 pdf2image
# Also requires poppler-utils system package
Installing PDF Support (Optional)
Python packages:
pip install PyPDF2 pdf2image
System requirements for pdf2image:
Windows:
- Download poppler from https://github.com/oschwartz10612/poppler-windows/releases/
- Extract and add to PATH
Linux (Ubuntu/Debian):
sudo apt install poppler-utils
macOS:
brew install poppler
Installing Tesseract OCR
Tesseract requires separate system installation:
Windows:
- Download from https://github.com/UB-Mannheim/tesseract/wiki
- Install and add to PATH
Linux (Ubuntu/Debian):
sudo apt install tesseract-ocr tesseract-ocr-rus tesseract-ocr-eng
macOS:
brew install tesseract
Usage
Input Data
The node expects a JSON object with an array of files:
{
"files": [
{
"FILENAME": "document.jpg",
"MIMETYPE": "image/jpeg",
"FILEBASE64": "base64_encoded_image_data..."
},
{
"FILENAME": "screenshot.png",
"MIMETYPE": "image/png",
"FILEBASE64": "base64_encoded_image_data..."
},
{
"FILENAME": "scanned.pdf",
"MIMETYPE": "application/pdf",
"FILEBASE64": "base64_encoded_pdf_data..."
}
]
}
Output Data
Image Output:
{
"success": true,
"results": [
{
"filename": "document.jpg",
"mimetype": "image/jpeg",
"status": "success",
"text": "Recognized text from image combined into one line",
"ocr_model": "paddleocr",
"languages": ["en", "ru"],
"resize_factor": 0.5,
"processing_time": 1.23,
"rotation_applied": 3.5,
"error": null
}
],
"processed": 1,
"total": 1,
"errors": 0,
"ocr_model": "paddleocr",
"languages": ["en", "ru"],
"resize_factor": 0.5,
"performance": {
"total_time": 1.45,
"average_time_per_image": 1.23,
"images_per_second": 0.81
}
}
PDF Output:
{
"success": true,
"results": [
{
"filename": "document.pdf",
"mimetype": "application/pdf",
"status": "success",
"text": "Extracted text from PDF...",
"ocr_model": "tesseract",
"languages": ["en", "ru"],
"processing_time": 2.34,
"pdf_info": {
"pages": 5,
"method": "text_layer",
"pages_processed": 5
},
"error": null
}
],
"processed": 1,
"total": 1,
"errors": 0
}
Processing Statuses
Status | Description |
---|---|
success |
Text successfully recognized |
no_text |
No text found in image |
error |
Error processing image |
unsupported_format |
Unsupported image format |
no_data |
No image data provided |
Node Settings
Parameter | Description | Default |
---|---|---|
Name | Node name (optional) | - |
OCR Model | Choose OCR engine: PaddleOCR, EasyOCR, Tesseract | paddleocr |
Languages | Languages for recognition (multiple selection) | en , ru |
Supported Formats | File formats to process: JPEG/JPG, PNG, PDF | jpeg , png |
Resize Factor | Image resize coefficient (0.1-1.0) | 1.0 |
Use GPU | Use GPU for acceleration (requires CUDA) | false |
Auto-Rotate | Automatic text orientation detection and correction | false |
Processing Mode | Sequential or Parallel processing | sequential |
Max Threads | Number of parallel threads (1-32) | 4 |
Supported Languages
- English (en)
- Russian (ru)
- German (de)
- French (fr)
- Spanish (es)
- And many more...
Usage Examples
Example 1: Fast processing with optimal settings
// Node configuration in Node-RED
{
"id": "ocr-node",
"type": "ocr-engine",
"name": "Fast Recognition",
"ocr_model": "paddleocr",
"languages": ["en", "ru"],
"resize_factor": 0.5,
"gpu": true
}
Example 2: Preparing data for processing
// JavaScript function to prepare data
var files = [
{
FILENAME: "image1.jpg",
MIMETYPE: "image/jpeg",
FILEBASE64: msg.payload.base64data1
},
{
FILENAME: "image2.png",
MIMETYPE: "image/png",
FILEBASE64: msg.payload.base64data2
}
];
msg.payload = { files: files };
return msg;
Example 3: Processing results with performance metrics
// After OCR node
if (msg.payload.success) {
node.log(`OCR engine: ${msg.payload.ocr_model}`);
node.log(`Processed ${msg.payload.processed} of ${msg.payload.total} images`);
node.log(`Total time: ${msg.payload.performance.total_time} sec`);
node.log(`Speed: ${msg.payload.performance.images_per_second} img/sec`);
msg.payload.results.forEach(function(result) {
if (result.status === 'success') {
node.log(`Text from ${result.filename}: ${result.text.substring(0, 100)}...`);
node.log(`Processing time: ${result.processing_time} sec`);
} else if (result.status === 'no_text') {
node.warn(`No text found in ${result.filename}`);
} else if (result.status === 'error') {
node.error(`Error in ${result.filename}: ${result.error}`);
}
});
} else {
node.error('OCR error: ' + msg.payload.error);
}
Requirements
- Node.js >= 14.0.0
- Node-RED >= 2.0.0
- Python >= 3.7
- Memory: ~2GB+ RAM (for recognition models)
Supported File Formats
- JPEG/JPG (.jpg, .jpeg)
- PNG (.png)
- PDF (.pdf) - supports both text-based and scanned PDFs
Troubleshooting
Error: "Module not found"
# Reinstall Python dependencies
pip install --upgrade paddleocr easyocr pytesseract opencv-python pillow
Error: "GPU not available"
# Option 1: Install CUDA to use GPU
# Option 2: Disable "Use GPU" option in node settings
Slow Performance
Recommendations:
- USE PARALLEL PROCESSING for multiple images
- Reduce image size (
resize_factor: 0.5
) - Use GPU for acceleration (requires CUDA)
- Use PaddleOCR instead of EasyOCR
- Run performance test:
npm run test:async
Tesseract not found
Windows:
Download and install: https://github.com/UB-Mannheim/tesseract/wiki
Add path to PATH
Linux:
sudo apt install tesseract-ocr tesseract-ocr-rus tesseract-ocr-eng
macOS:
brew install tesseract
Development
Project Structure
node_red_ocr_image_to_text/
├── nodes/
│ └── ocr-image-to-text/
│ ├── node_red_ocr_image_to_text.html # UI configuration
│ ├── node_red_ocr_image_to_text.js # Node.js logic
│ ├── node_red_ocr_image_to_text.py # Python main script
│ ├── ocr_engine.py # Abstract base class
│ ├── easyocr_engine.py # EasyOCR implementation
│ ├── paddleocr_engine.py # PaddleOCR implementation
│ ├── tesseract_engine.py # Tesseract implementation
│ └── pdf_processor.py # PDF processing (NEW)
├── test/
│ ├── .mocharc.json # Mocha configuration
│ ├── sample_test_data.json # Test data
│ ├── test_ocr.py # Test script
│ ├── test_performance_comparison.py # Performance comparison
│ ├── test_async_performance.py # Parallel processing test
│ ├── test_model_switching.py # Model switching test
│ ├── test_orientation.py # Orientation detection test
│ ├── test_pdf_ocr.py # PDF processing test
│ ├── node_spec.js # Node-RED unit tests
│ └── README.md # Testing documentation
├── documens_readme_md/
│ ├── PDF_SUPPORT.md # PDF feature documentation
│ ├── README_FOR_DEVELOPERS.md # Developer guide (Russian)
│ └── PERFORMANCE.md # Performance optimization
├── logs_test/ # Test logs (in .gitignore)
│ └── performance_test_results_*.json # Performance test results
├── .npmignore # npm package exclusions
├── package.json
├── LICENSE
└── README.md
Local Development
- Clone the repository
- Install dependencies:
npm install
- Create symlink:
npm link
- In Node-RED folder:
npm link @polina123/node-red-ocr-image-to-text
- Restart Node-RED
Testing
# Basic test (requires test/sample_test_data.json)
npm run test:python
# Performance comparison of all engines
npm run test:performance
# Parallel vs Sequential processing test
npm run test:async
# Model switching test
npm run test:switching
# Text orientation detection test
npm run test:orientation
# PDF processing test
npm run test:pdf
# Node-RED unit tests
npm test
License
MIT License - see LICENSE file for details.
Support
If you have problems or questions:
- Create Issue on GitHub
- EasyOCR Documentation: https://github.com/JaidedAI/EasyOCR
- PaddleOCR Documentation: https://github.com/PaddlePaddle/PaddleOCR
- Tesseract Documentation: https://github.com/tesseract-ocr/tesseract
Acknowledgments
- Based on EasyOCR by Jaided AI
- Uses PaddleOCR by PaddlePaddle
- Integrates Tesseract OCR by Google
Additional Information
Tesseract Architecture
Pytesseract is a Python wrapper for Tesseract OCR. Tesseract OCR is a powerful open-source library for optical character recognition (OCR), originally developed by HP and now maintained by Google.
How it works:
Python code → pytesseract → Tesseract OCR (system program) → result
Two-component system:
pytesseract
(Python package) — installed via piptesseract
(system program) — installed separately in OS
Author: Polina Suvorova Repository: https://github.com/polinaSuvorova/node_red_ocr_image_to_text