Tech Explorer Logo

Search Content

Open Source Pdf Translation Tool(PDFMathTranslate) - Format-Preserving Multilingual PDF Translation Tool with AI Integration

2 min read
Cover image for Open Source Pdf Translation Tool(PDFMathTranslate) - Format-Preserving Multilingual PDF Translation Tool with AI Integration

PDFMathTranslate is a powerful PDF document translation tool that preserves original formatting including mathematical formulas, charts, table of contents, and annotations while translating. It supports multiple translation engines and can integrate with AI services (Ollama, OpenAI). This article will guide you through its installation and usage.

Key Features

  • Support for multi-language translation
  • Preserves original document formatting
  • Offers both command-line and graphical interfaces
  • Supports multiple translation services (Google, DeepL, Ollama, etc.)
  • Docker containerization support

Quick Start

1. Local Installation

Ensure your system meets the following requirements:

  • Python 3.8-3.11
  • pip package manager

Install using the following command:

   pip install pdf2zh

2. Docker Deployment

If you prefer using Docker, deploy quickly with these commands:

   docker pull byaidu/pdf2zh
docker run -d -p 7860:7860 byaidu/pdf2zh

After deployment, access the Web interface at http://localhost:7860.

pdfmathtranslate-hero-02.png

Usage Guide

1. Command Line Usage

Basic translation command:

   pdf2zh your_document.pdf

Specify translation languages:

   # Translate English document to Chinese
pdf2zh your_document.pdf -li en -lo zh

2. Graphical Interface Usage

Launch the graphical interface (built with Gradio):

   pdf2zh -i

3. Using Ollama Translation Service

To use the Ollama service, configure the following:

  1. Set the Ollama server address:
   export OLLAMA_HOST=http://your_ollama_server:11434
  1. In the graphical interface:
    • Select Ollama as the translation service
    • Enter the model name in Model ID (e.g., qwen2:7b-instruct)

Environment Variable Configuration

1. Model Storage Location

The project uses the Hugging Face model library DocLayout-YOLO-DocStructBench. To customize the model download location, set this environment variable:

   export HF_HOME=/path/to/your/models

This will download models to the specified directory instead of the default user directory.

2. Ollama Service Configuration

Set the Ollama server address:

   export OLLAMA_HOST=http://localhost:11434  # default address
# or
export OLLAMA_HOST=http://your_server:11434  # custom address

3. Other Translation Service Configuration

For other translation services, set the corresponding API keys:

   # DeepL
export DEEPL_AUTH_KEY=your_key

# DeepLX
export DEEPLX_AUTH_KEY=your_key

# Azure
export AZURE_APIKEY=your_key

# OpenAI
export OPENAI_API_KEY=your_key
Share

More Articles