3.1 KiB
3.1 KiB
Bank Statement Parsing with MiniCPM-V 4.5
Recipe for extracting transactions from bank statement PDFs using vision-language AI.
Model
- Model: MiniCPM-V 4.5 (8B parameters)
- Ollama Name:
openbmb/minicpm-v4.5:q8_0 - Quantization: Q8_0 (9.8GB VRAM)
- Runtime: Ollama on GPU
Image Conversion
Convert PDF to PNG at 300 DPI for optimal OCR accuracy.
convert -density 300 -quality 100 input.pdf \
-background white -alpha remove \
output-%d.png
Parameters:
-density 300: 300 DPI resolution (critical for accuracy)-quality 100: Maximum quality-background white -alpha remove: Remove transparencyoutput-%d.png: Outputs page-0.png, page-1.png, etc.
Dependencies:
apt-get install imagemagick
Prompt
You are a bank statement parser. Extract EVERY transaction from the table.
Read the Amount column carefully:
- "- 21,47 €" means DEBIT, output as: -21.47
- "+ 1.000,00 €" means CREDIT, output as: 1000.00
- European format: comma = decimal point
For each row output: {"date":"YYYY-MM-DD","counterparty":"NAME","amount":-21.47}
Do not skip any rows. Return complete JSON array:
API Call
import base64
import requests
# Load images
with open('page-0.png', 'rb') as f:
page0 = base64.b64encode(f.read()).decode('utf-8')
with open('page-1.png', 'rb') as f:
page1 = base64.b64encode(f.read()).decode('utf-8')
payload = {
"model": "openbmb/minicpm-v4.5:q8_0",
"prompt": prompt,
"images": [page0, page1], # Multiple pages supported
"stream": False,
"options": {
"num_predict": 16384,
"temperature": 0.1
}
}
response = requests.post(
'http://localhost:11434/api/generate',
json=payload,
timeout=600
)
result = response.json()['response']
Output Format
[
{"date":"2022-04-01","counterparty":"DIGITALOCEAN.COM","amount":-21.47},
{"date":"2022-04-01","counterparty":"DIGITALOCEAN.COM","amount":-58.06},
{"date":"2022-04-12","counterparty":"LOSSLESS GMBH","amount":1000.00}
]
Running the Container
GPU (recommended):
docker run -d --gpus all -p 11434:11434 \
-v ollama-data:/root/.ollama \
-e MODEL_NAME="openbmb/minicpm-v4.5:q8_0" \
ht-docker-ai:minicpm45v
CPU (slower):
docker run -d -p 11434:11434 \
-v ollama-data:/root/.ollama \
-e MODEL_NAME="openbmb/minicpm-v4.5:q4_0" \
ht-docker-ai:minicpm45v-cpu
Hardware Requirements
| Quantization | VRAM/RAM | Speed |
|---|---|---|
| Q8_0 (GPU) | 10GB | Fast |
| Q4_0 (CPU) | 8GB | Slow |
Test Results
| Statement | Pages | Transactions | Accuracy |
|---|---|---|---|
| bunq-2022-04 | 2 | 26 | 100% |
| bunq-2021-06 | 3 | 28 | 100% |
Tips
- DPI matters: 150 DPI causes missed rows; 300 DPI is optimal
- PNG over JPEG: PNG preserves text clarity better
- Remove alpha: Some models struggle with transparency
- Multi-page: Pass all pages in single request for context
- Temperature 0.1: Low temperature for consistent output
- European format: Explicitly explain comma=decimal in prompt