Releases: eellak/glossAPI
Glossapi 0.1
Features
-
Multi-GPU processing
for faster and more efficient processing. -
Text extraction with Docling
from PDF and other file types -
Greek OCR support
recognize Greek text from images and PDFs using DeepSeek OCR or RapidOCR. -
Formula recognition
in LaTeX with Docling's math enhancement model or DeepSeek OCR -
Fast CPU-only Text Extraction
with self.batch_policy = "safe" using pypdfium backend
glossapi v0.0.5
Publishing through workflow
glossapi pipeline
Pipeline to extract text (from pdf for now), section it, annotate sections (table of contents, bibliography etc) of textbooks or academic papers in a parquet file.
glossapi pipeline
A pipeline for text extraction and annotation.
glossapi pipeline
Pipeline to extract text (from pdf for now), section it, annotate sections (table of contents, bibliography etc) of textbooks or academic papers in a parquet file.
v0.0.3.5.2-alpha
Update GitHub workflow to publish to TestPyPI