WARNING: THIS SITE IS A MIRROR OF GITHUB.COM / IT CANNOT LOGIN OR REGISTER ACCOUNTS / THE CONTENTS ARE PROVIDED AS-IS / THIS SITE ASSUMES NO RESPONSIBILITY FOR ANY DISPLAYED CONTENT OR LINKS / IF YOU FOUND SOMETHING MAY NOT GOOD FOR EVERYONE, CONTACT ADMIN AT ilovescratch@foxmail.com
Skip to content

Releases: oomol-lab/pdf-craft

v1.0.1

05 Dec 07:56
1bf3c6a

Choose a tag to compare

What's New in v1.0.1

  • Enhanced Error Handling: Added structured error types (FitzError, OCRError, InterruptedError) with detailed page and step information for better debugging
  • Improved Stability: Fixed crashes when encountering single-page PyMuPDF errors - now handles page-level failures gracefully
  • Online Demo: Try PDF Craft directly in your browser at pdf.oomol.com without any installation

What's Changed

Full Changelog: v1.0.0...v1.0.1

v1.0.0

02 Dec 06:44
bb733c3

Choose a tag to compare

🎉 PDF Craft v1.0.0 Official Release

PDF Craft v1.0.0 is now officially released. This version includes major architectural changes and brings significant performance improvements.

🚀 Core Changes: Fully Embracing DeepSeek OCR

The biggest change in v1.0.0 is the complete rewrite based on DeepSeek OCR, eliminating the dependency on LLM for text correction.

DeepSeek OCR is a powerful open-source OCR engine that supports complex content recognition (tables, formulas, images, footnotes, etc.) with excellent document structure understanding capabilities. Thanks to DeepSeek OCR, pdf-craft now offers:

  • Fully Local Processing: The entire conversion process runs completely locally without any network requests. No need to configure LLM APIs, and no risk of conversion failures due to network issues or API outages—in the old version, a single LLM request failure would halt the entire conversion process.
  • Faster Speed: Compared to v0.2.8 which required multiple LLM calls for text correction, the new version uses direct OCR recognition with significantly improved speed.
  • Higher Accuracy: DeepSeek OCR excels at document structure analysis, table recognition, and formula extraction, delivering high-quality results without secondary correction.
  • Simpler API: Removed complex LLM configuration and multi-step processing workflows. Now conversion can be completed with a single function call.

Additionally, v1.0.0 has fully migrated to DeepSeek OCR (MIT License), removing the previous AGPL-3.0 dependency. The entire project now uses the more permissive MIT License, making it easier for commercial use and integration!

⚠️ Important Change: CUDA Environment Required

The new version requires a CUDA environment to run. This is because DeepSeek OCR depends on CUDA acceleration for efficient document recognition. The old version (v0.2.8) could work in pure CPU environments using LLM, but the new version cannot run without a GPU.

If your environment doesn't support CUDA, do not upgrade to v1.0.0. Continue using v0.2.8:

pip install pdf-craft==0.2.8

For specific CUDA environment installation instructions, please refer to the Installation Guide.

🚫 When NOT to Upgrade

Continue using v0.2.8 in the following situations:

  1. No GPU or CUDA Environment: The new version requires CUDA and cannot run without GPU
  2. Need LLM Text Correction: The new version has removed LLM correction functionality. If your use case requires secondary correction of OCR results, continue using the old version or use it in combination with epub-translator

🙏 Acknowledgments

Thanks to DeepSeek OCR for being open source, and to all community members who have contributed code and feedback to pdf-craft!


If you have a CUDA environment, upgrade to v1.0.0 now and experience faster, more stable, and simpler PDF conversion! 🚀

v0.2.8

26 Sep 05:40
862487b

Choose a tag to compare

What's Changed

v0.2.7

23 Jul 03:14
cd03f4b

Choose a tag to compare

What's Changed

Full Changelog: v0.2.5...v0.2.7

v0.2.5

12 Jul 04:22
6a5aa1a

Choose a tag to compare

What's Changed

  • fix(analysers): some codes are out of the lock domain by @Moskize91 in #224
  • fix(analysers): generate a huge paragraph and it will make request oversize by @Moskize91 in #228
  • feat(project): support new dependency API & update it to fix bugs by @Moskize91 in #229
  • fix(analysers): cannot report with max_count by @Moskize91 in #230
  • chore(project): upgrade to 0.2.5 by @Moskize91 in #231

Full Changelog: v0.2.4...v0.2.5

v0.2.4

11 Jul 01:30
1521bbf

Choose a tag to compare

What's Changed

  • fix: #209
  • fix: #216
  • fix: some of chapters cannot be generated in EPUB file

Full Changelog: v0.2.3...v0.2.4