- 20 Sep, 2024 2 commits
-
-
Xiaomeng Zhao authored
fix(pdf_extract_kit):change unimernet base -> small
-
myhloli authored
-
- 19 Sep, 2024 4 commits
-
-
Xiaomeng Zhao authored
fix(pdf-extract): ensure model is set to evaluation mode before processing
-
myhloli authored
Add model.eval() invocation to pdf_extract_kit initialization sequence to ensure the model is set to evaluation mode. This is critical for proper inference and performance metrics when processing PDF content.
-
Xiaomeng Zhao authored
refactor(pdf_extract): use Image.crop directly with layout detection
-
myhloli authored
-
- 18 Sep, 2024 9 commits
-
-
Xiaomeng Zhao authored
feat(UNIPipe): change default drop_mode to NONE_WITH_REASON
-
myhloli authored
-
Xiaomeng Zhao authored
feat(ocr_mkcontent): support drop reason in none_with_reason mode
-
Xiaomeng Zhao authored
-
Xiaomeng Zhao authored
feat(pipeline): pass language parameter for parsing and markdown conversion
-
Xiaomeng Zhao authored
feat(gradio_app): add examples accordion to the PDF conversion interface
-
myhloli authored
-
myhloli authored
-
myhloli authored
feat(ocr_mkcontent): support drop reason in none_with_reason modeEnable the `NONE_WITH_REASON` drop mode in `para_to_standard_format_v2` by updating the function signature to include the `drop_reason` parameter and handling it within the function logic. This enhancement allows the function to convey the reason for dropping content in the output.
-
- 13 Sep, 2024 2 commits
-
-
Xiaomeng Zhao authored
-
Xiaomeng Zhao authored
-
- 12 Sep, 2024 20 commits
-
-
Xiaomeng Zhao authored
fix: remove useless files
-
Xiaomeng Zhao authored
-
Xiaomeng Zhao authored
-
Xiaomeng Zhao authored
-
Xiaomeng Zhao authored
fix: solve conflicts
-
myhloli authored
-
quyuan authored
-
Xiaomeng Zhao authored
fix: recovert the lang option in tools/cli.py
-
-
quyuan authored
-
icecraft authored
-
Xiaomeng Zhao authored
fix: 1. resolve uncorrect pair relation of figure and footnote, 2. re…
-
icecraft authored
fix: 1. resolve uncorrect pair relation of figure and footnote, 2. resolve uncorrect pair relation of table and caption #590
-
quyuan authored
-
quyuan authored
-
quyuan authored
-
quyuan authored
-
quyuan authored
-
myhloli authored
The pipeline now supports passing the language parameter to parsing functions and during markdown conversion to optimize processing based on the specified language. This enhancement allows for more accurate parsing and markdown generation, particularly when dealing with non-English content.
-
myhloli authored
Introduce an examples accordion within the Gradio application to provide users with a selection of sample PDFs for demonstration purposes. The added functionality allows users to quickly test the PDF conversion feature with various document types.
-
- 11 Sep, 2024 1 commit
-
-
Xiaomeng Zhao authored
-
- 10 Sep, 2024 2 commits
-
-
Xiaomeng Zhao authored
-
Xiaomeng Zhao authored
-