- 27 Sep, 2024 6 commits
-
-
myhloli authored
# Conflicts: # magic_pdf/libs/draw_bbox.py
-
myhloli authored
Introduce an additional argument `draw_bbox` in the `draw_bbox_with_number` function to enable toggling the drawing of bounding boxes on or off. When set to `False`, no bounding box will be drawn, allowing for situations where only text
-
myhloli authored
Remove debug code related to layout bbox visualization and adjust drawing functions to support optional line sorting bboxes. This change includes the removal of `draw_layout_bbox` function and updates to `draw_bbox_with_number` to support variable line width for bbox drawing.
-
myhloli authored
Add a new function `draw_line_sort_bbox` to visualize the sorting of lines on each page. This includes indexing lines and handling both text and non-text elements such as tables and images for better content organization. Also, comment out GPU-related code for flexibility and remove overlaps in bounding box detection, which improves the accuracy of layout splitting.
-
myhloli authored
refactor(pdf_parse_union): integrate LayoutLMv3 for block orderingReplace the heuristic-based block ordering algorithm with LayoutLMv3 model predictions toimprove the accuracy of block ordering on PDF pages. Additionally, refactor the span handling during block filling to ensure spans are correctly assigned. - Introduce LayoutLMv3ForTokenClassification from 'hantian/layoutreader' to predict block order. - Implement span replacement strategy to use pymu spans for non-OCR content. - Enhance cleanup process to free GPU memory more effectively after model use. - Adjust block ordering logic to use median line index for text, title, and interline equation blocks. - Refactor page parsing core logic for better maintainability. BREAKING CHANGE: The integration of LayoutLMv3 changes the internal block handling and ordering mechanism, which may affect downstream systems relying on the previous implementation. Ensure to test thoroughly before deployment.
-
myhloli authored
- Added CUDA cache clearing after layoutreader prediction to free up GPU memory. - Modified the bbox sorting logic to sort text and title blocks separately. - Adjusted drawing colors for better distinction in debug visualizations.
-
- 26 Sep, 2024 2 commits
-
-
myhloli authored
- Added CUDA cache clearing after layoutreader prediction to free up GPU memory. - Modified the bbox sorting logic to sort text and title blocks separately. - Adjusted drawing colors for better distinction in debug visualizations.
-
myhloli authored
Implement a new function `draw_layout_sort_bbox` in `draw_bbox.py` to visualize the layout sorting results using the `LayoutLMv3ForTokenClassification` model. This function predicts the order of layout elements and draws them in the sorted sequence on the PDF pages.
-
- 25 Sep, 2024 1 commit
-
-
myhloli authored
Implement a new function `draw_layout_sort_bbox` in `draw_bbox.py` to visualize the layout sorting results using the `LayoutLMv3ForTokenClassification` model. This function predicts the order of layout elements and draws them in the sorted sequence on the PDF pages.
-
- 20 Sep, 2024 2 commits
-
-
Xiaomeng Zhao authored
fix(pdf_extract_kit):change unimernet base -> small
-
myhloli authored
-
- 19 Sep, 2024 4 commits
-
-
Xiaomeng Zhao authored
fix(pdf-extract): ensure model is set to evaluation mode before processing
-
myhloli authored
Add model.eval() invocation to pdf_extract_kit initialization sequence to ensure the model is set to evaluation mode. This is critical for proper inference and performance metrics when processing PDF content.
-
Xiaomeng Zhao authored
refactor(pdf_extract): use Image.crop directly with layout detection
-
myhloli authored
-
- 18 Sep, 2024 9 commits
-
-
Xiaomeng Zhao authored
feat(UNIPipe): change default drop_mode to NONE_WITH_REASON
-
myhloli authored
-
Xiaomeng Zhao authored
feat(ocr_mkcontent): support drop reason in none_with_reason mode
-
Xiaomeng Zhao authored
-
Xiaomeng Zhao authored
feat(pipeline): pass language parameter for parsing and markdown conversion
-
Xiaomeng Zhao authored
feat(gradio_app): add examples accordion to the PDF conversion interface
-
myhloli authored
-
myhloli authored
-
myhloli authored
feat(ocr_mkcontent): support drop reason in none_with_reason modeEnable the `NONE_WITH_REASON` drop mode in `para_to_standard_format_v2` by updating the function signature to include the `drop_reason` parameter and handling it within the function logic. This enhancement allows the function to convey the reason for dropping content in the output.
-
- 13 Sep, 2024 2 commits
-
-
Xiaomeng Zhao authored
-
Xiaomeng Zhao authored
-
- 12 Sep, 2024 14 commits
-
-
Xiaomeng Zhao authored
fix: remove useless files
-
Xiaomeng Zhao authored
-
Xiaomeng Zhao authored
-
Xiaomeng Zhao authored
-
Xiaomeng Zhao authored
fix: solve conflicts
-
myhloli authored
-
quyuan authored
-
Xiaomeng Zhao authored
fix: recovert the lang option in tools/cli.py
-
-
quyuan authored
-
icecraft authored
-
Xiaomeng Zhao authored
fix: 1. resolve uncorrect pair relation of figure and footnote, 2. re…
-
icecraft authored
fix: 1. resolve uncorrect pair relation of figure and footnote, 2. resolve uncorrect pair relation of table and caption #590
-
quyuan authored
-