• myhloli's avatar
    feat(pdf_parse): improve span filtering and add new block types · 149132d6
    myhloli authored
    - Refactor remove_outside_spans function to filter spans more accurately
    - Add image_footnote, index, and list block types to output file documentation
    - Update draw_span_bbox to use preproc_blocks instead of para_blocks
    - Bump version to 0.9.0
    149132d6
Name
Last commit
Last update
..
config Loading commit data...
data Loading commit data...
dict2md Loading commit data...
filter Loading commit data...
integrations Loading commit data...
layout Loading commit data...
libs Loading commit data...
model Loading commit data...
para Loading commit data...
pipe Loading commit data...
post_proc Loading commit data...
pre_proc Loading commit data...
resources Loading commit data...
rw Loading commit data...
spark Loading commit data...
tools Loading commit data...
utils Loading commit data...
__init__.py Loading commit data...
pdf_parse_by_ocr.py Loading commit data...
pdf_parse_by_txt.py Loading commit data...
pdf_parse_union_core.py Loading commit data...
pdf_parse_union_core_v2.py Loading commit data...
user_api.py Loading commit data...