• myhloli's avatar
    refactor(para): improve paragraph splitting algorithm · 8cc76c49
    myhloli authored
    - Adjust the threshold for identifying index blocks from 3 lines to 2 lines
    - Add a new function __is_list_group to detect if a group of blocks is a list
    - Modify the paragraph merging logic to handle list groups differently
    8cc76c49
Name
Last commit
Last update
.github Loading commit data...
demo Loading commit data...
docs Loading commit data...
magic_pdf Loading commit data...
old_docs Loading commit data...
projects Loading commit data...
signatures/version1 Loading commit data...
tests Loading commit data...
.gitignore Loading commit data...
.pre-commit-config.yaml Loading commit data...
.readthedocs.yaml Loading commit data...
Dockerfile Loading commit data...
LICENSE.md Loading commit data...
MinerU_CLA.md Loading commit data...
README.md Loading commit data...
README_ja-JP.md Loading commit data...
README_zh-CN.md Loading commit data...
magic-pdf.template.json Loading commit data...
requirements-docker.txt Loading commit data...
requirements-qa.txt Loading commit data...
requirements.txt Loading commit data...
setup.py Loading commit data...
update_version.py Loading commit data...