• myhloli's avatar
    refactor(para): improve paragraph splitting algorithm · 8cc76c49
    myhloli authored
    - Adjust the threshold for identifying index blocks from 3 lines to 2 lines
    - Add a new function __is_list_group to detect if a group of blocks is a list
    - Modify the paragraph merging logic to handle list groups differently
    8cc76c49
Name
Last commit
Last update
..
__init__.py Loading commit data...
block_continuation_processor.py Loading commit data...
block_termination_processor.py Loading commit data...
commons.py Loading commit data...
denoise.py Loading commit data...
draw.py Loading commit data...
exceptions.py Loading commit data...
layout_match_processor.py Loading commit data...
para_pipeline.py Loading commit data...
para_split.py Loading commit data...
para_split_v2.py Loading commit data...
para_split_v3.py Loading commit data...
raw_processor.py Loading commit data...
stats.py Loading commit data...
title_processor.py Loading commit data...