• myhloli's avatar
    refactor(para): improve paragraph splitting algorithm · 8cc76c49
    myhloli authored
    - Adjust the threshold for identifying index blocks from 3 lines to 2 lines
    - Add a new function __is_list_group to detect if a group of blocks is a list
    - Modify the paragraph merging logic to handle list groups differently
    8cc76c49
para_split_v3.py 13.1 KB