Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
P
pdf-miner
Project
Project
Details
Activity
Releases
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
Qin Kaijie
pdf-miner
Commits
5b2d81aa
Commit
5b2d81aa
authored
Apr 18, 2024
by
许瑞
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
feat: support get images and tables
parent
53a63316
Changes
3
Expand all
Hide whitespace changes
Inline
Side-by-side
Showing
3 changed files
with
369 additions
and
26 deletions
+369
-26
boxbase.py
magic_pdf/libs/boxbase.py
+40
-1
math.py
magic_pdf/libs/math.py
+5
-0
magic_model.py
magic_pdf/model/magic_model.py
+324
-25
No files found.
magic_pdf/libs/boxbase.py
View file @
5b2d81aa
from
loguru
import
logger
from
loguru
import
logger
import
math
def
_is_in_or_part_overlap
(
box1
,
box2
)
->
bool
:
def
_is_in_or_part_overlap
(
box1
,
box2
)
->
bool
:
"""
"""
...
@@ -332,3 +332,42 @@ def find_right_nearest_text_bbox(pymu_blocks, obj_bbox):
...
@@ -332,3 +332,42 @@ def find_right_nearest_text_bbox(pymu_blocks, obj_bbox):
return
right_boxes
[
0
]
return
right_boxes
[
0
]
else
:
else
:
return
None
return
None
def
bbox_relative_pos
(
bbox1
,
bbox2
):
x1
,
y1
,
x1b
,
y1b
=
bbox1
x2
,
y2
,
x2b
,
y2b
=
bbox2
left
=
x2b
<
x1
right
=
x1b
<
x2
bottom
=
y2b
<
y1
top
=
y1b
<
y2
return
left
,
right
,
bottom
,
top
def
bbox_distance
(
bbox1
,
bbox2
):
def
dist
(
point1
,
point2
):
return
math
.
sqrt
((
point1
[
0
]
-
point2
[
0
])
**
2
+
(
point1
[
1
]
-
point2
[
1
])
**
2
)
x1
,
y1
,
x1b
,
y1b
=
bbox1
x2
,
y2
,
x2b
,
y2b
=
bbox2
left
,
right
,
bottom
,
top
=
bbox_relative_pos
(
bbox1
,
bbox2
)
if
top
and
left
:
return
dist
((
x1
,
y1b
),
(
x2b
,
y2
))
elif
left
and
bottom
:
return
dist
((
x1
,
y1
),
(
x2b
,
y2b
))
elif
bottom
and
right
:
return
dist
((
x1b
,
y1
),
(
x2
,
y2b
))
elif
right
and
top
:
return
dist
((
x1b
,
y1b
),
(
x2
,
y2
))
elif
left
:
return
x1
-
x2b
elif
right
:
return
x2
-
x1b
elif
bottom
:
return
y1
-
y2b
elif
top
:
return
y2
-
y1b
else
:
# rectangles intersect
return
0
\ No newline at end of file
magic_pdf/libs/math.py
0 → 100644
View file @
5b2d81aa
def
float_gt
(
a
,
b
):
if
0.0001
>=
abs
(
a
-
b
):
return
False
return
a
>
b
\ No newline at end of file
magic_pdf/model/magic_model.py
View file @
5b2d81aa
This diff is collapsed.
Click to expand it.
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment