Commit dd787f46 authored by myhloli's avatar myhloli

Merge remote-tracking branch 'origin/master'

parents 1e3c1ef5 30f06136
...@@ -64,10 +64,9 @@ https://github.com/opendatalab/MinerU/assets/11393164/618937cb-dc6a-4646-b433-e3 ...@@ -64,10 +64,9 @@ https://github.com/opendatalab/MinerU/assets/11393164/618937cb-dc6a-4646-b433-e3
![Flowchart](docs/images/flowchart_en.png) ![Flowchart](docs/images/flowchart_en.png)
### Submodule Repositories ### Dependency repositorys
- [PDF-Extract-Kit](https://github.com/opendatalab/PDF-Extract-Kit) - [PDF-Extract-Kit : A Comprehensive Toolkit for High-Quality PDF Content Extraction](https://github.com/opendatalab/PDF-Extract-Kit) 🚀🚀🚀
- A Comprehensive Toolkit for High-Quality PDF Content Extraction
## Getting Started ## Getting Started
......
...@@ -6,6 +6,9 @@ from loguru import logger ...@@ -6,6 +6,9 @@ from loguru import logger
from magic_pdf.pipe.UNIPipe import UNIPipe from magic_pdf.pipe.UNIPipe import UNIPipe
from magic_pdf.rw.DiskReaderWriter import DiskReaderWriter from magic_pdf.rw.DiskReaderWriter import DiskReaderWriter
import magic_pdf.model as model_config
model_config.__use_inside_model__ = True
try: try:
current_script_dir = os.path.dirname(os.path.abspath(__file__)) current_script_dir = os.path.dirname(os.path.abspath(__file__))
demo_name = "demo1" demo_name = "demo1"
......
#### Install Git LFS ### Install Git LFS
Before you begin, make sure Git Large File Storage (Git LFS) is installed on your system. Install it using the following command: Before you begin, make sure Git Large File Storage (Git LFS) is installed on your system. Install it using the following command:
```bash ```bash
git lfs install git lfs install
``` ```
#### Download the Model from Hugging Face ### Download the Model from Hugging Face
To download the `PDF-Extract-Kit` model from Hugging Face, use the following command: To download the `PDF-Extract-Kit` model from Hugging Face, use the following command:
```bash ```bash
...@@ -15,13 +15,37 @@ git lfs clone https://huggingface.co/wanderkid/PDF-Extract-Kit ...@@ -15,13 +15,37 @@ git lfs clone https://huggingface.co/wanderkid/PDF-Extract-Kit
Ensure that Git LFS is enabled during the clone to properly download all large files. Ensure that Git LFS is enabled during the clone to properly download all large files.
Move the 'models' directory to a directory on a larger disk space, preferably an SSD.
### Download the Model from ModelScope
#### SDK Download
```bash
# First, install the ModelScope library using pip:
pip install modelscope
```
```python
# Use the following Python code to download the model using the ModelScope SDK:
from modelscope import snapshot_download
model_dir = snapshot_download('wanderkid/PDF-Extract-Kit')
```
#### Git Download
Alternatively, you can use Git to clone the model repository from ModelScope:
```bash
git clone https://www.modelscope.cn/wanderkid/PDF-Extract-Kit.git
```
Put [model files]() here:
``` ```
./ ./
├── Layout ├── Layout
│ ├── config.json │ ├── config.json
│ └── model_final.pth │ └── weights.pth
├── MFD ├── MFD
│ └── weights.pt │ └── weights.pt
├── MFR ├── MFR
......
#### 安装 Git LFS ### 安装 Git LFS
开始之前,请确保您的系统上已安装 Git 大文件存储 (Git LFS)。使用以下命令进行安装 开始之前,请确保您的系统上已安装 Git 大文件存储 (Git LFS)。使用以下命令进行安装
```bash ```bash
git lfs install git lfs install
``` ```
#### 从 Hugging Face 下载模型 ### 从 Hugging Face 下载模型
请使用以下命令从 Hugging Face 下载 PDF-Extract-Kit 模型: 请使用以下命令从 Hugging Face 下载 PDF-Extract-Kit 模型:
```bash ```bash
...@@ -15,6 +15,29 @@ git lfs clone https://huggingface.co/wanderkid/PDF-Extract-Kit ...@@ -15,6 +15,29 @@ git lfs clone https://huggingface.co/wanderkid/PDF-Extract-Kit
确保在克隆过程中启用了 Git LFS,以便正确下载所有大文件。 确保在克隆过程中启用了 Git LFS,以便正确下载所有大文件。
### 从 ModelScope 下载模型
#### SDK下载
```bash
# 首先安装modelscope
pip install modelscope
```
```python
# 使用modelscope sdk下载模型
from modelscope import snapshot_download
model_dir = snapshot_download('wanderkid/PDF-Extract-Kit')
```
#### Git下载
也可以使用git clone从 ModelScope 下载模型:
```bash
git clone https://www.modelscope.cn/wanderkid/PDF-Extract-Kit.git
```
将 'models' 目录移动到具有较大磁盘空间的目录中,最好是在固态硬盘(SSD)上。 将 'models' 目录移动到具有较大磁盘空间的目录中,最好是在固态硬盘(SSD)上。
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment