Commit dd787f46 authored by myhloli's avatar myhloli

Merge remote-tracking branch 'origin/master'

parents 1e3c1ef5 30f06136
......@@ -64,10 +64,9 @@ https://github.com/opendatalab/MinerU/assets/11393164/618937cb-dc6a-4646-b433-e3
![Flowchart](docs/images/flowchart_en.png)
### Submodule Repositories
### Dependency repositorys
- [PDF-Extract-Kit](https://github.com/opendatalab/PDF-Extract-Kit)
- A Comprehensive Toolkit for High-Quality PDF Content Extraction
- [PDF-Extract-Kit : A Comprehensive Toolkit for High-Quality PDF Content Extraction](https://github.com/opendatalab/PDF-Extract-Kit) 🚀🚀🚀
## Getting Started
......
......@@ -6,6 +6,9 @@ from loguru import logger
from magic_pdf.pipe.UNIPipe import UNIPipe
from magic_pdf.rw.DiskReaderWriter import DiskReaderWriter
import magic_pdf.model as model_config
model_config.__use_inside_model__ = True
try:
current_script_dir = os.path.dirname(os.path.abspath(__file__))
demo_name = "demo1"
......
#### Install Git LFS
### Install Git LFS
Before you begin, make sure Git Large File Storage (Git LFS) is installed on your system. Install it using the following command:
```bash
git lfs install
```
#### Download the Model from Hugging Face
### Download the Model from Hugging Face
To download the `PDF-Extract-Kit` model from Hugging Face, use the following command:
```bash
......@@ -15,13 +15,37 @@ git lfs clone https://huggingface.co/wanderkid/PDF-Extract-Kit
Ensure that Git LFS is enabled during the clone to properly download all large files.
Move the 'models' directory to a directory on a larger disk space, preferably an SSD.
### Download the Model from ModelScope
#### SDK Download
```bash
# First, install the ModelScope library using pip:
pip install modelscope
```
```python
# Use the following Python code to download the model using the ModelScope SDK:
from modelscope import snapshot_download
model_dir = snapshot_download('wanderkid/PDF-Extract-Kit')
```
#### Git Download
Alternatively, you can use Git to clone the model repository from ModelScope:
```bash
git clone https://www.modelscope.cn/wanderkid/PDF-Extract-Kit.git
```
Put [model files]() here:
```
./
├── Layout
│ ├── config.json
│ └── model_final.pth
│ └── weights.pth
├── MFD
│ └── weights.pt
├── MFR
......
#### 安装 Git LFS
### 安装 Git LFS
开始之前,请确保您的系统上已安装 Git 大文件存储 (Git LFS)。使用以下命令进行安装
```bash
git lfs install
```
#### 从 Hugging Face 下载模型
### 从 Hugging Face 下载模型
请使用以下命令从 Hugging Face 下载 PDF-Extract-Kit 模型:
```bash
......@@ -15,6 +15,29 @@ git lfs clone https://huggingface.co/wanderkid/PDF-Extract-Kit
确保在克隆过程中启用了 Git LFS,以便正确下载所有大文件。
### 从 ModelScope 下载模型
#### SDK下载
```bash
# 首先安装modelscope
pip install modelscope
```
```python
# 使用modelscope sdk下载模型
from modelscope import snapshot_download
model_dir = snapshot_download('wanderkid/PDF-Extract-Kit')
```
#### Git下载
也可以使用git clone从 ModelScope 下载模型:
```bash
git clone https://www.modelscope.cn/wanderkid/PDF-Extract-Kit.git
```
将 'models' 目录移动到具有较大磁盘空间的目录中,最好是在固态硬盘(SSD)上。
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment