feat(model inference): add table recognition and conversion to LaTeX (#284)
* # add table recognition using struct-eqtable ## Changelog 31/07/20204 - Support table recognition. Table images will be converted into html. ### how to use the new feature: set the attribute 'table-mode' to 'true' in magic-pdf.json ### caution: it takes 200s to 500s to convert a single table image using cpu * # add table recognition using struct-eqtable ## Changelog 31/07/20204 - Support table recognition. Table images will be converted into LaTex. ### how to use the new feature: set the attribute 'table-mode' to 'true' in magic-pdf.json ### caution: it takes 200s to 500s to convert a single table image using cpu * # feat(model inference): add table recognition and convertion to LaTeX # What's Changed ### New Features - Add table content recognition, we use weights of [StructEqTable](https://github.com/UniModal4Reasoning/StructEqTable-Deploy) to convert table image to LaTex. ### Instruction - pip install pypandoc struct-eqtable==0.1.0 - Download [StructEqTable weights](https://huggingface.co/wanderkid/PDF-Extract-Kit/tree/main/models/TabRec) and put it under models/ directory. - Edit 'table-mode' value to turn on table recognition function which is turned off by default. - If you did not download any models before, refer to [how to download models](docs/how_to_download_models_zh_cn.md)。 * add table recognition and convertion to LaTeX * add table recognition and conversion to LaTeX * add table recognition and conversion to LaTeX * add table recognition and conversion to LaTeX --------- Co-authored-by:liukaiwen <liukaiwen@pjlab.org.cn>
Showing
... | @@ -8,4 +8,6 @@ fast-langdetect==0.2.0 | ... | @@ -8,4 +8,6 @@ fast-langdetect==0.2.0 |
wordninja>=2.0.0 | wordninja>=2.0.0 | ||
scikit-learn>=1.0.2 | scikit-learn>=1.0.2 | ||
pdfminer.six==20231228 | pdfminer.six==20231228 | ||
pypandoc | |||
struct-eqtable==0.1.0 | |||
# The requirements.txt must ensure that only necessary external dependencies are introduced. If there are new dependencies to add, please contact the project administrator. | # The requirements.txt must ensure that only necessary external dependencies are introduced. If there are new dependencies to add, please contact the project administrator. |
Please register or sign in to comment