Dedoc
This sample demonstrates the use of Dedoc
in combination with LangChain
as a DocumentLoader
.
Overviewโ
Dedoc is an open-source library/service that extracts texts, tables, attached files and document structure (e.g., titles, list items, etc.) from files of various formats.
Dedoc
supports DOCX
, XLSX
, PPTX
, EML
, HTML
, PDF
, images and more.
Full list of supported formats can be found here.
Integration detailsโ
Class | Package | Local | Serializable | JS support |
---|---|---|---|---|
DedocFileLoader | langchain_community | โ | beta | โ |
DedocPDFLoader | langchain_community | โ | beta |