Skip to main content

llama-index readers pdf_table integration

Project description

PDF Table Loader

This loader reads the tables included in the PDF.

Users can input the PDF file and the pages from which they want to extract tables, and they can read the tables included on those pages.

Usage

Here's an example usage of the PDFTableReader. pages parameter is the same as camelot's pages. Therefore, you can use patterns such as all, 1,2,3, 10-20, and so on.

from llama_hub.pdf_table import PDFTableReader
from pathlib import Path

reader = PDFTableReader()
pdf_path = Path("/path/to/pdf")
documents = reader.load_data(file=pdf_path, pages="80-90")

Example

This loader is designed to be used as a way to load data into LlamaIndex and/or subsequently used as a Tool in a LangChain Agent.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llama_index_readers_pdf_table-0.1.3.tar.gz (2.6 kB view hashes)

Uploaded Source

Built Distribution

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page