Skip to main content

llama-index readers remote integration

Project description

Remote Page/File Loader

This loader makes it easy to extract the text from any remote page or file using just its url. If there's a file at the url, this loader will download it temporarily and parse it using SimpleDirectoryReader. It is an all-in-one tool for (almost) any url.

As a result, any page or type of file is supported. For instance, if a .txt url such as a Project Gutenberg book is passed in, the text will be parsed as is. On the other hand, if a hosted .mp3 url is passed in, it will be downloaded and parsed using AudioTranscriber.

Usage

To use this loader, you need to pass in a Path to a local file. Optionally, you may specify a file_extractor for the SimpleDirectoryReader to use, other than the default one.

from llama_index import download_loader

RemoteReader = download_loader("RemoteReader")

loader = RemoteReader()
documents = loader.load_data(
    url="https://en.wikipedia.org/wiki/File:Example.jpg"
)

This loader is designed to be used as a way to load data into LlamaIndex and/or subsequently used as a Tool in a LangChain Agent. See here for examples.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llama_index_readers_remote-0.1.4.tar.gz (3.2 kB view hashes)

Uploaded Source

Built Distribution

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page