youtube-transcript-plus
TypeScript icon, indicating that this package has built-in type declarations

1.0.2 • Public • Published

youtube-transcript-plus

npm version

A Node.js library to fetch transcripts from YouTube videos. This package uses YouTube's unofficial API, so it may break if YouTube changes its internal structure.

Note: This project was originally forked from https://github.com/Kakulukian/youtube-transcript.

Installation

$ npm install youtube-transcript-plus

or

$ yarn add youtube-transcript-plus

Usage

Basic Usage

import { fetchTranscript } from 'youtube-transcript-plus';

// Fetch transcript using default settings
fetchTranscript('videoId_or_URL').then(console.log).catch(console.error);

Custom User-Agent

You can pass a custom userAgent string to mimic different browsers or devices.

fetchTranscript('videoId_or_URL', {
  userAgent:
    'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36',
})
  .then(console.log)
  .catch(console.error);

Custom Fetch Functions

You can inject custom videoFetch and transcriptFetch functions to modify the fetch behavior, such as using a proxy or custom headers.

fetchTranscript('videoId_or_URL', {
  videoFetch: async ({ url, lang, userAgent }) => {
    return fetch(`https://my-proxy-server.com/?url=${encodeURIComponent(url)}`, {
      headers: {
        ...(lang && { 'Accept-Language': lang }),
        'User-Agent': userAgent,
      },
    });
  },
  transcriptFetch: async ({ url, lang, userAgent }) => {
    return fetch(`https://my-proxy-server.com/?url=${encodeURIComponent(url)}`, {
      headers: {
        ...(lang && { 'Accept-Language': lang }),
        'User-Agent': userAgent,
      },
    });
  },
})
  .then(console.log)
  .catch(console.error);

Language Support

You can specify the language for the transcript using the lang option.

fetchTranscript('videoId_or_URL', {
  lang: 'fr', // Fetch transcript in French
})
  .then(console.log)
  .catch(console.error);

Caching

You can provide a custom caching strategy by implementing the CacheStrategy interface. The library also provides default implementations for in-memory and file system caching.

In-Memory Cache

import { fetchTranscript, InMemoryCache } from 'youtube-transcript-plus';

fetchTranscript('videoId_or_URL', {
  lang: 'en',
  userAgent: 'FOO',
  cache: new InMemoryCache(1800000), // 30 minutes TTL
})
  .then(console.log)
  .catch(console.error);

File System Cache

import { fetchTranscript, FsCache } from 'youtube-transcript-plus';

fetchTranscript('videoId_or_URL', {
  cache: new FsCache('./my-cache-dir', 86400000), // 1 day TTL
})
  .then(console.log)
  .catch(console.error);

Custom Caching

If the default implementations don’t meet your needs, you can implement your own caching strategy:

import { fetchTranscript, CacheStrategy } from 'youtube-transcript-plus';

class CustomCache implements CacheStrategy {
  async get(key: string): Promise<string | null> {
    // Custom logic
  }

  async set(key: string, value: string, ttl?: number): Promise<void> {
    // Custom logic
  }
}

fetchTranscript('videoId_or_URL', {
  cache: new CustomCache(),
})
  .then(console.log)
  .catch(console.error);

Error Handling

The library throws specific errors for different failure scenarios. Make sure to handle them appropriately.

import {
  YoutubeTranscriptVideoUnavailableError,
  YoutubeTranscriptDisabledError,
  YoutubeTranscriptNotAvailableError,
  YoutubeTranscriptNotAvailableLanguageError,
} from 'youtube-transcript-plus';

fetchTranscript('videoId_or_URL')
  .then(console.log)
  .catch((error) => {
    if (error instanceof YoutubeTranscriptVideoUnavailableError) {
      console.error('Video is unavailable:', error.message);
    } else if (error instanceof YoutubeTranscriptDisabledError) {
      console.error('Transcripts are disabled for this video:', error.message);
    } else if (error instanceof YoutubeTranscriptNotAvailableError) {
      console.error('No transcript available:', error.message);
    } else if (error instanceof YoutubeTranscriptNotAvailableLanguageError) {
      console.error('Transcript not available in the specified language:', error.message);
    } else {
      console.error('An unexpected error occurred:', error.message);
    }
  });

Example Usage Files

The repository includes several example files in the example/ directory to demonstrate different use cases of the library:

  1. basic-usage.js: Demonstrates the simplest way to fetch a transcript using the default settings.
  2. caching-usage.js: Shows how to use the InMemoryCache to cache transcripts with a 30-minute TTL.
  3. fs-caching-usage.js: Demonstrates how to use the FsCache to cache transcripts on the file system with a 1-day TTL.
  4. language-usage.js: Shows how to fetch a transcript in a specific language (e.g., French).
  5. proxy-usage.js: Demonstrates how to use a proxy server to fetch transcripts, which can be useful for bypassing rate limits or accessing restricted content.

These examples can be found in the example/ directory of the repository.

API

fetchTranscript(videoId: string, config?: TranscriptConfig)

Fetches the transcript for a YouTube video.

  • videoId: The YouTube video ID or URL.
  • config: Optional configuration object with the following properties:
    • lang: Language code (e.g., 'en', 'fr') for the transcript.
    • userAgent: Custom User-Agent string.
    • cache: Custom caching strategy.
    • cacheTTL: Time-to-live for cache entries in milliseconds.
    • videoFetch: Custom fetch function for the video page request.
    • transcriptFetch: Custom fetch function for the transcript request.

Returns a Promise<TranscriptResponse[]> where each item in the array represents a transcript segment with the following properties:

  • text: The text of the transcript segment.
  • duration: The duration of the segment in seconds.
  • offset: The start time of the segment in seconds.
  • lang: The language of the transcript.

Errors

The library throws the following errors:

  • YoutubeTranscriptVideoUnavailableError: The video is unavailable or has been removed.
  • YoutubeTranscriptDisabledError: Transcripts are disabled for the video.
  • YoutubeTranscriptNotAvailableError: No transcript is available for the video.
  • YoutubeTranscriptNotAvailableLanguageError: The transcript is not available in the specified language.
  • YoutubeTranscriptInvalidVideoIdError: The provided video ID or URL is invalid.

License

MIT Licensed

Readme

Keywords

Package Sidebar

Install

npm i youtube-transcript-plus

Weekly Downloads

24

Version

1.0.2

License

MIT

Unpacked Size

23.6 kB

Total Files

11

Last publish

Collaborators

  • ericmmartin