@jackdbd/eleventy-plugin-text-to-speech
TypeScript icon, indicating that this package has built-in type declarations

3.1.0 • Public • Published

@jackdbd/eleventy-plugin-text-to-speech

npm version install size CodeCov badge Socket Badge

Eleventy plugin that uses text-to-speech to generate audio assets for your website, then injects audio players in your HTML.

Installation

npm install @jackdbd/eleventy-plugin-text-to-speech

⚠️ Peer Dependencies

This package defines 6 peer dependencies.

Peer Version range
@11ty/eleventy >=2.0.0 or 3.0.0-alpha.4
@aws-sdk/client-s3 >=3.0.0
@aws-sdk/lib-storage >=3.0.0
@google-cloud/storage >=7.0.0
@google-cloud/text-to-speech >=5.0.0
debug >=4.0.0

About

Eleventy plugin that uses text-to-speech to generate audio assets for your website, then injects audio players in your HTML.

To synthesize text into speech you can use:

To host the generated audio assets you can use:

⚠️ The Cloud Text-to-Speech API has a limit of 5000 characters.

See also:

Docs

Docs generated by TypeDoc

📖 API Docs

This project uses API Extractor and api-documenter markdown to generate a bunch of markdown files and a .d.ts rollup file containing all type definitions consolidated into a single file. I don't find this .d.ts rollup file particularly useful. On the other hand, the markdown files that api-documenter generates are quite handy when reviewing the public API of this project.

See Generating API docs if you want to know more.

Preliminary Operations

Enable the Text-to-Speech API

Before you can begin using the Text-to-Speech API, you must enable it. You can enable the API with the following command:

gcloud services enable texttospeech.googleapis.com

Set up authentication via a service account

This plugin uses the official Node.js client library for the Text-to-Speech API. In order to authenticate to any Google Cloud API you will need some kind of credentials. At the moment this plugin supports only authentication via a service account JSON key.

First, create a service account that can use the Text-to-Speech API. You can also reuse an existing service account if you want. This service account should have the necessary IAM permissions to create/delete objects in a Cloud Storage bucket. You can grant the service account the Storage Object Admin predefined IAM role.

gcloud iam service-accounts create sa-text-to-speech-user \
  --display-name "Text-to-Speech user SA"

Second, download the JSON key of this service account and store it somewhere safe. Do not track this file in git.

Optional: Create Cloud Storage bucket (only if you want to host audio files on Cloud Storage)

Create a Cloud Storage bucket in your desired location. Enable uniform bucket-level access and use the nearline storage class.

gsutil mb \
  -p $GCP_PROJECT_ID \
  -l $CLOUD_STORAGE_LOCATION \
  -c nearline \
  -b on \
  gs://bkt-eleventy-plugin-text-to-speech-audio-files

If you want, you can check that uniform bucket-level access is enabled using this command:

gsutil uniformbucketlevelaccess get \
  gs://bkt-eleventy-plugin-text-to-speech-audio-files

Make the bucket's objects publicly available for read access (otherwise people will not be able to listen/download the audio files):

gsutil iam ch allUsers:objectViewer \
  gs://bkt-eleventy-plugin-text-to-speech-audio-files

Usage

Let's say that you are hosting your Eleventy website on Cloudflare Pages. Your current deployment is at the URL indicated by the environment variable CF_PAGES_URL.

Self-hosting the generated audio assets

If you want to self-host the audio assets that this plugin generates and use all default options, you can register the plugin with this code:

import { textToSpeechPlugin } from '@jackdbd/eleventy-plugin-text-to-speech'

export default function (eleventyConfig) {
  // some eleventy configuration...
  
  eleventyConfig.addPlugin(textToSpeechPlugin, {
    // TODO: add config with process.env.CF_PAGES_URL here
  })

  // some more eleventy configuration...

}

Hosting the generated audio assets on Cloud Storage

If you want to host the audio assets on a Cloud Storage bucket and configure the rules for the audio matches, you could register the plugin using something like this:

import { textToSpeechPlugin } from '@jackdbd/eleventy-plugin-text-to-speech'

export default function (eleventyConfig) {
  // some eleventy configuration...
  
  eleventyConfig.addPlugin(textToSpeechPlugin, {
    // TODO: add config with Cloud Storage bucket here
  })

  // some more eleventy configuration...

}

Multiple hosts

If you want to host the generated audio assets on multiple hosts, register this plugin multiple times. Here are a few examples:

  • Self-host some audio assets, and host on a Cloud Storage bucket some other assets.
  • Host all audio assets on Cloud Storage, but host some on one bucket, and some others on a different bucket.

Have a look at the Eleventy configuration of the demo-site in this monorepo.

Configuration

Plugin options

Key Default Description
collectionName "audio-items" Name of the 11ty collection defined by this plugin
rules undefined Rules that determine which texts to convert into speech (1 to ∞ elements)
transformName "inject-audio-tags-into-html" Name of the 11ty transform defined by this plugin

Rule

Key Default Description
audioInnerHTML undefined Function that returns some HTML from the list of hrefs where the generated audio assets are hosted.
cssSelectors [] CSS selectors to find matches in a HTML document
hosting undefined Client that provides hosting capabilities
regex {} RegExp to find matches in the output path
synthesis undefined Client that provides Text-to-Speech capabilities
xPathExpressions [] XPath expressions to find matches in a HTML document

Troubleshooting

This plugin uses the debug library for logging. You can control what's logged using the DEBUG environment variable.

For example, if you set your environment variables in a .envrc file, you can do:

# print all logging statements
export DEBUG=11ty-plugin:*

Dependencies

Package Version
@jackdbd/zod-schemas ^2.0.0
html-to-text ^9.0.5
zod ^3.22.4
zod-validation-error ^3.0.0
jsdom ^24.0.0
specificity ^1.0.0

Credits

I had the idea of this plugin while reading the code of the homonym eleventy-plugin-text-to-speech by Larry Hudson. Larry's plugin uses the Microsoft Azure Speech SDK.

License

© 2022 - 2024 Giacomo Debidda // MIT License

Package Sidebar

Install

npm i @jackdbd/eleventy-plugin-text-to-speech

Weekly Downloads

63

Version

3.1.0

License

MIT

Unpacked Size

253 kB

Total Files

108

Last publish

Collaborators

  • jackdbd