DOR Services App

This Ruby application provides a REST and GraphQL API for DOR Services. There is a OAS 3.0 spec that documents the API in openapi.yml. You can browse the generated documentation at http://sul-dlss.github.io/dor-services-app/

Authentication

To generate an authentication token run RAILS_ENV=production bin/rails generate_token on the prod server. This will use the HMAC secret to sign the token. It will ask you to submit a value for "Account". This should be the name of the calling service, or a username if this is to be used by a specific individual. This value is used for traceability of errors and can be seen in the "Context" section of a Honeybadger error. For example:

{"invoked_by" => "workflow-service"}

GraphQL

DSA exposes a limited GraphQL API at the /graphql endpoint. The API is implemented using graphql-ruby. The purpose of the API is to allow retrieving only the parts of cocina objects that are needed, in particular, to avoid retrieving very large structural metadata.

It is limited in that:

It only supports querying, not mutations.
Only the first level of attributes (description, structural, etc.) are expressed in the GraphQL schema; the contents of each of these attributes are just typed as JSON.

Developer notes:

Most GraphQL code is in app/graphql.
In local development, the GraphiQL browser is available at http://localhost:3000/graphiql.

Developer Notes

DOR Services App is a Rails app.

Background Jobs

Dor Services App uses Sidekiq to process background jobs, which requires Redis. You can either install this locally, if running services locally, or run it via docker-compose. To spin up Sidekiq, run:

bundle exec sidekiq # use -d option to daemonize/run in the background

See the output of bundle exec sidekiq --help for more information.

Note that the application has a web UI for monitoring Sidekiq activity at /queues.

Running Tests

First, ensure the database container is spun up:

docker compose up db # use -d to daemonize/run in background

And if you haven't yet prepared the test database, run:

RAILS_ENV=test bundle exec rails db:test:prepare

To run the tests:

bundle exec rspec

To run rubocop:

bundle exec rubocop

Console and Development Server

Using Docker

First, you'll need both Docker and docker-compose installed.

Run dor-services-app and its dependencies using:

docker compose up -d

Update Docker image

docker build -t suldlss/dor-services-app:latest .
docker push suldlss/dor-services-app:latest

Without Docker

To spin up a local rails console:

bundle exec rails c

To spin up a local development server:

bundle exec rails s

Setup RabbitMQ

You must set up the durable rabbitmq queues that bind to the exchange where workflow messages are published.

RAILS_ENV=production bin/rake rabbitmq:setup

This is going to create queues for this application that bind to some topics.

RabbitMQ queue workers

In a development environment you can start sneakers this way:

WORKERS=CreateEventJob bin/rake sneakers:run

but on the production machines we use systemd to do the same:

sudo /usr/bin/systemctl start sneakers
sudo /usr/bin/systemctl stop sneakers
sudo /usr/bin/systemctl status sneakers

This is started automatically during a deploy via capistrano

Cron check-ins

Some cron jobs (configured via the whenever gem) are integrated with Honeybadger check-ins. These cron jobs will check-in with HB (via a curl request to an HB endpoint) whenever run. If a cron job does not check-in as expected, HB will alert.

Cron check-ins are configured in the following locations:

config/schedule.rb: This specifies which cron jobs check-in and what setting keys to use for the checkin key. See this file for more details.
config/settings.yml: Stubs out a check-in key for each cron job. Since we may not want to have a check-in for all environments, this stub key will be used and produce a null check-in.
config/settings/production.yml in shared_configs: This contains the actual check-in keys.
HB notification page: Check-ins are configured per project in HB. To configure a check-in, the cron schedule will be needed, which can be found with bundle exec whenever. After a check-in is created, the check-in key will be available. (If the URL is https://api.honeybadger.io/v1/check_in/rkIdpB then the check-in key will be rkIdpB).

Rolling (Re)Indexer

(This runs here so it has efficient access to the cocina for each object.)

This helps keep the index fresh by reindexing the oldest data. It is managed as a systemd service. To interact with it from your machine, you can use Capistrano:

$ cap ENV rolling_indexer:status
$ cap ENV rolling_indexer:start
$ cap ENV rolling_indexer:stop
$ cap ENV rolling_indexer:restart

Or if you're on a server that has the rolling_indexer capistrano role, use systemd commands:

$ sudo systemctl status rolling-index
$ sudo systemctl start rolling-index
$ sudo systemctl stop rolling-index
$ sudo systemctl restart rolling-index

NOTE 1: The rolling indexer is automatically restarted during deployments.

NOTE 2: The rolling indexer runs only on one node per environment. Conventionally, this is the -a node, but for production, it is dor-services-worker-prod-b.

NOTE 3: The rolling indexer logs to {capistrano_shared_dir}/log/rolling_indexer.log

Robots

DSA hosts robots that perform DSA actions. This replaces the previous pattern in which a common accessioning robot would invoke a DSA endpoint that would start a DSA job that would perform the action and then update the workflow status.

Robots are in jobs/robots/*. All DSA robots must be added to Workflow Server Rails' QueueService so that the workflow jobs are handled by DSA robots (instead of normal robots).

There also must be a sidekiq process to handle the DSA robot queues. For example:

:labels:
 - robot
:concurrency: 5
:queues:
  - [accessionWF_default_dsa, 2]
  - accessionWF_low_dsa

Other tools

Running Reports

There is information about how to run reports on the sdr-infra VM in the cocina-models README. This approach has two advantages:

sdr-infra connects to the DSA database as read-only
no resource competition with production DSA processing

Generating a list of druids from Solr query

$ bin/generate-druid-list 'is_governed_by_ssim:"info:fedora/druid:rp029yq2361"'

The results are written to druids.txt.

Removing deleted items from a list of druids

$ bin/clean-druid-list -h Usage: bin/clean-druid-list [options] -i, --input FILENAME File containing list of druids (instead of druids.txt). -o, --output FILENAME File to write list of druids (instead of druids.clean.txt). -h, --help Displays help.

Solr is used to determine if an item still exists.

Find druids missing from the SOLR index

Run the missing druid rake task:

RAILS_ENV=production bundle exec rake missing_druids:unindexed_objects

This produces a missing_druids.txt file in the application root.

Missing druids can be indexed with:

RAILS_ENV=production bundle exec rake missing_druids:index_unindexed_objects

Data migrations / bulk remediations

bin/migrate-cocina provides a framework for data migrations and bulk remediations. It supports optional versioning and publishing of objects after migration.

Usage: bin/migrate-cocina MIGRATION_CLASS [options]
        --mode [MODE]                Migration mode (dryrun, migrate, verify). Default is dryrun
    -p, --processes PROCESSES        Number of processes. Default is 4.
    -s, --sample SAMPLE              Sample size per type, otherwise all objects.
    -h, --help                       Displays help.

The process for performing a migration/remediation is:

Implement a Migrator (app/services/migrators/). See Migrators::Base and Migrators::Exemplar for the requirements of a Migrator class. Migrators should be unit tested.
Perform a dry run: bin/migrate-cocina Migrators::Exemplar --mode dryrun and inspect migrate-cocina.csv for any errors. This is a way to change the cocina and validate the new objects without saving the updated cocina or publishing or versioning.
Perform migration/remediation: bin/migrate-cocina Migrators::Exemplar --mode migrate and inspect migrate-cocina.csv for any errors.
Perform verification: bin/migrate-cocina Migrators::Exemplar --mode verify and inspect migrate-cocina.csv for any errors. (An error here means that an object matching .migrate? has been found ... which is presumably NOT desired after migration.)

Additional notes:

The dry run and the verification can be performed on sdr-infra. See the existing documentation on setting up db connections.
The migration/remediation must be performed on the DSA server since it requires a read/write DB connection. (sdr-infra has a read-only DB connection.)
Migrations are performed on an ActiveRecord object, not a Cocina object. This allows the remediation of invalid items (i.e., items that cannot be instantiated as Cocina objects).
Migrations can be performed against all items or just a list provided by the Migrator.
Breaking changes, especially breaking cocina model changes, are going to require additional steps, e.g., stopping SDR processing. The complete process is to be determined.

Reset Process (for QA/Stage)

Steps

Reset the database

Name	Name	Last commit message	Last commit date
Latest commit justinlittman Merge pull request #5267 from sul-dlss/digital-serials-report Mar 25, 2025 22e265f · Mar 25, 2025 History 7,481 Commits
.circleci	.circleci	Update to latest ruby-rails orb version	Mar 11, 2025
.github	.github	Update pull_request_template.md	Mar 14, 2022
app	app	Add report for serials based on HRID counts	Mar 20, 2025
bin	bin	Adds tracing id to troubleshoot indexing.	Jan 15, 2025
config	config	Adds common HB config.	Mar 24, 2025
db	db	Support permanently withdrawing user versions.	Aug 8, 2024
docker	docker	Revert "Bootstrap the Ur-APO"	Mar 20, 2021
docs	docs	Update master branch references to main branch references	Jan 8, 2021
lib/tasks	lib/tasks	Adds rake tasks for creating new object versions and moving user vers…	Aug 20, 2024
public	public	Passenger config	Mar 16, 2012
solr_conf	solr_conf	solrconfig.xml: boost primary author matches in search results	Mar 11, 2024
spec	spec	Reduce the number of calls to WF service to get version status.	Mar 17, 2025
.dockerignore	.dockerignore	Switch from byebug to debug	Mar 20, 2024
.gitignore	.gitignore	.gitignore: ignore .byebug_history	Mar 30, 2024
.rspec	.rspec	rspec: allow format doc for individual specs	Jun 11, 2020
.rubocop.yml	.rubocop.yml	Upgrade to ruby 3.4.1.	Jan 27, 2025
.rubocop_todo.yml	.rubocop_todo.yml	Add more context to HB when editing via QuickMARC	Jan 15, 2025
Capfile	Capfile	Remove capistrano-rvm and other bits added to help deploy to ubuntu b…	May 24, 2022
Dockerfile	Dockerfile	Fix docker compose build	Feb 14, 2025
Gemfile	Gemfile	Update cocina models and DSC.	Feb 21, 2025
Gemfile.lock	Gemfile.lock	Update Ruby dependencies	Mar 24, 2025
LICENSE	LICENSE	Add LICENSE file	Aug 14, 2014
README.md	README.md	Switch from CodeClimate to CodeCov	Jan 31, 2025
Rakefile	Rakefile	Adds async event creation consumer.	Jun 27, 2022
VERSION	VERSION	Bump version to 3.0.0 in advance of release	Aug 19, 2019
compose.yaml	compose.yaml	Fix docker compose build	Feb 14, 2025
config.ru	config.ru	Run bin/rails app:update and edit files for formatting	May 12, 2022
openapi.yml	openapi.yml	Return version description with version status.	Feb 13, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DOR Services App

Authentication

GraphQL

Developer Notes

Background Jobs

Running Tests

Console and Development Server

Using Docker

Update Docker image

Without Docker

Setup RabbitMQ

RabbitMQ queue workers

Cron check-ins

Rolling (Re)Indexer

Robots

Other tools

Running Reports

Generating a list of druids from Solr query

Removing deleted items from a list of druids

Find druids missing from the SOLR index

Data migrations / bulk remediations

Reset Process (for QA/Stage)

Steps

About

Contributors 23

Languages

License

sul-dlss/dor-services-app

Folders and files

Latest commit

History

Repository files navigation

DOR Services App

Authentication

GraphQL

Developer Notes

Background Jobs

Running Tests

Console and Development Server

Using Docker

Update Docker image

Without Docker

Setup RabbitMQ

RabbitMQ queue workers

Cron check-ins

Rolling (Re)Indexer

Robots

Other tools

Running Reports

Generating a list of druids from Solr query

Removing deleted items from a list of druids

Find druids missing from the SOLR index

Data migrations / bulk remediations

Reset Process (for QA/Stage)

Steps

About

Topics

Resources

License

Stars

Watchers

Forks

Contributors 23

Languages