Skip to main content

Automate management of PII redacted schemas for dbt projects.

Project description

PyPI GitHub CI Codecov Supported Python versions License

The Schema Builder tool is used to create dbt schema files, sql models, and default PII / non-PII views for tables in the given Snowflake schemas.

For each specified application schema, the script will generate dbt models for a <SCHEMA> and <SCHEMA>_PII schema. We refer to these schemas as a “trifecta”.

  • <SCHEMA>_<RAW_SUFFIX> contains the original source tables.

  • <SCHEMA>_PII contains views on the _RAW tables that have un-redacted PII.

  • <SCHEMA> contains views on the _RAW tables sensitive data redacted.

Application schemas can be sourced from multiple raw schemas. This allows you to specify which tables should be pulled from which raw schema to construct the “trifecta”.

Schema Builder ensures that all three schemas provide the same interface to the data (number and order of columns match what is present in the _RAW schema).

Once the script is successfully run, you can execute a dbt run to create or update the views in <SCHEMA> and <SCHEMA>_PII. If your source data in the <SCHEMA>_<RAW_SUFFIX> schema changes you should run Schema Builder frequently to keep up with changes in the tables and columns stored there.

Schema Builder will also automatically create sources in one or more other dbt projects so that they can use the results of these models as sources.

See the docs for more information.

License

The code in this repository is licensed under the AGPL 3.0 unless otherwise noted.

Please see LICENSE.txt for details.

How To Contribute

Contributions are very welcome. Please read The Contribution Guide for details. Even though they were written with edx-platform in mind, the guidelines should be followed for all Open edX projects.

The pull request description template should be automatically applied if you are creating a pull request from GitHub. Otherwise you can find it at PULL_REQUEST_TEMPLATE.md.

The issue report template should be automatically applied if you are creating an issue on GitHub as well. Otherwise you can find it at ISSUE_TEMPLATE.md.

Reporting Security Issues

Please do not report security issues in public. Please email security@edx.org.

Getting Help

If you’re having trouble, we have discussion forums at https://discuss.openedx.org where you can connect with others in the community.

Our real-time conversations are on Slack. You can request a Slack invitation, then join our community Slack team.

For more information about these options, see the getting assistance page.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dbt-schema-builder-0.5.0.tar.gz (36.9 kB view hashes)

Uploaded Source

Built Distribution

dbt_schema_builder-0.5.0-py2.py3-none-any.whl (32.1 kB view hashes)

Uploaded Python 2 Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page