Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Overhaul #55

Open
wants to merge 44 commits into
base: main
Choose a base branch
from
Open

fix: Overhaul #55

wants to merge 44 commits into from

Conversation

biscoe916
Copy link
Member

Proposed Changes

Checklist

  • A clear description of the change has been included in this PR.
  • A clear description of whether this change is a Major, Minor, Patch or cosmetic change as per the Versioning Guidelines has been included in this PR.
  • All schema validation tests have been updated appropriately and are passing.
  • MAJOR/MINOR VERSION CHANGES ONLY: This PR should be made in branches prefixed with draft-<change>
  • MAJOR/MINOR VERSION CHANGES ONLY: A link to a reference implementation (PR or set of PRs) of the change has been included in this PR.
  • MAJOR/MINOR VERSION CHANGES ONLY: A writeup has been included discussing the motivation and impact of this change.
  • MAJOR/MINOR VERSION CHANGES ONLY: The minimum wait time has elapsed.
  • DRAFT MERGE ONLY: Draft Semver has been updated in the VERSION file (optional)
  • DRAFT MERGE ONLY: Tagged this branch with new semver version and an annotation describing the change (ex: git tag -s 4.1.0 -m "Spec version 4.1.0 - did a thing")
  • DRAFT MERGE ONLY: Version numbers have been updated as per the Versioning Guidelines.
  • This change otherwise adheres to the project Contribution Guidelines.

@biscoe916 biscoe916 requested a review from a team as a code owner April 3, 2025 16:39
Copy link
Contributor

@cassandrabailey293 cassandrabailey293 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is looking great! i found a couple things that i wasn't sure we wanted to include but you would know better than I would. left those comments in case, and a few typos/links to fix but otherwise it looks good to me.


* As a Zip file with extension of `.tdf`. For example, if you are trying to protect a file named `demo.jpeg`, the file will be stored as `demo.jpeg.tdf` after encryption.
* As a HTML file with extension of `.html`. For example, if you are trying to protect a file named `demo.jpeg`, the file will be stored as `demo.jpeg.html` after encryption. An example HTML file is (here)[https://github.com/opentdf/spec/blob/master/schema/HtmlProtocolExample.html].
OpenTDF represents a modernization of data-centric security concepts originally established in the **IC-TDF** (Intelligence Community Trusted Data Format) specification. While IC-TDF utilized an XML-based structure, OpenTDF adopts a more contemporary approach using JSON for its manifest, enhancing flexibility and ease of integration with modern web technologies.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably not necessary (or wanted?) but you could link to the ICTDF spec here

1. **`manifest.json`:** The metadata manifest described in the Key Concepts section. It holds instructions for decryption and access control.
2. **`payload`:** The encrypted data payload itself.

However, a TDF can be encoded in other ways. For example, as an HTML document:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this "true"? like technically someone could hand roll it but do we want to mention html at all, since i dont think opentdf supports ti directly?


While OpenTDF offers flexibility and rich metadata, NanoTDF prioritizes size efficiency for specific use cases.

**➡️ For details, please refer to the [NanoTDF Specification](../schema/nanotdf/README.md).**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getting a 404 on this link


## Contact

For questions regarding OpenTDF, interoperability, or the specification, please reach out to [support@opentdf.io](mailto:support@opentdf.io).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

who is on this mailing list?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK no one yet. We need to set it up.

* **Resource Attributes:** Defined within the TDF's [Policy Object](../schema/OpenTDF/policy.md) in the `dataAttributes` array. These specify the attribute instances *required* to access this specific piece of data.
* **Policies:** Defined by the combination of:
* The `dataAttributes` required by the specific TDF.
* The optional `dissem` list in the TDF's [Policy Object](../schema/OpenTDF/policy.md) acting as an initial filter.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just checking that we want to talk about dissem, maybe people will want to use it, but wanted to make sure


1. **Request Initiation:** The client presents the relevant [Key Access Object(s)](../schema/OpenTDF/key_access_object.md) from the TDF manifest to the appropriate KAS, along with the client's credentials and asserted **Entity Entitlements** (Subject Attributes).
2. **PEP Evaluation:** The KAS performs the following checks based on the TDF's embedded [Policy Object](../schema/OpenTDF/policy.md) (extracted from the `policy` field):
* **Dissemination Check (if applicable):** If the policy's `dissem` list is present and non-empty, the PEP verifies if the requesting entity's identifier is in the list. If not, access is **denied**.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

more mentions of dissem here, maybe we do want to include it but double checking

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we want to mention it, but we can run it by @willackerly

2. **Encrypt Payload:** Encrypt the original payload data using the DEK and an authenticated encryption mode (e.g., AES-256-GCM), generating an Initialization Vector (IV) and integrity tags per segment. Store the IV and segment information.
3. **Define Policy:** Construct the [Policy Object](../schema/OpenTDF/policy.md) JSON defining the required attributes (`dataAttributes`) and optional dissemination list (`dissem`). Base64 encode this JSON string.
4. **Generate Policy Binding:** Calculate the policy binding hash: `HMAC(DEK, Base64(policyJSON))` using a standard algorithm like HMAC-SHA256. Base64 encode the resulting hash.
5. **Prepare Optional Metadata:** If client-specific metadata needs to be passed securely to the KAS during decryption, prepare this data.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want to have an example of what this would be? there are some things in opentdf we ALWAYS pass right? or no?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea this is worded poorly. I'll update it.

Technically this is always passed to the KAS, but it's optional in the sense that a user of the SDK doesn't need to put anything in it.

17. **Decrypt Payload:** Use the plaintext DEK and the parameters from `encryptionInformation.method` (IV) to decrypt the TDF payload. During decryption (especially with AES-GCM or streaming), simultaneously verify the integrity of each segment using the `hash` from the [Segment Object](./opentdf/segment.md) and finally verify the `rootSignature` from [`integrityInformation`](./opentdf/integrity_information.md). If any integrity check fails, abort decryption and report an error.
18. **Securely Discard DEK:** Once the payload is decrypted or decryption fails, securely erase the plaintext DEK and any intermediate shares from memory.

## Error Handling
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we had an adr around this a while back... not sure if we'd wanna include that here, maybe not since we havent implemented against it much . https://github.com/virtru-corp/adr/pull/102

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might forgo linking to this ADR for a couple reasons.

  1. It's not public
  2. The error codes themselves aren't defined by the spec - we only define them for the purposes of our "reference implementation".

What do you think? Should be define the error codes formally here?

@@ -0,0 +1,18 @@
# Attribute Object (Structure)

This document describes the JSON structure representing an Attribute Instance when embedded within a [Policy Object](./policy.md). For a conceptual overview of attributes, and their role in access control, see [Access Control Concepts](../../concepts/access_control.md.md).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo here - double .md in the access control link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants