-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Consider using the debug_redact
field / enum option when marshaling into prototext
#1655
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
At first, I suspected this were a duplicate of #1541 however, having read through the issue, it’s different, because this I would say, there is no condition where we would ever want to deprecate Next, we would not want to add any options that default to Now that I’ve covered all the “bad news”, here’s the good news from my side: we’ve held for quite a while that prototext is only intended to be a human-readable debugging format. People who do not want the fields redacted can always elect to simply not add the tag. And, if we implement it like C++, then it affects only prototext output anyways. Individuals remain on their own to implement any sort of redactor for JSON formatting. Pushing it as a universal default might be a bit much (though from my view, not entirely out of the question). But at worst, we could add something like a As for how to emit a message without redaction in a new world where redaction would be a possible default, that could be as simple as adding a new TL;DR: C++ and Java already support this, any change of behavior is in what we already consider human-oriented debug messages, we could provide reasonable knobs to set desired default behavior at runtime or compile time, and the redaction is already field-specific opt-in already. |
This seems reasonable. Actually, I'm surprised debug_redact was added without Go support--it seems like an oversight. I'd say that:
I'm not sure about prototext.Format. I lean towards saying that it should redact. |
I like those defaults and knobs. |
@ivucica Would you be able to send a Gerrit change? See https://github.com/protocolbuffers/protobuf-go/blob/master/CONTRIBUTING.md |
To confirm, consensus is default to redacting, and no envvar? If I’m implementing this, happy to add the envvar. Any particular documentation you’d like me to update? I can’t offer a timeline for this contribution, as this is a personal interest, but (aside from any tests I should update) it seems simple enough, so I can try giving it a spin. If someone beats me to it, I won’t complain, of course. |
I don’t think we need an ENV var, no. With the options as proposed by nield, we should have sane enough behavior overall to not need to offer a kill-switch ENV var. |
Can I ask that the same be applied to There should be a way in We log all received and sent proto messages as json, and are presently running into this exact issue of how to redact them. |
I'll give a pass to textprotos. I don't know what the intended practice would be for protojson (and deciding it should not block prototext work). Likely it should be a separate feature request. |
While 🤔 Hm… but I don’t see any reason why it couldn’t go into |
I've started looking at patching https://github.com/protocolbuffers/protobuf-go. Even putting it into the
It does look like it'll be quite a bit of work to update various callsites -- I'm nowhere near close, even though this seemed like it might be a small change. Once closer to being ready3, I may need to call upon experts for advice (or help) to update the codebase I'm looking at, and only then can I consider going via the Gerrit route.
Yes, whoever does the patching should likely keep the default off there in all cases. Footnotes
|
Thanks for putting such time and effort into looking into this feature, a few thoughts to help you out:
None of the defaults should be We discussed above that the field for this in
Extending interfaces is notoriously not backwards-compatible. Instead, if one wants to expand an interface, it’s best to use a separate interface that embeds the original interface (like
If this change requires a code generator update, that could also complicate the backwards compatibility… if regeneration of protobufs would be necessary to update the protobuf module version, then that will be very hard for users to work with, since they are often depending on protobufs outside of their control and ability to regenerate.
A heuristic that I often try and use is that if something is becoming overwhelmingly difficult or tedious, then maybe I’m attacking it from the wrong direction. You might be going after a few too many hydra heads with your approach so far, and if you’re having to change so much because backwards compatibility with the existing code breaks, then consider how much that would impact users as well.
As I note above, the default should be off here as well. Only for the |
I have a somewhat working version (with tests covering proto2 syntax for now, and failing on one case that I came up wiith). It's also very, very messy. I'll have to review how accessing I'll have to look for advice on how to clean it up. |
That's usually the case, but we intentionally designed the |
Thanks for the followups -- this is just a quick note that this is on my mind, and I just haven't gotten around to putting some time into it.
To be clear: the const defaultIndent = " "
const fieldValueReplacement = "[REDACTED]"
// Format formats the message as a multiline string.
// This function is only intended for human consumption and ignores errors.
// Do not depend on the output being stable. Its output will change across
// different builds of your program, even when using the same version of the
// protobuf module.
func Format(m proto.Message) string {
return MarshalOptions{Multiline: true, RedactDebugString: true}.Format(m)
} Based on the clearly visible comment, this is not meant to be stable and we are allowed to change the behavior -- and it seems like it's documented to have unstable output (just like I was considering whether updating These are probably not the only places to update: upon a closer look (since I hadn't actually reached the phase where I've tested what // MessageStringOf returns the message value as a string,
// which is the message serialized in the protobuf text format.
func (Export) MessageStringOf(m protoreflect.ProtoMessage) string {
return prototext.MarshalOptions{Multiline: false}.Format(m)
} Is this the right method that's called by a All in all, as you said above, I do think I'd leave that exact decision to the reviewer and to consensus.
There's a chance it is not avoidable. On the other hand, perhaps an interesting (but not nicest) fix might be to make sensitive fields implement an extra interface and thus make them identifiable by the I do also already have some notes in the working copy where I asked around for some advice regarding understanding of some of the descriptor code -- I really did go into this expecting I would make a minimal change to Besides, I didn't check, but hopefully
I made this note when filing the bug report originally:
I don't have the exact same logic to implement this, but the naming seems to make a bit of sense -- especially if it is at some point extended with other annotations that are not part of this change. The main downside is inconsistency; but the upside is being able to cover more than one option, should the need arise. Footnotes
|
👍
Yes, it is the correct place for
The only place we would need to change any defaults is when there’s zero possibility to modify the behavior as the caller. As you’ve found
As was pointed out, the API was apparently designed to be extendable without causing issues. 👍
This would be even more impossible. I mean, a direct type of In all, I’m positive about the progress you’re making, and not having any more reservations on your design. 👍 |
Is your feature request related to a problem? Please describe.
Field
debug_redact
has been added around Protobuf 22 to mark sensitive fields as being sensitive. (Amusingly, this meansprotoc
that ships in Debian, as late asunstable
, does not support this field yet.)Not using
debug_redact
means that fields that are tagged as sensitive will be redacted in some languages such as C++, but not in Go.Please note that
debug_redact
can be used on more than just fields; but, its purpose on fields is clearer than on enums.FieldOptions
): https://github.com/protocolbuffers/protobuf/blob/5d58caeebc5c2779ab4db87285280784159f9ed4/src/google/protobuf/descriptor.proto#L739EnumValueOptions
): https://github.com/protocolbuffers/protobuf/blob/5d58caeebc5c2779ab4db87285280784159f9ed4/src/google/protobuf/descriptor.proto#L865Describe the solution you'd like
A solution should be found so that serialization for purposes of text output results in fields being redacted, but otherwise not.
In C++,
TextFormat::Printer::PrintFieldValue
invokesTryRedactFieldValue
which uses this option if the serializer has been initialized withredact_debug_string_
option:https://github.com/protocolbuffers/protobuf/blob/5d58caeebc5c2779ab4db87285280784159f9ed4/src/google/protobuf/text_format.cc#L3026-L3055
Java computes whether a field is sensitive based on this option: https://github.com/protocolbuffers/protobuf/blob/5d58caeebc5c2779ab4db87285280784159f9ed4/java/core/src/main/java/com/google/protobuf/Descriptors.java#L1837-L1859
This is then used if
enablingSafeDebugFormat
is flipped true:https://github.com/protocolbuffers/protobuf/blob/5d58caeebc5c2779ab4db87285280784159f9ed4/java/core/src/main/java/com/google/protobuf/TextFormat.java#L628-L634
For Go, it is not completely clear to me when this should be used. Adding it to the right place seems trivial: it likely belongs in
marshalField
or wherevermarshalSingular
is called, and an option can be used at that point: https://github.com/protocolbuffers/protobuf-go/blob/b98563540c0a4edb38526bcd6e6c97f9fac1f453/encoding/prototext/encode.go#L210However, once the
encoder
supports this, should theStringer
interface (String()
) on a generated message proceed to output the redacted or the non-redacted version of the message?str(someMessage)
to generate a textproto and also use the relatively new fielddebug_redact
in order to experience this change in behavior. If I recall correctly, the textpb output is not defined to be stable -- so perhapsMessageStringOf(m protoreflect.ProtoMessage) string
is permitted to change behavior when invokingprototext.MarshalOptions{Multiline: false}.Format(m)
.RedactFieldsWarns
(toggle warning debugoutput for a field, possibly a comment in generated textpb; for default stringification, defaults to true) andRedactFields
(for default stringification, initially defaults to true, later default false); possibly later switching to panics.String()
by adding a relevant docstring to the generated code. This feels rather blunt, but ifdebug_redact
is important enough, it would hint to the users that they should avoid it. It would probably not affect the users widely enough, however: would casting withstr()
or using%s
+%v
show up in code health tools to indicate invoking a deprecatedStringer
?It feels like sensitive fields should be saved only when explicitly requested, but unfortunately, existing code before the field existed or before it was used may have different expectations.
Describe alternatives you've considered
Two main approaches:
str()
,.String()
or%s
/%v
on a generated proto message without using the custom formatter+redactor, just as one would have to do with a custom option. (This would then apply whether logging or not, as there can be no guarantees of what will happen with the proto isfmt.Sprintf
'd.) Process generated.pb.go
code to add// Deprecated: Using String() directly does not redact fields.
to the docstring so the tooling starts to output warnings about use ofString()
.debug_redact
broken and ignore the existence of this field option, accepting that potentially sensitive messages will end up in logs or other places where they should not.debug_redact
option to clarify that implementing is optional, implementation specific and that depending on it taking effect without examining the behavior in every used language is risky.As is, the option is present and usable, but will not actually result in redaction of fields in sensitive contexts.
Additional context
It may be worth considering otherwise allowing modification of the marshaling of the fields, in case other options / annotations affect the field in other, non-logging contexts.
As a example, customization might be useful because a field might not be sensitive for logging, but it might be too sensitive to display to some types of system administrators, or to send to end users.
This consideration on allowing customizing serialization would likely apply across languages.
The text was updated successfully, but these errors were encountered: