Skip to content

[DISCUSS] Prototype a Base64Encoder#2452

Draft
kinkie wants to merge 2 commits into
squid-cache:masterfrom
kinkie:base64encoder
Draft

[DISCUSS] Prototype a Base64Encoder#2452
kinkie wants to merge 2 commits into
squid-cache:masterfrom
kinkie:base64encoder

Conversation

@kinkie

@kinkie kinkie commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

Stemming from the converstaions in PR #2447 which focuses on
addressing a problem as narrowly as possible , here
is what I would envision a base64 encoder to look like.

It's a prototype, vibe-coded with lots of hand-holding on my side,
so please don't focus too much on the code itself, it comes with
a comprehensive unit test which is meant to showcase the API.

Does anyone see anything obviously wrong with the approach or is
it worth investing into by reviewing, polishing, writing a decoder and using it?

@yadij yadij added the M-ignored-by-merge-bots https://github.com/measurement-factory/anubis/blob/master/README.md#pull-request-labels label Jun 27, 2026
@yadij

yadij commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

The only issue I see is that Squid base64 encoding is only used on buffers. So we would end up adding a stream just for the purpose of using the encoder.

@rousskov rousskov left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had already flagged this API direction as problematic before this PR was posted. Will find time to detail that claim.

@rousskov rousskov left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, the serious problems identified in this incomplete review are enough to warrant switching to a different approach to base64-encoding. There are other, mostly secondary problems in this PR that this review does not flag. Most will probably disappear on the adjusted path.

I can suggest starting with an API like this:

namespace AnyP {

/// A base64-encoded version of the given input.
SBuf Base64Encode(const SBuf &);

} // namespace AnyP

* This class inherits from std::ostream to provide a familiar streaming
* interface, similar to SBufStream.
*/
class Base64Encoder : public std::ostream

@rousskov rousskov Jun 29, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proposed inheritance is an API mistake for several reasons, including:

  • std::ostream is meant for writing bytes to various destinations (as defined by derived classes). A base64 encoder should not know or care about the destination of the encoded bytes. I should be able to use the same encoder for writing into a file, a socket, or an SBuf. Changing high-level information into written bytes is the domain of formatting I/O functions (e.g., STL-provided << operators) and I/O manipulators, not std::ostream kids.

  • Base64 encoding requires a special action to finalize the result. std::ostream does not have a caller-convenient, safe, and efficient way for triggering that action. Neither class destructor nor buf() are it, for various reasons.

  • Formatted and locale-specific output (e.g., various STL-provided << operators and manipulators) is not fully compatible with many (all?) known base64 use cases in Squid code, where callers must base64-encode values while following strict syntax rules. An extra space character or the wrong number representation is likely to violate the protocol requirements but is not going to be caught by the compiler. In this context, we probably want to highlight value conversions rather than hide them behind various convenience APIs.

Base64Encoder::Base64StreamBuf::sync()
{
encoder_.encodePending();
encoder_.finalize();

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sync() implementation essentially violates STL streambuf::sync() API. STL code that we do not control should be able to call this method multiple times, as needed, without being worried that the very first call effectively places (or should place) the stream into a "no more updates" state, a state that Base64Encoder enters when its finalize() method is called.


/// Create a Base64Encoder with optional maximum encoded output size limit
/// \param maxEncodedSize maximum encoded output size (default: noLimit)
explicit Base64Encoder(size_t maxEncodedSize = noLimit);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Burdening this SBuf-based class with explicitly maintaining output size restrictions is probably wrong. The underlying code should not care. Modern users should not care either. Legacy users, if any, would be better off with checking when converting from SBuf into their own storage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

M-ignored-by-merge-bots https://github.com/measurement-factory/anubis/blob/master/README.md#pull-request-labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants