Skip to content
This repository has been archived by the owner on Aug 29, 2023. It is now read-only.

Extract the framing functionality to its own package #48

Open
wants to merge 19 commits into
base: master
Choose a base branch
from

Conversation

luxas
Copy link
Contributor

@luxas luxas commented Jul 7, 2021

This is the first step in essentially making a "v2" rewrite of libgitops.

Progress Plan

The rewrite starts from the elementary operation of reading/writing YAML/JSON frames, which is needed everywhere else. After this, I'll take a stab at making the serializer package more robust, observable, unit-tested, and so on, more production grade. After the serializer is "production-grade" in turn, I'll progress towards making the storage system implement the controller-runtime Client interface so that we can have an object-oriented approach when reading/writing objects to Git and other backends. The serializer and storage improvements are already prototyped in #46, and will be ported from there in their own PRs, such that it is easier to review.

Why make/have a framing library?

Every time someone interacts with a Kubernetes configuration object, they are touching either JSON or YAML. If that configuration is declarative, it is stored somewhere in encoded form. If that storage is Git, for GitOps purposes, the configuration is stored in files exactly as the user wishes. The user can put many YAML objects in the same file by separating them in different YAML documents, there can be only one object per file, the formatting can be weird, etc. Parsing all this by hand is brittle and too complex to do at each call site where this functionality is needed.

JSON is a self-framing format, which means that no extra separator like --- is needed, but it is not straightforward for a reader of the file stream to know where the JSON object ends. Other than declarative configuration, one place where it is common to output JSON objects after each other is logging, where the stream looks like { ... }{ ... }{ ... }. If a machine would like to read that log, it'd need to be able to frame individual objects in one way or the other.

Solution

This frame package contains on a high-level a Reader and Writer that perform framing in either direction, respectively. They provide an easy-to-use interface for interacting with the underlying io.Reader or io.Writer stream (which can be a file, a HTTP request, a log, etc.).

The content package provides a means to automatically trace read/write requests and carry metadata about the source or the target byte stream.

The tracing package has higher-level utilities for tracing the frame/sanitation/content streams, or the library more generally.

The sanitize package allows for keeping YAML comments automatically, formatting JSON/YAML prettily, keeping sequence indentation as-is, and such with the goal of minimizing the textual diff when we've written back to Git.

Features and Design Goals

  • Crucial production-grade features such as limiting the byte count and the number of frames are included out of the box.
  • Thread-safe accesses to the underlying resource is automatically built-in.
  • Frames are automatically sanitized, and empty frames are skipped.
  • Traces are automatically created when reading/writing a frame and closing the stream and exported using OpenTelemetry to e.g. Jaeger. Errors during read/write/close are automatically propagated to Jaeger as well, and some metadata.
  • context.Context is used everywhere.
  • Using the functional Option pattern one can add more options and features over time without breaking the API surface.
  • If the maximum number of frames is just one, there's support for using any content type (think HCL, TOML or similar), if no framing is needed. In that case, all of the other listed features here are included "for free".
  • Sanitation: Comments can be copied over automatically, YAML/JSON can be pretty-printed in a standard way, and YAML indentation of lists can be automatically kept or force-formatted.

This PR also contains pkg/tracing which contains some higher-level tools for working with OpenTracing. For example, FuncTracer allows a function that is passed in to be automatically instrumented with a trace span, and then have its error reported to Jaeger or an other backend automatically. There is also a constructor of a TracerProvider that follows the builder-pattern, so setting up tracing for your application is as seamless/easy as possible. When this PR is merged, I'll also make a converter shim which automatically converts traces into logs to stdout or some other sink, without the app developer having to write any extra logging code.

Copy link

@bigkevmcd bigkevmcd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a lot of cases where the comments don't really add much.

	// Catch if fn == nil
	if fn == nil {
		return &traceFuncResult{MakeFuncNotSuppliedError("FuncTracer.TraceFunc"), span}
	}

if fw.ContentType() != ContentTypeYAML {
logrus.Debugf("Asked to preserve comments, but ContentType is not YAML, so ignoring")
if fw.FramingType() != frame.FramingTypeYAML {
logrus.Debugf("Asked to preserve comments, but frame.FramingType is not YAML, so ignoring")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should accept a simple logging interface rather than imposing logrus on users.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that'll be great 👍. logr is nice. After this PR is merged, I'll do an upgrade of pkg/serializer/*.go with those kinds of changes.

// context in a normalized way.
func MakeUnsupportedFramingTypeError(ct FramingType) error {
return fmt.Errorf("%w: %q", ErrUnsupportedFramingType, ct)
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not return an UnsupportedFramingType error with the FramingType as a field something like https://pkg.go.dev/io/fs#PathError ?

There's no way to extract the FramingType from the error without parsing the string.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch; I'll turn the non-trivial errors into structs, thanks 👍

//
// Another way of defining a "frame" is that it MUST contain exactly one decodable object.
// This means that no empty (i.e. len(frame) == 0) frames shall be returned. Note: The decodable
// object might represent a list object (e.g. as Kubernetes' v1.List); more generally something

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How many frames are in this document?

---
testing: value
---
---
another: test

Is this a valid file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is a valid file, and it contains two non-empty frames that would be returned. There is also one empty frame that would automatically be ignored and never returned. I'll make an example of that in the godoc.

func sanitizeYAMLFrame(frame []byte) []byte {
prevLen := len(frame)
for {
// Trim spaces

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment doesn't add anything to the code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback; I'll review my comments and make sure they describe "why" not "what". Indeed, some unnecessary "what" comments have slipped in in a fair amount of places.

func (r *singleReader) ReadFrame(context.Context) ([]byte, error) {
// Only allow reading once
if !r.hasBeenRead {
// Read the whole frame from the underlying io.Reader, up to a given amount

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just repeating what io.ReadAll says?


func (w *singleWriter) WriteFrame(_ context.Context, frame []byte) error {
// Only allow writing once
if !w.hasBeenWritten {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we make this code easier to read so that the comment isn't necessary?

spanName = "CloseNoop"
}
return fnTracer.TraceFunc(ctx, spanName, func(ctx context.Context, _ trace.Span) error {
// Don't close if a closer wasn't given

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could probably simplify this code and remove the comments by inverting the logic.

func ReadFrameList(ctx context.Context, fr Reader) (FrameList, error) {
var frameList FrameList
for {
// Read until we get io.EOF or an error

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How could we simplify this code to allow the removal of this comment?

frame, err := fr.ReadFrame(ctx)
if errors.Is(err, io.EOF) {
break
} else if err != nil {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need the else here, you've already executed the break if it was an io.EOF error?

https://gist.github.com/adamveld12/c0d9f0d5f0e1fba1e551#indent-error-flow

// If an error occurs, writing stops and the error is returned.
func WriteFrameList(ctx context.Context, fw Writer, frameList FrameList) error {
// Loop all frames in the list, and write them individually to the Writer
for _, frame := range frameList {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering what we could do to simplify this code so that it didn't need a comment?


// 2 generic Reader constructors

func NewSingleReader(ct stream.ContentType, r stream.Reader, opts ...SingleReaderOption) Reader {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should have doc comments https://golang.org/doc/effective_go#commentary

Every exported (capitalized) name in a program should have a doc comment.

There are a lot of cases in this codebase.

// 1 single, 3 YAML and 1 recognizing stream.Reader helper constructors

/*func FromSingleBuffer(ct stream.ContentType, buf *bytes.Buffer, opts ...SingleReaderOption) Reader {
return NewSingleReader(ct, stream.FromBuffer(buf), opts...)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this come out?

// FrameCountOverflowError is returned when a Reader or Writer would process more
// frames than allowed.
type FrameCountOverflowError struct {
// +optional

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear what optional means in this case?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants