Thanks to visit codestin.com
Credit goes to github.com

Skip to content

refactor: use coder/slog + minor go style changes #107

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

cstyan
Copy link

@cstyan cstyan commented May 16, 2025

Changes are broken down in to multiples commits to hopefully make reviewing easy. 1 commit for the slog change and then a commit per Go file for style changes.

Style changes are generally:

  • try to use full sentences for all comments
  • try to stick to 120 column lines (not strict) instead of 80
  • try to one line as many call function, check if err != nil blocks as possible
  • stick var and const definitions near the top of the file
  • try to use err or errs for all return type names, previously used problems in some cases but errs in others
  • some minor optimizations, like the line scanner declaring a new variable in each iteration of a loop
  • Todo -> TODO, sometimes also useful to do TODO (name): to make it easier to find things a specific author meant to follow up on
  • comments for types/functions should generally start with // FunctionName/TypeName ... though I'm now seeing places I didn't update that

In general there's very few tests for the Go code here, would we like more or is there some testing that spins up the entire registry to validate things? I didn't see any makefile.

@cstyan cstyan requested review from Parkreiner and bcpeinhardt May 16, 2025 19:16
Copy link
Member

@Parkreiner Parkreiner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just commenting for now, since it sounded like there might be some more changes you want to make, but I'm okay with making these changes

And to be clear, when I'm asking a question, that's not me trying to be defensive – I'm just trying to understand how big the gap between my TypeScript way of doing things is with how a Gopher usually does stuff


lineScanner := bufio.NewScanner(strings.NewReader(trimmed))
for lineScanner.Scan() {
lineNum++
nextLine := lineScanner.Text()
nextLine = lineScanner.Text()
Copy link
Member

@Parkreiner Parkreiner May 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand the point of this change, since nextLine isn't ever used outside the loop. I'm not a fan of scope pollution, and try to keep scoping as aggressively small as possible, even in the same function

  1. Is this mostly a memory optimization?
  2. I feel like I see code all the time in Go that looks just like what we used to have. Especially with range loops. Does the below example have the same problems as the old approach, where we're declaring new block-scoped variables on the stack once per iteration?
for i, value := range exampleSlice {
  // Stuff
}

Is there an optimization that range loops have that doesn't exist with other loops?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this is just an optimization to reduce memory allocations. Very minor in this case since I doubt this loop has a lot of iterations, but without this a new string for nextLine is allocated for each iteration of the loop.

The Go compiler already does an optimization itself for for thing := range anotherThing to do the same optimization, assigning to the same var for each iteration rather than allocating a new one every time.

Comment on lines +15 to +18
var (
err error
subDir os.FileInfo
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this something that Go engineers typically do? I guess I just expected these parentheses declarations to be mainly used for declaring groups of related variables. Right now, the variables aren't directly related (aside from being scoped to the function), and take up more lines total now

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personal preference in some cases. The convention is either if the variables are logically related to each other, or to help with readability such as when there's multiple variables declared near eachother and you want to avoid repeating the var keyword.

In this case, I wanted both to get their respective default values and allow for turning the previous lines 17-18 into one line.

The other option was to do:

var subDir os.FileInfo
var err Error

}

errs := []error{}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question I've had for a bit: does it matter whether errs is defined with an allocation or as a nil slice, since we're not serializing it as JSON?

I know that Go recommends that you don't differentiate between a nil slice and an empty, allocated slice aside from JSON output, but aside from JSON, are there ever any times when you'd want to do an allocation for a slice that might stay empty?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question I've had for a bit: does it matter whether errs is defined with an allocation or as a nil slice, since we're not serializing it as JSON?

Do you mean empty slice, as opposed to nil slice? Rather than doing var slice := make(...)?

Usually I prefer doing make(...) with some specified length/capacity since that allows for either starting out with a slice of the size you need, or at least of some reasonable size. Every call to append when the underlying memory no longer has remaining capacity for what you're trying to append results in reallocation of the slice with 2x the current capacity.

When we don't know what the final length might be and it is possible that it could be 0, using []error{} ensures we don't allocate any space for the item storage portion of the slice.

Even if we did errs := make([]error, 0, 10) the underlying item storage would still be allocated for 10 items.

In general it's best to avoid nil slices for return values, though they can be used for function parameters/optional values.

One important point to note is that under the hood you can append to a nil slice, it will be treated as a 0 length empty slice on the first append.

Comment on lines +66 to +70
var (
userDirs []os.DirEntry
err error
)
if userDirs, err = os.ReadDir(rootRegistryPath); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the point of this change to help insulate the function from needing to worry about when variables are created as things get refactored over time?

I feel like we got a lot out of that : before, especially the type inference it gives

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to allow for the assignment and err check in one line, while still being able to use the value of userDirs later on in the function. I may have gone a bit overboard with this type of change in this PR 😂

I feel like we got a lot out of that : before, especially the type inference it gives

True, though it's safe in this case since we're calling a function from the stdlib. The function signature for os.ReadDir will not change without a major version change in Go. If changes were required before a Go 2.0, a new function would be introduced rather than os.ReadDir being modified in a way that changed it's return types.

Comment on lines 11 to 32
const (
rootRegistryPath = "./registry"
fence = "---"

// validationPhaseFileStructureValidation indicates when the entire Registry
// directory is being verified for having all files be placed in the file
// system as expected.
validationPhaseFileStructureValidation validationPhase = "File structure validation"

var supportedAvatarFileFormats = []string{".png", ".jpeg", ".jpg", ".gif", ".svg"}
// validationPhaseFileLoad indicates when README files are being read from
// the file system.
validationPhaseFileLoad = "Filesystem reading"

// validationPhaseReadmeParsing indicates when a README's frontmatter is
// being parsed as YAML. This phase does not include YAML validation.
validationPhaseReadmeParsing = "README parsing"

// readme represents a single README file within the repo (usually within the
// top-level "/registry" directory).
// validationPhaseAssetCrossReference indicates when a README's frontmatter
// is having all its relative URLs be validated for whether they point to
// valid resources.
validationPhaseAssetCrossReference = "Cross-referencing relative asset URLs"
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the topic of using declarations to group related variables, I feel like I'd want two const declarations here – one for the phases, and one for everything else

Copy link
Member

@Parkreiner Parkreiner May 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two other things:

  • I'm hesitant about referencing something above where it's defined if I can help it (in this case, defining a string in terms of validationPhase before the type is declared), even if it'll still work. I generally like being able to read the code top-to-bottom, even if that means that files don't follow a pattern of always defining constants first
  • The phases were previously defined as ints via iota. I'm just now realizing: with the current setup, only the first phase is defined strictly as validationPhase, right? Everything else in the series has the type untyped string?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the topic of using declarations to group related variables, I feel like I'd want two const declarations here – one for the phases, and one for everything else

Can you elaborate as to why? It's idiomatic, but not required, to group together const/var definitions like this. We could separate the const definitions, or simply separate them within the same parens block with a comment.

I'm hesitant about referencing something above where it's defined if I can help it (in this case, defining a string in terms of validationPhase before the type is declared), even if it'll still work. I generally like being able to read the code top-to-bottom, even if that means that files don't follow a pattern of always defining constants first

I agree with you, for sure. I missed that validationPhaseFileStructureValidation was a validationPhase. I'll see if there's another refactor we can make here to keep the top to bottom organization in place.

The phases were previously defined as ints via iota. I'm just now realizing: with the current setup, only the first phase is defined strictly as validationPhase, right? Everything else in the series has the type untyped string?

Well, string type no untyped, but yes. How important is it that the validationPhase exist? Could we use iota again or have these all just be strings?

@@ -318,19 +310,18 @@ func validateAllContributorFiles() error {
return err
}

log.Printf("Processing %d README files\n", len(allReadmeFiles))
logger.Info(context.Background(), "Processing README files", "num_files", len(allReadmeFiles))
Copy link
Member

@Parkreiner Parkreiner May 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still new to structured logging. Is there any special behavior/benefit you get if you use the same key multiple times? I guess I'm just wondering how much of a concern it is to make sure you're using the same keys each time you describe the same "resource", particularly for a function call that takes a variadic slice of empty interfaces (so basically zero type-safety)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not the end of the world if you don't use the same key, but it does make searching for logs in some kind of log aggregation system much easier.

For example, a system I used to work on referred to the same internal tenant type within the system as variations of user, tenant, id, etc. Remembering which key was used on which logged lines complicated searches when I knew within I needed to see info for tenant="1234" but on some lines the logging was user="1234".

Again this is likely less important in the case of the registry but still a good practice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants