Thanks to visit codestin.com
Credit goes to github.com

Skip to content

tagged provisioners started with --key do not pick up jobs tagged by coder_workspace data source #15047

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
bpmct opened this issue Oct 12, 2024 · 11 comments
Assignees
Labels
must-do Issues that must be completed by the end of the Sprint. Or else. Only humans may set this. s1 Bugs that break core workflows. Only humans may set this.

Comments

@bpmct
Copy link
Member

bpmct commented Oct 12, 2024

Steps to reproduce

  1. Create a key for a tagged provisioner in a separate org

    coder provisioner keys create my-key \
    --org data-platform \
    --tag environment=kubernetes
  2. Start the provisioner with the key

    coder provisioner start --key <the-generated-key-value>
  3. Observe that the provisioner does not properly report that it is starting with the tags

    [info]  note: untagged provisioners can only pick up jobs from untagged templates
    [info]  provisioner key auth automatically sets tag scope empty
    [info]  starting provisioner daemon  tags={}  name=bens-m2

    🐛 This is bug 1. There is a second one too

  4. Modify an existing template to add tags to the template
    Image

  5. Observe that this actually works fine and is assigned to the proper provisioner ✅

  6. Remove the manual tag from the template and switch to the coder_workspace_tags data source (example template)

    data "coder_workspace_tags" "custom_workspace_tags" {
      tags = {
        "environment"        = "kubernetes"
      }
    }
    Screen.Recording.2024-10-12.at.8.07.11.AM.mov
  7. Notice that this does not assign the job to the proper provisioner. If you have a generic provisioner with no tags, the job may be assigned to that but it is definitely not attached to the provisioner we just started.

    🐛 This is bug number 2.

  8. Notice that starting an un-tagged provisioner will allow this build to complete

     coder login # as user with Owner role
     coder provisioner start --org <your-org>
  9. Observe that this also applies to workspaces started with templates with coder_workspace_tags. The tags seem to do nothing

To summarize

  • The coder_workspace_tags data source (example template) is designed to dynamically assign provisioner tags to a template
  • The way templates/workspaces are assigned to specific provisioners is documented here
  • 🐛 This does not seem to work when the provisioner is started with --key. Instead, the jobs seem to apply as if no tags are set, both for the workspace build and template version push
  • 🐛 When starting a provisioner key with tags, it does not report that tags it is started with and is instead empty starting provisioner daemon tags={}
@coder-labeler coder-labeler bot added bug need-help Assign this label prompts an engineer to check the issue. Only humans may set this. labels Oct 12, 2024
@bpmct bpmct added the s1 Bugs that break core workflows. Only humans may set this. label Oct 12, 2024
@bpmct
Copy link
Member Author

bpmct commented Oct 12, 2024

@stirby Let's look out for this one and do a patch once completed

@johnstcn johnstcn self-assigned this Oct 14, 2024
@johnstcn
Copy link
Member

@dannykopping let's sync up on this!

@f0ssel
Copy link
Contributor

f0ssel commented Oct 14, 2024

So unfortunately this is actually "expected behavior" for provisioners and tags, and is a limitation of how the terraform datasources are imported. What's happening:

  1. When you edit the untagged template in the UI it directly adds tags to the provisioner job - this works fine.
  2. When you add the workspace_tags datasource we must run a provisioner job to first read the HCL - this job is untagged because it hasn't yet parsed the tags from the HCL. Because this job is untagged, it won't be picked up by the tagged provisioner. Workspaces created by this template will be properly tagged and picked up, but the actual template import job is untagged.

A workaround would be to run an untagged provisioner to handle the original template import job where it transitions from untagged to tagged. After the template is tagged future changes to the template and workspaces created from it should be picked up correctly. The other option would be to tag the template version in the UI first then add the workspace_tags datasource.

This seems to come down to an unexpected behavior with the workspace_tags data source. There's a chicken and egg problem where we need a provisioner to read the HCL to know what the tags are, but the tags aren't applied until the next job.

This is obviously confusing to the end user and should be improved, but there's not an obvious answer to solve this with our current provisioner and template architecture. Some ideas:

  1. Parse the HCL manually for tags - we would be breaking the "provisioner barrier" to do so and would require reading each file and manually parsing (not relying on terraform plan and state file parsing).
  2. Allow any provisioner to pick up template import jobs - @dannykopping had a good observation that provisioner tags are really for provisioning workspaces, and end users might not even expect template import jobs to be constrained by tags in the first place. This will break the barrier tags provide for provisioners for template import jobs, but maybe that is OK since it's not a workspace and doesn't produce any output (besides DB data).
  3. We could possibly do something with terraform graph to parse the HCL without a provisioner job - @dannykopping had some ideas here and can add more info if we want to go down this route.

@johnstcn johnstcn added needs-rfc Issues that needs an RFC due to an expansive scope and unclear implementation path. and removed need-help Assign this label prompts an engineer to check the issue. Only humans may set this. labels Oct 14, 2024
@f0ssel
Copy link
Contributor

f0ssel commented Oct 14, 2024

Some more clarifying information:

  1. coder_workspace_tags only affect the workspaces created with the template and not the template themselves. That HCL block never flows back up to the template and to apply tags to a template import job you must do the UI flow and not go through the HCL. This is most likely a cause for confusion for the end customers.
  2. We currently do introspect terraform files to read tags before workspace builds - feat: evaluate provisioner tags #13333 . Unfortunately this behavior is actually broken because we can't evaluate terraform in coderd because it breaks the provisioner barrier and possibly evaluates the terraform in the "wrong place". For instance
  • You have a template with a data source for a kubernetes cluster that is only reachable by a external provisioner
  • You use that data source to populate a tag to dictate where it runs
  • We evaluate the data source to get the tag value in coderd, but it doesn't have access to the kubernetes cluster, leading to an error or even worse an incorrect value.
  • We either fail to run the workspace build job or we place the workspace in the incorrect environment because we queried the data from coderd instead of the external provisioner.
  1. We do this introspection in coderd today because otherwise we run into the same chicken and egg problem this issue outlines. The introspection we do today in coderd is essentially a hack and we should not continue to do this since we now know this can lead to incorrect behavior.

So this behavior has actually be incorrect for ~5 months, but we just haven't had customers hit these edge cases or they have and haven't noticed because nothing is visibly broken, just maybe provisioned incorrectly.

@bpmct
Copy link
Member Author

bpmct commented Oct 14, 2024

Thanks for the write-up. I think it would be totally acceptable for template versions to provision without these tags. However, I am assuming we also need to solve this problem for workspace builds too?

@bpmct
Copy link
Member Author

bpmct commented Oct 14, 2024

I am noticing, even when the template is pushed the workspace build gets assigned to the wrong tags too... Just just a template thing

@f0ssel
Copy link
Contributor

f0ssel commented Oct 14, 2024

So the answer to the original bug report of "I run a provisioner with a key with tags and it isn't picking up this template that specifies tags in the HCL" with "That HCL doesn't apply to template import jobs". For template import jobs you would essentially need a pool of untagged provisioners or you would need to add the tags to the template versions themselves outside of the HCL. This is why we just ran into issues with this with orgs, because new orgs do not have built-in provisioners that would just run the template import job for you and prevent the user from even realizing the job is not running on the provisioner they would expect.

For workspace builds, we basically should not be introspecting terraform values from coderd, but it's currently the only way to support using coder parameters in the coder_workspace_tags data source in HCL.

@bpmct
Copy link
Member Author

bpmct commented Oct 14, 2024

So does this work as expected for workspace builds (assigning to proper provisioner tags)? This did not happen in my testing

@bpmct
Copy link
Member Author

bpmct commented Oct 15, 2024

Edit: I confirmed it works as expected with workspace

@spikecurtis
Copy link
Contributor

It was a deliberate design decision to define coder_workspace_tags within the template so that it could be used with coder_parameters. Since the tags must be chosen before the provisioner is selected, the HCL needs to be parsed and evaluated on coderd. The evaluation needs to be based only on constants and coder_parameters data: stuff we know a priori before the terraform provisioner is called. That is, the example of using a Kubernetes data source to set the tags is not allowed and needs to generate an error at template import.

It's definitely a bug that we don't process coder_workspace_tags for template import jobs. A full fix would have us directly inspect the HCL on coderd to find the coder_workspace_tags and the coder_parameters, then resolve the tags based on the parameters passed to the import job. A cheaper fix would be to allow the template author to directly set the tags for an import job, but it puts more of a burden on them.

This does double-down on HCL and terraform, rather than being entirely agnostic to the provisioner, but I think that's the price we pay to allow template authors to define tags "in-line" with coder_parameters in the template files.


The only alternative I can see is a full redesign of the feature, that deprecates coder_workspace_tags in the template, and moves dynamic tags to a top-level property of the template version: something we have to set on the REST API, not in the files of the template itself. It would mean building a separate config language to define what prompts to show the end user and how to translate those prompts into tags (i.e. a parallel system of parameters).

johnstcn added a commit that referenced this issue Oct 16, 2024
@johnstcn johnstcn removed the needs-rfc Issues that needs an RFC due to an expansive scope and unclear implementation path. label Oct 16, 2024
@bpmct bpmct added the must-do Issues that must be completed by the end of the Sprint. Or else. Only humans may set this. label Oct 17, 2024
@bpmct
Copy link
Member Author

bpmct commented Nov 7, 2024

We're going to do the more expensive fix.

It's definitely a bug that we don't process coder_workspace_tags for template import jobs. A full fix would have us directly inspect the HCL on coderd to find the coder_workspace_tags and the coder_parameters, then resolve the tags based on the parameters passed to the import job.

Closing in favor of #15427, which tracks this

@bpmct bpmct closed this as not planned Won't fix, can't repro, duplicate, stale Nov 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
must-do Issues that must be completed by the end of the Sprint. Or else. Only humans may set this. s1 Bugs that break core workflows. Only humans may set this.
Projects
None yet
Development

No branches or pull requests

6 participants