Thanks to visit codestin.com
Credit goes to github.com

Skip to content

danielsidev/ai-gemini-guard-rails

Repository files navigation

Guard Rails on Gemini

Here we have some examples of guard rails with gemini using golang

Gemini Flash Go

For this, we need a gemini api key and export this api key like a enviroment variable:

export GEMINI_API_KEY=my_gemini_api_key

Bellow, we can see some ways to create guard rails

We have 4 folders that show theses examples, each one with a main.go file.

So, just acces the specif folder, before export your api key, and run:

go run main.go

The result can be see in the console(terminal).

With a Prompt Engineer

We can create a guard rails with a new prompt that evalute the message and compose a new message:

// The model return "OK" or "BIASED".
func validateWithPrompt(ctx context.Context, model *genai.GenerativeModel, text string) (string, error) {
	validationPrompt := fmt.Sprintf(`
    Analyze the following text to determine whether it has a clear political bias or promotes misinformation.
	Respond only with 'OK' if the text is neutral and informative, or with 'BIASED' if it is biased or promotes misinformation.
	Do not add explanations.

	Text for analysis:
    "%s"
    `, text)

	resp, err := model.GenerateContent(ctx, genai.Text(validationPrompt))
	if err != nil {
		return "", fmt.Errorf("content validation failed: %v", err)
	}

	if len(resp.Candidates) == 0 {
		return "", fmt.Errorf("no validation results returned")
	}

	var validationResult string
	for _, part := range resp.Candidates[0].Content.Parts {
		if txt, ok := part.(genai.Text); ok {
			validationResult += string(txt)
		}
	}

	return strings.TrimSpace(strings.ToUpper(validationResult)), nil
}

With Simple Filter

We can create a filter and replace or remove parts from messge:

In this case, if a message has email or phone, we replace to a tag:

// sanitizeResponse act like our custom "guard rail".
func sanitizeResponse(text string) string {
	// Regular Expression to find email address
	// This is a simple Regex and can be improvment for more cases.
	emailRegexp := regexp.MustCompile(`\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b`)
	text = emailRegexp.ReplaceAllString(text, "[EMAIL REMOVED]")

	// Regular Expression to find phone numbers in format: (XX) XXXX-XXXX or XXXX-XXXX.
	phoneRegexp := regexp.MustCompile(`\b\(?\d{2}\)?\s?\d{4,5}-?\d{4}\b`)
	text = phoneRegexp.ReplaceAllString(text, "[PHONE REMOVED]")

	//  You can add more filters here, like to personal ou company documents, address etc.
	// For example, a Brazilian Personal Document call CPF:
	// cpfRegexp := regexp.MustCompile(`\d{3}\.?\d{3}\.?\d{3}-?\d{2}`)
	// text = cpfRegexp.ReplaceAllString(text, "[CPF REMOVED]")

	return text
}

With Settings Safety Availables Default and Custom

Categories

  • HARM_CATEGORY_HARASSMENT

    • Negative Comments, offensives, intimidation, bullying.
  • HARM_CATEGORY_HATE_SPEECH

    • Hate Speech, comments that attack or discriminate based on protected characteristics.
  • HARM_CATEGORY_SEXUALLY_EXPLICIT

    • Sexual content, explicit references.
  • HARM_CATEGORY_DANGEROUS_CONTENT

    • Incentive or facilitate dangerous activities, violence, or behaviors that may cause damage.
  • HARM_CATEGORY_TOXICITY

    • Means content that is rude, disrespectful, or profane.

These are some examples of Categories. There ar many more.

Threshold

  • BLOCK_NONE

    • Nenhum bloqueio para aquela categoria — o conteúdo não é bloqueado independentemente da probabilidade de dano.
    • ...
  • BLOCK_ONLY_HIGH

    • Bloqueia somente se a probabilidade de dano for alta (HIGH). Permite se for médio ou baixo.
    • No blocking for that category — content is not blocked regardless of the likelihood of harm.
  • BLOCK_MEDIUM_AND_ABOVE

    • Bloqueia se for médio ou alto (MEDIUM ou HIGH).
    • Blocks if it is medium or high (MEDIUM or HIGH).
  • BLOCK_LOW_AND_ABOVE

    • Bloqueia se for baixo, médio ou alto — ou seja, qualquer probabilidade que não seja “negligível”.
    • Block if it is low, medium, or high — that is, any probability that is not “negligible.”

About

Some ways to create guard rails using gemini ai

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages