gosh-darnit

A fast, efficient Go library for profanity detection and censorship.

Features

Fast: Uses the Aho-Corasick algorithm to match all patterns in a single pass
Smart word boundaries: Prevents false positives like "bass", "analyst", "assist", "Scunthorpe"
Evasion resistant: Handles common obfuscation techniques:
- Leetspeak: @ss, sh1t, fvck, a$$
- Unicode homoglyphs: Cyrillic, Greek, fullwidth characters
- Zero-width characters: U+200B, U+200C, U+200D, U+FEFF
- Repeated characters: fuuuuck, shiiiit
- NFKC Unicode normalization
Flexible censoring: Multiple modes for replacing profanity
Zero external dependencies: Only uses Go standard library + golang.org/x/text

Installation

go get github.com/geoherna/gosh-darnit

Usage

Basic Detection

package main

import (
	"fmt"
	"github.com/geoherna/gosh-darnit"
)

func main() {
    // Check if text contains profanity
    if goshdarnit.IsProfane("What the fuck?") {
        fmt.Println("Profanity detected!")
    }

    // Find which words matched
    words := goshdarnit.FindProfanity("This is some shit")
    fmt.Println("Found:", words) // ["shit"]
}

Censoring

package main

import (
	"fmt"
	"github.com/geoherna/gosh-darnit"
)

func main() {
    text := "What the fuck is this shit?"

    // Replace all characters with asterisks
    fmt.Println(goshdarnit.Censor(text, goshdarnit.CensorAll))
    // Output: "What the **** is this ****?"

    // Keep first character visible
    fmt.Println(goshdarnit.Censor(text, goshdarnit.CensorKeepFirst))
    // Output: "What the f*** is this s***?"

    // Keep first and last characters visible
    fmt.Println(goshdarnit.Censor(text, goshdarnit.CensorKeepFirstLast))
    // Output: "What the f**k is this s**t?"
}

Evasion Detection

The library automatically handles common evasion techniques:

// Leetspeak
goshdarnit.IsProfane("@ss")      // true (@ -> a)
goshdarnit.IsProfane("sh1t")     // true (1 -> i)
goshdarnit.IsProfane("fvck")     // true (v -> u)
goshdarnit.IsProfane("a$$")      // true ($ -> s)

// Repeated characters
goshdarnit.IsProfane("fuuuuck")  // true
goshdarnit.IsProfane("shiiiit")  // true

// Unicode homoglyphs (Cyrillic 'а' looks like Latin 'a')
goshdarnit.IsProfane("аss")      // true

False Positive Prevention

Word boundary detection prevents common false positives:

goshdarnit.IsProfane("The bass is great")     // false
goshdarnit.IsProfane("She's an analyst")      // false
goshdarnit.IsProfane("I need to assist you")  // false
goshdarnit.IsProfane("Scunthorpe is a town")  // false
goshdarnit.IsProfane("The shitake mushrooms") // false
goshdarnit.IsProfane("Assess the situation")  // false
goshdarnit.IsProfane("Classic movie")         // false

API Reference

Functions

Function	Description
`IsProfane(text string) bool`	Returns true if text contains profanity
`ContainsProfanity(text string) bool`	Alias for `IsProfane`
`Censor(text string, mode CensorMode) string`	Replaces profanity with asterisks
`CensorWithDefault(text string) string`	Censors with `CensorAll` mode
`FindProfanity(text string) []string`	Returns list of matched profane words

Censor Modes

Mode	Example	Description
`CensorAll`	`****`	Replace all characters
`CensorKeepFirst`	`f***`	Keep first character visible
`CensorKeepFirstLast`	`f**k`	Keep first and last characters visible

Performance

Benchmarks on Apple M4 Max:

Benchmark	Time	Allocations
CleanShort	~766ns	8 allocs
ProfaneShort	~839ns	9 allocs
Leetspeak	~847ns	9 allocs
RepeatedChars	~1.0µs	11 allocs
MixedText	~2.5µs	14 allocs

Run benchmarks yourself:

go test -bench=. -benchmem

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Special Shoutouts

Huge thanks to John Kim for the graphic design assets used in this project.

Note on content

This software contains a list of profanities, slurs, and other offensive terms solely for the purpose of detecting and filtering harmful language in user-generated content. These terms are included for harm-reduction, research, and moderation purposes only. Their presence in the source code does not constitute endorsement or promotion of such language by the authors.

License

MIT License - see LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
.github/workflows		.github/workflows
.gitignore		.gitignore
.golangci.yml		.golangci.yml
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE		LICENSE
README.md		README.md
aho_corasick.go		aho_corasick.go
bad_words.go		bad_words.go
benchmark_test.go		benchmark_test.go
go.mod		go.mod
go.sum		go.sum
goshdarnit.go		goshdarnit.go
goshdarnit_test.go		goshdarnit_test.go
normalize.go		normalize.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

gosh-darnit

Features

Installation

Usage

Basic Detection

Censoring

Evasion Detection

False Positive Prevention

API Reference

Functions

Censor Modes

Performance

Contributing

Special Shoutouts

Note on content

License

About

Uh oh!

Releases 9

Packages

Languages

License

geoherna/gosh-darnit

Folders and files

Latest commit

History

Repository files navigation

gosh-darnit

Features

Installation

Usage

Basic Detection

Censoring

Evasion Detection

False Positive Prevention

API Reference

Functions

Censor Modes

Performance

Contributing

Special Shoutouts

Note on content

License

About

Topics

Resources

License

Code of conduct

Uh oh!

Stars

Watchers

Forks

Releases 9

Packages 0

Languages

Packages