YACAM

A bot designed to detect and handle (dumb) spam on jschan imageboards. Due to its nature, both false positives and false negatives are possible. If a user posts like a spammer (e.g., writes l*i*k*e t*h*i*s while sharing URLs), they will be considered one. Read the "How it works" section to understand these and other limitations or test if a string or a post is considered spam using the provided script.

Note that the current implementation should not be considered "production ready". Although it works and has been running for a few months at the time of writing, it was initially implemented as a proof of concept.

How it works

In short, the bot listens for new posts or threads through the global management socket. Posts and threads are handled similarly, and the terms are used interchangeably.

New posts are evaluated through a set of rules designed to reduce false positives. Posts without files, without a message, with a capcode, with a geo-flag from a whitelisted country, and without URLs are ignored.

If the post is not considered safe at this point, the message (excluding URLs) is evaluated. Currently, two modes of "spam detection" are supported: threshold and entries.

Threshold mode

In threshold mode, the bot detects spam messages by comparing the ratio between tokens and message size against a threshold. A token is a character typically used by spammers to obfuscate a message (e.g., * or #). If the ratio exceeds the threshold, the post is flagged as spam.

Entries mode

In entries mode, the bot counts the number of consecutive entries. An entry is defined as a word character (as defined by Python regular expressions) followed by a token (e.g., a*). For example, with *#$ tokens, the message a*a#b$ would have three consecutive entries.

If the count surpasses a configurable amount, the post is considered spam.

Finally, if a post is considered spam, the configured moderation action is performed, and the post is saved as a JSON file (currently experimental and not thoroughly tested).

The moderation action is performed by the account configured in the .env file, and the username is hidden by default. The following moderation actions are supported: ban and delete. The ban action also deletes the post.

Before you run

For reasons explained in the "How it works" section, you will need to provide the bot with credentials for an account with global staff permissions without 2FA enabled (not supported). Ideally, the account should be a global moderator since no other type of account has been tested.

(Optional) Create and activate a new virtual environment by running python -m venv path/to/venv and source path/to/venv/bin/activate respectively.
Use pip install -r requirements.txt to install the dependencies.
Copy, update, and rename .env.example to .env.
Copy, tweak, and rename example.ini to config.ini.

Running

(Optional) Activate the virtual environment with source path/to/venv/bin/activate.
Run python yacam/yacam.py to start the bot.

Testing a string or a post against the rules

You can test a string or a post against the rules by running python yacam/is_spam.py <mode> <file_path>.

Use python yacam/is_spam.py --help for more information.

The mode can be either string or post, the latter requiring a JSON file with the post data. The JSON content must follow the format provided by the jschan API in the global management socket or .../recents.json endpoint.

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
yacam		yacam
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
example.ini		example.ini
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

YACAM

How it works

Threshold mode

Entries mode

Before you run

Running

Testing a string or a post against the rules

About

Uh oh!

Uh oh!

Languages

License

loynet/yacam

Folders and files

Latest commit

History

Repository files navigation

YACAM

How it works

Threshold mode

Entries mode

Before you run

Running

Testing a string or a post against the rules

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Languages