Skip to content

AI Assisted echomail moderation #203

@awehttam

Description

@awehttam

This issue originally related to "multipass moderation" where a message that was held for moderation could be kicked back to the user and re-submitted for further review. In essence, allowing a loop of back and forth until a message was deemed suitable to be posted.

I'm changing this issue to AI Assisted moderation and including the bit about multipass moderation.

AI assisted moderation would be particularly helpful for catching new users posting unwanted messages into echomail (spammers, trolls, etc.). At the moment BinktermPHP supports holding new user posts to networked areas, as well as manual registration review - which works fine for the moment.

An expanded AI assisted moderation tool could be configured to screen message posts for rogue content and automatically flag the message for moderator review. This would allow for more frictionless signups where new users are auto-validated and do not have a mandated review of their first posts. Simply put - they can get in right away, and take part in the conversation. The AI would screen their message.

I would be concerned about screening ALL posts just due to API costs. We could keep the moderation threshold, or perhaps only moderate the users first X number of posts with random samplings after, or something like that.

It's also been suggested that rather than have a hard pass fail, a moderated message can be kicked back to the user for editing and refinement multiple times if necessary.

This task could also be performed by AI

From Paradigms Shifting:

A more complex but useful future feature would be some multipass systems. Here is what that looks like:

The user posts a new message, that is marked as pending. Local policy tools scan the post, first. If they find any violations, they explain to the user what those are. They instruct the user to make specific corrections. The user makes the corrections and saves the draft. The next pass is optional: an AI agent does a similar scan. But it is not allowed to take action. It reports its findings ack to the policy tools that are not ai. The policy tools check AI's report against policy itself to avoid ai hallucination. If the ai's eport is internally consistent with no hallucinations, the report data is factored in. If hallucination is detected, the tools ignore the feedback from ai. Whether ai is used or not, the policy tools then check the user's draft edit. If it passes, the post is either approved, or sent to a moderator, depending on configuration. If the post is still not right, it is sent to a human moderator for review. The moderator then types in feedback about what needs to be changed and why. If the user applies the changes correctly, the moderator will then approve the post. If the user refuses to make the necessary changes, the moderator will reject the post.

Its fine to start with the more absolute system you're building, because its just easier to start there. As things evolve however, please consider my suggestion for a multi-pass system that can optionally include an ai pass and local policy parsing tools.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions