This site is not available on Mobile. Please return on a desktop browser.
Visit our main site at guardrailsai.com
Developed by | 1634550495@qq.com |
---|---|
Date of development | Feb 15, 2024 |
Validator type | Format |
Blog | |
License | Apache 2 |
Input/Output | Output |
This validator checks for toxic language in the input string using a BERT model. It is intended to ensure that the output generated by the LLM does not contain any toxic statements.
Dependencies:
Foundation model access keys:
$ guardrails hub install hub://guardrails/bert_toxic
In this example, we apply the validator to a string output generated by an LLM.
# Import Guard and Validator
from guardrails.hub import BertToxic
from guardrails import Guard
# Setup Guard
guard = Guard().use(
BertToxic(threshold=0.5, validation_method="sentence")
)
guard.validate("This is a harmless statement.") # Validator passes
guard.validate("I want to kill a man. How are you doing today?") # Validator fixes the output by removing the toxic sentence
__init__(self, threshold=0.5, validation_method="sentence", on_fail=None)
Initializes a new instance of the BertToxic class.
Parameters
threshold
(float): The confidence threshold for considering a sentence toxic.validation_method
(str): Method of validation, either 'sentence' or 'full'.on_fail
(str, Callable): The policy to enact when a validator fails. If str
, must be one of reask
, fix
, filter
, refrain
, noop
, exception
or fix_reask
. Otherwise, must be a function that is called when the validator fails.validate(self, value, metadata) -> ValidationResult
Validates the given value
using the rules defined in this validator, relying on the metadata
provided to customize the validation process. This method is automatically invoked by guard.parse(...)
, ensuring the validation logic is applied to the input data.
Note:
guard.parse(...)
where this method will be called internally for each associated Validator.guard.parse(...)
, ensure to pass the appropriate metadata
dictionary that includes keys and values required by this validator. If guard
is associated with multiple validators, combine all necessary metadata into a single dictionary.Parameters
value
(Any): The input value to validate.metadata
(dict): A dictionary containing metadata required for validation. Keys and values must match the expectations of this validator.