ValX Functions

Available functions:

  • detect_profanity(text_data, language="English"): Detect profanity in text using regex.

  • remove_profanity(text_data, output_file=None, language="English"): Remove profanity from text data.

  • detect_sensitive_information(text_data): Detect sensitive information in text data.

  • remove_sensitive_information(text_data, output_file=None): Remove sensitive information from text data.

  • detect_hate_speech(text): Detect hate speech or offensive language in a text string.

  • remove_hate_speech(text_data): Remove hate speech or offensive language in text data using AI.


Detect profanity

Detect profanity in text using regex.

Args:
        - text_data (list): A list of strings representing the text data to analyze.
        - language (str): The language used to detect profanity. Defaults to 'English'. Available languages include: All, Arabic, Czech, Danish, German, English, Esperanto, Persian, Finnish, Filipino, French, French (CA), Hindi, Hungarian, Italian, Japanese, Kabyle, Korean, Dutch, Norwegian, Polish, Portuguese, Russian, Swedish, Thai, Klingon, Turkish, Chinese.

Returns:
            list: A list of dictionaries where each dictionary represents a detected instance of profanity.
            Each dictionary contains the following keys:
            - "Line" (int): The line number where the profanity was detected.
            - "Column" (int): The column number (position in the line) where the profanity starts.
            - "Word" (str): The detected profanity word.
            - "Language" (str): The language in which the profanity was detected.

Remove profanity

Remove profanity from text data.

Args:
        - text_data (list): A list of strings representing the text data to clean.
        - output_file (str): The file path to write the cleaned data. If None, cleaned data is not written to a file.
        - language (str): The language for which to remove profanity. Defaults to 'English', Available languages include: All, Arabic, Czech, Danish, German, English, Esperanto, Persian, Finnish, Filipino, French, French (CA), Hindi, Hungarian, Italian, Japanese, Kabyle, Korean, Dutch, Norwegian, Polish, Portuguese, Russian, Swedish, Thai, Klingon, Turkish, Chinese.

Returns:
        - list: A list of strings representing the cleaned text data.

Detect sensitive information

Detect sensitive information in text data.

Args:
        - text_data (list of str): A list of strings representing the text data to be analyzed.

Returns:
        - list of tuple: A list of tuples containing detected sensitive information, each tuple representing (line number, column index, type, value).

Remove sensitive information

Remove sensitive information from text data.

Args:
        text_data (list of str): A list of strings representing the text data to be cleaned.
        output_file (str, optional): Path to the output file where cleaned data will be saved.

Returns:
        list of str: A list of strings representing the cleaned text data.

Detect hate speech or offensive language

Detect offensive language or hate speech in the provided text string, using an AI model.

Args:
        text (str): A string representing the text data to be used for hate speech detection and offensive language detection.
        
Returns:
        list of str: A list of strings representing the outcome of the detection.

Remove hate speech or offensive language

Remove offensive language or hate speech in the provided text data array, using an AI model.

Args:
        text (str): A string representing the text data to be used for hate speech detection and offensive language detection.

Returns:
        list of str: A list of strings representing the cleaned text data.

Last updated