# ValX Functions

**Available functions:**

* [`detect_profanity`](#detect-profanity)`(text_data, language="English")`: Detect profanity in text using regex.
* [`remove_profanity`](#remove-profanity)`(text_data, output_file=None, language="English")`: Remove profanity from text data.
* [`detect_sensitive_information`](#detect-sensitive-information)`(text_data, info_type=[])`: Detect sensitive information in text data.
* [`remove_sensitive_information`](#remove-sensitive-information)`(text_data, output_file=None, info_type=[])`: Remove sensitive information from text data.
* [`detect_hate_speech`](#detect-hate-speech-or-offensive-language)`(text)`: Detect hate speech or offensive language in a text string.
* [`remove_hate_speech`](#remove-hate-speech-or-offensive-language)`(text_data)`: Remove hate speech or offensive language in text data using AI.
* [`load_custom_profanity_from_file`](#load-custom-profanity-from-file)`(filepath)`: Loads a custom profanity word list from a text file.

***

### Detect profanity

Detect profanity in text using regex.

{% code overflow="wrap" %}

```python
Args:
        text_data (list): A list of strings representing the text data to analyze.
        language (str, optional): The language used to detect profanity. Defaults to 'English'. Available languages include: All, Arabic, AR, Czech, CS, Danish, DA, German, DE, English, EN, Esperanto, EO, Persian, Finnish, FI, Filipino, FIL, French, FR, French (CA), FR-CA-U-SD-CAQC, Hindi, HI, Hungarian, HU, Italian, IT, Japanese, JA, Kabyle, KAB, Korean, KO, Dutch, NL, Norwegian, NO, Polish, PL, Portuguese, PT, Russian, RU, Spanish, ES, Swedish, SV, Thai, TH, Klingon, TLH, Turkish, TR, Chinese, ZH. If set to `None` and `custom_words_list` is provided, only the custom list will be used.
        custom_words_list (list[str], optional): A Python list of custom profanity words to detect. Defaults to `None`. If provided, these words will be used in addition to the selected language's wordlist, or exclusively if `language` is `None`.

Returns:
            list: A list of dictionaries where each dictionary represents a detected instance of profanity.

Raises:
            ValueError: If `language` is set to `None` and `custom_words_list` is not provided or is empty.
            Each dictionary contains the following keys:
            - "Line" (int): The line number where the profanity was detected.
            - "Column" (int): The column number (position in the line) where the profanity starts.
            - "Word" (str): The detected profanity word.
            - "Language" (str): Indicates the source of the profanity detection (e.g., "English", "Custom", or "Custom + English" if a custom list is combined with a language).
```

{% endcode %}

### Remove profanity

Remove profanity from text data.

```python
Args:
        text_data (list): A list of strings representing the text data to clean.
        output_file (str, optional): The file path to write the cleaned data. If None, cleaned data is not written to a file. Defaults to `None`.
        language (str, optional): The language for which to remove profanity. Defaults to 'English'. Available languages include: All, Arabic, Czech, Danish, German, English, Esperanto, Persian, Finnish, Filipino, French, French (CA), Hindi, Hungarian, Italian, Japanese, Kabyle, Korean, Dutch, Norwegian, Polish, Portuguese, Russian, Swedish, Thai, Klingon, Turkish, Chinese. If set to `None` and `custom_words_list` is provided, only the custom list will be used.
        custom_words_list (list[str], optional): A Python list of custom profanity words to remove. Defaults to `None`. If provided, these words will be used in addition to the selected language's wordlist, or exclusively if `language` is `None`.

Returns:
        list: A list of strings representing the cleaned text data.

Raises:
            ValueError: If `language` is set to `None` and `custom_words_list` is not provided or is empty (as this function internally calls `load_profanity_words`).
```

### Detect sensitive information

Detect sensitive information in text data.

{% code overflow="wrap" %}

```python
Args:
        text_data (list of str): A list of strings representing the text data to be analyzed.
        info_type (str or list of str, optional): One or more types of sensitive info to detect. Available types are: "email", "phone", "credit_card", "ssn", "id", "address", "ip", "iban", "mrn", "icd10", "geo_coords", "username", "file_path", "bitcoin_wallet", "ethereum_wallet". Uses all info types by default.

Returns:
        list of tuple: A list of tuples containing detected sensitive information, each tuple representing (line number, column index, type, value).
```

{% endcode %}

### Remove sensitive information

Remove sensitive information from text data.

{% code overflow="wrap" %}

```python
Args:
        text_data (list of str): A list of strings representing the text data to be cleaned.
        output_file (str, optional): Path to the output file where cleaned data will be saved.
        info_type (str or list of str, optional): One or more types of sensitive info to detect and remove. Available types are: "email", "phone", "credit_card", "ssn", "id", "address", "ip", "iban", "mrn", "icd10", "geo_coords", "username", "file_path", "bitcoin_wallet", "ethereum_wallet". Uses all info types by default.

Returns:
        list of str: A list of strings representing the cleaned text data.
```

{% endcode %}

### Load custom profanity from file

Loads a custom list of profanity words from a text file.

The file should contain one profanity word per line. Lines starting with a hash symbol (#) are treated as comments and are ignored. Empty lines or lines containing only whitespace are also ignored.

```python
Args:
        filepath (str): The path to the text file containing profanity words.

Returns:
        list: A list of profanity words loaded from the file.
```

### Detect hate speech or offensive language

Detect offensive language or hate speech in the provided text string, using an AI model.

<pre class="language-python" data-overflow="wrap"><code class="lang-python">Args:
<strong>        text (str): A string representing the text data to be used for hate speech detection and offensive language detection.
</strong><strong>        
</strong>Returns:
        list of str: A list of strings representing the outcome of the detection.
</code></pre>

### Remove hate speech or offensive language

Remove offensive language or hate speech in the provided text data array, using an AI model.

```python
Args:
        text (str): A string representing the text data to be used for hate speech detection and offensive language detection.

Returns:
        list of str: A list of strings representing the cleaned text data.
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://infinitode-docs.gitbook.io/documentation/package-documentation/valx-package-documentation/valx-reference/valx-functions.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
