remove_sensitive_information(text_data, output_file=None, info_type=[]): Remove sensitive information from text data.
detect_hate_speech(text): Detect hate speech or offensive language in a text string.
remove_hate_speech(text_data): Remove hate speech or offensive language in text data using AI.
Detect profanity
Detect profanity in text using regex.
Args:text_data (list): A list of strings representing the text data to analyze.language (str): The language used to detect profanity. Defaults to 'English'. Available languages include: All, Arabic, Czech, Danish, German, English, Esperanto, Persian, Finnish, Filipino, French,French (CA), Hindi, Hungarian, Italian, Japanese, Kabyle, Korean, Dutch, Norwegian, Polish, Portuguese, Russian, Swedish, Thai, Klingon, Turkish, Chinese.Returns:list: A list of dictionaries where each dictionary represents a detected instance of profanity. Each dictionary contains the following keys:-"Line" (int): The line number where the profanity was detected.-"Column" (int): The column number (position in the line) where the profanity starts.-"Word" (str): The detected profanity word.-"Language" (str): The language in which the profanity was detected.
Remove profanity
Remove profanity from text data.
Args:text_data (list): A list of strings representing the text data to clean.output_file (str): The file path to write the cleaned data. If None, cleaned data isnot written to a file.language (str): The language for which to remove profanity. Defaults to 'English', Available languages include: All, Arabic, Czech, Danish, German, English, Esperanto, Persian, Finnish, Filipino, French,French (CA), Hindi, Hungarian, Italian, Japanese, Kabyle, Korean, Dutch, Norwegian, Polish, Portuguese, Russian, Swedish, Thai, Klingon, Turkish, Chinese.Returns:list: A list of strings representing the cleaned text data.
Detect sensitive information
Detect sensitive information in text data.
Args:text_data (list of str): A list of strings representing the text data to be analyzed.info_type (strorlist of str, optional): One or more types of sensitive info to detect. Available types are:"email","phone","credit_card","ssn","id","address","ip","iban","mrn","icd10","geo_coords","username","file_path","bitcoin_wallet","ethereum_wallet". Uses all info types by default.Returns:list of tuple: A list of tuples containing detected sensitive information, each tuplerepresenting (line number, column index, type, value).
Remove sensitive information
Remove sensitive information from text data.
Args:text_data (list of str): A list of strings representing the text data to be cleaned.output_file (str, optional): Path to the output file where cleaned data will be saved.info_type (strorlist of str, optional): One or more types of sensitive info to detect and remove. Available types are:"email","phone","credit_card","ssn","id","address","ip","iban","mrn","icd10","geo_coords","username","file_path","bitcoin_wallet","ethereum_wallet". Uses all info types by default.Returns:list of str: A list of strings representing the cleaned text data.
Detect hate speech or offensive language
Detect offensive language or hate speech in the provided text string, using an AI model.
Args:text (str): A string representing the text data to be used for hate speech detection and offensive language detection.Returns:list of str: A list of strings representing the outcome of the detection.
Remove hate speech or offensive language
Remove offensive language or hate speech in the provided text data array, using an AI model.
Args:text (str): A string representing the text data to be used for hate speech detection and offensive language detection.Returns:list of str: A list of strings representing the cleaned text data.