Package documentation for ValX, a Python package for handling text-cleaning tasks, including profanity and PII. Now includes hate speech detection and offensive language detection using AI.
Changelog
0.2.3 (Latest): Created new detection patterns for sensitive information, and created a new optional info_type parameter to control sensitive information detection and removal.
0.2.2: Refactored detect_profanity function to return more information about the found profanities. Also removed unnecessary printing in functions.
0.2.1: Updated project PYPI description.
0.2.0: Created a new function, to automatically remove detected hate speech or offensive speech from text.
0.1.8 - 0.1.9: Updated docstrings.
0.1.7: Added AI models to ValX for hate speech detection.
0.1.1 - 0.1.6: Fixed errors in code, and created several functions for text cleaning.
0.1.0: Initial release.
Installation
You can install ValX using PyPi, please make sure that you are using Python 3.6 or later before installing ValX:
pipinstallvalx
List of supported languages for profanity detection and removal
Below is a complete list of all the available supported languages for ValX's profanity detection and removal functions which are valid values for language:
All
Arabic
Czech
Danish
German
English
Esperanto
Persian
Finnish
Filipino
French
French (CA)
Hindi
Hungarian
Italian
Japanese
Kabyle
Korean
Dutch
Norwegian
Polish
Portuguese
Russian
Swedish
Thai
Klingon
Turkish
Chinese
Example Usage
Profanity Detection
from valx import detect_profanitysample_text = ["This is a sample text containing some profanity like bad word 1, bad word 2, and bad word 3.","This line doesn't contain any profanity.","But this one has another, just in another language: bad word 4."]# Detect profanityresults =detect_profanity(sample_text, language='English')print("Profanity Evaluation Results", results)
Profanity Removal
from valx import remove_profanitysample_text = ["This is a sample text containing some profanity like bad word 1, bad word 2, and bad word 3.","This line doesn't contain any profanity.","But this one has another, just in another language: bad word 4."]# Remove profanityremoved =remove_profanity(sample_text, "text_cleaned.txt", language="English")
PII Detection
from valx import detect_sensitive_informationsample_text = ["Please contact john.doe@example.com or call 555-123-4567 for more information.","We will need your credit card number to complete the transaction: 1234-5678-9012-3456.","My social security number is 123-45-6789 and my ID number is AB123456.","Our office address is 123 Main St, Anytown, USA. Please visit us!","Your IP address is 192.168.1.1. Please don't share it with anyone."]# Detect sensitive informationdetected_information =detect_sensitive_information(sample_text)
PII Removal
from valx import remove_sensitive_informationsample_text = ["Please contact john.doe@example.com or call 555-123-4567 for more information.","We will need your credit card number to complete the transaction: 1234-5678-9012-3456.","My social security number is 123-45-6789 and my ID number is AB123456.","Our office address is 123 Main St, Anytown, USA. Please visit us!","Your IP address is 192.168.1.1. Please don't share it with anyone."]# Detect sensitive informationcleaned_information =remove_sensitive_information(sample_text)
Hate Speech Detection
from valx import detect_hate_speech# Detect hate speech or offensive languageoutcome_of_detection =detect_hate_speech("You are stupid.")
Remove Hate Speech
from valx import detect_hate_speechsample_text = ["This is a sample text containing some profanity like bad word 1, bad word 2, and bad word 3.","This line doesn't contain any profanity.","But this one has another, just in another language: bad word 4."]# Remove hate speech or offensive languagecleaned_text =remove_hate_speech(sample_text)