Python String - Remove special characters


Python String - Remove special characters

To remove special characters from a string in Python, you can use a regular expression along with the re (regular expression) module.

Steps to remove special characters in string

Follow these steps to remove all the special characters in a given string using regular expression re module.

  1. Import re module.
  2. Given original string is in original_string.
  3. To remove the special character, we first define what the allowed characters are in a pattern. Take a pattern that selects characters other than uppercase alphabets A-Z, lowercase alphabets a-z, numbers 0-9, and single space characters. The pattern would be r'[^A-Za-z0-9 ]'.
  4. Call re.sub() function and pass the pattern, an empty string, and the original string as arguments. The sub() function substitutes all the characters other than specified character groups with an empty string in the original string, and returns the resulting string.
  5. Store the returned string in a variable cleaned_string, and you may use it for further processing.

Example

In the following program, we take a string in original_string, and remove all the special characters using the steps mentioned above.

Python Program

import re

# Given string
original_string = "Hello, @World! 123"

# Pattern
pattern = r'[^A-Za-z0-9 ]'

# Substitute special characters with empty string
cleaned_string = re.sub(pattern, "", original_string)

print(cleaned_string)

Output

Hello World 123