Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove control characters from PHP string

Tags:

string

regex

php

How can I remove control characters like STX from a PHP string? I played around with

preg_replace("/[^a-zA-Z0-9 .\-_;!:?äÄöÖüÜß<>='\"]/","",$pString) 

but found that it removed way to much. Is there a way to remove only control chars?

like image 414
KB22 Avatar asked Sep 30 '09 12:09

KB22


People also ask

How remove all special characters from a string in PHP?

Using str_ireplace() Method: The str_ireplace() method is used to remove all the special characters from the given string str by replacing these characters with the white space (” “).

How do I remove numbers and special characters from a string in PHP?

1 Answer. Show activity on this post. function clean($string) { $string = str_replace(' ', '-', $string); // Replaces all spaces with hyphens. return preg_replace('/[^A-Za-z\-]/', '', $string); // Removes special chars. }

What are the control characters in PHP?

Control characters are e.g. line feed, tab, escape.


2 Answers

If you mean by control characters the first 32 ascii characters and \x7F (that includes the carriage return, etc!), then this will work:

preg_replace('/[\x00-\x1F\x7F]/', '', $input); 

(Note the single quotes: with double quotes the use of \x00 causes a parse error, somehow.)

The line feed and carriage return (often written \r and \n) may be saved from removal like so:

preg_replace('/[\x00-\x09\x0B\x0C\x0E-\x1F\x7F]/', '', $input); 

I must say that I think Bobby's answer is better, in the sense that [:cntrl:] better conveys what the code does than [\x00-\x1F\x7F].

WARNING: ereg_replace is deprecated in PHP >= 5.3.0 and removed in PHP >= 7.0.0!, please use preg_replace instead of ereg_replace:

preg_replace('/[[:cntrl:]]/', '', $input); 
like image 116
Stephan202 Avatar answered Sep 28 '22 11:09

Stephan202


For Unicode input, this will remove all control characters, unassigned, private use, formatting and surrogate code points (that are not also space characters, such as tab, new line) from your input text. I use this to remove all non-printable characters from my input.

<?php $clean = preg_replace('/[^\PC\s]/u', '', $input); 

for more info on \p{C} see http://www.regular-expressions.info/unicode.html#category

like image 25
Scott Jungwirth Avatar answered Sep 28 '22 10:09

Scott Jungwirth