JavaScript: Replacing Special Characters

Removing punctuation in JavaScript is a relatively easy task, but removing accents, leaving only the letters is a bit more challenging. Regardless of the situation, I have below some minimalist functions that can be used for both cases.

How to remove accents in JavaScript

To simply remove accents and cedilla from a string and return the same string without the accents, we can use ES6's String.prototype.normalize method, followed by a String.prototype.replace:

const str = 'ÁÉÍÓÚáéíóúâêîôûàèìòùÇç';
const parsed = str.normalize('NFD').replace(/[\u0300-\u036f]/g, '');
console.log(parsed);

Explanation

The normalize method was introduced in the ES6 version of JavaScript in 2015. It serves to convert a string into its standard Unicode format. In this case, we use the parameter NFD which can separate the accents from the letters and return their Unicode codes.

To get a better idea of how this conversion to Unicode works, see below:

// String Á in UTF-18 has 1 digit
'Á'.length; // 1

// String Á in Unicode has 2 digits: \u0041\u0301
'Á'.normalize('NFD').length; // 2

// If we try to represent Unicode, we'll obtain the following result
console.log('\u0041\u0301'); // Á

Then the method replaces all occurrences of diacritical characters, combining them in the Unicode sequence \u0300 - \u036F, another advantage of ES6 that was added to allow Unicode ranges in RegEx.

Removing all special characters in JavaScript

To remove the accents and other special characters like /?!(), just use the same formula above, only replace everything but letters and numbers.

const str = 'ÁÉÍÓÚáéíóúâêîôûàèìòùÇç/.,~!@#$%&_-12345';
const parsed = str.normalize('NFD').replace(/([\u0300-\u036f]|[^0-9a-zA-Z])/g, '');
console.log(parsed);

Explanation

To understand what happens in the code above, I suggest reading the previous paragraph where I talk about Unicode and the normalize method.

The only addition, in this case, was to create 2 groups in the regex through ([ group 1 ]|[ group 2 ]) and add to group 2 the regular expression [^0-9a-zA-Z], which means: anything that's not (^) 0-9, a-z or A-Z, is also replaced.

If you don't want to remove spaces, just add \s:

str.normalize('NFD').replace(/([\u0300-\u036f]|[^0-9a-zA-Z\s])/g, '')

Replacing special characters

Another quite recurrent use case is the need to clear the accents and then replace special characters with some other one, e.g. "Any phrase" -> "Any-phrase".

There is a very good regular expression to replace characters that are not common letters or numbers, but this expression also removes accents.

'Here\'s à sentence'.replace(/[^\w\-]+/g, '-'); // Here-s-sentence

If we want to remove only the accents and then replace other special characters, we need to do sort of what was proposed in the first example:

'Here\'s à sentence'.normalize('NFD').replace(/[\u0300-\u036f]/g, '').replace(/[^\w\-]+/g, '-');

But maybe you also need to replace unnecessary hyphens, as in the case of "This is a sentence!!!" turning into "This-is-a-sentence---".

Here's a complete function that removes accents, replaces special characters with hyphens, also removing additional hyphens:

const replaceSpecialChars = (str) => {
	return str.normalize('NFD').replace(/[\u0300-\u036f]/g, '') // Remove accents
		.replace(/([^\w]+|\s+)/g, '-') // Replace space and other characters by hyphen
		.replace(/\-\-+/g, '-')	// Replaces multiple hyphens by one hyphen
		.replace(/(^-+|-+$)/g, ''); // Remove extra hyphens from beginning or end of the string
}

console.log(replaceSpecialChars('This is a sentence!!!'));

If you want to use this same function to "slugify" a URL, just add toLowerCase() at the end and it's done!

I think I covered all the more recurring cases when working with accents and special characters in JavaScript. I know that it's an additional challenge for many foreign languages not to have built-in methods to deal with special characters.

JavaScript: Replacing Special Characters - The Clean Way

How to remove accents in JavaScript

Removing all special characters in JavaScript

Replacing special characters

Did you find this helpful?

Ricardo Metring

Related articles

JavaScript: How to Get the Value of a Select or Dropdown List

Round Decimal Places in JavaScript - The Reliable Way

How to Clone a JavaScript Object with Efficiency