Introduction
I have long searched for a simple code snippet that would let me turn a string into its Capitalized form, known to editors as "proper case". Upon finding nothing, I tried parsing the string one character at a time and switching on the findings; I also tried splitting the string (using spaces as delimiters), altering the first character of each element and gluing the array back into a string (using spaces, Holmes). I was able to locate the characters that needed replacing in an array of either the characters themselves or their positions (indices) within the string, but then some complex code was required to put the whole thing back together.
These attempts worked just fine, but I was still aiming at a simple replace
method solution. I finally got it after discovering that JScript 5.5 will take a function as the replaceText
argument.
Implementation
First, let us refresh our knowledge on regular expressions. We must locate each character after a space and remember such locations. Fortunately, the regular expression object does capture 9 "submatches" or occurrences in a set of properties. The regular expression syntax for capturing any single character after a space follows:
/\s(.)/g
I rather use the space specifier \s than an actual blank in the code to improve readability and to include any form-feed, tabs, etc. in the match.
The first character of the string must be capitalized too, so we'll use the input beginning specifier and capture whatever character is exactly after it:
/^(.)
Our lookup expression then combines into:
/^(.)|\s(.)/g
Now, the use of the $n properties in the replacement text has limitations I am not going to elaborate about; in this particular case, the beauty of using a function relies on three facts:
- The $1 regular expression property can be used as a function argument;
- The $1 regular expression property can be used inside the function, and
- The $1 regular expression property is updated internally by the
replace
method after the function executes!
Because of this, the $1 property always represents the next submatch. All that is left to do is to turn the entire string into lower case and each match to upper case using our replace
function:
function toProperCase(s)
{
return s.toLowerCase().replace(/^(.)|\s(.)/g,
function($1) { return $1.toUpperCase(); });
}
Alternatively, if you prefer, in prototype flavor:
String.prototype.toProperCase = function()
{
return this.toLowerCase().replace(/^(.)|\s(.)/g,
function($1) { return $1.toUpperCase(); });
}
Bear in mind that this is no thesaurus, its primary use is for proper names, i.e. turning HECTOR J. RIVAS or hector j. rivas into Hector J. Rivas... You will need a second pass function of your device to skip articles or other words in phrases, say to turn the opening sentence in this paragraph into: Bear in Mind that this is no Thesaurus.
I sure would like to know the internals of the replace
function, the regular expression object and the function as replace text, since I mostly use it in database apps, but so far, it reports better speed and efficiency than any other method I have tried.
Hope you like it.