Introduction
Regular expressions are hard. Reading about them is confusing and
boring at best. Until you need the functionality they provide it’s hard
to understand why they exist. You need to apply them to something you
do everyday.
Following these lessons you’ll be taken through the assembly of a Skidmarklet... a JavaScript Bookmarklet
that leverages Regular Expression matching and replacement to skidmark
the crappy parts of any web page. Think about it this way, Regular
expressions (or regex) are hard, everybody poops and we can all relate
to that.
Click the button below to see the finished product in action or download the example files here.
Background
Recent
forays into fatherhood have revitalized my once rampant infatuation
with poop. So let’s make my poop obsession your regular expression!
I
developed the poop game when I was 17 working at Blockbuster Video.
Don’t worry, it happened in the store not the toilet. For fun between
the postal monotony of shelving cassette tapes I swapped poop for each
word in a movie title. Using The Green Mile as an example there were
three potential poop game results:
- Poop Green Mile
- The Poop Mile
- The Green Poop
One
word at a time I’d just laugh at all the possible combinations and one
word at a time customers would distance themselves from my creepy
giggles as this all happened in my head. Here poop made a crappy job a
little less of a turd. Regular Expressions have the power to smear joy
and delight if only we can understand them. So let’s get to know them
better by playing the poop game on the internet.
Lesson 1
Replace
a list of words with another list of words. It iterates through an
array of Regular Expressions using a special character, called a word
boundary, to make sure we are only getting whole words. Then we can make
sure it is replacing only the word "go" and not the letters "go" in the
middle of the word "engorge."
1. Encapsulate
We have to
make sure we are only leaving a skidmark and not clogging up the toilet.
In JavaScript terms it’s best to avoid collisions and as a general
guideline this means encapsulating the main entry point within an
anonymous function call. All the main functionality is squeezed out of a
main method named “skidmark.”
(function() {
!function skidmark(){
}();
})();
2. Define some variables:
var POOP = "Poop";
var PATTERNS_TO_GO = [/\bgo\b/g,/\bgoing\b/g,/\bwent\b/g];
var REPLACEMENTS_TO_POOP = [POOP, "Pooping", "Pooped"];
var P_TAGS = ["h2", "h3", "h4", "h5", "h6", "p"];
POOP
is a string for the actual term “Poop” defined as a sudo-constant
because even though Poop is definitely a constant of life JavaScript has
no rules.
PATTERNS_TO_GO is an array of Regular Expression
Patterns, each similarly structured. ‘/\bgo\b/g’. We first find a word
boundary, ‘\b’, then the characters ‘go’, then another word boundary,
‘\b’. The global flag, ‘g’, ensures that we get every instance of the
match in a string.
REPLACEMENTS_TO_POOP contains corresponding
terms for each element of the PATTERNS_TO_GO array. This is the
“replacement” for each “match” that the regular expressions will need.
P_TAGS contains a list of element selectors representing non level 1 headings. Like a dog these are elements we want to mark.
3. Simplify DOM Selection
The method pickOutUnderwearByTag will return DOM Elements in which to leave our mark. Just like potty training we learn to go on our own, without jQuery.
function pickOutUnderwearByTag(tags) {
var underwearSelectors = tags;
var underwearEls = [];
for(var i in underwearSelectors){
var els = Array.prototype.slice.call(document.getElementsByTagName(underwearSelectors[i]));
underwearEls = underwearEls.concat(els);
}
return underwearEls;
}
4. Put it all together:
poopIfYouHaveToGo
loops through each entry in the PATTERNS_TO_GO array to find a match
somewhere in the underwear elements. Each is replaced with the
corresponding options from REPLACEMENTS_TO_MATCH.
function poopIfYouHaveToGo(){
var sourceEls = pickOutUnderwearByTag(P_TAGS);
for(var i = 0; i
var sourceEl = sourceEls[i];
var searchStr = sourceEl.innerHTML;
for(var j = 0; j< PATTERNS_TO_GO.length; j++){
var toGo = PATTERNS_TO_GO[j];
var toPoop = REPLACEMENTS_TO_POOP[j];
searchStr = searchStr.replace(
toGo,
(searchStr.match(/^[A-Z]/)) ?
toPoop :
toPoop.toLowerCase()
);
}
sourceEl.innerHTML = searchStr;
}
}
5. Run it
Our first skidmark task is ready to go.
!function skidmark(){
poopIfYouHaveToGo();
}();
Example 1 - And that’s how we leave skidmarks on the page.
Lesson 2
In
lesson 2, we are expanding on replacing several words within the text.
Here, instead of using a list of regular expressions to find and replace
individual words we are using the regex “or” operator, “|”.
1. Define some variables
Okay, it’s only one variable a regex named POOPY_TERMS matching any of three words (loaf, duty and business).
var POOPY_TERMS = /\b(loaf|duty|business)\b/g;
2. Define the method
Running the method poopWhereYouSeeIt runs over the same underwear elements replacing any near turd POOPY_TERM very literally with POOP.
function poopWhereYouSeeIt() {
var sourceEls = pickOutUnderwearByTag(P_TAGS);
for (var i = 0; i< sourceEls.length; i++) {
sourceEls[i].innerHTML = sourceEls[i].innerHTML.replace(
POOPY_TERMS,
POOP
);
}
}
3. Run it
I call them like I see’em and so too skidmark must poopWhereYouSeeIt.
!function skidmark(){
poopWhereYouSeeIt();
}();
Example 2 - And that’s how we leave skidmarks on the page.
Lesson 3
Through
our shared human experience craps have found a name. The corn poop,
the butt pee and rabbit dropping are a few that come to mind. Poop like
the poop game is all about titles so let’s concentrate our regex
sphincter on squeezing out some work in the title areas of a web page.
To name this page poo we are doing something a bit more, well,
dangerous.
Before we knew what words we were replacing. Now we’ll
realize the essence of the poop game by replacing one randomly-picked
word in the page title <title> and the main headline element
<h1>. In the previous lessons, if the page did not contain any of
our pre-picked words, the page would still not have any poop on it.
Now, however, the page can’t escape and we know it will wind up smeared
somehow.
1. Define Some Variables
var TEST_CASE = /^[A-Z]/;
var POOP_BOUNDARY = /\b(\S+)\b/g;
var NAME_TAGS = ["title", "h1"];
TEST_CASE is a simple expression that tests a string that begins with any character in uppercase alphabet.
POOP_BOUNDARY
matches what is considered a "word." In this instance, one or more
consecutive non-whitespace character between word boundaries.
NAME_TAGS
is an array of the tags we are matching against. The spot in your
underwear you might write you name, so as not to lose it.
2. Separate Re-usable Utilities
If
you’ve read Clean Code I don’t care about your opinion. It is mine that
code is more readable and re-usable if each method has a single
purpose. The following methods encapsulate the individual tasks that
together accomplish our goal.
function insertPoopHere(str) {
var word = randomWord(str);
return poopInCase(str, word);
}
function poopInCase(str, word) {
return str.replace(
new RegExp('\\b(' + word + ')+\\b'),
word.match(TEST_CASE) ? POOP : POOP.toLowerCase()
);
}
function randomWord(str){
var arr = str.match(POOP_BOUNDARY);
return arr[Math.floor((Math.random()*arr.length))];
}
randomWord - Picks a random word, replaces it with "poop".
poopInCase - Case-sensitive replacement of a word with "poop"
insertPoopHere - Selects a random element within an array of values
3. Define the method
function nameYourPoop() {
var sourceEls = pickOutUnderwearByTag(NAME_TAGS);
for (
var i =
0; i< sourceEls.length; i++) {
sourceEls[i].innerHTML =
insertPoopHere(sourceEls[i].innerHTML);
}
}
nameYourPoop
grabs all the words from within the underwear name tags, picking a
random one to replace with “poop”. We use two regexes to do this,
POOP_BOUNDARY to grab all the string matches that qualify as words and
POOP as the substitution. Once we have an array or all the words
individually, we pick a random one and inject that word into a new
JavaScript RegExp object. Here again, the regex pattern sandwiches the
word itself between word boundaries so that, if the word is “with”, we
won’t also change the middle of the word “wherewithall”.
4. Run it
!function skidmark(){
nameYourPoop();
}();
Example 3 - And that’s how we leave skidmarks on the page.
Lesson 4
Lesson 3 gave us the insertPoopHere
method as applied in a narrow manner to just page title elements. In
this final lesson we will take that and make it a blow out; we are going
to replace a word in every sentence of every paragraph through out the
page. In order to do so we need to first identify a sentence and then
stain it.
1. Define a variable
var POOP_SENTENCES = /(\S.+?[.!?])(?=\s+|$)/g;
POOP_SENTENCES
is a regex pattern to match the structure of a sentence. In our
language, which is called, I believe, English, sentences are
predictable: they usually start with a “\S” non-whitespace character
with a bunch of other characters and words ending in a period,
exclamation point or question mark followed by whitespace or linebreaks.
Funny enough, at least a couple of those are operators in regular
expressions but inside a character set they are used literally. That is
why [.!?] does not require an “/” escape character for each punctuation
mark.
2. Define the method
function poopAndP(){
var sourceEls = pickOutUnderwearByTag(P_TAGS);
for (var i = 0; i< sourceEls.length; i++) {
var text = sourceEls[i].innerHTML;
var sentences = text.match(POOP_SENTENCES);
if(!sentences)
continue;
for(var j = 0; j<sentences.length; j++)
var sentence = sentences[j];
var poopySentence = insertPoopHere(sentence);
text = text.replace(sentence, poopySentence);
}
}
sourceEls[i].innerHTML = text;
}
Once
we have each sentence singled out, we can then pick a random word, just
like we did in Lesson 3, and replace it with a poop. poopAndP does just this within the elements of our existing P_TAGS underwear.
3. Run it
!function skidmark(){
poopIfYouHaveToGo();
poopWhereYouSeeIt();
nameYourPoop();
poopAndP();
}();
Example 4- And that’s how we leave skidmarks on the page.
Conclusion
As
with any successful pooping endeavor, let's finish things off by
lighting a match and possibly igniting a conversation. The goal is to
have a bit of fun with the tutorial and leave a mark on you, so say what you think! Leave a comment or flushit back down the intertubes, it's your call. Hopefully this was a
somewhat educational diversion from the possibly dull results a typical tutorial
search turns up.