Video Coming Soon...

Created by Zed A. Shaw Updated 2024-10-08 04:45:56

02: Basic Matching

In this exercise you'll create a simple little text file and use grep (ugrep on Windows) to search for lines in the file. I'll also walk through exactly how a regex work, but keep in mind you'll have to play with them for a while before they really make sense. Just keep experimenting and testing your ideas and you'll get it.

The Setup

To do this exercise you'll need to create a little text file with some text to search. Let's use a little poem:

I have one million bees
None of these have knees
But if I squint
They won't sprint
Even if I also say please

Thank you. Thank you. I'll be off in my studio writing my Nobel Prize acceptance speech now.

Save this file in your Documents folder so you can access it easily and name it ex02.txt.

You'll need to create this text file with a simple text editor like Geany but not with a word processor like Microsoft Word. I repeat, do not use Microsoft Word.

Finding Your Text File

If you did as instructed and saved the text file into Documents then you can do this to get your Terminal to the same location:

  1. Start Terminal.
  2. Type cd ~/Documents the ~ is called a "tilde" character and is above the ` (backtick) character on my keyboard but might be somewhere else on yours.
  3. This command will change directory to Home slash Documents.
  4. Once you are there you should be able to type cat ex02.txt and see the poem printed to your screen.
  5. If you can't, then you did something wrong and need to find the file.

If you can't do cat ex02.txt then probably the best thing to do is create the file again but be very sure that you are saving it in Documents. Another way is to do this:

  1. Use your mouse to find the file on your computer. Basically, how would you find the file so you can double click it and open it?
  2. Start Terminal like normal.
  3. Type cd (that's cd then space).
  4. Grab the file with your mouse and drag it into your Terminal window then let go.
  5. When you do it will print out the real location of the file into your terminal. To cd to this location, just delete the ex02.txt on the end (use your arrow keys to get to the end) and hit ENTER to submit the command cd you typed in #3.

Now, if this happened you need to figure out why. One major thing that programming teaches is paying attention to what you do. If you saved the file in a weird place, take the time to go back and find out why you did that, then try not to do it again.

Before You Continue

You should now be setup to actually play with this file. Confirm you have this:

  1. A text file named ex02.txt in Documents.
  2. A Terminal open for you to type commands.
  3. Your poem printed to the screen with cat ex02.txt.

If you don't have this then go back and try again.

Your First Regex

A regular expression is constructed with a mixture of the text you want to find, and patterns that add additional steps in the search. At the most basic level you can search for an exact word in the poem like this:

grep bees ex02.txt

If you type this into your Terminal you should see this:

$ grep bees .\ex02.txt
I have one million bees

If you're on Windows then you would see this (because you typed ugrep):

PS C:\Users\lcthw\Documents> ugrep bees .\ex02.txt
I have one million bees

WARNING From now on I'm only going to show the grep version of these commands and assume you know to type ugrep.

You can type any sequence of characters that are also in the ex02.txt poem, but you can use additional operators to match patterns. The simplest pattern is the "anything" operator:

grep ..ees ex02.txt
I have one million bees
None of these have knees

You can see that I typed ..ees which means, "Match any two characters (..) then ees." Which is why it matches bees and knees.

Experiment #1

Take the time now to use this to match as many other lines as possible.

  1. What is the one regex that will match the most lines that has both . and at least one other character?
  2. How would you match a space character? Try putting " (double-quote) around your regex.

Matching Blank Space

If you want to match a space then you put " (double-quote) around the regex, but you can also use the \ character to specify different blank space characters:

The \ character is like saying, "Treat the next character as a command." It can also mean, "Treat the next character literally." We'll experiment with that in the next exercise.

Experiment #2

  1. How would you write a regex to match any character after a space?
  2. What if someone isn't using actual space characters, but instead using tabs?

Further Study

Previous Lesson Next Lesson

Register for Learn Regex the Hard Way

Register to gain access to additional videos which demonstrate each exercise. Videos are priced to cover the cost of hosting.