(SQL examples for Beginners)
In this end-to-end example, you will learn – SQL Tutorials for Business Analyst: MySQL Tutorials for Business Analyst: MySQL Regular Expressions.
MySQL Regular Expressions (REGEXP) with Syntax & Examples
What are regular expressions?
Regular Expressions help search data matching complex criteria. We looked at wildcards in the previous tutorial. If you have worked with wildcards before, you may be asking why learn regular expressions when you can get similar results using the wildcards. Because, compared to wildcards, regular expressions allow us to search data matching even more complex criterion.
The basic syntax for a regular expression is as follows
SELECT statements... WHERE fieldname REGEXP 'pattern';
- “SELECT statements…” is the standard SELECT statement
- “WHERE fieldname” is the name of the column on which the regular expression is to be performed on.
- “REGEXP ‘pattern'” REGEXP is the regular expression operator and ‘pattern’ represents the pattern to be matched by REGEXP. RLIKE is the synonym for REGEXP and achieves the same results as REGEXP. To avoid confusing it with the LIKE operator, it better to use REGEXP instead.
Let’s now look at a practical example-
SELECT * FROM `movies` WHERE `title` REGEXP 'code';
The above query searches for all the movie titles that have the word code in them. It does not matter whether the “code” is at the beginning, middle or end of the title. As long as it is contained in the title then it will be considered.
Let’s suppose that we want to search for movies that start with a, b, c or d , followed by any number of other characters, how would we go about to achieve that. We can use a regular expression together with the metacharacters to achieve our desired results.
SELECT * FROM `movies` WHERE `title` REGEXP '^[abcd]';
Executing the above script in MySQL workbench against the myflixdb gives us the following results.
|4||Code Name Black||Edgar Jimz||2010||NULL|
|5||Daddy’s Little Girls||NULL||2007||8|
|6||Angels and Demons||NULL||2007||6|
Let’s now take a close look at our regular expression responsible for the above result.
‘^[abcd]’ the caret (^) means that the pattern match should be applied at the beginning and the charlist [abcd] means that only movie titles that start with a, b, c or d are returned in our result set.
Let’s modify our above script and use the NOT charlist and see what results we will get after executing our query.
SELECT * FROM `movies` WHERE `title` REGEXP '^[^abcd]';
Executing the above script in MySQL workbench against the myflixdb gives us the following results.
|1||Pirates of the Caribean 4||Rob Marshall||2011||1|
|2||Forgetting Sarah Marshal||Nicholas Stoller||2008||2|
|9||Honey mooners||John Schultz||2005||8|
|17||The Great Dictator||Chalie Chaplie||1920||7|
|19||movie 3||John Brown||1920||8|
Let’s now take a close look at our regular expression responsible for the above results.
‘^[^abcd]’ the caret (^) means that the pattern match should be applied at the beginning and the charlist [^abcd] means that the movie titles starting with any of the enclosed characters is excluded from the result set.
Regular expression metacharacters
What we looked at in the above example is the simplest form of a regular expression. Let’s now look at more advanced regular expression pattern matches. Suppose we want to search for movie titles that start with the pattern “code” only using a regular expression, how would we go about it? The answer is metacharacters. They allow us to fine tune our pattern search results using regular expressions.
|*||The asterisk (*) metacharacter is used to match zero (0) or more instances of the strings preceding it||SELECT * FROM movies WHERE title REGEXP ‘da*’; will give all movies containing characters “da” .For Example, Da Vinci Code , Daddy’s Little Girls.|
|+||The plus (+) metacharacter is used to match one or more instances of strings preceding it.||SELECT * FROM `movies` WHERE `title` REGEXP ‘mon+’; will give all movies containing characters “mon” .For Example, Angels and Demons.|
|?||The question(?) metacharacter is used to match zero (0) or one instances of the strings preceding it.||SELECT * FROM `categories` WHERE `category_name` REGEXP ‘com?’; will give all the categories containing string com .For Example, comedy , romantic comedy .|
|.||The dot (.) metacharacter is used to match any single character in exception of a new line.||SELECT * FROM movies WHERE `year_released` REGEXP ‘200.’; will give all the movies released in the years starting with characters “200” followed by any single character .For Example, 2005,2007,2008 etc.|
|[abc]||The charlist [abc] is used to match any of the enclosed characters.||SELECT * FROM `movies` WHERE `title` REGEXP ‘[vwxyz]’; will give all the movies containing any single character in “vwxyz” .For Example, X-Men, Da Vinci Code, etc.|
|[^abc]||The charlist [^abc] is used to match any characters excluding the ones enclosed.||SELECT * FROM `movies` WHERE `title` REGEXP ‘^[^vwxyz]’; will give all the movies containing characters other than the ones in “vwxyz”.|
|[A-Z]||The [A-Z] is used to match any upper case letter.||SELECT * FROM `members` WHERE `postal_address` REGEXP ‘[A-Z]’; will give all the members that have postal address containing any character from A to Z. .For Example, Janet Jones with membership number 1.|
|[a-z]||The [a-z] is used to match any lower case letter||SELECT * FROM `members` WHERE `postal_address` REGEXP ‘[a-z]’; will give all the members that have postal addresses containing any character from a to z. .For Example, Janet Jones with membership number 1.|
|[0-9]||The [0-9] is used to match any digit from 0 through to 9.||SELECT * FROM `members` WHERE `contact_number` REGEXP ‘[0-9]’ will give all the members have submitted contact numbers containing characters “[0-9]” .For Example, Robert Phil.|
|^||The caret (^) is used to start the match at beginning.||SELECT * FROM `movies` WHERE `title` REGEXP ‘^[cd]’; gives all the movies with the title starting with any of the characters in “cd” .For Example, Code Name Black, Daddy’s Little Girls and Da Vinci Code.|
||||The vertical bar (|) is used to isolate alternatives.||SELECT * FROM `movies` WHERE `title` REGEXP ‘^[cd]|^[u]’; gives all the movies with the title starting with any of the characters in “cd” or “u” .For Example, Code Name Black, Daddy’s Little Girl, Da Vinci Code and Underworld – Awakening.|
|[[:<:]]||The[[:<:]] matches the beginning of words.||SELECT * FROM `movies` WHERE `title` REGEXP ‘[[:<:]]for’; gives all the movies with titles starting with the characters. For Example: Forgetting Sarah Marshal.|
|[[:>:]]||The [[:>:]] matches the end of words.||SELECT * FROM `movies` WHERE `title` REGEXP ‘ack[[:>:]]’; gives all the movies with titles ending with the characters “ack” .For Example, Code Name Black.|
|[:class:]||The [:class:] matches a character class i.e. [:alpha:] to match letters, [:space:] to match white space, [:punct:] is match punctuations and [:upper:] for upper class letters.||SELECT * FROM `movies` WHERE `title` REGEXP ‘[:alpha:]’; gives all the movies with titles contain letters only .For Example, Forgetting Sarah Marshal, X-Men etc. Movie like Pirates of the Caribbean 4 will be omitted by this query.|
The backslash () is used to as an escape character. If we want to use it as part of the pattern in a regular expression, we should use double backslashes (\)
- Regular expressions provide a powerful and flexible pattern match that can help us implement power search utilities for our database systems.
- REGEXP is the operator used when performing regular expression pattern matches. RLIKE is the synonym
- Regular expressions support a number of metacharacters which allow for more flexibility and control when performing pattern matches.
- The backslash is used as an escape character in regular expressions. It’s only considered in the pattern match if double backslashes have used.
- Regular expressions are not case sensitive.
Disclaimer: The information and code presented within this recipe/tutorial is only for educational and coaching purposes for beginners and developers. Anyone can practice and apply the recipe/tutorial presented here, but the reader is taking full responsibility for his/her actions. The author (content curator) of this recipe (code / program) has made every effort to ensure the accuracy of the information was correct at time of publication. The author (content curator) does not assume and hereby disclaims any liability to any party for any loss, damage, or disruption caused by errors or omissions, whether such errors or omissions result from accident, negligence, or any other cause. The information presented here could also be found in public knowledge domains.
Learn by Coding: v-Tutorials on Applied Machine Learning and Data Science for Beginners
Latest end-to-end Learn by Coding Projects (Jupyter Notebooks) in Python and R:
All Notebooks in One Bundle: Data Science Recipes and Examples in Python & R.
End-to-End Python Machine Learning Recipes & Examples.
End-to-End R Machine Learning Recipes & Examples.
Applied Statistics with R for Beginners and Business Professionals
Data Science and Machine Learning Projects in Python: Tabular Data Analytics
Data Science and Machine Learning Projects in R: Tabular Data Analytics
Python Machine Learning & Data Science Recipes: Learn by Coding
R Machine Learning & Data Science Recipes: Learn by Coding
Comparing Different Machine Learning Algorithms in Python for Classification (FREE)
There are 2000+ End-to-End Python & R Notebooks are available to build Professional Portfolio as a Data Scientist and/or Machine Learning Specialist. All Notebooks are only $29.95. We would like to request you to have a look at the website for FREE the end-to-end notebooks, and then decide whether you would like to purchase or not.