Overview
The REGEXP_REPLACE() function replaces all occurrences of a regular expression pattern in a string with a specified replacement string.
Syntax
The syntax for REGEXP_REPLACE() function is as follows:
REGEXP_REPLACE(source_string, pattern, replacement, [flags])
Parameters
source_string: string that we want to perform the replacement on
pattern: POSIX regular expression pattern to match
replacement: replacement string
flags: (optional) string that changes the matching behavior of REGEXP_REPLACE() function
The flags parameter is an optional string that controls how the function operates. Here is a list of flags supported in Oxla:
g: global replacement. This flag ensures that all occurrences of the pattern are replaced
i: use this flag for case-insensitive matching
c: REGEXP_REPLACE() function is case sensitive by default, using the c flag has the same effect as using no flags
Examples
Basic function usage
In this example, we will focus on using REGEXP_REPLACE() function with a basic POSIX regular expression pattern:
SELECT REGEXP_REPLACE('The OXLA supports various data types', 'T[^ ]*', 'We') AS "Replaced_String";
By executing the query above, we will get the following output:
Replaced_String
-----------------------------------------
We OXLA supports various data types
The pattern used was “T[^ ]*”, which matches any substring that starts with a ‘T’ character, followed by any number of non-space characters. The function replaces the matched substring with the specified replacement string “We”.
Replacing special characters
This example demonstrates how to replace a non-alphanumeric character in a string with a tilde (~):
SELECT REGEXP_REPLACE('Hello World!', '[^A-Za-z0-9 ]', '~') AS "Replaced_String";
In the above query, the second parameter is a regular expression “[^A-Za-z0-9 ]” that matches any characters that are not uppercase / lowercase letters, digits or spaces. The output for executing the query above will be as follows:
Replaced String
-------------------
Hello World~
Flags usage
Replacing certain substrings with a single flag defined
This example will focus on using the REGEXP_REPLACE() function with a defined flag and replacing certain substrings in a string. For the needs of this section, we will create a sample quotes table:
CREATE TABLE quotes (quotes_text text);
INSERT INTO quotes (quotes_text)
VALUES ('Work hard and stay hungry. Lazy people get nowhere in life.'),
('An excuse is a way for a LAZY person to feel better.'),
('The word LUCKY is how a lazy person describes someone who works hard.');
SELECT quotes_text FROM quotes;
By executing the code above, we will get the following output:
quotes_text
-----------------------------------------------------------------------
Work hard and stay hungry. Lazy people get nowhere in life.
An excuse is a way for a LAZY person to feel better.
The word LUCKY is how a lazy person describes someone who works hard.
(3 rows)
Now, we will use the REGEXP_REPLACE() function with the i flag specifiec to replace all occurrences of the word lazy with active regardless of the case sensitivity:
SELECT quotes_text, REGEXP_REPLACE(quotes_text, 'lazy', 'active', 'i') AS "New quotes" FROM quotes;
In this case, all occurrences of the word lazy have been replaced with active:
quotes_text | New quotes
-----------------------------------------------------------------------+-------------------------------------------------------------------------
Work hard and stay hungry. Lazy people get nowhere in life. | Work hard and stay hungry. active people get nowhere in life.
An excuse is a way for a LAZY person to feel better. | An excuse is a way for a active person to feel better.
The word LUCKY is how a lazy person describes someone who works hard. | The word LUCKY is how a active person describes someone who works hard.
(3 rows)
Specifying one or more flags
Without specifying the g flag, REGEXP_REPLACE() function replaces only the first occurrence of a substring:
SELECT REGEXP_REPLACE('ab12c', '[0-9]', 'X');
regexp_replace
----------------
abX2c
In this case, as you can see only the first digit (1) was replaced with X. By adding the g flag, all occurrences are replaced with X:
SELECT REGEXP_REPLACE('ab12c', '[0-9]', 'X', 'g');
regexp_replace
----------------
abXXc
If you use multiple flags, the last one takes precedence. If you use the ci flags, the regex will be case-insensitive, while using the ic flags it will be case-sensitive
Restrictions
- The function returns
NULL if there are no input rows or NULL values
- If the regular expression pattern isn’t found in the string, the
REGEXP_REPLACE() function returns the original string
i and c flags shouldn’t be used with each other