Overview

The REGEXP_MATCH() function matches a POSIX regular expression pattern to a string. It returns an array of TEXT[] type with substring(s) of matched groups within the first match.

Syntax

The syntax for REGEXP_MATCH() function is as follows:

REGEXP_MATCH(source_string, pattern, [flags])

Parameters

  • source_string: string on which you want to perform the matching
  • pattern: POSIX regular expression pattern to match
  • flags: (optional) string with flags that change the matching behavior of REGEXP_MATCH() function

The flags parameter is an optional string that controls how the function operates. Here is a list of flags that are supported by Oxla:

  • i: use this flag for case-insensitive matching
  • c: REGEXP_MATCH() function is case sensitive by default, using the c flag has the same effect as having no flags at all
If you use multiple flags, the last one takes precedence. If you use the ci flags, the regex will be case-insensitive, while using the ic flags it will be case-sensitive

Examples

Basic ssage

The following example demonstrates how to find the first occurrence of an email address in the input string:

SELECT REGEXP_MATCH('Contact us at [email protected]', '[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}');
   regexp_match   
------------------
 {hello@oxla.com}
(1 row)

Matching multiple groups

The REGEXP_MATCH() function can capture multiple groups within a match, which allow you to extract key parts from a string in a structured way. The example below extracts the protocol, domain and path from a given URL:

SELECT REGEXP_MATCH('https://www.example.com/products/item123', '(https?)://([\w.-]+)/(.+)');
               regexp_match               
------------------------------------------
 {https,www.example.com,products/item123}
(1 row)

Case-insensitive matching

This example shows how to match a pattern regardless of case-sensitivity:

SELECT REGEXP_MATCH('[email protected]', '@([a-z0-9.-]+)$', 'i');
 regexp_match  
---------------
 {Example.COM}
(1 row)

Matching with patterns stored in a table

In this example we will show you how to take the source string and regex pattern directly from the table. For the needs of this section, let’s create two sample tables:

CREATE TABLE users (
    email TEXT NOT NULL
);

CREATE TABLE patterns (
    id INT,
    regex_pattern TEXT NOT NULL
);

Once that is done, let’s insert values into those tables:

INSERT INTO users (email) VALUES 
    ('[email protected]'),
    ('[email protected]'),
    ('invalid-email@wrong');

INSERT INTO patterns (id, regex_pattern) VALUES 
    (0, '^[a-z0-9._%+-]+@[a-z0-9.-]+\.[a-z]{2,}$');

Now, we can validate if user emails in users table are valid. If the regex doesn’t match, a NULL value is returned.

SELECT users.email, 
       patterns.regex_pattern,
       REGEXP_MATCH(users.email, patterns.regex_pattern, 'i') AS is_valid
FROM users
JOIN patterns ON patterns.id = 0;
        email        |              regex_pattern              |   is_valid    
---------------------+-----------------------------------------+--------------------
 user@example.com    | ^[a-z0-9._%+-]+@[a-z0-9.-]+\.[a-z]{2,}$ | {user@example.com}
 admin@test.org      | ^[a-z0-9._%+-]+@[a-z0-9.-]+\.[a-z]{2,}$ | {admin@test.org}
 invalid-email@wrong | ^[a-z0-9._%+-]+@[a-z0-9.-]+\.[a-z]{2,}$ | 
(3 rows)

Restrictions

  • The function returns NULL if it cannot match the regular expression pattern
  • i and c flags shouldn’t be used with each other