Oracle Regular Expression
Regular expressions allow you to perform powerful context
searches in variable-length strings. They provide a a powerful set of pattern
matching capabilities by combining the following
Character Classes
|
are groups of possible characters at a point in the search
|
Collation Classes
|
are sets of characters and are treated like a range
|
Metacharacters
|
are operators that specify search algorithms
|
Metasequences
|
are operators created by two metacharacters or literals
|
Literals
|
are characters, character sets and words
|
Character Classes
Character classes are groups or ranges of possible
characters, they are traditionally delimited by square brackets [], You use a
dash "-" to specifiy a range i.e [1-9]. You can use the posix
character class inside the brackets [:alpha:] see below for more examples
[:alnum:]
|
All alphanumeric characters
|
[:alpha:]
|
All alphabetic characters
|
[:cntrl:]
|
All non-printable control characters
|
[:digit:]
|
All numeric digits
|
[:graph:]
|
all [:digit:], [:lower:], [:punct:] and [:upper:] portable
character class
|
[:lower:]
|
All lower alphabetic characters
|
[:print:]
|
All printable characters
|
[:punct:]
|
All punctuation characters
|
[:space:]
|
All nonprinting space characters
|
[:upper:]
|
All upper alphabetic characters
|
[:xdigit:]
|
All hexidecimal characters
|
Now for an example
Character class example:
DECLARE
Character class example:
DECLARE
counter NUMBER := 1;
source_string VARCHAR2(12) := 'A1';
pattern1 VARCHAR2(12) := '[[:alpha:]]'; -- use the posix character class
pattern2 VARCHAR2(12) := '[[:alnum:]]';
BEGIN
-- Compare using standard character class ranges.
FOR i IN 1..LENGTH(source_string)
LOOP
LOOP
IF REGEXP_INSTR(SUBSTR(source_string,counter,i),pattern1) = i
THEN
THEN
dbms_output.put(REGEXP_SUBSTR(SUBSTR(source_string,counter,i),pattern1));
ELSE
dbms_output.put_line(REGEXP_SUBSTR( SUBSTR(source_string,counter,i),pattern2));
END IF;
counter := counter + 1;
END LOOP;
END;
/
This class is new and is designed to allow you to collate
languages that require a collating element. You define a collation class by
using [..]
Collation Classes
Collation Classes
Collation class example
|
[a-[.ch.]]
Note: allows you to find element that is between an a or a
ch
|
Metacharacters
A metacharacter provides some mechanics for performing
pattern matching, lots of programming and scripting languages use the same
syntax.
Metacharacter
|
Name
|
Type
|
Description
|
()
|
parentheses
|
Delimiter
|
act as a constraint on the scope of comparsion
|
{m}
|
exact
|
Interval
|
matches exactly m occurences of the preceding
subexpression or character
|
{m, }
|
at least
|
Interval
|
matches at least m occurences of the preceding
subexpression or character
|
{m,n}
|
between
|
Interval
|
matches exactly m occurences but no more than n of the
preceding subexpression or character
|
|
|
OR
|
Logical
|
acts as a logical OR operator
|
.
|
dot
|
Matching
|
matches any one character
|
^
|
caret
|
Matching
|
matches the beginning of a line
|
$
|
dollar
|
Matching
|
matches the end of a line
|
[^]
|
caret
|
Negation
|
negates when used inside square brackets (like the NOT
operator)
|
-
|
dash
|
Range
|
specifies a range when inside square brackets
|
?
|
question mark
|
Repetition
|
makes the proceding character optional
|
*
|
asterisk
|
Repetition
|
matches any instance of zero to many characters
|
+
|
plus
|
Repetition
|
matches at least once or many times the preceding
character.
|
The best way to learn these is to google regular expression
on the web, as i said before many programming languages and scripting languages
use metacharacters.
Metasequences
Those of you who have used Perl before will under the
metasequences, they are a character combined with a backslash to represent a
pattern or characters.
Metasequence
|
Name
|
Type
|
Description
|
\n
|
backreference
|
Posix
|
matches the nth preceding subexpression
|
\d
|
digit
|
Perl
|
matches any digit
|
\D
|
nondigit
|
Perl
|
matches any non-digit
|
\w
|
word character
|
Perl
|
matches any word characters
|
\W
|
nonword character
|
Perl
|
matches any non-word characters
|
\s
|
whitespace character
|
Perl
|
matches any whitespace character
|
\S
|
nonwhitespace character
|
Perl
|
matches any non-whitespace character
|
\A
|
beginning of a string
|
Perl
|
matches a beginning of a string
|
\Z
|
end of a sting
|
Perl
|
matches the end of a string
|
\z
|
end of a string
|
Perl
|
matches the end of a string
|
Literals
Literals values are simply string literals, they may consist
of one or many characters.
Oracle 11g Regular Expression
Oracle 11g has implemented a number of regular expression
functions, so far there are 5 in total
Regular Expression Functions
|
|||
REGEXP_COUNT
|
lets you count the number of times a specific pattern is
found in a string
|
||
REGEXP_INSTR
|
lets you find a position index value in a string
|
||
REGEXP_LIKE
|
lets you find a regular expression match inside a string,
it's like the LIKE operator
|
||
REGEXP_REPLACE
|
lets you find and replace a substring inside of a string
|
||
REGEXP_SUBSTR
|
lets you find a substring inside a string
|
||
Match Type Flags
|
|||
i
|
sets the search to case-insensitive matching, overriding
the nls_sort parameter
|
||
c
|
sets the search to case-sensitive matching, overriding the
nls_sort parameter
|
||
n
|
Enables the dot (.) to truly match any character,
including the newline
|
||
m
|
Enables a search to recognise multiple lines inside a
string
|
||
x
|
sets the search to ignore any whitespace characters
|
||
Examples
|
|||
REGEXP_COUNT
|
DECLARE
mystring varchar2(24) := 'abcdefABCDEFabcdefABCDEF'; mycount number; BEGIN -- match a patterm - should result in 2 select regexp_count(mystring, 'abcdef') into mycount from dual; dbms_output.put_line('Total Matches: ' || mycount);
-- match a single pattern starting from
position 7 - should result in 1
select regexp_count(mystring, 'abcdef',7) into mycount from dual; dbms_output.put_line('Total Matches: ' || mycount);
-- match a pattern but ignore case -
should result in 4
select regexp_count(mystring, 'abcdef',1,'i') into mycount from dual; dbms_output.put_line('Total Matches: ' || mycount); END; / |
||
REGEXP_INSTR
|
create table employees (emp_id number
,emp_name varchar2(30) ); insert into employees values (1, 'Paul Valle');
select regexp_instr(emp_name,'Valle',1,1,0,'i') as Begin
regexp_instr(emp_name,'Valle',1,1,1,'i') - 1 as End from employees |
||
REGEXP_LIKE
|
create table employees ( emp_id number
,emp_name varchar2(30) ); insert into employees values (1, 'Paul Valle');
select emp_id
from employees where regexp_like(emp_name, 'Valle'); |
||
REGEXP_REPLACE
|
create table employees (emp_id number
,emp_name varchar2(30) ); insert into employees values (1, 'Paul Valle');
select regexp_replace(emp_name, 'Paul', 'Lorraine',1)
from employees; |
||
REGEXP_SUBSTR
|
create table employees ( emp_id number
,emp_name varchar2(30) ); insert into employees values (1, 'Paul Valle');
select regexp_substr(emp_name, 'V[:alpha:]+e',1)
from employees; -- using a character class |
||