Java regex by example: strings

By , last updated July 30, 2018

In this tutorial we will use Java 8 and show several program examples with Java strings and regular expressions or Java regex.

Find a word in a string

To match the string containing a word or a sequence of characters we will simply provide the word within the regex expression.

The pattern for simple word match will be the word itself.

For example, here’s the Java regex pattern to find all occurrences of the word “world”:

"world"

Java code example:

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class Regex
{
    public static void main(String[] args)
    {

        String str1 = "Hi Will intheworld!";
        String str2 = "Hello world!";

        Pattern pattern = Pattern.compile("world");

        matchPattern(str1, pattern);
        matchPattern(str2, pattern);
    }

    private static void matchPattern(String str1, Pattern pattern) {
        Matcher m = pattern.matcher(str1);
        while(m.find()) {
            System.out.print(m.group());
        }
        System.out.println();
    }
}

Output:

world
world

This pattern will match all occurrences of the word “world” no matter where it is. This will also match the word even though it is a part of another word.

Match whole word only in a string

To match the string containing a special whole word we need to use the “word boundaries” \\b metacharacter. It will find a whole word and not just a sequence of characters.

The pattern will be:

"\\b(?:world)\\b"

Example strings:

String str1 = "Hi Will intheworld!";
String str2 = "Hello world!";

The output will be only the string containing this whole word:

world

Match phrase in a string

We are going to use the same “word boundaries” metachracter to match a phrase exactly in a string.

Pattern:

"\\bhello world\\b"

This pattern will find a phrase “hello world”, but not “helloworld”.

Get string between parentheses

  • \\( – opening parenthesis
  • \\) – closing parenthesis
  • (...) – start and end of the match group
  • [^)]* – match any character except “)” zero or more times (use “+” instead of “*” if you want to match one or more times)
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class Regex
{
    public static void main(String[] args)
    {
        String str1 = "I'm the (First string) to be found. ()?";
        String str2 = "I'm the (SECOND string) to be found. Right? (haha?._,?)";

        Pattern pattern = Pattern.compile(" (([^)]*)\\)");

        matchPattern(str1, pattern);
        matchPattern(str2, pattern);
    }

    private static void matchPattern(String str1, Pattern pattern) {
        Matcher m = pattern.matcher(str1);
        while(m.find()) {
            System.out.println(m.group(1));
        }
    }
}

This version of the program will match empty parentheses as well.

Output:

First string

SECOND string
haha?._,?

Get string between square brackets

Pattern for brackets is much like the pattern between parentheses. We will have a opening and a closing bracket and a one or more matches of any character except the closing bracket:

"\\[([^]]+)\\]"

Get string between curly braces

"\\{([^}]+)\\}"

Extract string between two characters

Given I would like to extract a string between characters “a” and “b” the pattern will be:

"a(.*)b"

Explanation:

  • a ... b – between characters “a” and “b” (use any other character)
  • (...) – start and end of the match group
  • .* – match any character (.) zero or more times (*)

Java regex example code to find everything between two characters:

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class Regex
{
    public static void main(String[] args)
    {

        String str1 = "I'm a (string) to be found. []?";
        String str2 = "I'm the hahahb string to be found.";

        Pattern pattern = Pattern.compile("a(.*)b");

        matchPattern(str1, pattern);
        matchPattern(str2, pattern);
    }

    private static void matchPattern(String str1, Pattern pattern) {
        Matcher m = pattern.matcher(str1);
        while(m.find()) {
            System.out.println(m.group(1));
        }
    }
}

Output:

(string) to
hahb string to

Exclude special characters in a string

Given we would like to exclude characters “<", ">“, “$” and “%” from a string, we would use a negative sign before these chars in a regex:

"([^<>%$])"

Example strings:

String str1 = "Hi <there> %s%";
String str2 = "I'm the $$$.";

The output will be:

Hi there s
I’m the .

Senior Software Engineer developing all kinds of stuff.