Skip to main content

Introduction To Regular Expressions [ Regex ]


What Is A Regular Expression

Regular expression or regex for short is a sequence of characters that defines a search pattern.
Let me simplify this for you.
Imagine you're writing a huge assignment or a report then you realize you misspelled a word and apparently the word was used a couple of hundred times throughout your document, now any sane person living in the 21st century would :
find and replace->”misspelled word”->”correct word”.

Have you ever wondered how the computer checks for the word?
How the words are magically found and replaced?
It just searches the entire document for “word-to-be-searched” and replace the string with the new one.
Imagine this scenario, you are asked to redact phone numbers from a letter, now you don’t know any of the numbers personally, and since the letter contains plenty of numbers that are not just phone numbers and those should not be redacted what would you do then?
This is where Regular Expressions come in to play, a regular expression is basically a phase or a sequence of characters that represent a series of characters in any given order in a given grammar. To simplify it further:
Let G be the language we use, such that G = {a, b, c}
This basically means that we are just going to use a, b, and c as the total alphabets in the language instead of the whole 26 alphabets.

Now we can have the alphabets coming in any order, say abc, bac, aaa, bbb, acc, bbb….. and any number of times. Suppose that we have to find all the words that start with a, can have any of the 3 letters in the middle and ends with c. So our possible options are abc, aac, and acc.
Regex state machine
State Transition Diagram
Now looking at the state transition table, A B and C denote the states and a b and c denote the alphabets passed. We always start from the start state, here we start the automata if and only if the first character is ‘a’ else we just skip the word altogether.

From state A, the next state is reached when any of the three letters are incurred. Moving from B we go to state C, which is the final state (denoted by double circle) if and only if ‘c’ is incurred, else we move to dead state which means that our automata gets halted and returns that the word did not satisfy the grammar set (aac, abc & acc).
You don’t always need to draw out the transition table to use a regular expression, this whole table can be represented in text as :
Regular Expression: a(a+b+c)c
Where the symbols represent the following:           
+ :- OR operator
* :- 1 or more occurrences 
So if we need to find a sequence where the above word repeats itself over and over, we just use the regular expression, (a (a + b + c) c)*
Now that you have somewhat of an understanding of what and how regular expressions work, let’s move onto the actual coding side and implementation of regex.
Implementation in C#.

To access regular expression functionality you have to include System.Text.RegularExpressions.
Some of the symbol interpretations change when you use them in programming languages like C#.

List Of Symbols And Their Use In C# Regex

  • ‘\b’: This is a special symbol that indicates the compiler to match the position at the beginning or end of the word to be searched for.
  • ‘\d’: Indicates that the character is a digit ranging from 0-9.
  • {n}: This is usually used after a symbol or word to be searched, n is the number of time the symbol occurs.
  • ‘+’: At least one or more occurrences.
  • ‘\w’: Alphanumeric characters.
  •  ‘.’: Any character except a new line.
  •  ‘\s’: Whitespace.
  • ‘^’: Beginning of a string.
  • ‘$’: End of a string.
  • ‘*’: Any number of repetitions.
  • {n,m}: Repeat the symbol at least n times but not more than m times.
  • {n,}: Repeat the symbol at least n times with no upper limit.
  • ‘\W’: Not alphanumeric
  • ‘\S’: Not white space
  •  ‘\D’: Not digit
  • ‘\B’: Not beginning or end of a word
  • [^x]: Any character that is not x
  • [^aeiou]: Any consonant
  •  ‘*?’: Any number of times but as few as possible
  •  ‘+?’: One or more occurrences but as few as possible
  •  ‘??’: 0 or 1 occurrences but as few as possible
The main function that we will be using in the Regex class is :
Regex.Matches( string textInput, string regexStatement) :- returns MatchCollection object.
We will go over MatchCollection class in the future.

Example Program:

using System;
using System.Text.RegularExpressions;
     
public class BitshiftProgrammer
{
 private static void CheckForCaptialsAtSentenceStart(string text)
 {
       MatchCollection mc = Regex.Matches(text,@"\. [A-Z]\w*");
       /*Start checking with presence of '.' then look for a space then any captial letter between A & Z 
       then following it can be any alpha-numeric value until we see a non-alpha numeric value*/
       foreach (Match m in mc)
       {
            Console.WriteLine(m);
       }
 }
 public static void Main(string[] args)
 {
       Console.WriteLine("Checking for capital rule non-violating words");
       CheckForCaptialsAtSentenceStart("This is first sentence. Second sentence is better. third sentence needs some work. Fourth has become better");
 }
}
Output :
Checking for captical rule non-violting words
. Second
. Fourth
We will go over many such C# examples in the future where we look into much more complicated ones.
Well I hope you learnt something of value.
Please do support Bitshift Programmer by sharing this with your friends and colleagues.
For More C# Tutorials, go HERE.
For Unity Tutorials, go HERE.

Comments

Assets Worth Checking Out

POPULAR POSTS

Curved Surface Shader [ Unity Implementation ]

Curved Surface Shader This is the shader that we will be having at the end of this tutorial.
 The curved surface shader is capable of achieving really varied visual effects from showing space-time curve due to gravity to a generic curved world shader that is seen in endless runners like Subway Surfers.
The concepts that you learn here can open you up to a new way of looking at shaders and if you didn't think they were the coolest thing ever already, hopefully let this be the turning point.😝.

Both the examples show above use the same exact material is just that different values have been passed to the shader.
Start by creating a new unlit shader in Unity and we will work our way from there.
First we define what the properties are:
_MainTex("Texture", 2D) = "white" {} _BendAmount("Bend Amount", Vector) = (1,1,1,1) _BendOrigin("Bend Origin", Vector) = (0,0,0,0) _BendFallOff("Bend Falloff", float) = 1.0 _BendFallOffStr("Falloff s…

How To Animate A Fish Swimming With Shaders

Animate Fish Swimming With Shaders We are going to make swimming animation by using only shader code.
By the time we are done, it's going to look like this.
You will probably need the fish model used in this tutorial, that can be found HERE. Can use your own model but the shader code might have to be modified accordingly because of the orientation of the model that you might be using ( issues with whether the X axis & Z axis is flipped ).
The shader used way out performs a similar scene with skeletal animations applied on the fish models.
On a previous benchmark I did comparing the shader animation with the skeletal animation there was a difference of 28 FPS( on average ) with 50 fish.
The shader we are going to make is really powerful and flexible and don't think that it's limited to making fishes swim😀.


So this mesh oriented like this when imported into unity and this is important to understand because this means that the model's vertices have to be moved along the X-…

Pixelation Shader - Unity Shader

Pixelation Shader This is the correct way (one of many) of showing pixelation as a post-processing effect. This effect will work in any aspect ratio without any pixel size scaling issues as well as it is very minimal in terms of coding it up.

In order to get this to work 2 components have to be set up:
1) The pixelation image effect
2) The script - which will be attached to the camera

So let's get started by creating a new image effect shader.
We will take a look at our Shaderlab properties :
_MainTex("Texture", 2D) = "white" {} That's it, Everything else will be private and not shown in the editor.
Now we will see what are defined along with the _MainTex but are private.
sampler2D _MainTex; int _PixelDensity; float2 _AspectRatioMultiplier; We will pass _PixelDensity & _AspectRatioMultiplier values from the script.
As this is an image effect there is no need to play around with the vertex shader.
Let's take a look at our fragment shader:
fixed4 frag (…

Access Reflection Probe Data For Custom Shaders

The Unity shader documentation regarding reflection probes is pretty minimal and not at all comprehensive.
This short tutorial is intended to bring reflection probe functionalities to the forefront your future shader writing endevors which is a fancy way of saying "Look at this cool stuff and go and use it somewhere" 😏
Here we will try just the bare minimum of making a shader that reflects the cubemap data from reflection probe and displays it on the object.

These reflection probes are basically objects that store a complete image of the environment surrounding it into a cubemap which then can be read by shaders to create various effects.
More information on how reflection probes work in Unity can be found here :
Using Reflection Probes In Unity

I am not going over how to set up Reflection Probes here only how to access them inside our custom shaders.
So this is what we will be making:
The reflection probe takes in the cubemap only if it is within it's range otherwise i…

Toon Liquid Shader - Unity Shader

Toon Liquid Shader This is how the shader will end up looking :
This shader is pretty neat and somewhat easy to implement as well as to understand. Since we will be adding some basic physics to the toon water as it is moved about we will have to support that in the vertex shader as well.
So let's start by looking at the properties :
Properties { _Colour ("Colour", Color) = (1,1,1,1) _FillAmount ("Fill Amount", Range(-10,10)) = 0.0 [HideInInspector] _WobbleX ("WobbleX", Range(-1,1)) = 0.0 [HideInInspector] _WobbleZ ("WobbleZ", Range(-1,1)) = 0.0 _TopColor ("Top Color", Color) = (1,1,1,1) _FoamColor ("Foam Line Color", Color) = (1,1,1,1) _Rim ("Foam Line Width", Range(0,0.1)) = 0.0 _RimColor ("Rim Color", Color) = (1,1,1,1) _RimPower ("Rim Power", Range(0,10)) = 0.0 } Just the usual stuff that we are used to. The only thing that may stand out is the [HideInInspector] tag, This works j…