The process of shader optimization can seem like trial and error... in fact, that's how it is most of the time.
Most of the time shader optimizations could be boiled down to educated guesses because each time a shader gets compiled, the GPU driver of that specific hardware is what converts your code into actual machine code, therefore, the machine code generated will be different for each GPU and the driver itself might perform some optimizations on top of your's which won't be available on another GPU, thereby making it difficult to have a standard way of writing optimal shader code.

So the best way to know for sure is to actually test it on the hardware you are targeting.
With that said, Here are some universal best ways of getting your shader to perform better.😅
1. ## Do Calculations On Vertex Shader

The most commonly used case for this is lighting, an example would be Gouraud lighting, where lighting calculations are done per vertex but at the loss of quality. Some calculations need not be done per fragment and still retain the desired quality such as UV co-ordinate modifications and world space/local space calculations.

A good example would be rotating the UV coordinates of the skybox texture which will, in turn, pass the modified UV coordinates to the fragment shader which will render the rotated skybox.
Even fog can be approximated per vertex and still get good results.
2. ## Store Complex Functions In Texture

A good example of this is when you need noise to achieve an effect, Most of the time sampling from a noise texture is more performant than calculating per it fragment/vertex.
Example: Another example would be ambient occlusion, Baked in ambient occlusion basically comes for free as it can be considered as a part of the albedo/diffuse texture itself.

Ambient occlusion solutions like SSAO should only be opted to be used in cases where the game visuals require better visual interaction of objects with surrounding geometry.
3. ## Avoid Non-Constant Math

Never re-calculate something everytime the shader is called either use uniforms or declare constants within the shader itself.
There are main 3 ways of achieving less non-constant code.
1. declare a variable with a constant value.
Example of doing it correctly:

float x = 22.0/7.0;
float y = 3.5 * sin(3.14);
Example of doing it wrong:

float x = 22.0;
x = x/7.0;
float y = sin(3.14);
y *= 3.5;
2.  Using #defines to declare universal constants like PI and it's variants like TWO_PI, HALF_PI etc.
These can be defined anywhere in the CG PROGRAM. It is best to declare all #defines on top.
Example:

#define PI 3.14159
#define HALF_PI 1.57079
#define TWO_PI
6.28318
3.  Use regular const statements perform the same function as #defines but these const statements can only be declared within the vertex/fragment shader.
Example:

const float PI = 3.14159
4. ## Use Lowest Precision Possible( applies mainly to mobile )

On PCs and on most consoles all the precision will be maintained at highest precision but on mobiles, the other lower precision types will be considered as well.
There are 3 levels of precision: float, half & fixed.
• Where float is 32-bit value usually used for positions, UV coordinates and in general large values.
• -60,000 to +60,000 with 3 digits of decimal precision.
useful for accessing image data in HDR range, UV coordinates as well general floating point operations that don't exceed its range.
• fixed has a value range between -2.0 to +2.0 mostly used when dealing with colors, not in HDR range.
5. ## Multiply Scalar Values Before Vector Values

Multiplying scalar values with each other is a simple and fast operation but if a scalar is multiplied with a vector then each value in the vector has to be multiplied with the scalar.
So multiply all the scalars first then multiply the result with the vectors involved thereby reducing the number of instructions needed to be executed.
Example of doing it correctly:

float height = 2.45;
float width = 1.22;
float3 world = float3(1, 3, 5);
float3 pos = (height * width) * world;
Example of doing it wrong:

float height = 2.45;
float width = 1.22;
float3 world = float3(1, 3, 5);
float3 pos = height * (width * world);
6. ## Avoid Branching Based On Dynamically Set Values

Branching code usually the main culprit when it comes to slow shader performance.
However branching only causes slow down if the condition being checked is local and is subject to change with each vertex/fragment, This is due to how GPUs perform calculations.

Uniforms and Constants, however, do not cause slowdowns because they do not change based on vertex/fragment.
There are various techniques that can be used to prevent branching even if the values are dynamic which we will go over in Part 2 (Will Be Released Soon).