- Part 1 – A Mega List of Known Algorithms – Sorting
- Part 2 – Mega List of Known Algorithms – Searching

# Introduction

Due to the sheer increase in the volume of information, we have embarked on the journey towards understanding what it all means. But before understanding anything, you need to find where the bits of information are. So we need searching algorithms which can efficiently extract information from our databases. There are many searching algorithms but we will be discussing some of them here.

## Symbol Tables

A table which store key-value pairs is essentially a symbol table. We can put key-value pairs in the table, and later on, we can access it using the specific key, and symbol table will return us the value associated with it. So here there two important procedures, one is the insertion, and the other is searching. So a symbol table data structure should support two operations i.e. insert and search. One such example of symbol table structures is in programming language compile time symbol table construction. This table can be used at runtime by the multiple functions to access certain routines at certain places in the memory.

Symbol tables can be used to develop further algorithms. You can find some of the applications of symbol table in web-search, indexing, accounting, compilers, file management etc. Following is a partially filled example of symbol table in C.

1 2 3 4 5 6 7 8 9 10 11 |
typedef struct symbol { char* id; char* dataType; int offset; struct symbol* next; } symbol, *symbol_ptr; /**Following is the API to use**/ void insert(symbol** headref, char* id, char* dataType); symbol search(char* id); /*... other functions ...*/ |

In the design of the symbol table make sure that there are no duplicate keys, or any new insertion which has a duplicate key should remove the old value in the table with the new value. The keys that you insert should never be NULL. Also do not add values in the table which are nul. nul values can be set for the process of deleting a key-value pair only. Or to say in a new way is that you make a value nul for the purpose of deleting it in future.

A symbol table API should also provide the facility of iteration, key equality, and deletion.

## Binary Search Trees

Binary Search Trees are one of the implementations of symbol tables. It provides a mechanism to do ordered search. In binary search, a node points to two more nodes apart from the design of singly linked list. Every binary tree has a root node, which has no parents. A root has two children and subsequently, each child has two children each, and so on. A node that is the far end of the tree is called a leaf. Every node before leaf has two nodes i.e. one to the right of it, and one to the left of it.

Each node has a key. And this key will be larger than the all the keys of the nodes that are in the left subtree of this node. Also, this key should be less than the keys that are the nodes which are in the right subtree of this node.

Here is the pseudo code that implements the BST’s algorithm. In a nutshell “For all nodes **a** and **b**, if b belongs to the left subtree of **a**, then the key at **b** is less than the key at **a**. And if **b** belongs to the right subtree of **a**, then the key at **b** is greater than the key at **a**“. So in pseudo code terms it would be

1 2 3 4 5 6 7 |
/*Code showing in-order traversal only. Here we are only searching for a node*/ b ← a while b != nil do if key[b] = k then return b else if key[b] < k then b ← right[b] else b ← left[b] return (“CAN'T FIND”) |

## Balanced Search Trees

Balanced Search Trees have a good worst-case performance (2lgN), primarily they maintain a good balance in the Binary search tree. There are many types of balanced search trees, let’s discuss two of them. 2-3 search tree and red-black search trees are two of the instances of in balanced search tree.

### 2-3 search tree

In 2-3 search tree, we allow nodes to have more than one keys. It has 3 primary properties

- Either the whole 2-3 search tree is empty or
- A 2-node with one key, and two sublinks. On the left of the 2-node is a 2-3 search tree with keys less than the key of 2-node parent. And the 2-3 search tree on the right has the keys greater than the key of the 2-node parent
- A 3-node with two keys, and three sublinks. On the left of the 3-node is a 2-3 search tree with keys less than the keys at 3-node parent. On the middle of the 3-node is a 2-3 search tree with keys between the values of the keys contained in the parent 3- node. And on the right sublink, it has 2-3 search tree with keys greater than its own keys
- A 2-3 tree would be called a perfectly balanced tree if all the null links have the same distance from the root

### red-black search tree

A red-black search tree is similar to 2-3 search tree but it has better insertion mechanism. I has following properties

- Each node in the RB Search Tree is either red or black
- The root node is always black
- Every end-node or leaf node (NULL node) is always black
- If the node becomes red, then all of its children are black
- Each path from node to all of its descendant leaf nodes has the same number of black nodes in it

The color of the node is stored inside its data structure apart from its keys. The tree makes it balanced by ensuring the last point in the list above.

The generic data structure of red-black tree is given below:

1 2 3 4 5 |
struct rbtree { int key; enum {red, black} color; rbtree *left, *right, *parent; } |

I need to cover more searching algorithms in this post e.g. Hash Tables, but I will update soon.