Algorithms and Data Structures

Algorithms and data structures ported to TypeScript from Introduction to Algorithms by Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein, as well as other sources classic computer science papers.

Binary Search Tree

A Binary Search Tree (BST) is a data structure in which each node has at most two children and maintains the property that for every node, the values in its left subtree are less than the node’s value, and the values in its right subtree are greater.

Breadth-First Search

Breadth-First Search is an algorithm for searching graph data structures. It starts at a specified source node and explores all its immediate neighbors before moving on to the next level of neighbors.

Bubblesort

Bubble Sort is a simple comparison-based sorting algorithm that repeatedly steps through the list, compares adjacent elements, and swaps them if they are in the wrong order. This process is repeated until the list is sorted. Although easy to implement, its worst-case and average-case time complexity of \(O(n^2)\) makes it inefficient on large datasets.

Chained Hash Table

A Chained Hash Table is a data structure that resolves collisions using linked lists. Each slot in the hash table array points to a chain (linked list) that stores all elements hashing to the same index. When inserting, searching, or deleting an element, the algorithm hashes the key to determine the appropriate slot and traverses the corresponding chain to perform the operation.

Cosine Similarity

Cosine similarity is a measure of similarity between two non-zero vectors in a multi-dimensional space, defined as the cosine of the angle between them. It is calculated as the dot product of the vectors divided by the product of their magnitudes. This metric is often used in text analysis, clustering, and information retrieval to determine the similarity of documents or data points, independent of their magnitude.

Counting Sort

Counting Sort is a non-comparison-based sorting algorithm that sorts integers by counting the occurrences of each distinct value in a range and using this information to place elements in their correct positions. It works in 𝑂(𝑛+𝑘) time, where 𝑛 is the number of elements and 𝑘 is the range of the input values. Counting Sort is efficient for small ranges of integers but unsuitable for sorting non-integer or large-range data due to its space complexity of 𝑂(𝑘). It is a stable sorting algorithm when implemented correctly.

Finite Automata

A finite automata algorithm is a step-by-step method used to check if a sequence of symbols (like letters or numbers) matches a specific pattern. It works by moving through a series of states based on the input, starting from an initial state and possibly ending in a state that signals a match.

Heapsort

A sorting technique that uses a binary heap, first building a max-heap and then repeatedly extracting the maximum element, ensuring efficient sorting.

Huffman Encoding

The Huffman Encoding data structure uses a Min Heap to sort characters by frequency. It then assigns codes based on frequency; the most frequent characters will have the shortest codes.

Insertion Sort

A sorting technique that builds the sorted array one element at a time, inserting each new element into its proper position among already-sorted elements.

Intersection

The calculateIntersectionWeight function computes the intersection weight between two arrays of strings by determining the ratio of the size of their intersection to the length of the smaller array. It relies on the getIntersection function to find the common elements between the two arrays, ensuring an efficient and straightforward implementation. The getIntersection function uses sets to identify and return a list of unique shared elements, removing duplicates and simplifying comparison. Together, these functions handle edge cases, such as empty arrays, by returning 0 when appropriate to avoid division by zero, making them robust and easy to use for set-based comparisons.

Iterative Depth-First Search

An iterative depth-first search algorithm using a stack, exploring as far along each branch as possible before backtracking, useful for pathfinding and cycle detection.

Linear Search

Linear Search is a straightforward algorithm for finding a specific element in a list. It sequentially checks each element until the target is found or the list ends, making it simple and effective for unsorted data but inefficient for large datasets.

Linked List

A data structure where each element contains a value and a reference to the next node, supporting efficient insertion and deletion operations.

Longest Common Prefix

Longest Common Subsequence

Merge Sort

A divide-and-conquer sorting technique that splits the array into halves, recursively sorts each half, and then merges the sorted halves.

Minimum Heap

The Minimum Heap is a specialized binary tree that is based on the Heap data structure. It is also known as a Priority Queue.

Myers Diff

Computes the differences between two sequences efficiently, commonly used in version control systems to show changes between file revisions.

N-Gram

The n-gram data structure is a simple container. It is used as a probabilistic model typically used in Natural Language Processing (NLP) to predict sequences of elements such as words or characters. It represents a sequence of \(n\) items from a given dataset, often applied for tasks like language modeling, auto-completion, and text prediction.

Naive String Matcher

Negative Array

Queue

A linear collection following the First-In-First-Out (FIFO) principle, supporting operations like enqueue and dequeue, useful in task scheduling.

Quicksort

Quick sort is a divide-and-conquer sorting algorithm that selects a “pivot” element from an array, partitions the remaining elements into two sub-arrays (elements less than the pivot and elements greater than the pivot), and then recursively sorts the sub-arrays.

Rabin-Karp Matcher

Recursive Depth-First Search

A recursive depth-first search algorithm that explores branches deeply before backtracking, managing traversal state with recursion.

Reverse

The reverse algorithm takes an array of values and returns the values in reverse order. It works by iterating over the array and swapping the first and last values, then the second and second-to-last values, and so on until the middle of the array is reached.

Selection Sort

Stack

Suffix Array

The suffix array algorithm constructs a sorted array of all suffixes of a given string, enabling efficient substring searches, pattern matching, and text analysis. It uses sorting and optional auxiliary structures like the LCP array for additional optimizations.

Ternary Search Trie

A Ternary Search Trie (TST) is a data structure used to store and retrieve strings efficiently, combining features of binary search trees and tries. It organizes data in a tree where each node contains three children (low, equal, high) corresponding to characters less than, equal to, or greater than the node’s character, enabling balanced search operations while conserving memory compared to standard tries.

Textrank

TF-IDF

Trie Symbol Table

The Trie Symbol Table is a data structure designed to efficiently store and retrieve key-value pairs where the keys are strings. It organizes data into a tree-like structure, where each node represents a character, and paths from the root to nodes represent strings.

Wagner-Fischer Edit Distance

The Wagner-Fischer algorithm is a dynamic programming method for calculating the Levenshtein distance (edit distance) between two strings. It determines the minimum number of single-character edits (insertions, deletions, or substitutions) needed to transform one string into another by filling a matrix where each cell represents the edit distance between prefixes of the two strings. The algorithm builds the solution incrementally, ensuring optimal substructure and overlapping subproblems for efficient computation.