- Activation Function
- Confusion Matrix
- Convolutional Neural Networks
- Forward Propagation
- Generative Adversarial Network
- Gradient Descent
- Linear Regression
- Logistic Regression
- Machine Learning Algorithms
- Multilayer Perceptron
- Naive Bayes
- Neural Networking and Deep Learning
- RuleFit
- Stack Ensemble
- Word2Vec
- XGBoost

- Attention Mechanism
- BERT
- Binary Classification
- Classify Token ([CLS])
- Conversational Response Generation
- GLUE (General Language Understanding Evaluation)
- GPT (Generative Pre-Trained Transformers)
- Language Modeling
- Layer Normalization
- Mask Token ([MASK])
- Probability Distribution
- Probing Classifiers
- SQuAD (Stanford Question Answering Dataset)
- Self-attention
- Separate token ([SEP])
- Sequence-to-sequence Language Generation
- Sequential Text Spans
- Text Classification
- Text Generation
- Transformer Architecture
- WordPiece

- AUC-ROC
- Analytical Review
- Autoencoders
- Bias-Variance Tradeoff
- Decision Optimization
- Explanatory Variables
- Exponential Smoothing
- Level of Granularity
- Long Short-Term Memory
- Loss Function
- Model Management
- Precision and Recall
- Predictive Learning
- ROC Curve
- Recommendation system
- Stochastic Gradient Descent
- Target Leakage
- Target Variable
- Underwriting

A

C

D

G

L

M

N

P

R

S

T

X

- Activation Function
- Confusion Matrix
- Convolutional Neural Networks
- Forward Propagation
- Generative Adversarial Network
- Gradient Descent
- Linear Regression
- Logistic Regression
- Machine Learning Algorithms
- Multilayer Perceptron
- Naive Bayes
- Neural Networking and Deep Learning
- RuleFit
- Stack Ensemble
- Word2Vec
- XGBoost

- Attention Mechanism
- BERT
- Binary Classification
- Classify Token ([CLS])
- Conversational Response Generation
- GLUE (General Language Understanding Evaluation)
- GPT (Generative Pre-Trained Transformers)
- Language Modeling
- Layer Normalization
- Mask Token ([MASK])
- Probability Distribution
- Probing Classifiers
- SQuAD (Stanford Question Answering Dataset)
- Self-attention
- Separate token ([SEP])
- Sequence-to-sequence Language Generation
- Sequential Text Spans
- Text Classification
- Text Generation
- Transformer Architecture
- WordPiece

- AUC-ROC
- Analytical Review
- Autoencoders
- Bias-Variance Tradeoff
- Decision Optimization
- Explanatory Variables
- Exponential Smoothing
- Level of Granularity
- Long Short-Term Memory
- Loss Function
- Model Management
- Precision and Recall
- Predictive Learning
- ROC Curve
- Recommendation system
- Stochastic Gradient Descent
- Target Leakage
- Target Variable
- Underwriting

Machine learning (ML) models process input data to detect patterns and generate predictions. These models are trained by datasets and algorithms. The datasets fed into a model are a starting point, while the established algorithms think, reason, and segregate the data. This creates a mathematical representation of the relationship between each object in an ML system.

There are three different ML systems: supervised machine learning, unsupervised machine learning, and reinforcement learning. They each have a different relationship between data and objects and vary on the type of machine learning they represent.

ML models are designed to teach machines to operate and optimize themselves, while learning and improving from past decisions. AI models are designed to replicate human intelligence through algorithms. While all ML models are AI models, not every AI model is an ML model.

Supervised machine learning provides algorithms with input data and optimizes them to meet specific end results. Supervised learning provides context to the discovery of a solution and can then produce detailed models. Ways of deriving said context can be further grouped into problem types solved through supervised learning; classification and regression.

Classification isattaching classes to specific clusters within a dataset. A common use of classification is auto-labeling emails as spam, promotions, or junk. Classification-based problems are often solved through several frequently used supervised algorithms and models. Examples of commonly used classification models in supervised learning are:

**Decision tree**- Decision trees are graphical representations of the available alternatives to solve a given problem and determine the most effective course of action.

**Support vector machine (SVM)**- SVMs are supervised learning models with associated learning algorithms that analyze and categorize data. SVM models assign vectors into different categories.

**K-nearest neighbors (KNN)**- KNN is an algorithm that identifies the nearest vectors, or neighbors, to an unknown variable that must be predicted or classified by the symbol “K”. This algorithm is effective in solving both classification and regression problem statements.

Regression analysis involves finding a specific numerical value from various data points. Predicting upcoming weather, for example, needs to be calculated through specific properties such as air temperature, humidity, atmospheric pressure, etc. Each of these properties are quantified as data points and inserted into algorithms to produce predictive models of future weather patterns. Several algorithms can be used to produce these models. Examples of common algorithms include, but are not limited to:

**Linear regression**- Linear regression is a process that identifies relationships between dependent and one (simple linear) or more independent variables (multiple linear), processing these connections to optimize predictions.

**Simple linear regression**- Simple linear regression uses a single input and output variable to create predictions based on trained datasets. These predictions are derived through a straight line between the input and output.

**Multiple linear regression**- While similar to simple linear regression, multiple linear regression has multiple input variables.

**Multivariate linear regression**- Multivariate linear regression differentiates itself from simple linear regression through the use of multiple output variables.

**Unsupervised machine learning** algorithms identify patterns from a dataset without a known outcome. Due to the lack of clarified end results, unsupervised machine learning often discovers unknown information. This approach benefits unlabeled and unstructured data as unsupervised learning can find patterns within them. This manifests itself through detailed unsupervised learning algorithms including clustering, anomaly detection, and dimensionality reduction.

**Clustering** is an ML algorithm that organizes similar data points into defined clusters. This algorithm is an unsupervised machine learning task that gives insight into data through analysis of the clusters.

Also known as outlier detection, anomaly detection identifies irregularities among processed data. This is valuable in detecting cloud security vulnerabilities across various industries.

Dimensionality is the unique identifier within a given dataset. Dimensionality reduction, a form of data simplification, is the unsupervised learning process of refining the unique identifiers into simplified datasets. This process is most effective when producing straightforward ML models.

Reinforcement machine learning (RL) is when a model interacts continuously with the environment and trains itself through trial and error. RL does not rely on training data. Instead, it learns to identify and maximize positive results over time. RL is unique to both supervised and unsupervised learning through optimization and function.

Though supervised and reinforcement learning incorporate data categorization, RL leverages a system of actions and consequences to optimize results. Conversely, while a primary function of unsupervised learning lies in pattern discovery between unlabeled data points, the primary function of RL is to create models that maximize the total cumulative rewards of the initial output data.

ML models are trained through data consumption. ML model training is the process of feeding an algorithm sufficient data to learn from. Training models run primary datasets through a given algorithm to correlate processed outcomes against their preprocessed sample results. The correlation between these two outputs can then be used to iterate and modify the ML model.