# Logistic Regression

Logistic regression is a statistical method for analyzing a dataset in which there are one or more independent variables that determine an outcome. The outcome is measured with a dichotomous variable (in which there are only two possible outcomes). It’s used extensively for binary classification problems, such as spam detection (spam or not spam), loan default (default or not), disease diagnosis (positive or negative), etc. Logistic regression predicts the probability that a given input belongs to a certain category.

## Sigmoid / Logistic Function :

The core of logistic regression is the sigmoid function, which maps any real-valued number into a value between 0 and 1, making it suitable for probability estimation. The sigmoid function is defined as $$\sigma(z) = \frac{1}{1 + e^{-z}}$$, where $$z$$ is the input to the function, often $$z = w^T x + b$$, with $$w$$ being the weights, $$x$$ the input features, and $$b$$ the bias.

## Cost / loss Function:

### MLE in Binary Classification

Maximum Likelihood Estimation (MLE) is a central concept in statistical modeling, including binary classification tasks. Binary classification involves predicting whether an instance belongs to one of two classes (e.g., spam or not spam, diseased or healthy) based on certain input features.

In binary classification, you often model the probability of the positive class ($$y=1$$) as a function of input features ($$X$$) using a logistic function, leading to logistic regression. The probability that a given instance belongs to the positive class can be expressed as:

$P(Y=1 | X; \theta) = \frac{1}{1 + e^{-(\beta_0 + \beta_1X_1 + ... + \beta_nX_n)}}$

Here, $$\theta$$ represents the model parameters ($$\beta_0, \beta_1, ..., \beta_n$$), and $$X_1, ..., X_n$$ are the input features.

The likelihood function $$L(\theta)$$ in the context of binary classification is the product of the probabilities of each observed label, given the input features and the model parameters. For a dataset with $$m$$ instances, where $$y_i$$ is the label of the $$i$$-th instance, and $$p_i$$ is the predicted probability of the $$i$$-th instance being in the positive class, the likelihood is:

$L(\theta) = \prod_{i=1}^{m} p_i^{y_i} (1-p_i)^{1-y_i}$

This product is maximized when the model parameters ($$\theta$$) are such that the predicted probabilities ($$p_i$$) are close to 1 for actual positive instances and close to 0 for actual negative instances.

### Log-Likelihood:

To simplify calculations and handle numerical stability, we use the log-likelihood, which converts the product into a sum:

$\ell(\theta) = \sum_{i=1}^{m} \left[ y_i \log(p_i) + (1-y_i) \log(1-p_i) \right]$

The goal is to find the parameters ($$\theta$$) that maximize this log-likelihood.

### Threshold Decision:

The probability outcome from the sigmoid function is converted into a binary outcome via a threshold decision rule, usually 0.5 (if the sigmoid output is greater than or equal to 0.5, the outcome is classified as 1, otherwise as 0).

### Performance Metrics:

Here are some performance metrics that can be used to evaluate the performance of a binary classifier:

• Accuracy

• Precision

• Recall

• F1 score

• ROC curve

• Confusion matrix

• AUC (Area Under the Curve)

### Logistic Regression in PyTorch:

Here’s a simple example of how to implement logistic regression in PyTorch. PyTorch is a deep learning framework that provides a lot of flexibility and capabilities, including automatic differentiation which is handy for logistic regression.

#### Step 1: Import Libraries

import torch
import torch.nn as nn
import torch.optim as optim


#### Step 2: Create Dataset

For simplicity, let’s assume a binary classification task with some synthetic data.

# Features [sample size, number of features]
X = torch.tensor([[1, 2], [4, 5], [7, 8], [9, 10]], dtype=torch.float32)
# Labels [sample size, 1]
y = torch.tensor([[0], [1], [1], [0]], dtype=torch.float32)


#### Step 3: Define the Model

class LogisticRegressionModel(nn.Module):
def __init__(self, input_size, num_classes):
super(LogisticRegressionModel, self).__init__()
self.linear = nn.Linear(input_size, num_classes)

def forward(self, x):
out = torch.sigmoid(self.linear(x))
return out


#### Step 4: Instantiate Model, Loss, and Optimizer

input_size = 2
num_classes = 1
model = LogisticRegressionModel(input_size, num_classes)

criterion = nn.BCELoss()
optimizer = optim.SGD(model.parameters(), lr=0.01)


#### Step 5: Train the Model

num_epochs = 100
for epoch in range(num_epochs):
# Forward pass
outputs = model(X)
loss = criterion(outputs, y)

# Backward and optimize