Linear Models
Linear models form the foundation of supervised learning by assuming linear relationships between input features and target variables. In financial services, they power credit scoring and risk assessment. In retail, they enable demand forecasting and price optimization.
Mathematical Foundation
Linear Regression
Models the relationship between features and continuous targets:
Symbol Definitions:
- = Predicted output (continuous value)
- = Weight vector (learned parameters)
- = Input feature vector
- = Bias term (intercept)
- = Weight for feature
- = Value of feature
- = Number of features
Matrix Form:
Symbol Definitions:
- = Target vector (n×1)
- = Feature matrix (n×d)
- = Vector of ones (n×1)
- = Number of training samples
Loss Function (Mean Squared Error)
Closed-Form Solution (Normal Equation):
Symbol Definitions:
- = Optimal weight vector
- = Inverse of feature covariance matrix
- = Feature-target correlation vector
Logistic Regression
Binary Classification
Models probability using logistic function:
Symbol Definitions:
- = Probability of positive class given features
- = Sigmoid/logistic function
- = Euler's number (≈ 2.718)
Log-Odds (Logit):
Symbol Definitions:
- = Odds ratio (probability of success vs. failure)
- = Natural logarithm
Loss Function (Cross-Entropy):
Financial Services Example: Credit Scoring Model
Business Context: Bank develops linear model to assess credit default risk for loan applications, balancing accuracy with regulatory interpretability requirements.
Features:
- = Annual income (normalized)
- = Credit history length (years)
- = Debt-to-income ratio
- = Number of existing credit accounts
- = Previous default indicator (0/1)
Logistic Regression Model:
Model Interpretation:
- Income coefficient (0.8): Higher income reduces default probability
- Credit history (0.5): Longer history indicates stability
- Debt ratio (-1.2): Higher debt increases risk significantly
- Previous default (2.4): Strongest predictor of future default
Decision Rule:
Business Performance:
- Accuracy: 87.3% correct predictions
- Precision: 0.92 (approved loans that don't default)
- Recall: 0.81 (actual defaults correctly identified)
- ROC AUC: 0.89 (excellent discrimination)
- Business Impact: 12M reduction in annual defaults
Regulatory Compliance: Model coefficients provide clear explanations for loan decisions, meeting Fair Lending requirements.
Retail Example: Dynamic Pricing Optimization
Business Context: E-commerce retailer uses linear regression to optimize product prices based on demand factors, competitor pricing, and inventory levels.
Demand Prediction Model:
Feature Engineering:
- = Log-transformed price (captures elasticity)
- = Seasonal index (1 = normal, >1 = high season)
- = Average competitor price difference
- = Stock level normalized by target inventory
Price Elasticity:
Optimal Pricing: Maximize revenue subject to inventory constraints:
First-Order Condition:
Optimal Price:
Multi-Product Portfolio:
Symbol Definitions:
- = Price vector for all products
- = Marginal cost for product
- = Demand function for product given all prices
Business Results:
- Revenue Increase: 18.3% vs. fixed pricing
- Margin Improvement: 4.2 percentage points
- Inventory Turnover: 25% faster clearance of slow-moving items
- Competitive Position: Maintained market share while improving profitability
Regularization Techniques
Ridge Regression (L2 Regularization)
Prevents overfitting by penalizing large weights:
Closed-Form Solution:
Symbol Definitions:
- = Regularization parameter (controls penalty strength)
- = Identity matrix
- = L2 norm squared (sum of squared weights)
Lasso Regression (L1 Regularization)
Performs feature selection by driving some weights to zero:
Symbol Definitions:
- = L1 norm (sum of absolute weights)
- = Absolute value of weight
Solution Method (Coordinate Descent):
Symbol Definitions:
- = Simple least squares coefficient for feature
- = Sign function (-1, 0, or +1)
Elastic Net
Combines L1 and L2 regularization:
Symbol Definitions:
- = L1 regularization parameter
- = L2 regularization parameter
Financial Services Example: Portfolio Risk Modeling
Business Context: Investment firm uses regularized linear regression to model portfolio risk factors and optimize asset allocation.
Factor Model:
Symbol Definitions:
- = Portfolio return
- = Alpha (excess return)
- = Factor loading for factor
- = Factor return (market, size, value, etc.)
- = Idiosyncratic error
Risk Factors:
- = Market factor (S&P 500 return)
- = Size factor (small-cap premium)
- = Value factor (value-growth spread)
- = Momentum factor (price momentum)
- = Quality factor (earnings quality)
Ridge Regression for Stability:
Portfolio Optimization:
Symbol Definitions:
- = Asset allocation weights
- = Covariance matrix (from factor model)
- = Expected return vector
- = Target return
Business Results:
- Risk Prediction Accuracy: R² = 0.84 for portfolio variance
- Factor Exposure Control: Within ±0.1 of target allocations
- Sharpe Ratio: 1.23 (vs. 0.87 benchmark)
- Maximum Drawdown: 12.4% (vs. 18.7% benchmark)
Model Selection and Validation
Cross-Validation for Regularization
K-Fold CV Error:
Symbol Definitions:
- = Optimal regularization parameter
- = Model trained without fold
- = Validation fold
Information Criteria
Akaike Information Criterion (AIC):
Bayesian Information Criterion (BIC):
Symbol Definitions:
- = Number of parameters
- = Likelihood of the model
- = Sample size
Feature Engineering for Linear Models
Polynomial Features
Create non-linear relationships:
Interaction Terms
Capture feature interactions:
Standardization
Normalize features for regularization:
Symbol Definitions:
- = Feature mean
- = Feature standard deviation
Linear models provide interpretable, efficient solutions for many supervised learning problems, offering clear insights into feature importance and model behavior while maintaining computational efficiency and regulatory compliance in financial services and retail applications.