Feature Engineering
Selecting, transforming, and constructing input variables from raw data to improve predictive accuracy of machine learning models in insurance.
FAQs
- Can we use credit-based variables as features in all states?
- No. Several states restrict or prohibit the use of credit information in personal lines rating. Even where permitted, you must document the actuarial justification and ensure the variable does not function as an illegal proxy under state unfair discrimination statutes.
- How do we handle missing values during feature engineering?
- Common approaches include mean or median imputation, creating an explicit missing-indicator binary variable, or using model architectures that handle missing values natively. The chosen approach must be documented and applied consistently between training and production scoring.
- How does feature engineering interact with rate filings?
- In prior-approval states, you may need to disclose the features and transformations used in a filed rating algorithm. Actuarial judgment must support each variable's relationship to loss exposure, so the engineering rationale must be preserved in documentation available to the filing actuary.
Related Terms
Model Governance
Policies, controls, and oversight processes managing the full lifecycle of predictive and AI models from development through retirement.
Gradient Boosting Insurance
An ensemble machine learning technique building sequential decision trees widely used in insurance pricing, fraud detection, and churn prediction.
Algorithmic Bias
Systematic unfair discrimination in AI or ML models disadvantaging protected classes—a critical compliance concern as insurers adopt predictive models.
Telematics Data
Driving behavior data from in-vehicle devices or apps (speed, braking, mileage) used to price auto insurance based on actual usage and risk.
