Can we use credit-based variables as features in all states?

No. Several states restrict or prohibit the use of credit information in personal lines rating. Even where permitted, you must document the actuarial justification and ensure the variable does not function as an illegal proxy under state unfair discrimination statutes.

How do we handle missing values during feature engineering?

Common approaches include mean or median imputation, creating an explicit missing-indicator binary variable, or using model architectures that handle missing values natively. The chosen approach must be documented and applied consistently between training and production scoring.

How does feature engineering interact with rate filings?

In prior-approval states, you may need to disclose the features and transformations used in a filed rating algorithm. Actuarial judgment must support each variable's relationship to loss exposure, so the engineering rationale must be preserved in documentation available to the filing actuary.

Feature Engineering

Selecting, transforming, and constructing input variables from raw data to improve predictive accuracy of machine learning models in insurance.

technicalPublished 2026/06/07Last verified 2026/06/07

FAQs

Can we use credit-based variables as features in all states?: No. Several states restrict or prohibit the use of credit information in personal lines rating. Even where permitted, you must document the actuarial justification and ensure the variable does not function as an illegal proxy under state unfair discrimination statutes.
How do we handle missing values during feature engineering?: Common approaches include mean or median imputation, creating an explicit missing-indicator binary variable, or using model architectures that handle missing values natively. The chosen approach must be documented and applied consistently between training and production scoring.
How does feature engineering interact with rate filings?: In prior-approval states, you may need to disclose the features and transformations used in a filed rating algorithm. Actuarial judgment must support each variable's relationship to loss exposure, so the engineering rationale must be preserved in documentation available to the filing actuary.

Related Terms

Feature Engineering

Selecting, transforming, and constructing input variables from raw data to improve predictive accuracy of machine learning models in insurance.

technicalPublished 2026/06/07Last verified 2026/06/07

FAQs

Can we use credit-based variables as features in all states?: No. Several states restrict or prohibit the use of credit information in personal lines rating. Even where permitted, you must document the actuarial justification and ensure the variable does not function as an illegal proxy under state unfair discrimination statutes.
How do we handle missing values during feature engineering?: Common approaches include mean or median imputation, creating an explicit missing-indicator binary variable, or using model architectures that handle missing values natively. The chosen approach must be documented and applied consistently between training and production scoring.
How does feature engineering interact with rate filings?: In prior-approval states, you may need to disclose the features and transformations used in a filed rating algorithm. Actuarial judgment must support each variable's relationship to loss exposure, so the engineering rationale must be preserved in documentation available to the filing actuary.

Related Terms

Related Items

How it works / Why it matters

Raw insurance data rarely arrives in model-ready form. Policy transaction records contain dates that must be converted to elapsed-time variables. Vehicle identification numbers encode make, model, and safety ratings that must be decoded and joined from reference tables. Claims histories must be aggregated into frequency and severity metrics at the insured level. Each of these transformations is a feature engineering decision.

Common techniques include:

Binning and discretization: Converting a continuous variable such as building age into ordinal buckets that stabilize predictions at thin data points.

Interaction terms: Multiplying or combining two variables — for example, driver age multiplied by vehicle horsepower — to capture joint effects that neither variable captures alone.

Target encoding: Replacing a high-cardinality categorical variable such as zip code with the historical loss ratio for that geography, with shrinkage toward the mean for sparse categories.

Lag and rolling features: For telematics or iot-risk-data, computing rolling averages of braking events or speed violations over the prior 30 and 90 days.

Text-derived features: Extracting numeric signals from unstructured fields via nlp-submissions, such as occupancy class keywords from submission emails.

Regulators in several states have scrutinized features that serve as proxies for protected characteristics such as race or national origin. This makes feature selection a compliance exercise as well as a statistical one, linking feature engineering to algorithmic-bias review within the model-governance process.

In practice

A personal auto insurer building a renewal pricing model might start with 400 raw variables from policy, claims, and telematics-data feeds. After exploratory analysis, the team may engineer 60 derived features, drop 300 redundant or unstable variables, and subject the remaining set to a bias audit before training. The final feature set is documented in the model card required by the governance framework.

Platforms such as Akur8 provide built-in feature selection and transformation tooling designed specifically for insurance pricing, reducing the manual effort required and preserving the documentation needed for regulatory filings. Verisk data products supply pre-engineered external features — credit attributes, prior carrier history, catastrophe scores — that carriers incorporate into their own pipelines.

Feature Engineering

FAQs

Related Terms

Model Governance

Gradient Boosting Insurance

Algorithmic Bias

Telematics Data

Related Items

Akur8

Verisk

Planck

Feature Engineering

FAQs

Related Terms

Model Governance

Gradient Boosting Insurance

Algorithmic Bias

Telematics Data

Related Items

Akur8

Verisk

Planck

How it works / Why it matters

In practice

FAQs

Related Terms

Model Governance

Gradient Boosting Insurance

Algorithmic Bias

Telematics Data

Related Items

Akur8

Verisk

Planck

Newsletter

Join the Community

FAQs

Related Terms

Model Governance

Gradient Boosting Insurance

Algorithmic Bias

Telematics Data

Related Items

Akur8

Verisk

Planck

Newsletter

Join the Community

How it works / Why it matters

In practice

Related concepts