sklearn_pmml_model.linear_model#

The sklearn_pmml_model.linear_model module implements generalized linear models.

Submodules#

Package Contents#

Classes#

PMMLLinearRegression

Ordinary least squares Linear Regression.

PMMLLogisticRegression

Logistic Regression (aka logit, MaxEnt) classifier.

PMMLRidge

Linear least squares with l2 regularization.

PMMLRidgeClassifier

Classifier using Ridge regression.

PMMLLasso

Linear Model trained with L1 prior as regularizer (aka the Lasso).

PMMLElasticNet

Linear regression with combined L1 and L2 priors as regularizer.

class sklearn_pmml_model.linear_model.PMMLLinearRegression(pmml)#

Bases: sklearn_pmml_model.base.OneHotEncodingMixin, sklearn_pmml_model.base.PMMLBaseRegressor, sklearn.linear_model.LinearRegression

Ordinary least squares Linear Regression.

The PMML model consists out of a <RegressionModel> element, containing at least one <RegressionTable> element. Every table element contains a <NumericPredictor> element for numerical fields and <CategoricalPredictor> per value of a categorical field, describing the coefficients.

Parameters:
pmmlstr, object

Filename or file object containing PMML data.

Notes

Specification: http://dmg.org/pmml/v4-3/Regression.html

fit(x, y)#

Not supported: PMML models are already fitted.

_more_tags()#
class sklearn_pmml_model.linear_model.PMMLLogisticRegression(pmml)#

Bases: sklearn_pmml_model.base.OneHotEncodingMixin, sklearn_pmml_model.base.PMMLBaseClassifier, sklearn.linear_model.LogisticRegression

Logistic Regression (aka logit, MaxEnt) classifier.

The PMML model consists out of a <RegressionModel> element, containing at least one <RegressionTable> element. Every table element contains a <NumericPredictor> element for numerical fields and <CategoricalPredictor> per value of a categorical field, describing the coefficients.

Parameters:
pmmlstr, object

Filename or file object containing PMML data.

Notes

Specification: http://dmg.org/pmml/v4-3/Regression.html

fit(x, y)#

Not supported: PMML models are already fitted.

_more_tags()#
class sklearn_pmml_model.linear_model.PMMLRidge(alpha=1.0, *, fit_intercept=True, copy_X=True, max_iter=None, tol=0.0001, solver='auto', positive=False, random_state=None)#

Bases: sklearn_pmml_model.linear_model.base.PMMLGeneralizedLinearRegressor, sklearn.linear_model.Ridge

Linear least squares with l2 regularization.

Minimizes the objective function:

||y - Xw||^2_2 + alpha * ||w||^2_2

This model solves a regression model where the loss function is the linear least squares function and regularization is given by the l2-norm. Also known as Ridge Regression or Tikhonov regularization. This estimator has built-in support for multi-variate regression (i.e., when y is a 2d-array of shape (n_samples, n_targets)).

Parameters:
pmmlstr, object

Filename or file object containing PMML data.

Notes

Specification: http://dmg.org/pmml/v4-3/GeneralRegression.html

fit(x, y)#

Fit Ridge regression model.

Parameters:
X{ndarray, sparse matrix} of shape (n_samples, n_features)

Training data.

yndarray of shape (n_samples,) or (n_samples, n_targets)

Target values.

sample_weightfloat or ndarray of shape (n_samples,), default=None

Individual weights for each sample. If given a float, every sample will have the same weight.

Returns:
selfobject

Fitted estimator.

_more_tags()#
class sklearn_pmml_model.linear_model.PMMLRidgeClassifier(pmml)#

Bases: sklearn_pmml_model.linear_model.base.PMMLGeneralizedLinearClassifier, sklearn.linear_model.RidgeClassifier

Classifier using Ridge regression.

This classifier first converts the target values into {-1, 1} and then treats the problem as a regression task (multi-output regression in the multiclass case).

Parameters:
pmmlstr, object

Filename or file object containing PMML data.

Notes

Specification: http://dmg.org/pmml/v4-3/GeneralRegression.html

fit(x, y)#

Fit Ridge classifier model.

Parameters:
X{ndarray, sparse matrix} of shape (n_samples, n_features)

Training data.

yndarray of shape (n_samples,)

Target values.

sample_weightfloat or ndarray of shape (n_samples,), default=None

Individual weights for each sample. If given a float, every sample will have the same weight.

New in version 0.17: sample_weight support to RidgeClassifier.

Returns:
selfobject

Instance of the estimator.

_more_tags()#
class sklearn_pmml_model.linear_model.PMMLLasso(pmml)#

Bases: sklearn_pmml_model.linear_model.base.PMMLGeneralizedLinearRegressor, sklearn.linear_model.Lasso

Linear Model trained with L1 prior as regularizer (aka the Lasso).

The optimization objective for Lasso is:

(1 / (2 * n_samples)) * ||y - Xw||^2_2 + alpha * ||w||_1

Technically the Lasso model is optimizing the same objective function as the Elastic Net with l1_ratio=1.0 (no L2 penalty).

Parameters:
pmmlstr, object

Filename or file object containing PMML data.

Notes

Specification: http://dmg.org/pmml/v4-3/GeneralRegression.html

fit(x, y)#

Fit model with coordinate descent.

Parameters:
X{ndarray, sparse matrix} of (n_samples, n_features)

Data.

yndarray of shape (n_samples,) or (n_samples, n_targets)

Target. Will be cast to X’s dtype if necessary.

sample_weightfloat or array-like of shape (n_samples,), default=None

Sample weights. Internally, the sample_weight vector will be rescaled to sum to n_samples.

New in version 0.23.

check_inputbool, default=True

Allow to bypass several input checking. Don’t use this parameter unless you know what you do.

Returns:
selfobject

Fitted estimator.

Notes

Coordinate descent is an algorithm that considers each column of data at a time hence it will automatically convert the X input as a Fortran-contiguous numpy array if necessary.

To avoid memory re-allocation it is advised to allocate the initial data in memory directly using that format.

_more_tags()#
class sklearn_pmml_model.linear_model.PMMLElasticNet(pmml)#

Bases: sklearn_pmml_model.linear_model.base.PMMLGeneralizedLinearRegressor, sklearn.linear_model.ElasticNet

Linear regression with combined L1 and L2 priors as regularizer.

Minimizes the objective function:

1 / (2 * n_samples) * ||y - Xw||^2_2
+ alpha * l1_ratio * ||w||_1
+ 0.5 * alpha * (1 - l1_ratio) * ||w||^2_2

If you are interested in controlling the L1 and L2 penalty separately, keep in mind that this is equivalent to:

a * ||w||_1 + 0.5 * b * ||w||_2^2

where:

alpha = a + b and l1_ratio = a / (a + b)

The parameter l1_ratio corresponds to alpha in the glmnet R package while alpha corresponds to the lambda parameter in glmnet. Specifically, l1_ratio = 1 is the lasso penalty. Currently, l1_ratio <= 0.01 is not reliable, unless you supply your own sequence of alpha.

Parameters:
pmmlstr, object

Filename or file object containing PMML data.

Notes

Specification: http://dmg.org/pmml/v4-3/GeneralRegression.html

fit(x, y)#

Fit model with coordinate descent.

Parameters:
X{ndarray, sparse matrix} of (n_samples, n_features)

Data.

yndarray of shape (n_samples,) or (n_samples, n_targets)

Target. Will be cast to X’s dtype if necessary.

sample_weightfloat or array-like of shape (n_samples,), default=None

Sample weights. Internally, the sample_weight vector will be rescaled to sum to n_samples.

New in version 0.23.

check_inputbool, default=True

Allow to bypass several input checking. Don’t use this parameter unless you know what you do.

Returns:
selfobject

Fitted estimator.

Notes

Coordinate descent is an algorithm that considers each column of data at a time hence it will automatically convert the X input as a Fortran-contiguous numpy array if necessary.

To avoid memory re-allocation it is advised to allocate the initial data in memory directly using that format.

_more_tags()#