chariots.sklearn¶
the sklearn module provides support for the scikit-learn framework.
this module provides two main classes (SKSupervisedOp, SKUnsupervisedOp) that need to be subclassed to be used. to do so you will need to set the model_class class attribute and potentially the model_parameters class attribute. this should be a VersionedFieldDict which defines the parameters your model should be initialized with. As for other machine learning ops, you can override the training_update_version class attribute to define which version will be changed when the operation is retrained:
>>> class PCAOp(SKUnsupervisedOp):
... training_update_version = VersionType.MAJOR
... model_parameters = VersionedFieldDict(VersionType.MAJOR, {"n_components": 2,})
... model_class = VersionedField(PCA, VersionType.MAJOR)
Once your op class is define, you can use it as any MLOp choosing your MLMode to define the behavior of your operation (fit and/or predict):
>>> train_pca = Pipeline([Node(IrisXDataSet(), output_nodes=["x"]), Node(PCAOp(MLMode.FIT), input_nodes=["x"])],
... 'train_pca')
-
class
chariots.sklearn.
SKSupervisedOp
(mode: chariots._ml_mode.MLMode, op_callbacks: Optional[List[chariots.callbacks._op_callback.OpCallBack]] = None)[source]¶ Bases:
chariots.sklearn._base_sk_op.BaseSKOp
Op base class to create supervised models using the scikit learn framework., If using the MLMode.FIT or MLMode.FIT_PREDICT, you will need to link this op to a X and a y upstream node:
>>> train_logistics = Pipeline([ ... Node(IrisFullDataSet(), output_nodes=["x", "y"]), ... Node(PCAOp(MLMode.PREDICT), input_nodes=["x"], output_nodes="x_transformed"), ... Node(LogisticOp(MLMode.FIT), input_nodes=["x_transformed", "y"]) ... ], 'train_logistics')
and if you are using the op with the MLMode.PREDICT mode you will only need to link the op to an X upstream node:
>>> pred = Pipeline([ ... Node(IrisFullDataSet(),input_nodes=['__pipeline_input__'], output_nodes=["x"]), ... Node(PCAOp(MLMode.PREDICT), input_nodes=["x"], output_nodes="x_transformed"), ... Node(LogisticOp(MLMode.PREDICT), input_nodes=["x_transformed"], output_nodes=['__pipeline_output__']) ... ], 'pred')
-
fit
(X, y)[source]¶ method used by the operation to fit the underlying model
DO NOT TRY TO OVERRIDE THIS METHOD.
- Parameters
X – the input that the underlying supervised model will fit on (type must be compatible with the sklearn lib such as numpy arrays or pandas data frames)
y – the output that hte underlying supervised model will fit on (type must be compatible with the sklearn lib such as numpy arrays or pandas data frames)
-
-
class
chariots.sklearn.
SKUnsupervisedOp
(mode: chariots._ml_mode.MLMode, op_callbacks: Optional[List[chariots.callbacks._op_callback.OpCallBack]] = None)[source]¶ Bases:
chariots.sklearn._base_sk_op.BaseSKOp
base class to create unsupervised models using the scikit-learn framework. Whatever the mode you will need to link this op with a single upstream node:
>>> train_logistics = Pipeline([ ... Node(IrisFullDataSet(), output_nodes=["x", "y"]), ... Node(PCAOp(MLMode.PREDICT), input_nodes=["x"], output_nodes="x_transformed"), ... Node(LogisticOp(MLMode.FIT), input_nodes=["x_transformed", "y"]) ... ], 'train_logistics') >>> pred = Pipeline([ ... Node(IrisFullDataSet(),input_nodes=['__pipeline_input__'], output_nodes=["x"]), ... Node(PCAOp(MLMode.PREDICT), input_nodes=["x"], output_nodes="x_transformed"), ... Node(LogisticOp(MLMode.PREDICT), input_nodes=["x_transformed"], output_nodes=['__pipeline_output__']) ... ], 'pred')