Skip to main content

Clustering Model Generator

Description

Clustering helps to identify the assembly of closely related data points and group all with similar properties.

Properties

Location

  • Algorithm Type – Select the classification algorithm for model creation. The value can be “KMeans”. k-means algorithm searches for a pre-determined number of clusters within an unlabelled multidimensional dataset.

  • CSV File Path – Specify that the CSV file path of the data set required for model creation is present.

  • Missing Values Handler – Datasets may have missing values and will create problems for algorithms. The missing value handler identifies and replaces the missing values for each column in the specified input data before modelling the prediction task. The value can be “Mean”, “Median”, or “Mode”.

    Missing value

  • Model Name – It will generate a model with this name.

  • Number of clusters – Number of clusters to make data into N number of groups.

  • Selected Columns – Columns that are required to train the model.

Misc

  • DisplayName – Add a display name to your activity.
  • Private – By default, activity will log the values of your properties inside your workflow. If private is selected, then it stops logging.

Output

  • Result – Name of the model generated by this activity.