company_banner

ML.NET Tutorial — Get started in 10 minutes

    Last year we announced ML.NET, cross-platform and open ML system for .NET developers. During this time, it has evolved greatly and has gone through many versions. Today we are sharing a guide on how to create your first ml.net application in 10 minutes.



    *This tutorial on Russian.

    **Below is a tutorial for Windows. But exactly the same thing can be done on MacOS/Linux.

    Install the .NET SDK


    To start building .NET apps you just need to download and install the .NET SDK (Software Development Kit).



    Create your app


    Open a new command prompt and run the following commands:

    dotnet new console -o myApp
    cd myApp

    The dotnet command creates a new application of type console for you. The -o parameter creates a directory named myApp where your app is stored, and populates it with the required files. The cd myApp command puts you into the newly created app directory.

    Install ML.NET package


    To use ML.NET, you need to install the Microsoft.ML package. In your command prompt, run the following command:

    dotnet add package Microsoft.ML --version 0.9.0

    Download the data set


    Your machine learning app will predict the type of iris flower (setosa, versicolor, or virginica) based on four features: petal length, petal width, sepal length, and sepal width.

    Open the UCI Machine Learning Repository: Iris Data Set, copy and paste the data into a text editor (e.g. Notepad), and save it as iris-data.txt in the myApp directory.

    When you paste the data it will look like the following. Each row represents a different sample of an iris flower. From left to right, the columns represent: sepal length, sepal width, petal length, petal width, and type of iris flower.

    5.1,3.5,1.4,0.2,Iris-setosa
    4.9,3.0,1.4,0.2,Iris-setosa
    4.7,3.2,1.3,0.2,Iris-setosa
    ...

    Using Visual Studio?


    If you're following along in Visual Studio, you'll need to configure iris-data.txt to be copied to the output directory.



    Write some code


    Open Program.cs in any text editor and replace all of the code with the following:

    using Microsoft.ML;
    using Microsoft.ML.Data;
    using System;
    
    namespace myApp
    {
        class Program
        {
            // STEP 1: Define your data structures
            // IrisData is used to provide training data, and as
            // input for prediction operations
            // - First 4 properties are inputs/features used to predict the label
            // - Label is what you are predicting, and is only set when training
            public class IrisData
            {
               [LoadColumn(0)]
                public float SepalLength;
    
                [LoadColumn(1)]
                public float SepalWidth;
    
                [LoadColumn(2)]
                public float PetalLength;
    
                [LoadColumn(3)]
                public float PetalWidth;
    
                [LoadColumn(4)]
                public string Label;
            }
    
            // IrisPrediction is the result returned from prediction operations
            public class IrisPrediction
            {
                [ColumnName("PredictedLabel")]
                public string PredictedLabels;
            }
    
            static void Main(string[] args)
            {
                // STEP 2: Create a ML.NET environment  
                var mlContext = new MLContext();
    
                // If working in Visual Studio, make sure the 'Copy to Output Directory'
                // property of iris-data.txt is set to 'Copy always'
                var reader = mlContext.Data.CreateTextReader<IrisData>(separatorChar: ',', hasHeader: true);
                IDataView trainingDataView = reader.Read("iris-data.txt");
    
                // STEP 3: Transform your data and add a learner
                // Assign numeric values to text in the "Label" column, because only
                // numbers can be processed during model training.
                // Add a learning algorithm to the pipeline. e.g.(What type of iris is this?)
                // Convert the Label back into original text (after converting to number in step 3)
                var pipeline = mlContext.Transforms.Conversion.MapValueToKey("Label")
                    .Append(mlContext.Transforms.Concatenate("Features", "SepalLength", "SepalWidth", "PetalLength", "PetalWidth"))
                    .Append(mlContext.MulticlassClassification.Trainers.StochasticDualCoordinateAscent(labelColumn: "Label", featureColumn: "Features"))
                    .Append(mlContext.Transforms.Conversion.MapKeyToValue("PredictedLabel"));
    
                // STEP 4: Train your model based on the data set  
                var model = pipeline.Fit(trainingDataView);
    
                // STEP 5: Use your model to make a prediction
                // You can change these numbers to test different predictions
                var prediction = model.CreatePredictionEngine<IrisData, IrisPrediction>(mlContext).Predict(
                    new IrisData()
                    {
                        SepalLength = 3.3f,
                        SepalWidth = 1.6f,
                        PetalLength = 0.2f,
                        PetalWidth = 5.1f,
                    });
    
                Console.WriteLine($"Predicted flower type is: {prediction.PredictedLabels}");
            }
        }
    }

    Run your app


    In your command prompt, run the following command:

    dotnet run

    The last line of output is the predicted type of iris flower. You can change the values passed to the Predict function to see predictions based on different measurements.

    Congratulations, you've built your first machine learning model with ML.NET!

    Keep learning


    Now that you've got the basics, you can keep learning with our ML.NET tutorials.

    Microsoft
    607.68
    Microsoft — мировой лидер в области ПО и ИТ-услуг
    Share post

    Similar posts

    Comments 1

      +1
      5.1,3.5,1.4,0.2,Iris-setosa
      4.9,3.0,1.4,0.2,Iris-setosa
      4.7,3.2,1.3,0.2,Iris-setosa
      ...

      Iris data set appears to have no header (column names) in the source csv. However, you write in your example:


      var reader = mlContext.Data.CreateTextReader<IrisData>(separatorChar: ',', hasHeader: true);

      Here it obviously says that hasHeader is used to control whether the data set has "header with feature names", but example data set does not have feature names (or am I missing something?).


      If I am not mistaken, the code above might not fail, but the loaded data will likely have names taken from the first row (so, feature "5.1", feature "3.5" and so on), and the total number of observations will be reduced by 1.


      I've had so many weird errors reading tables in R because of the missing header = TRUE/FALSE parameters that I now try to be super-cautious about it.

      Only users with full accounts can post comments. Log in, please.