The .NET Stacks #15: The final preview and ML.NET with Luis Quintanilla

We discuss the final .NET preview, talk about ML.NET with Luis Quintanilla, and more!

04 Sep 2020

Welcome to the start of another week—I hope you are well and safe. We had an insanely busy week in the .NET world, so let’s get to it.

It’s the final … preview

Are you ready for .NET 5? This week, Microsoft announced the release of .NET 5 Preview 8 (and also ASP.NET Core and EF 5 updates). This means that .NET 5 is more-or-less feature complete with the exception of bug fixes. Up next, we have two go-live release candidates, and then the official release in November.

You’ll want to hit up the links (and GitHub) for all the gory details, but here’s a quick recap of what’s coming in .NET 5 from a high level:

We discussed this a few weeks back, but for the preview itself, the ASP.NET Core side is jam-packed with Blazor improvements and capabilities. There’s CSS isolation, lazy loading, UI focus, and more. From the EF 5 side, there’s table-per-type mapping, table-valued functions, and more—and, as mentioned last week, many-to-many is now in the daily builds.

Speed up runtime with C# source generators

While not directly linked to C# 9, I’m getting excited about C# source generators (I wrote a “first look” post back in May).

These generators are basically code that runs during compilation and can produce additional files that are compiled together with the rest of your code. It’s a complication step that generates code for you based on your code.

Here’s a big benefit that I wrote about:

If you’ve ever leaned on reflection in your projects, you might begin to see many use cases for these solutions—C# source generators provide a lot of advantages that reflection currently offers and few, if any, drawbacks. Reflection is extremely powerful when you want to query properties and attributes you don’t know about when you typically compile. Of course, getting type information at runtime can incur a large performance cost, so offloading this to compilation is definitely a game-changer in C#.

This week, Microsoft announced some new C# source generator samples. You can also check out the design document and a source generators cookbook. If you’ve tried it, let me know your thoughts!

Dev Discussions: Luis Quintanilla

Machine learning is a fascinating world and, to many, a complicated one. As .NET developers, we definitely see the benefit in training our data but between the learning curve and using other languages like Python for machine learning—a language .NET devs might not be familiar with—ML is often sent to a developer’s “I should look into that sometime” queue.

That changed in 2018, when Microsoft launched ML.NET—a free, open source, x-plat machine learning framework for .NET. With ML.NET, you can use your favorite languages like C# or F# to work with your custom machine learning models. The idea is to meet you where you are and make ML more accessible.

There’s no one better to talk to about this than Luis Quintanilla. Luis has been with ML.NET since the beginning and was eventually scooped up by Microsoft to work on the docs for ML.NET.

What made you focus on ML.NET over other development tech?

I could write an entire essay on why ML.NET but all of the reasons can be summarized in a single word, .NET. Now, to expand on that, here are a few reasons why I enjoy ML.NET so much!

Languages

Though not unique to .NET, I like statically-typed languages. I’m sure many of the readers are able to build their applications and successfully run them without errors on the first try 😉. That, however, is usually not my experience. Therefore, I prefer catching as many errors as possible at compile time. Another reason I like types is they provide a way of documenting your code.

This is of extreme importance when working in data science and machine learning scenarios. Although ultimately the data used by machine learning algorithms to train models is encoded as numbers, knowing your data schema and checking it at compile time may help reduce the number of errors in your code as you transform your data.

Lately I’ve been doing more F# development and the more I use it, the more I like it. F# for me provides a nice balance between Python and C#. F# gives you the productivity and succinctness of a language like Python, while still having the compiler and many other neat features at your disposal.

Runtime

The .NET runtime is fast and performant. This is important in two scenarios, training machine learning models and deploying them. A good part of training machine learning models involves performing operations on vectors and matrices. .NET provides Single Instruction Multiple Data (SIMD) enabled types via the System.Numerics namespace. ML.NET leverages these types where possible to increase the throughput of the training operations making training fast and efficient.

Tooling

.NET has world class tooling across the board and you can’t go wrong with any of your choices. Visual Studio is an excellent IDE packed with tons of functionality to help developers be more productive. Alternatively, another great IDE for .NET is Jetbrains Rider. If you’re looking for a more lightweight development environment, you can also use Visual Studio Code. When working with F#, you can use the Ionide extension which makes F# development a pleasant experience.

Data science and machine learning workflows are very experimental. This means that you sometimes may want to have an interactive environment where you get feedback in near real-time of the outputs generated by your code. You’d also like a way to visualize your data inline. Within the data science community, a popular interactive computing environment is Jupyter Notebooks. You can leverage this interactive environment in .NET through .NET Interactive, which provides among many things, a kernel for you to run .NET code interactively.

Extensible

Although .NET is great, a large portion of the data science and machine learning space is predominantly made up of libraries and frameworks built in Python. That however does not limit ML.NET because it is extensible. ML.NET supports working with TensorFlow and Open Neural Network Exchange (ONNX) models. TensorFlow is a popular platform for building machine learning models. Using TensorFlow.NET, a set of C# bindings for TensorFlow, users can train and deploy TensorFlow models with ML.NET. ONNX is an open format built to represent machine learning models. This means that you can train a model in other popular tools and frameworks like Azure Custom Vision, PyTorch, Scikit Learn. Then, you can export or convert that model to ONNX and consume it using ML.NET.

Open Source & Community

ML.NET, like .NET, is open source. This allows for the community to collaborate and contribute to it. Users have various ways of contributing to ML.NET, whether it’s raising issues, updating documentation or submitting pull requests, they’re all valuable contributions that only help make the framework that much better for everyone to use.

Correct me if I’m wrong, but I believe a big mission of ML.NET is making machine learning accessible—that is, I shouldn’t have to be an expert in machine learning to do it in .NET. Even still: how much should I know before I get started?

That’s right! ML.NET provides many ways of interacting with it depending on what you’re most comfortable with. The easiest way to get started is by using the tooling. The tooling provides a low-code way of training and consuming ML.NET models. If you prefer a graphical user interface, you can try Model Builder, a Visual Studio extension that guides you through the steps involved in training a machine learning model. As long as you have a general sense of the problem you’re trying to solve (classify text, predict a number, categorize images) and you have a dataset, Model Builder takes care of the rest.

Alternatively, if you prefer working on the command line you can use the ML.NET CLI, a .NET command line tool for training ML.NET model models and generating consumption code. The idea is very much the same as Model Builder, except now you interact with the tooling via the command line. The CLI is also a great choice for Machine Learning Operations (MLOps) scenarios where model training and deployment is done as part of a continuous integration (CI) or continuous deployment (CD) pipeline.

For folks who want more control, prefer a code-first approach, or are more familiar with machine learning concepts, there’s other ways of using ML.NET. One is with the ML.NET Automated ML (Auto ML) API. The AutoML API is leveraged by the tooling to try to find the “best” model. The best model for your problem depends on many factors such as the quantity and distribution of your data and time to train. Therefore, it helps to try different algorithms with different parameters.

If you want full control over your machine learning pipeline, you can use the ML.NET API. The API provides you with direct access to data loaders, transformations, trainers, and prediction components that you can configure as needed to solve your problem.

One of the nice things is, none of the ways of using ML.NET is mutually exclusive. You can start off with the tooling to bootstrap the model training process and from there use the ML.NET API to make further refinements. In the end, it’s all about choice and depending on your experience with machine learning and preferred workflow, there’s an option for you.

Check out the full interview with Luis at my site.

🌎 Last week in the .NET world

.NET Stacks

The .NET Stacks #15: The final preview and ML.NET with Luis Quintanilla

It’s the final … preview

Speed up runtime with C# source generators

Dev Discussions: Luis Quintanilla