Golang Machine Learning Libraries

Machine Learning Golang Libraries

If there’s one thing that everyone (yes, including non-CS people) has hyped incessantly over the past few years, it’s Machine Learning. And while we know Go provides faster speeds overall at accessing requests over the internet, it would be really disappointing if it couldn’t be used for something practical – like Machine Learning. Well, this article is designed to give you a brief introduction to the applications of Golang in ML about what is possible currently, and what is still in the works.

Machine Learning

As a brief overview of ML if you haven’t heard of it already, the best definition of ML by Tom Mitchell has been accepted:

A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.

– Machine Learning 1983-86, Tom Mitchell

ML is a method of teaching the computer to learn things in a manner of humans – by trial, error, and updating its “understanding” of the solution.

The work behind Machine Learning is nothing new – it started when Andrey Markov tried analyzing the contents of a poem, resulting in his discovery of Markov Chains. It has led scientists to create better algorithms throughout the 1970s and 1980s, and it all culminated in 2016, when a machine named AlphaGo beat professional players in the game of Go (not related to Golang).

NameScore
AlphaGo Fan5:0 against Fan Hui
AlphaGo Lee4:1 against Lee Sodol
AlphaGo Master60:0 against professional players online
AlphaGo Zero100:0 against AlphaGo Lee
89:11 against AlphaGo Master
Scores of various AlphaGo models against real players

So, Machine Learning works! And Python, being one of the top 3 programming languages for the past decade, does it quite well. So the question is, can Go match up? I think the answer might surprise you.

Gorgonia and Goro

First, let’s talk about Gorgonia, the closest that Golang has gotten to Keras and TensorFlow in Python. Keras is a Python ML/Data Science library that is very intuitive in its creation of Neural Networks, the fundamental building blocks of Data Science and deep pattern recognition.

Gorgonia is a huge project, that has accomplished some big feats:

  • Can perform automatic differentiation
  • Can perform symbolic differentiation
  • Can perform gradient descent optimizations
  • Can perform numerical stabilization
  • Provides a number of convenience functions to help create neural networks
  • Is fairly quick (comparable to Theano and Tensorflow’s speed)
  • Supports CUDA/GPGPU computation (OpenCL not yet supported)
  • Will support distributed computing

The backend is a little difficult to understand, especially for beginners since it uses a “Computation Graph”, which simply means that the program creates a hierarchical tree structure as the source code is read, making it easier for machines to parse through the program instead of trying to interpret the order of variable and function calls.

The library is easy to pick up and use. Install the package with the following command.

import ("gorgonia.org/gorgonia")

Now you’re good to go! It also has a handy “examples” subdirectory with many popular datasets like MNIST pre-loaded.

inputs, targets, err := mnist.Load(typ, "./testdata", tensor.Float64)

This would be a good time to mention a separate project based on Gorgonia called Goro. In Goro, it is possible to create a Convolutional Neural Network (CNN) sequentially:

import (
    . "github.com/aunum/goro/pkg/v1/model"
    "github.com/aunum/goro/pkg/v1/layer"
)

//Define x_train and y_train

model, _ := NewSequential(model_name)

model.AddLayers(
    layer.Conv2D{Input: 1, Output: 32, Width: 3, Height: 3},
    layer.MaxPooling2D{},
    layer.Conv2D{Input: 32, Output: 64, Width: 3, Height: 3},
    layer.MaxPooling2D{},
    layer.Conv2D{Input: 64, Output: 128, Width: 3, Height: 3},
    layer.MaxPooling2D{},
    layer.Flatten{},
    layer.FC{Input: 128 * 3 * 3, Output: 100},
    layer.FC{Input: 100, Output: 10, Activation: layer.Softmax},
)

optimizer := g.NewRMSPropSolver()

model.Compile(xi, yi,
    WithOptimizer(optimizer),
    WithLoss(m.CrossEntropy),
    WithBatchSize(100),
)

model.Fit(x_train, y_train)

It’s very intuitive, and anyone who’s used Keras previously can work on it directly.

MLGo, GoML and GoLearn

Both mlgo and goml aim to extend the capabilities and functions of the current ML libraries by implementing algorithms that are required in the process.

While mlgo seems to have mainly focused on developing stronger algorithms for Clustering, goml has implemented some handy Regression models – Ordinary Least Squares, Linear Regression, Softmax Regression, Batch Gradient Descent, etc. which are required in the CNN layers for approaching the solution. The goml has successfully implemented a few naive bayes classification methods, which are still not comparable to what can be achieved in Python, But it’s a start!

mlgo is also working on several density-based and hierarchical clustering methods which are essential in unsupervised learning models.

Golearn, on the other hand, is taking a simplistic approach to building built-in functions that can be referenced as instances directly in a .go code file to train and predict machine learning datasets. These are restrictive in their number of parameters and layers, but perhaps easier for a beginner to understand. To show how easy it is to work with golearn, look at the code below.

package main

import (
	"fmt"

	"github.com/sjwhitworth/golearn/base"
	"github.com/sjwhitworth/golearn/evaluation"
	"github.com/sjwhitworth/golearn/knn"
)

func main() {
	// Load in a dataset, with headers. Header attributes will be stored.
	// Think of instances as a Data Frame structure in R or Pandas.
	// You can also create instances from scratch.
	rawData, err := base.ParseCSVToInstances("datasets/iris.csv", false)
	if err != nil {
		panic(err)
	}

	// Print a pleasant summary of your data.
	fmt.Println(rawData)

	//Initialises a new KNN classifier
	cls := knn.NewKnnClassifier("euclidean", "linear", 2)

	//Do a training-test split
	trainData, testData := base.InstancesTrainTestSplit(rawData, 0.50)
	cls.Fit(trainData)

	//Calculates the Euclidean distance and returns the most popular label
	predictions, err := cls.Predict(testData)
	if err != nil {
		panic(err)
	}

	// Prints precision/recall metrics
	confusionMat, err := evaluation.GetConfusionMatrix(testData, predictions)
	if err != nil {
		panic(fmt.Sprintf("Unable to get confusion matrix: %s", err.Error()))
	}
	fmt.Println(evaluation.GetSummary(confusionMat))
}

There are many other interesting libraries by developers around the world, trying to build up a community for Go and make it user intuitive while maintaining the innate vision of scale that Golang is built for. It is also possible to join these communities and be a contributor to these open source libraries.

So Go on!

A great discussion on the future of ML in Golang: https://www.reddit.com/r/golang/comments/anbjin/is_there_a_future_in_machine_learning_in_go/

References