Now Anyone Can Develop Machine learning Models without Coding

Yes, you read it right!!! If you’re a Data scientist IT IS A MUST READ.

Automatic Best Model Selection and training.

I’m writing this post for both technical and non-technical people. So the post will be in plain English[and technical word will be in braces] and the technical section will be mentioned before the start of the section.

Before I get into how you can get your idea to life and also experiment for free. Let's look into the prerequisite.

Prerequisite:

A laptop/computer or even a phone with internet

No need for heavy computers or Graphic Cards.

No need for Programming knowledge or Deep Machine learning Understanding[If u have, it is an added advantage]

Enough talking let's get into

If you have not guessed it already, It’s the new AutoAI feature offered in IBM Watson Studio by IBM Cloud.

We will be looking at How-To, technical aspects, and a review not in separate sections but all in one.

What Exactly is AutoAI?

AutoAI is an IBM cloud tool used to get the best possible Machine learning(ML) model for a particular dataset. Also, we can choose a particular column to be predicted. Let it be a number(int) or a statement(char).

After data preparation, the first thing is choosing an ML algorithm to train on. And trust me, AutoAI is very good at this. It trains our dataset on a variety of models and gives us the best possible outcome.

Most importantly it is using all the algorithms for the best prediction. I bet you wouldn’t have worked on all. Even if you're a senior ML developer with 10 years of experience.

Also, we can get the ML code generated for all the trained models in python as a notebook.

Example code generated by IBM Watson

[Technical]

Also, It automatically calculates all the error measures like Mean squared error, mean absolute error, and more(for all models).

[Technical close]

So You would think, that to train on all kinds of algorithms would take few hours or even a day or two. If yes, I’m sorry you’re wrong. I have conducted 3 experiments on AutoAI and all have taken less than 30 mins with large datasets.

Let's fire up Watson Studio

To play around you can create a free tier account if you already don’t have one on the IBM Cloud. And later upgrade based on your needs.

After you create an account, go to catalog then Watson studio

Then you will be taken to this page, select the nearest location and click create

And on the next page just click on “Get Started”

then you will land the below page, fill in you’re detailing.

Then you land on a beautiful black-themed Watson studio.

Create an empty “New project”, give it a Name and a description of your choice. Also, don’t forget to add a storage device. All the steps are simple, so I will not be walking through each click. That would be so lame.

Then after all that you will be on this page.

And now the time has come to add Data to the cloud. We can just drag and drop the data/browse the pc(It be an xlsx or CSV or any other accepted format)

One more thing, when we upload a Dataset, Watson automatically assigns the type for individual columns, yeah pretty cool right.

ok now you have loaded a 10GB file, but you have not cleaned it. Stop blaming yourself or stop deleting the file. Relax, we can clean it in Watson Studio.

Just click on Data and select Refine. Where you can do type conversion(string to date or any) or rename or remove operations on the columns.

Before we dive in let me tell you about my AutoAI experiments, I have done 3 different types

  1. Better ways to overcome violence against women(women's not Karens) — Dataset
  2. Youtube Video Prediction — Dataset
  3. Based on Mosquito prevention — Collected from many sources and also some Web scraping

For demo purposes, I will be using the Youtube video prediction part and here and there other two. So you can get a hang of screenshots.

So here it has taken trending_date as a string because it wasn’t in the standard format that IBM had specified. So it has been identified as a string. Just see the option I have given on left.

Also, We can do some viz. here it is showing that music category videos have more views than any other category. NO OFFENSE INTENDED. We love Pewdiepie💖

And it doesn’t stop with old-school bar charts. There are 33 more to have fun with and seriously I have not seen them ALL in one place.

And some more operation we can perform and there are many more

ok so now that Data is loaded, let's get our AutoAI in place. Click on “Add to project” and then add the AutoAI experiment.

As usual give your experiment a name, description, associate ML service if required then load in the dataset. For detailed video instruction, you can watch the official IBM AutoAI tutorial.

After all that we have to select a variable to predict, in my case its views.

As we know views is an integer type, We use Regression type models. So it picks it automatically. Then we run the Experiment.

This is where the magic happens

After we click RUN. we get a Relationship map and a Progress map.

The Relationship Map
The Progress Map

As the progress map says it starts with reading the dataset, split for train and test(also you can change it, If you want to), preprocess it, select models to test.

Also, it does Feature engineering[Feature engineering is the process of using domain knowledge to extract features from raw data. These features can be used to improve the performance of machine learning algorithms. Feature engineering can be considered as applied machine learning itself.]

Hyperparameter tuning /optimization [Selection of best parameters for training pre-selected ML models.] more here…

Let me show my case, it has chosen Extra tree regressor and Random forest regressor. And it was completed in 19 minutes.

And the best model came out to be an Extra tree regressor with HPO and FE, Hyperparameter optimization, and Feature engineering.

pipeline comparison/model comparison of all the models generated.

pipeline comparison/model comparison

Awesome isn’t it. It choose the models and also choose the best parameters and also calculated all the error measures like root mean squared error, mean absolute error, and many more.

AutoAI has more to get you blown away

ok, so the model is now trained, what next?

We have two ways here -to deploy the model on the cloud or download the source code of the model. yes, the source code. I know you were not expecting that coming😁 , or we can do both Deploy and notebook.

Deploying / Saving a model

Let's see what we have with a notebook, after clicking create. we get this page.

Publishing the notebook

As you can see we can publish the notebook on Github, Gist, or on Catalog or do all, if u do all I can clearly say that you are crazier than the joker.

Now the model Deployment.

Model Deployment.

Just create a model deployment and then we have to promote it to space to use it. We can either use it as a Webservice or a Batch service.

We can use the Webservice through curl, Java, Javascript, Python, and Scala. And to use these we need API. So create one.

Accessing web services

So this is all I had promised in the beginning. Everything I did here is free of cost, of course, you have to pay for extreme usage.

So Why not explore yourself?

Bonus

Some info on my other model “Better ways to overcome violence against women” and was trained in 18 minutes.

I don't want to bore you with words, see yourself

Experiment Summary
Pipelines generated

If you have made it this far, I have two things for you, Thank You and a cup of coffee☕

Data scientist and a Cloud architect

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store