Microsoft Azure learning for starters – Linear regression with a good Coefficient of determination

In this blog, we will look in details about Microsoft Azure for a Cloud starter especially in the area of Machine learning. Azure is the Cloud platform from Microsoft and we found it awesome. Azure provides users with a one-month trial license fees of $200 / INR 13,300 to try the cloud platform and our review is based on our one month use of the trial license.
What you need before you sign up for one month trial 1. A Microsoft id 2. A working credit card
How to sign up?
Go here at https://azure.microsoft.com/en-in/free/and click start free To log in after this use the URLportal.azure.comand use your login.
What is Azure Machine learning studio?
You have multiple options to start using Azure for machine learning. You may – 1. Create your own Virtual Machine over Azure and install packages via SSH or Windows remote access 2. You can use the marketplace version with has data science module inbuilt in some OS version 3. You can explore via Azure Machine learning studio, we talk about this part in this blog
Azure Machine learning studio as per Microsoft documentation is –

“ Microsoft Azure Machine Learning Studio is a collaborative, drag-and-drop tool you can use to build, test, and deploy predictive analytics solutions on your data. Machine Learning Studio publishes models as web services that can easily be consumed by custom apps or BI tools such as Excel. Machine Learning Studio is where data science, predictive analytics, cloud resources, and your data meet”
How to access Azure Machine learning studio?
ML studio provides you with an interactive visual workspace to easily build , test an analysis model and experiment.
You create a ML workspace which is like a folder or repository of your work. Do this by a login inside Azure. Click All resources and click the Add + Icon. Enter Machine Learning studio workspace as shown below.

Provide a name which you can identify easily. We gave the name as MieRobot-ML-Workspace1.
In the next screen you would see three options. Select the option – "Launch Machine Learning Studio"
You can also click the work space you have created and click the OPEN IN STUDIO option.
If you are asked for a sign in again, then please do it again.
This is how a ML studio with a work space created will look like.

Inside the ML studio you many the below tabs which I have tried to explain:
Projects: This is the tab to create new projects. We will not use this tab in this blog as we are just starting. We would use the experiment tab.
Experiments: This is the place where you can save, run and create your machine learning work. We will start with an experiment.
Webservices: List of services that you have published as a webservice.
Datasets: Dataset collections provided or uploaded by you.
Trained models: Models which you have trained and saved.
An experiment is an analysis which you perform for your machine learning. Let us now start an experiment.
What is our first experiment in Azure Machine learning?
That’s simple. Like all machine learning starters we would use a linear regression to predict using an automotive data set.
1. Click on saved datasets drop down and select the Automotive data set and drag to your right on the experiement canvas. This is shown below.

2. Double click on the experiment name, you can see this field on the top which writes experiment created on <date>. The name we choose for our experiment was MieRobot_Linear Regression_Auto
3. Right click on the dataset and click dataset->visualise. You would see something like this below. If you would scroll to the right you would see the target value of Y which is price. If you are new to linear regression read this blog on linear regression at: https://www.mierobot.com/single-post/Easy-Linear-Regression

We also take into notice that the dataset has 205 rows and 26 columns. Also note that the data has a labelled column header which is good for us.
4. Now on your left under Statistical functions select the option of Summarize data. This gets us a quick view of various commonly used statistical values and features. Right click on results dataset->visualise and you would see something as below. Please note that row and column count is different as the feature does not repeat as it does when you use the dataset option.

The key field to note is the column mean for Price which gets an idea of prediction and also the number of missing values. You can check the missing values and take a call to replace with mean or median values for a better predictive model. But we skip that for now as there are also string values and mean will throw error if used directly. So to be quick we can delete the rows or use the Probabilistic PCA.
5. To do this. Please select clean missing data. Click and set the cleaning mode to Probabilistic PCA as shown in figure. You can also do the replace the rows options. This is shown below.

6. If you take the option of Descriptive summary and check for the missing field columns – the count of the fields would be 0 as shown below. (Optional step).

7. We now need to focus on the columns we need. So get the select columns from dataset options. Then we click the launch column selector on the right. We choose only the columns we need by using the options of Include as shown below.

8. You now see that our columns have reduced from 26 to 8 as shown below, Hola thats great.

9. We then call linear regression model and we use the settings as below.

10. We call the split module and your model by now should look like this.

11. To configure the split module we take the spiting mode as split rows. Do not change any other parameters if you do not know what you are doing.
12. We are almost done and now pull the Train Model option from left. In the column selector we need to pick the predicted column which is Price here. This is shown below.

13. The next step is to score and evaluate the model. Drag the score model and evaluate model from left and complete the input output connections.
14. The canvas is completed and the canvas should look like this below.

Let us now test the model.
a) Right Click on score model and then scored dataset -> visualise. You can see that the predicted values of Price vs Train values in the last 2 columns.

b) Now follow the same and right click on evaluate model and visualise.
The evaluated model would look as below.

The model we have created is not a great one but a good one with prediction with a Coefficient of determination at 0.86. But this does the job as the first quick and easy test of Azure learning. The Key further for better predicted models would be data cleansing and fine tuning of the linear regression parameters.
Please comment up – What is the Coefficient of Determination that you have obtained and what else you did with this model?

About Author: Anirban explains himself as a combination of a coach, manager, leader and technologist. Anirban also runs the famous robotics site MieRobot.com which is voted as the top 40 robotics blog sites on this planet by Feed Spot.Anirban loves working with youth from his numerous corporate assignments with interns and freshers, he can give you a run on being patient and cool. Anirban’s technical stack includes Microsoft Azure Machine learning, Unix, C++, ROS, Python, Microcontroller Programming, Neural Networks, Tensorflow and web services. Anirban is a keen social media engineer and product UX designer. He trains young professionals and students in Machine learning and his offerings are at https://www.dneur.com/machine-learning

Source link

Back to top button