Solved: How To Reduce RMSE(Root Mean Squred Error) Value F...

logo
  • Community
  • Training
  • Partners
  • Support
Support Questions Find answers, ask questions, and share your expertise All communityThis categoryThis boardCommunity ArticlesUsers cancel Turn on suggestions Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type. Showing results for  Show  only  | Search instead for  Did you mean:  Advanced Search Announcements Build on Open Foundations -- Explore our new Cloudera for Developers Hub
  • Cloudera Community
  • Support
  • Support Questions
  • How to reduce RMSE(Root Mean Squred Error) value f...
Options Options Solved Go to solution

How to reduce RMSE(Root Mean Squred Error) value for linear regression in machine learning?

avatar author-rank Manus Super Collaborator

Created ‎10-17-2016 09:39 AM

10-17-2016 09:39:03

Hi Guys,

I am new to the machine learning course I have dataset of clinical trials.It contains some textual as well as numerical data both(I have converted all the textual data/features into numeric by using Divectorization library of python).

I have attached dataset csv file as well as jupyter python notebook.Please check it.

if you want dataset description,then please visit below link and have used same public data from clinicaltrial.gov website.

https://clinicaltrials.gov/ct2/about-studies/glossary

Problem Statement:A dataset contains "ENROLLMENT" column(which shows number of participants required for clinical study) so,i want my algorithm should predict "ENROLLMENT" based on train data.

Please change the format from .txt to .csv for ct_gov_results and .txt to .ipynb for temporary_notebook file before you opens.

Issue: I am getting RMSE value as somewhat near to 3000 which is not good value.As per my knowledge it's value must be in between the range of 0 and 1.

I don't understand how to reduce it's value so that my algorithm will works fine for my data.

Please do response,Your reply will be very valuable for me.

Thanks in advance.

ct-gov-results.txt temporary-notebook.txt Reply 53,261 Views 0 Kudos 1 ACCEPTED SOLUTION avatar author-rank mrizvi Super Collaborator

Created ‎11-08-2016 10:34 PM

11-08-2016 10:34:21

@Manoj Dhake , it depends on the dependent variable. The unit of RMSE is same as dependent variable. If your data has a range of 0 to 100000 then RMSE value of 3000 is small, but if the range goes from 0 to 1, it is pretty huge. Try to play with other input variables, and compare your RMSE values. The smaller the RMSE value, the better the model.

Also, try to compare your RMSE values of both training and testing data. If they are almost similar, your model is good. If the RMSE for the testing data is much higher than that of the training data, it is likely that you've badly over fit the data.

View solution in original post

Reply 42,770 Views 0 Kudos
  • All forum topics
  • Previous
  • Next
2 REPLIES 2 avatar author-rank mrizvi Super Collaborator

Created ‎11-08-2016 10:34 PM

11-08-2016 10:34:21

@Manoj Dhake , it depends on the dependent variable. The unit of RMSE is same as dependent variable. If your data has a range of 0 to 100000 then RMSE value of 3000 is small, but if the range goes from 0 to 1, it is pretty huge. Try to play with other input variables, and compare your RMSE values. The smaller the RMSE value, the better the model.

Also, try to compare your RMSE values of both training and testing data. If they are almost similar, your model is good. If the RMSE for the testing data is much higher than that of the training data, it is likely that you've badly over fit the data.

Reply 42,771 Views 0 Kudos avatar author-rank Brajesh New Member

Created ‎02-24-2021 06:50 AM

02-24-2021 06:50:27

"If your data has a range of 0 to 100000 then RMSE value of 3000 is small, but if the range goes from 0 to 1." Range going from 0 to 1 means?

Reply 29,981 Views 0 Kudos Post Reply Announcements What's New @ Cloudera Announcing Cloudera Streaming Analytics Operator for Kuberne... What's New @ Cloudera Announcing Cloudera Data Services 1.5.5 SP2: AI Inference an... What's New @ Cloudera Announcing Cloudera Streams Messaging Operator for Kubernete... Community Announcements January 2026 Community Highlights What's New @ Cloudera Cloudera Data Lineage enhancements - Cloudera connectivity, ... View More Announcements

Tag » What Is A Good Rmse Value