- Help Center
- Machine Learning
- Supervised
-
Data Science Bootcamp
-
Python Programming
-
Machine Learning
-
Data Analysis
-
Pricing
-
Registration
-
R Language
-
SQL
-
Power BI
-
Homework and Notebooks
-
Platform Related Issues
-
Programming and Tools
-
Large Language Models Bootcamp
-
Blog
-
Employment Assistance
-
Partnerships
-
Data Science for Business
-
Python for Data Science
-
Introduction to Power BI
Which should be preferred among Gini impurity and Entropy?
Gini impurity and Information Gain Entropy are pretty much the same. And people do use the values interchangeably. Below are the formulae of both:
- Gini:Gini (E)=1−∑cj=1p2jGini:Gini(E)=1−∑j=1cpj2
- Entropy:H(E)=−∑cj=1pjlogpjEntropy:H(E)=−∑j=1cpjlogpj
Given a choice, I would use the Gini impurity, as it doesn't require me to compute logarithmic functions, which are computationally intensive. The closed-form of its solution can also be found.