- Help Center
- Machine Learning
- Supervised
Data Science Bootcamp
Python Programming
Machine Learning
Data Analysis
R Language
Power BI
Homework and Notebooks
Platform Related Issues
Programming and Tools
Large Language Models Bootcamp
Employment Assistance
Data Science for Business
Python for Data Science
Introduction to Power BI
Which should be preferred among Gini impurity and Entropy?
Gini impurity and Information Gain Entropy are pretty much the same. And people do use the values interchangeably. Below are the formulae of both:
- Gini:Gini (E)=1−∑cj=1p2jGini:Gini(E)=1−∑j=1cpj2
- Entropy:H(E)=−∑cj=1pjlogpjEntropy:H(E)=−∑j=1cpjlogpj
Given a choice, I would use the Gini impurity, as it doesn't require me to compute logarithmic functions, which are computationally intensive. The closed-form of its solution can also be found.