4 Matching Annotations
 Jul 2019

livebook.datascienceheroes.com livebook.datascienceheroes.com

However, the gain ratio is the most important metric here, ranged from 0 to 1, with higher being better.

en: entropy measured in bits mi: mutual information ig: information gain gr: gain ratio


rdrr.io rdrr.io

Feature predictive power will be calculated for all features contained in a dataset along with the outcome feature. Works for binary classification, multiclass classification and regression problems. Can also be used when exploring a feature of interest to determine correlations of independent features with the outcome feature. When the outcome feature is continuous of nature or is a regression problem, correlation calculations are performed. When the outcome feature is categorical of nature or is a classification problem, the Kolmogorov Smirnov distance measure is used to determine predictive power. For multiclass classification outcomes, a one vs all approach is taken which is then averaged to arrive at the mean KS distance measure. The predictive power is sensitive towards the manner in which the data has been prepared and will differ should the manner in which the data has been prepared changes.


www.scholarpedia.org www.scholarpedia.org

Mutual information is one of many quantities that measures how much one random variables tells us about another. It is a dimensionless quantity with (generally) units of bits, and can be thought of as the reduction in uncertainty about one random variable given knowledge of another. High mutual information indicates a large reduction in uncertainty; low mutual information indicates a small reduction; and zero mutual information between two random variables means the variables are independent.
