What everyone needs to know about interpretability in machine learning

1. Machine learning systems make predictions based on a set of input features (i.e. a bunch of numbers).

2. Machine learning discovers correlations in data (but does not understand causality).

3. Some models are special, but interpretability is not the norm.

4. The real question is: how was the system created?

Key takeaways:

  • Some models are easy for humans to interpret, but this is the exception more than the rule; in general, we should not assume that all models can be represented in a way that is easy for humans to understand, at least not without some loss of fidelity.
  • In many ways the more important question than how a model works, is why we ended up with that particular model; ultimately, this will always be the result of the training data that was used, how that data was represented, and the modeling decisions that were made.
  • When applying machine learning in social domain, it is especially important to think about the training data being used, and to ask if it may be limited or biased in some way; it is much easier to rule out some feature as irrelevant than it is to know if some critical feature may be missing.
  • Finally, remember that the vast majority of supervised machine learning models work by discovering correlations in the data; without further evidence, this should not be interpreted as imply any kind of causal connection between inputs and outputs.




Dallas Card

