Overall, the incorporation of counterfactuals has generally improved the models' F1 scores, driven largely by the improvements in precision. This suggests that counterfactuals have effectively improved performance without necessitating a significant trade-off between precision and recall.
statements that draw general conclusions about humans, computers, and/or human-computer interaction based on the results of the specific experiment done in the paper.