Statistical models are cool.
That is not particularly appropriate academic language but I feel that saying “multivariate regression and other hyper-planer statistical models are the subject of significant interest among industrial & academic practitioners” does not fully capture how cool stats can be.
For the field of advanced statistics (and Machine Learning, AI, Neural networks) this is a blessing and a curse. As researchers, we like working on the mechanical aspects of stats sometimes to the exclusion of the reality of how useful or appropriate they might be.
This might have led to the perceived current situation where, as Machine Learning becomes more common in consumer applications and implementing ML it is supported in a variety of technologies, there are still significant challenges to transparency and adoption. We, the excited engineers, are solving the cool tech issues before the messy design and psychology ones.
I advocate for a mindset change; we need to stop thinking of statistical models as a tool we use to create our applications and start thinking of them is a fundamental part of the application itself. To use an analogy, statistical models are not the hammer we use to build a house with but the nail that is part of the house. This might feel like a distinction without a difference but, as we build applications in the future, AI is going to become more a part of the design conversations, not the engineering conversations.
The next stage for the adoption of Machine Learning in systems is going to be on how we best design to create meaningful, useful, and fair systems. We will need to start having conversations where AI and ML is part of the design fabric and the design language of the application. This will include very different and difficult conversations in co-design workshops where we let users guide how, where, and when we use statistical models.
So as Healthcare HCI researchers where do we start? There is some significant research in the field of intelligent user interfaces but for health products, we may need to look at a more fundamental level of design. We should design to ensure the inclusion and format of the statistical decisions are appropriate and not doing harm or disadvantaging people. This is broadly referred to in most literature as Fairness.
There are a few readings I can recommend –
Madaio, M. A., Stark, L., Wortman Vaughan, J., & Wallach, H. (2020). Co-Designing Checklists to Understand Organizational Challenges and Opportunities around Fairness in AI. Retrieved from https://doi.org/10.1145/3313831.3376445
– Read this paper if you want an incredibly clear, industrially relevant, step by step guides for the inclusive design of AI
Mehrabi, N., Morstatter, F., Saxena, N., Lerman, K., & Galstyan, A. (2019). A Survey on Bias and Fairness in Machine Learning. Retrieved from http://arxiv.org/abs/1908.09635
– Read this paper if you want a berth of understanding on why and how issues get created when a statistical model is deployed without thought or considering the ecosystem of the application
Mitchell, S., Potash, E., Barocas, S., D’Amour, A., & Lum, K. (2018). Prediction-Based Decisions and Fairness: A Catalogue of Choices, Assumptions, and Definitions. 1–22. Retrieved from http://arxiv.org/abs/1811.07867
– Read this if you want to a breakdown the implicit and explicit decisions that can go into the creation of a statistical model