Abstract
1- Introduction
2- The future shape of human-artificial intelligence nexus and its environmental costs
3- The environmental costs of machine learning research
4- The environmental costs of gratuitous generalisation capabilities
5- Conclusion
Acknowledgment
References
Abstract
The environmental costs and energy constraints have become emerging issues for the future development of Machine Learning (ML) and Artificial Intelligence (AI). So far, the discussion on environmental impacts of ML/AI lacks a perspective reaching beyond quantitative measurements of the energy-related research costs. Building on the foundations laid down by Schwartz et al. (2019) in the GreenAI initiative, our argument considers two interlinked phenomena, the gratuitous generalisation capability and the future where ML/AI performs the majority of quantifiable inductive inferences. The gratuitous generalisation capability refers to a discrepancy between the cognitive demands of a task to be accomplished and the performance (accuracy) of a used ML/AI model. If the latter exceeds the former because the model was optimised to achieve the best possible accuracy, it becomes inefficient and its operation harmful to the environment. The future dominated by the non-anthropic induction describes a use of ML/AI so all-pervasive that most of the inductive inferences become furnished by ML/AI generalisations. The paper argues that the present debate deserves an expansion connecting the environmental costs of research and ineffective ML/AI uses (the issue of gratuitous generalisation capability) with the (near) future marked by the all-pervasive Human-Artificial Intelligence Nexus.
Introduction
Conceived as a scholarly discipline, ML seeks to develop ‘tools-for-optimal-action’. Given a task and evidence that can facilitate its mastering, the ‘tool-for-optimal-action’ earns its rank by being able to generalise. Such a tool then supports inferences which can generalise beyond the evidence (training data), i.e. can run inferences on new samples, provided that these come from the same, or a sufficiently similar, probability distribution as the evidence (training data). The discipline has a twofold epistemic aim. First, its theoretical purview, established by statistical learning theory, involves formal assumptions about the learning that leads to generalisations (Kawaguchi, Kaelbling, & Bengio, 2019; Vapnik, 1995). Second, from the empirical viewpoint, the discipline seeks to improve the accuracy of inferences that furnish the acquired generalisations. At the moment, the cross-fertilisation between the two sub goals seems to be rather recalcitrant creating the following asymmetry. Although investing heavily in the theoretical research (e.g. cf. Arjovsky, Bottou, Gulrajani, & Lopez-Paz, 2019; Bartlett, Foster, & Telgarsky, 2017; Kawaguchi et al., 2019; Neyshabur, Bhojanapalli, McAllester, & Srebro, 2017; Zhang, Bengio, Hardt, Recht, & Vinyals, 2017), the field remains dominated by the second sub goal. As strong empirical results outpaced mature theoretical understanding, task-specific, accuracy-tracking leaderboards became the go-to measure for assessing the field’s epistemic progress.
The emphasis put on a single objective – accuracy – inspires a naïve idea that leaderboards are like ladders. The higher the rung, the closer we are to alleviating our cognitive burden by employing almost perfect ‘tools-for-optimal-action’ to carry out all sorts of tasks. If we construe the epistemic aim of ML as understanding generalisations, incentivising leaderboards produces troubles for the discipline itself and, quite strikingly, for the environment as well. The issue concerns the cost of computational resources that enable climbing to ever higher positions on the leaderboards. In scenarios where theoretical understanding lags behind empirical results, a new state-of-the-art (SOTA) usually emerges from trial and error experimentations often producing quite arbitrary heuristics. Faced with the theoretical lacuna, practitioners confront the temptation of post-hoc speculations that might assume the role usually played by theoretical explanations (cf. Lipton & Steinhardt, 2019). When occurring alongside (accidental) misattributions of the sources of empirical gains, f.e. reporting improvements from neural architecture changes when, in reality, they stem from hyperparameter tuning (ibid.), the following might ensue. Instead of achieving the epistemic aim of understanding generalisations, leaderboards might merely encourage post-hoc hypotheses fitted to the results of quite arbitrary heuristics. The incredibly fast pace of the leaderboards climbs, natural language processing (NLP) is among the best of present examples (cf. Strubell, Ganesh, & McCallum, 2019 for an estimation of the SOTA NLP’s environmental costs), makes such bad practices a siren song, which could considerably hamper the discipline’s twofold epistemic goal.