Data mining the archives | Opinion – Chemistry World

History including the history of science has a narrative tradition. Even if the historians research has involved a dive into archival material such as demographic statistics or political budgets to find quantitative support for a thesis, the stories it tells are best expressed in words, not graphs. Typically, any mathematics it requires would hardly tax an able school student.

But there are some aspects of history that only a sophisticated analysis of quantitative data can reveal. That was made clear in a 2019 study by researchers in Leipzig, Germany,1 who used the Reaxys database of chemical compounds to analyse the growth in the number of substances documented in scientific journals between 1800 and 2015. They found that this number has grown exponentially, with an annual rate of 4.4% on average.

And by inspecting the products made, the researchers identified three regimes, which they call proto-organic (before 1861), organic (1861 to 1980) and organometallic (from 1981). Each of these periods is characterised by a change a progressive decrease in the variability or volatility of the annual figures.

Theres more that can be gleaned from those data, but the key points are twofold. First, while the conclusions might seem retrospectively consistent with what one might expect, only precise quantification, not anecdotal inspection of the literature, could reveal them. It is almost as if all the advances in both theory (the emergence of structural theory and of the quantum description of the chemical bond, say) and in techniques dont matter so much in the end to what chemists make, or at least to their productivity in making. (Perhaps unsurprisingly, the two world wars mattered more to that, albeit transiently.)

Such a measure speaks to the unusual ontological stability of chemistry

Second, chemistry might be uniquely favoured among the sciences for this sort of quantitative study. It is hard to imagine any comparable index to gauge the progress of physics or biology. The expansion of known chemical space is arguably a crude measure of what it is that chemists do and know, but it surely counts for something. And as Guillermo Restrepo, one of the 2019 studys authors and an organiser of a recent meeting at the Max Planck Institute for Mathematics in the Sciences in Leipzig on quantitative approaches to the history of chemistry, says, the existence of such a measure speaks to the unusual ontological stability of chemistry: since John Daltons atomic theory at the start of the 19th century, it has been consistently predicated on the idea that chemical compounds are combinations of atomic elemental constituents.

Still, there are other ways to mine historical evidence for quantitative insights into the history of science often now aided by AI techniques. Matteo Valleriani of the Max Planck Institute for the History of Science in Berlin, Germany, and his colleagues have used such methods to compare the texts of printed Renaissance books that used parts of the treatise on astronomy by the 13th century scholar Johannes de Sacrobosco. The study elucidated how relationships between publishers, and the sheer mechanics of the printing process (where old plates might be reused for convenience), influenced the spread and the nature of scientific knowledge in this period.

And by using computer-assisted linguistic analysis of texts in the Philosophical Transactions of the Royal Society in the 18th and 19th centuries, Stefania Degaetano-Ortlieb of Saarland University in Germany and colleagues have identified the impact of Antoine Lavoisiers new chemical terminology from around the 1790s. This amounts to more than seeing new words appear in the lexicon: the statistics of word frequencies and placings disclose the evolving norms and expectations of the scientific community. At the other end of the historical trajectory, an analysis of the recent chemical literature by Marisol Bermdez-Montaa of Tecnolgico de Monterrey in Mexico reveals the dramatic hegemony of China in the study of rare-earth chemistry since around 2003.

All this work depends on accessibility of archival data, and it was a common refrain at the meeting that this cant be taken for granted. As historian of science Jeffrey Johnson of Villanova University in Pennsylvania, US, pointed out at the meeting, there is a private chemical space explored by companies who keep their results (including negative findings) proprietary. And researchers studying the history of Russian and Soviet chemistry have, for obvious geopolitical reasons, had to shift their efforts elsewhere and for who knows how long?

But even seemingly minor changes to archives might matter to historians: Robin Hendry of Durham University in the UK mentioned how the university librarys understandable decision to throw out paper copies of old journals that are available online obliterates tell-tale clues for historians of which pages were well-thumbed. The recent cyberattacks on the British Library remind us of the vulnerability of digitised records. We cant take it for granted that the digital age will have the longevity or the information content of the paper age.

Originally posted here:

Data mining the archives | Opinion - Chemistry World

Related Posts

Comments are closed.