CfA: Big Data and the History and Philosophy of Science

Keynote Speakers: Pieter Francois (University of Oxford), Rachel Spicer (London School of Economics), Charles Pence (UC Louvain) 

Philosophers and historians of science have long been wary about the uses of individual case studies to evaluate philosophical claims about science. The possibilities of cherry-picking or shoehorning in preconceived assumptions about scientific practice into carefully selected examples have led to serious concerns about the prospects of fruitful ways of testing general claims about the process of scientific change. The aim of the conference is to bring together an interdisciplinary array of scholars from philosophy, history, computer science, AI and deep learning, information science, and the social sciences to discuss the problems and prospects with using various big data approaches in the field of the history and philosophy of science. 

With the rise of the digital humanities and the development of a variety of complementary computer-aided techniques (e.g. distant reading, topic modelling, corpus analytics), big-data approaches have become more common in several subdisciplines of history and the humanities. Specifically, they have been used prominently in two recent projects that will be represented and discussed by our first two keynote speakers: the Database of Religious History and Seshat: Global History Databank. The success and potential demonstrated by these projects suggests the benefits of these methods for the history of science. While numerous groups are working on digital humanities/HPS projects with new AI-based tools (e.g. Gingras and Guay 2011), there remain outstanding issues to be addressed to develop publicly accessible, centralized databases that can provide an up-to-date synthesis of scholarly research for specialists and non-specialists alike.  

Such databases raise all sorts of issues. Specifically, many questions concerning the identification, reliable extraction, and pattern analysis of historical data need to be addressed. A few, specific examples include: 

·  What are the challenges of constructing historical databases? How can we build and justify their ontologies? How are key historical variables selected? 

·  Can deep machine learning or AI techniques expunge helpful data from primary historical texts? Should these tools be only used on primary texts or secondary texts as well?  

·  Are there limits as to what big data approaches can teach us about the history of science? If so, what are these limitations?  

·  Can there be a unified vocabulary to identify and define data points across diverse historical episodes? What’s the relation between local vocabularies of actor’s categories and those of historians? How can both be captured while avoiding anachronisms?  

·  How is the imprecision, incompleteness, and uncertainty of historical data best represented? Is there a substantial difference between inferred and non-inferred historical data? How can differences in historical interpretation best be conceptualized? 

·  Can historical data be used to derive and justify claims about various historical trends and patterns? How can computational techniques detect patterns and test hypotheses concerning, e.g., the co-evolution of theories, methods, values, and practices, or the composition of scientific communities and their dynamics? 

Please submit a 500-word abstract by Google Form by January 15th, 2023. Communication of acceptance will be by March 2023. Please note that the conference aims to be both in-person and online (for those participants who cannot make it to Toronto). However, there remains an open possibility that the event will be hosted fully online. 

Conference Website 

Organizing Committee 

·  Jamie Shaw (Leibniz Universität Hannover) 

·  Hakob Barseghyan (University of Toronto) 

·  Benjamin Goldberg (University of South Florida) 

·  Gregory Rupik (University of Toronto)