Abstract only

Dawn Nafus

This chapter contributes to debates about the social life of methods by describing the co-evolution of a research practice and infrastructure. It offers straightforward description of an ethnographic practice where I collect sensor data with participants, such as heart rate or ambient temperature, rework it into interpretable form, and sit down together with them to co-produce its meaning. The chapter reflects on how that method evolved with the available technical infrastructure, and argues that taking the social life of methods seriously might also mean intervening directly in the tools that sustain cultures of big data.

Abstract only

Transversal collaboration

An ethnography in/of computational social science

Mette My Madsen, Anders Blok and Morten Axel Pedersen

This chapter chronicles and reflects on the experiences of working ethnographically within, alongside and in collaboration with a large-scale interdisciplinary experiment in computational social science. It does so by recounting, from the ethnographer’s point of view, a number of ‘collaborative moments’ at the awkward intersection of computational data science and ethnographic fieldwork, as partners in the same research project. Here, the anthropologist finds herself in a position at right angles to both the population under study and the other scientists studying them; a chronic condition of oscillating between practising ethnography in a (partly) computational social science framework and doing an ethnography of the very scientific data practices and infrastructures involved. We consider this in/of oscillation not as a point of disciplinary comparison but rather as involving ‘transversal’ collaborations that instantiate forms of non-coherent, intermittent and yet productively mutual co-shaping among partially connected knowledge practices and practitioners. Such a rethinking is crucial, we argue, for understanding new social data ‘complementarities’ and their epistemological, ethical and political ramifications.

Abstract only

The other ninety per cent

Thinking with data science, creating data studies – an interview with Joseph Dumit

Joseph Dumit

This chapter is an interview about the first data studies course, a new approach to teaching critical thinking and social science approaches to data. The overall goal is to teach a critical-practical skill: how to assess data problems, consider their critical implications for stakeholders, explore the data provided (using Excel or R), perform basic analysis, liaise with data scientists for advanced analysis, and present the workflow and preliminary assessment of results to different audiences. Topics covered include: deep skills like being willing to fail; the importance of teaching with raw, real-world data; data archaeology and ethnography; exploratory data analysis; cleaning data with stakeholders in mind; continuing to clarify the question being asked.

Abstract only

Adrian Mackenzie

The chapter concerns an attempt to bring an ethnographic sensibility to the data generated by contemporary software developers. It focuses on numbers as processes and counting as a form, and explores how re-counting might be useful in attempts to reconstruct platforms and their associative realities. Since launching in late 2007, the code repository Github (Github.com) has become tremendously popular amongst programmers. Github’s growth attests to some substantial transformations in the way coders, coding and code associate with each other. On Github, coding practices have been re-formatted in ways that emulate the traits and tendencies of contemporary social media platforms. ‘Sharing’, ‘liking’, ‘watching’ and recirculation abound. Not only does Github host a wide variety of commercial, industry, government, scientific, educational and civil society software (and non-software) projects, but highly dispersed and diverse human and non-human actors congregate there. Github in early 2016 claimed to host 29 million code repositories and 6 million coders. The chapter describes some ways in which such large numbers might be re-counted. It explores how coders render accounts of what happens on Github through analysis of big data generated by other coders. It outlines some preliminary attempts to map the ripples of associative imitation that animate the platform’s growth and capitalisation. The growth of Github as intersectional assemblage, the reshaping of coding practices in imitation of social media and the susceptibility of large-scale public data about coding to analysis by coders alter the scope and focus of ethnographic study.

Abstract only


Ethnography for a data-saturated world

Hannah Knox and Dawn Nafus

The introduction to this collection sets out the key debates to which the chapters speak. It situates both digital data analysis and ethnography as methods which have their own ‘social lives’ and uses this approach to explore the work that methods do within particular projects of description and transformation.

Abstract only

‘If everything is information’

Archives and collecting on the frontiers of data-driven science

Antonia Walford

This chapter investigates the intensification of data practices that has occurred over the last decades in the environmental sciences. Moving away from a critical focus on the commodification of the environment, the chapter examines how a recent international databasing initiative in Global Earth Observations can be understood through the critical analytic of the archive. However, a focus only on the archival logics of such infrastructural data practices risks losing sight of other important elements of emergent data-driven scientific landscapes. One such element is data collection. Drawing on ethnographic fieldwork conducted with a large-scale Earth Systems project in the Brazilian Amazon, in comparison with a historical analysis of British colonial collections in the eighteenth and nineteenth centuries, the chapter argues that paying attention to data collection as a process of both appropriation and transformation is crucial for understanding the relations that constitute contemporary scientific knowledge production.

Abstract only

Edited by: Hannah Knox and Dawn Nafus

Data is not just the stuff of social scientific method; it is the stuff of everyday life. The presence of digital data in an ever widening range of human relationships profoundly unsettles notions of expertise for both ethnographers and data scientists alike. This collection situates digital data in broader knowledge-production practices. It asks about the kinds of social worlds that data scientists are creating as the profession coalesces, and looks at the contemporary possibilities available to both ethnographers and their participants for knowing, formatting and intervening in the world. It shows what digital data is doing to the empirical methods that sustain claims to expertise, with a particular focus on implications for ethnography.

The contributors offer empirically grounded accounts of the cultures, infrastructures and epistemologies of data production, analysis and use. They examine the professionalisation of data science in a variety of national and transnational contexts. They look closely at specific data practices like archiving of environmental data, or claims-making about how software is produced. They also offer a glimpse into the new methodological and pedagogical possibilities for teaching and doing ethnography in a data-saturated world.

Abstract only

Kaiton Williams

This chapter examines how a combination of approaches from anthropology and data science disciplines has supported my exploration of lives lived at similar intersections. It describes work I have done at two research sites. One, through self-tracking and the quantified self, is focused internally. The other, with a community of startup developers in Jamaica, is focused on struggles to realise the potential of the global knowledge economy from its margins.

While differing in their geographies and scales, both spaces allow for an interrogation of the potential of combining data science and ethnography: its new methods, modes of inquiry and modes of expression. For both myself and those I work with, data acts a conduit across borders of nation, history and flesh, promising new existential and epistemological models, and a means of affecting personal and national transformation. Its analytical lines offer the ability to connect and communicate, to modulate ideas of difference, and to help construct new identities. I discuss the uneven realisation of this potential, and how the attempts at its operationalisation reveal productive complications and reformulations at the convergence of engineering and ethnography.

Abstract only

Alison Powell

Data walking is a strategy for research creation and public engagement that breaks down hierarchies of knowledge and creates discussions about data based in a shared experience of observing and moving through space. This chapter describes the genesis of a particular approach to producing knowledge about data, in relation to ‘matters of concern’ encountered in particular local places.

Abstract only

Data scientists

A new faction of the transnational field of statistics

Francisca Grommé, Evelyn Ruppert and Baki Cakici

Big data and related methods, typically the purview of data science and data scientists, introduce new possibilities for the generation of official statistics and knowledge of the state. The chapter considers what this means for the future position and authority of national statisticians. Drawing on a collaborative ethnography of European national and international statistical institutes, we examine this as a politics of method where national statisticians position themselves in relation to data scientists to establish their legitimate authority. We suggest that both professional groups are being relationally reconfigured through not only debate, but transnational material-semiotic practices such as experiments, demonstrations and job descriptions. Through the proposed figure of the ‘iStatistician’, we suggest that these practices serve to differentiate national statisticians from data scientists by reinforcing established values and norms for the legitimate production of official statistics.