This book offers a practical introduction to digital history with a focus on working with text. It will benefit anyone who is considering carrying out research in history that has a digital or data element and will also be of interest to researchers in related fields within digital humanities, such as literary or classical studies. It offers advice on the scoping of a project, evaluation of existing digital history resources, a detailed introduction on how to work with large text resources, how to manage digital data and how to approach data visualisation. After placing digital history in its historiographical context and discussing the importance of understanding the history of the subject, this guide covers the life-cycle of a digital project from conception to digital outputs. It assumes no prior knowledge of digital techniques and shows you how much you can do without writing any code. It will give you the skills to use common formats such as plain text and XML with confidence. A key message of the book is that data preparation is a central part of most digital history projects, but that work becomes much easier and faster with a few essential tools.
expression support, and nearly all do. If you do not know where to start, we would suggest trying Sublime Text ( www.sublimetext.com ), which is free for evaluation purposes.
While we are tooling up, there are two more free pieces of software you will need to follow along with what we cover in this chapter: a commandline interface (CLI), and the version-control tool Git. See the appropriate text box for your operating system for how to get the CLI and Git.
Mac or Linux . Good news: you already have the commandline. On a Mac it is called ‘Terminal’; it is somewhat
and 5 go into detail on how to work with digitised text automatically and at scale. We show how you can use the commandline , which gives you access to hundreds of small programs written by other people, and with which you can accomplish an enormous amount without writing a line of code. We make no apology for talking at length about the commandline: it is the Swiss army knife of computing, beloved of most programmers. Learning even a little bit about how to use the commandline can transform the way you work. Plain text is covered in Chapter 4 and structured
continuous text. XML, then, is what we will focus on in this chapter. There are two other common formats which we touched on in Chapter 2 , CSV and JSON. Working with JSON is beyond the scope of this book, but CSV lends itself well to the commandline techniques we have already covered. We use CSV (and the similar TSV) in Chapter 7 . In our commandline recipes list in Appendix 2 we show how you can extract a particular column from a CSV file using the command cut .
XML is easier to work with than to create from scratch, so we will begin by using XML versions of our
metadata such as a manifest of your files. You might remember the ls command from Chapter 4 : it has flags which can give more verbose information or recursively list all the files in all subfolders. As always with the commandline, you can write your results to a file using > to redirect the output.
With images, you may want to add a description or tags which describe the subject of the picture in ways that you might want to find later, such as data, provenance or image type. We said in Chapter 2 that there are tools for managing images, such as Tropy for research
will mean that interesting or rogue results will leap out of the data. To do this you need a tool that is fast, and we would recommend gnuplot, which is free and cross-platform, and works from the commandline; because it has been around for a long time, it also works in concert with lots of other software. 6 Gnuplot is much used by scientists but it is effective for anyone with lots of data to handle; it can tile numerous plots or charts in a gallery for quick viewing. You can do the same thing with programming languages such as R and Python if you prefer. Nathan
such. 4 At last, in line
11, our refrain verb comes back for its curtain call in a role that
lines 9 and 10 have rehearsed in lower-case cameo: all Swinburne’s
footwork comes down to the blunt aplomb of a main verb and the
simplicity of the declarative mood. The flashy exhortation of
apostrophic command (line 1) and the intricacies of
trompe-l’oeil perceptual relativity (line 5) give final
Afghan War Diary, a collection of nearly 100,000 classified documents relating to the Afghan War that was released by Wikileaks in July 2010, Graham Harwood describes this evolution:
A low-ranking larynx fixed by GPS markers that connect by radio wave. The report cascades up a command hierarchy until the SigAct [‘Significant Activity’] is [authenticated as] true. Then de-constructed into data-atoms, logical machines compress the contingency of the moment as another higher-ranking larynx calls across another radio wave. 11
After unearthing a commandline
More than a game
reflect our stature as ‘successful’ mayors. When any aspect of the
game falls into decline, we are faced with complaints and advice
whether we solicit them or not. Even the cheat codes for SimCity
communicate the extent to which the player is being asked to accept the values and judgements of a particular ideological system.
In order to access a cheat that will enable purchases to be made
with no cost (in Simoleons, at least), the player must access the
commandline and type in the phrase ‘i am weak’.
That god games present us with the simultaneous
Weak empire to weak nation-state around Nagorno-Karabakh
developed into the new and
mighty normative transcript of the national cause, publicly administered by
the Karabakh Committee.
On the oﬃcial level a taboo had been broken in Stepanakert by formally absolutely legal means: the irrelevant and powerless institution of a local Soviet
had, like a zombie, come to life and dared to place a nationalist demand,
bypassing (and eventually even ‘bringing to heel’)13 parallel local CPSU and
executive control (the RAIKOM and IZPOLKOM) against the traditional topdown commandline. There had been territorial administrative changes