Sport History in the Digital Age: Muhammad Ali, Digitised Newspapers and Distant Reading

This post is about a number of different things. It’s about Muhammad Ali, cultural memory, names, the press, the civil rights era and racial discourse. More than anything however, the words that follow are about exploring new ways of doing historical work in the digital age.

Michael Ezra, author of Muhammad Ali: the Making of an Icon and chair of the American Multicultural Studies Department at Sonoma State University, calls Ali the most written about person in history. Not the most written-about boxer in history, or the most written-about sportsperson in history, or even the most written-about American; but the most written-about person in history. Michael Ezra is a smart guy, his book is one of the most innovative and consummately researched accounts of Ali yet written, but you have to question the validity of such a claim. How could the weight of literature written about a boxer, albeit a hugely famous boxer, eclipse that of Jesus, Hitler, Stalin, the Prophet Muhammad or Napoleon? It seems a ridiculous notion. Yet, as incredible (and probably false) this assertion may be, there is a brief moment when even the most skeptical of sports historians might pause and wonder; ‘well, could he be…?’

What can be said with certainty, however, is that the collection of Ali histories is immense. He is the subject of countless books, movies, newspaper articles, television shows, songs, theatre productions, museum exhibitions and art installations – each seeking to understand him in new ways. So, what makes a PhD student from Australia think he can write something innovative about the man who may be (but probably isn’t) the most written about person in history?

Quite simply, I think I’ve found a fresh perspective – a digital one to be precise. Doing history via digital methods poses unique challenges and confronting these challenges forces us to think creatively about our methods and approaches to historical work. This fresh perspective on Ali is born from having to confront some of the epistemic and philosophical challenges posed by the digital age. I don’t propose that I may be able to uncover these new understandings of Ali because I am smarter or harder working than the scholars that have come before me – but rather because I have been forced to think creatively about doing history via digital methods.

We now teach, write and research with access to what Digital Humanities scholar David M. Berry calls the ‘infinite archive’ – where source material is digital, abundant and (generally) readily available. The infinite archive is not a single website or database, you can’t ‘visit’ it. Rather, it is an idea: a conceptualisation of the incredible abundance of source material we now have access to. Whilst access to more sources instinctively feels like a good thing, the ‘infinite archive’ is often as problematic as it is exciting. Actually doing historical work with such an embarrassment of riches is complicated by a number of factors.

I will not attempt here even to briefly address all of the issues facing historians in the digital era. Other scholars, the late Roy Rosenzweig in particular, have discussed these concerns more comprehensively than I could hope to in a short blog post. I will however, share some of my experiences of working with digitized newspaper archives in the hope of opening up a dialogue about how to ‘do history’ when working with a glut of material.

My work currently revolves around 12 digitised North American newspapers that I have accessed via ProQuest Historical Newspapers. Most of you are undoubtedly familiar with ProQuest Historical Newspapers – a subscription based, online archive, with a number of different collections available for purchase. Among these is their Black Newspapers collection – a digital archive of nine important African American newspapers from the preceding two centuries. As you can imagine, access to these papers was a real boost for my research. The Black Newspaper collection along with archives of the New York Times, Los Angeles Times and the Washington Post, gave me an opportunity to examine the construction of Ali’s identity across multiple cultural, geographic, economic and political contexts.

It did come at a cost though. I won’t go into detail here, but my library’s purchase of a subscription to these collections only came about through a stroke of astronomically good fortune and at significant financial outlay. I say this not to gloat, but to allude to another problematic aspect of doing history with digital sources: financial and technological disparities within the academy. Were I a freelance researcher, or at a university with fewer resources, this project would never have come to be.

Issues of institutional inequality aside, my work with ProQuest Historical Newspapers also forced me to grapple with one of the digital age’s most challenging paradoxes: the idea that lots of source material might not always be such a great thing for historians. If we assume that each of these papers is a weekly publication (some are dailies) and each contains, on average, thirty pages (many have far more) – even narrowing my research to the years between 1964 and 1975 left me with 205,920 pages to read. It is simply not feasible to close-read, or even skim-read, 10 years worth of content from 12 newspapers in the hope of finding articles about Muhammad Ali. Such a task might not be accomplished in an entire career, let alone the three years it (hopefully) takes to complete a PhD.

In this situation, traditional methods such as close-reading are simply too time-consuming to be useful. So rather than scale back my research or use only one or two of the publications I was so lucky to be working with, I’ve spent the last year or so adapting and developing ways to analyse all of them. After all, it seems a shame not to read all the newspapers when they’re only a few mouse clicks away. I’m very pleased to say, not mention a little relieved, that these efforts have been quite fruitful.

Before I continue, I think it’s important to note that historians being forced to make methodological choices about which sources to work with and which ones to discard is not a situation unique to the digital age. Historians have always been challenged by large amounts of material – ours is a field that prides itself on understanding detail and nuance, understandings that often rely upon painstaking and time-consuming research methods. However, although the epistemic issues that characterise working with large bodies of material may not be new, they have certainly been exacerbated by digital technologies. The good news is that where digital technologies can create problems for us, they can also provide solutions.

ProQuest Historical Newspapers has a built-in search tool that scans the text of newspaper articles for words or phrases. Through this, I was able to find all articles containing the terms “Muhammad Ali” or “Cassius Clay” within a twelve-newspaper, ten-year range. The astounding thing is that the whole process takes less than a minute. However impressive this may be, it’s only half the battle. This search returned 20, 688 articles containing either “Muhammad Ali” or “Cassius Clay”, still far too many for me to read and analyse via close-reading.

It was at this point that I began to look closely at the work of Professor Franco Moretti, one of the great iconoclasts of twentieth century literary-scholarship. In his seminal Conjectures on World Literature essay (2000), Moretti detailed a process of using quantitative methods to analyse large amounts of literary material. He called this approach ‘distant reading’, and advocated its use as a research tool that allows us to see a body of texts in a broad, topographical way. In doing this, we can ‘look down’ upon a body of work and pick out the trends and concepts that interest us. Moretti himself admits that there are compromises inherent to the process. He notes that ‘distant reading’ reconstitutes texts in an abstracted way, and although graphs, maps and trees can be fantastic tools for viewing texts in a panoramic fashion, they lack the richness and complexity of traditional close-readings. Moretti argues however, “we always pay a price for theoretical knowledge: reality is infinitely rich; concepts are abstract, are poor. But it’s precisely this poverty that makes it easy to handle them, and therefore to know. This is why less is actually more.”

So I began to ‘distant read’ my group of twelve papers, creating a graphical comparison of how many times the terms “Cassius Clay” and “Muhammad Ali” appeared in the text of articles between 1964 and 1975. Choosing which words to base my distant reading upon is obviously a huge methodological decision. I chose to base my distant reading upon a comparison of how the American press used “Cassius Clay” and “Muhammad Ali” because key pieces of literature have implied that the press used Ali’s two names in certain ways depending upon how they felt about him with relation to race, religion and the Vietnam War.

This process produced the graph you can see below. I know it’s just a few lines and some numbers, but it’s hard to convey how exciting it was seeing this thing come together. Firstly because it meant I no longer had to spend hours punching numbers into Microsoft Excel spreadsheets, but also because there are some really clear trends that emerge from this graph.


DR Graph

This distant reading clearly indicates that there are significant, transitional events in 1964, 1967 and 1971 that affected how the press used Ali’s two names. For most historians, or indeed anyone with a basic knowledge of Ali’s career, these dates should sound familiar. Cassius Clay claimed his first heavyweight title in 1964 after defeating Sonny Liston in Miami. He changed his name to Muhammad Ali shortly thereafter. In 1967, known by this stage as much for his views on religion and race as for his boxing prowess, Ali was convicted of draft evasion after refusing to fight in Vietnam. Finally, in 1971 his draft-evasion conviction was overturned and he returned to boxing. Thus, it is not surprising that these dates correspond with spikes on the graph above. What I find interesting though, is what is inferred by the correlation between the trends on this graph and key events in Ali’s career.

My ‘distant reading’ suggests that the press’ use of the names “Muhammad Ali” and “Cassius Clay” is discursively linked to key events in 1964, 1967 and 1971. In itself, establishing this connection is not particularly groundbreaking. Ali’s two names have been used by a number of authors as an analogy for the progression of Ali’s identity: from Cassius Clay the brash young heavyweight – to Muhammad Ali the geopolitical figure. A few of Ali’s more skillful biographers, particularly David Remnick and Michael Ezra, have even hinted that his two names may have had their own agency in helping to construct Ali’s cultural identity. However, suggested links between Ali’s two names and his cultural identity were just that, suggestions. What this research does, that traditionally researched accounts could not, is to substantiate a previously implicit narrative of Ali’s cultural identity. Additionally, it also indicates the presence of an Ali story that potentially subverts previous understandings of the man; that within the American press he did not become “Muhammad Ali” until 1971 – a full seven years after his name change.

Ultimately though, this graph raises far more questions than it answers…and that’s a really good thing. A ‘distant reading’ such as this is not a magic bullet or an automatic history-machine. We can’t just plug an entire body of source material into a computer and expect it to spit out a rich, contextualized and rigorous historical analysis. That’s our job. What distant reading is really good at though, is suggesting which questions to ask, and also where we might find the answers. For me, this graph prompted the development of three research questions:

1) Why did the American press refuse to use “Muhammad Ali” between 1964 and 1967, preferring instead to call him “Cassius Clay”?

2) Why did the American press appear to re-evaluate the discourse regarding Ali’s ‘dual identity’ (Muhammad Ali & Cassius Clay) between 1967 and 1971?

3) Why did the American press embrace “Muhammad Ali” from 1971 onwards?

These are questions that can only be answered by good old-fashioned, close reading – getting down and dirty with the newspaper articles. In my next post, I hope to deliver some answers I’ve found to the questions posed above. In the meantime though, I’ll leave you with this: distant reading, when coupled with traditional close-reading and analysis, is a valuable and viable method of organising large quantities of historical material and can help us to develop meaningful and targeted research questions.

For further reading on ‘Distant Reading’ see:

– Nicholson, Bob. “The Digital Turn: Exploring the Methodological Possibilities of Digital Newspaper Archives.” Media History 19, no. 1 (2013): 59 – 73.

– Moretti, Franco. Distant Reading. London: Verso, 2013.

For further reading on Ali, particularly his relationship with the media, see:

– Ezra, Michael. Muhammad Ali: The Making of an Icon. Philadelphia: Temple University Press, 2009.

– Remnick, David. King of the World: Muhammad Ali and the Rise of an American Hero. London: Picador, 2000.

Steve Townsend

PhD Student at the University of Queensland (Australia)

Human Movement Studies


5 thoughts on “Sport History in the Digital Age: Muhammad Ali, Digitised Newspapers and Distant Reading

  1. Reblogged this on and commented:
    Steve Townsend at Sport in American History has written an article where he describes using “distant reading” to analyze large amounts of digital data on Muhammad Ali. According to Townsend, distant reading allows researchers “to see a body of texts in a broad, topographical way. In doing this we can ‘look down’ upon a body of work and pick out the trends and concepts that interest us.”

    Townsend used distant reading to examine when newspapers used the names “Muhammad Ali” and “Cassius Clay” over an eleven year period. This is an intriguing experiment because Townsend points out that each of these names is tied two different identities. While Cassius Clay identity was that of a “brash young boxer,” Muhammad Ali could more accurately be described as an international “geopolitical figure.” Townsend’s distant reading leads to some interesting conclusions.


  2. Pingback: Sport in the Archive: Research Reflections  | Sport in American History

  3. Pingback: Sport in American History: An Experiment in Digital Public Sport History | Sport in American History

Leave a Reply to Robert Greene II (@robgreeneII) Cancel reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s