How many biographies of classicists does Wikipedia have? December 2019 update

About this time last year, I prepared a few statistics illustrating Wikipedia’s gender gap in classics and how it has changed, largely due to the efforts of the Women’s Classical Committee. Over 2019, they have continued their excellent work documenting female classicists (broadly construed) through Wikipedia. This year alone they have created or improved 186 biographies. They have transformed their area on Wikipedia and inspired other groups to get involved. They even had a session at the Leeds International Medieval Congress (a Late Antiquity link of course).

https://platform.twitter.com/widgets.js

A year is a long time online, so it pays to revisit the statistics and see how far we have travelling in 12 months. This comes with the caveat that the number of biographies is one aspect of Wikipedia’s gender imbalance. By design, Wikipedia emulates the real world – all information needs to be drawn from reliable sources so people who are more likely to have coverage are more likely to have Wikipedia articles. Biographies are one way in which the is manifested on Wikipedia, and there are other aspects which are harder to quantify especially on a large scale. As of writing, the article on the history of archaeology mentions 51 men and 0 women. Alice White is the one who spotted that particular imbalance. Amongst the sources used as references, there are 18 men and four women. An improvement on 51-0 but still poor. And the gender imbalance can also be found in how pages are written, so a Wikipedia biography of a woman may mention her family and husband before her career.

The numbers

Last year I tested two approaches to work out how many biographies of classicists there are on Wikipedia: counting articles in categories and querying Wikidata. The latter has the benefit of including more information from other sources, so you can draw on information in the German Wikipedia for example. As such I’ll be focusing on the latter.

As a reminder, Wikidata is an open source database, and for the purpose of this blog post is especially used because it contains information on things in Wikipedia and structures it in a way which can be queried. One of the interesting things is that a person doesn’t need a Wikipedia page to have a Wikidata entry. At some point in the not-to-distant future, everyone who has written a piece in the Journal of Roman Studies could have a Wikidata entry containing information about them and their works. That’s part of a larger project called WikiCite which aims to build an open source bibliography. For us, that means it’s handy because we can look beyond Wikipedia.

The figures that follow are based on information from 2 December 2019 unless stated otherwise. Our starting numbers for entries on classicists in Wikidata are:

TotalMaleFemaleno data
6,7475,86686516

Therefore, in Wikidata 12.9% of people with the occupation of classicists (or a subclass of classicist) are women. This shifts when we look at the number of articles on the English Wikipedia. Let’s also throw in the German and French Wikipedias since they also have a large number of articles.

Language Wikipedia Total number of articles Male Female
English 2,359 1,975 (83.7%) 384 (16.3%)
German 3,295 2,973 (90.2%) 322 (9.8%)
French 977 885 (90.6%) 92 (9.4%)

For context, there are 1,674,919 biographies on the English Wikipedia and 18.14% are about women. Classics is still languishing behind the average for English Wikipedia, but lets refresh our memory of last year’s numbers.

Language Wikipedia Total number of articles Male Female
English (14 Dec 2018) 2,088 1,820 (87.2%) 268 (12.8%)
German (3 Dec 2018) 2,851 2,587 (90.7%) 264 (9.3%)
French (3 Dec 2018) 791 725 (91.7%) 66 (8.3%)

Though I didn’t include French and German in last year’s stats they’re retrievable through the Denelezh gender gap analysis tool. A shift of three-and-half percentage points in English represents a huge amount of work, especially as while the WCC are proactively creating biographies and linking to them throughout Wikipedia, other Wikipedia editors are creating new articles which by the nature of how Wikipedia has grown will tend to be about men. This really stands out when comparing changes over the last year in the English Wikipedia with French and German.

Language Wikipedia Articles created between
14 Dec 2018 and 2 Dec 2019
Male Female
English 271 155 (57.2%) 116 (42.8%)
German 444 386 (86.9%) 58 (13.1%)
French 186 160 (86.0%) 26 (14.0%)

Though more articles overall were created in the German Wikipedia, less than half of the number of biographies on women were created compared to the English Wikipedia. In proportional terms, that means only 13% of new biographies of classicists on the German Wikipedia were about women. That was an improvement on the baseline for 3 December 2018, but only by half a percentage point.

Archaeology

As I’m an archaeologist (and because someone asked me to) last year I also threw out some numbers about classical archaeologists. That particular profession isn’t in the Denelezh tool, so we’re left with running Wikidata query and making a note of the results.

Language WikipediaTotal articlesMaleFemale
English, 14 Dec 2018193156 (80.8%)37 (19.2%)
English, 6 Dec 2019292207 (70.9%)85 (29.1%)

So for new articles created over the past year, there’s been a roughly fifty-fifty split between male and female biographies in this field (48:51). Part of this will represent better information in Wikidata as well as a growing number of articles. For example, Wikidata seems to have a reasonably good idea of how many archaeologists there are in its database, but specialisms aren’t as well covered. In part that’s because there are two ways to note a specialism: noting it as a profession or a field of work. The result is occasionally fragmented data. Therefore, the English Wikipedia has four(!) articles on people whose profession is medieval archaeology (1 male, 3 female) and 34 on people whose field of work is medieval archaeology (22 male, 12 female).

The bottom line

Back in 2016, when the WCC started working on Wikipedia articles, I estimated that 7% of biographies of classicists on the English Wikipedia were about women. Without the WCC’s intervention, it is likely that trend would have continued albeit with the proportion increasing roughly in line with the rest of Wikipedia.

How many biographies of classicists does Wikipedia have?

Ideally, working out how many biographies of classicists there are on Wikipedia would be easy, but as the encyclopaedia is constantly growing it’s not a straightforward question. As well as biographies being added regularly, there is the issue that if you want to ask questions about quantity you need meta data.

There are two possible approaches that I can think of: using Wikipedia’s category system and using Wikidata. This post is to provide a snapshot of how Wikipedia’s biographies of classicists in December 2018 because Wikipedia’s tools don’t currently record historic data of this kind to map trends and changes over time.

Categories

The English Wikipedia’s category for classical scholars is a logical place to start. It is intended to cover historians, philologists, archaeologists, antiquarians, and anything else which might fit under the broad umbrella of ‘classicist’. These categories are populated manually, so it relies on people recognising that a person is a classicist and then adding the category.

But Wikipedia’s network of categories means that isn’t too simple. Under ‘Classical scholars’ sits ‘Classical scholars by discipline‎’, and under that is ‘Latinist’. Quite sensible, but that includes ‘Translators from Latin’, which is where things start getting hazier. Alfred the Great translated from Latin, but doesn’t surely doesn’t fit into a wide understanding of what is a classical scholar. Restricting the search to the first two tiers down from ‘classical scholars’ gives a total of 1,336 biographies on the English Wikipedia as of 14 December 2018.1

There is also a category for women classical scholars which, debates aside about whether there should be such a category without a corresponding one for men, makes working out a total much easier. As of 14 December, it stands at 214 biographies, 16.0% of the overall total.

Using Wikidata

Wikidata is a database linked to Wikipedia which distils articles into machine readable facts. It has a lot of other stuff, but to take an example the entry on Hella Eckardt states that she is an archaeologist, a university teacher, specialises in Roman archaeology, and works at the University of Reading.2

Like Wikipedia, the database isn’t complete but growing all the time and in some respects relies on manual intervention. If a person is marked as being a classical scholar (or a subclass) it can be picked up using a query service. This would pick up all classicists, but with a bit of tinkering it’s possible to refine this just to those with articles. On Wikidata, classical scholars include classical archaeologists, papyrologists, classical philologists, Hellenists, and historians of classical antiquity. Oddly, there isn’t a separate field for Romanists, but hopefully they have been classified as classical scholars. Classical philogists include Latinists, but on Wikidata that is not taken to include people who translated Latin – so the likes of Alfred the Great are excluded.

Wikidata has 5,896 entries on classicists (11.2% about women), so we need to work out how many have articles on the English Wikipedia. With the help of Nav Evans and Jason Evans, we’ve been able to work out that as of 14 December there are 2,088 articles about classicists on the English Wikipedia, and 268 (12.8%) are about women. Looking specifically at classical archaeologists, there are 193 biographies on the English Wikipedia, of which 37 (19.1%) are about women. It may be the case that classical archaeologists are not as well mapped as classical scholars, and that people may have been classified into the broader of the two.

Conclusion

The two methods produced different results which reflects the nature of how the two categorisation system have grown. As Wikidata contains information about the whole of Wikimedia, and does not Alfred the Great as a classicist, it is likely to be the more comprehensive of the two. As the Women’s Classical Committee have gone through creating articles, corresponding Wikidata items with information about them have typically also been created.

In 2016, I estimated that 7% of biographies about classicists were about women, using the category method. In the two years since, the Women’s Classical Committee have written or improved more than 200 articles. Considering that the English Wikipedia only has 268 biographies of female classicists this represents a transformative effort.

As Wikimedia’s tools do not provide snapshot data, this is intended to provide a record of the state of Wikimedia’s content on classicists in December 2018 and document the methods for checking progress.

Notes

[1] Category search available used the tool Petscan and can be repeated using this link.

[2] As an aside, there’s also bibliographic data so you can there’s a tool which uses that data to build visualisations. You can see how cool it is, and even visualisations for journals, but that’s not the point of this piece.