‘Data is’ versus ‘data are’
Publishing statistics is a tricky business. ONS is constantly marrying the absolute requirement for statistical accuracy with communicating our key messages to as wide an audience as possible.
With such a diverse target readership there are often differences of opinion within ONS on how best to communicate our statistics.
This was evident in the first session of ONS’s Style Council – convened not only to settle differences of opinion on how best to communicate with our readers but also to set a consistent standard.
The style issue up for debate was whether we continue to use the phrase ‘data are’ or change it to ‘data is’?
The grammatically correct approach is ‘data are’, because the word ‘data’ is the plural of the latin word ‘datum’.
As a language pedant, who corrects errant apostrophes when I see them and prefers ‘fewer’ to ‘less’, I would normally uphold the rule of grammatical accuracy, but there were powerful arguments on both sides of the debate.
There is a body of opinion outside the ONS, opting for the ‘data is’ route.
The Oxford Dictionaries blog, hives the use of the word ‘data’ off into two directions; in scientific and technical writing it’s plural (data are). In general usage, ‘data’ can take a singular form.
You might argue that use of ‘data’ as a plural is fine in specialised, scientific fields. However, is that the field ONS is in? Perhaps in terms of the gathering and analysis of data, that is true. But what about the dissemination of this data? There’s the challenge.
Which begs the question, what is ONS? Is it an academic institution, a government agency or a publisher? All three, I’d say.
The Wall Street Journal decided to use the singular form of ‘data’ in 2012, because “most style guides and dictionaries have come to accept the use of the noun ‘data’ with either singular or plural verbs.”
The Oxford English Dictionary refers to ‘data’ as a “mass noun, similar to a word like information, which cannot normally have a plural and which takes a singular verb. Sentences such as ‘data was’ (as well as data were) collected over a number of years are now widely accepted in standard English.”
When the Guardian posted a piece on the topic in 2012, a user pointed out that although using the singular of ‘data’ was bad Latin, we’re not speaking Latin any more.
Guardian style guide guru David Marsh cites the example of the word ‘agenda’, a Latin plural now used almost universally as a singular. He described using its singular form ‘agendum’ as “hypercorrect, old-fashioned and pompous.”
The beauty of the English language is that it constantly and consistently evolves and publishers are faced with a choice over whether or not to maintain the status quo (not the band) or change to a word which has become more commonly used.
Our Style Council opted to continue referring to data as plural, the main rationale being that people were less likely to be offended.
It’s a valid point; which could be considered the lesser of two evils.
In the Style Council’s next meeting in the New Year, we hope to address that hotly-contested point concerning whether numbers under 10 should be written out in full. I anticipate much debate.
7 comments on “‘Data is’ versus ‘data are’”
Write numbers as numbers, not words (except 1/one, which does need to be handled with care) Don’t go backwards after going forward!
No reference to a front page article in the RSS newsletter several years before which concluded it was correct to use ‘data is’. It was a good analysis of why the word had changed meaning and why the singular form is now correct.
At that time my part of ONS agreed and our editorial board, which had reviewed the then current ONS Style guide in detail, proposed the ONS changed to the singular form.
History … when forgotten leads to strange effects.
I almost certainly agree, Ian – from Tunbrige Well
Would you say:
a) “We are still waiting for one more datum to come in, then the data set will be complete.” or
b) “We are still waiting for one more piece of data to come in, then the data set will be complete.” (or data item, or similar)
If a), and you genuinely use “datum” in the singular, then it’s fine to use “data” as plural.
If b) then you should be consistent about using “data” as a singular mass noun.
The rule for “fewer vs less” is quite simple, use fewer if it’s countable, e.g. Fewer dogs means less barking.
For me, the most important part of communicating with your readership is that you use language which is relevant and meaningful.
Data is the plural of datum in Latin. If I happen to have a meeting with the Emperor of Rome I’ll remember that. But otherwise I’m not speaking Latin and a language has no obligation to follow the grammatical rules of languages that it borrows words from. For example we rarely use the Latin rules for pluralisation even when using words derived from Latin.
Treating data as a plural because of the grammar of a language not spoken here in 1500 years is stuffy and old fashioned. Is that the image the ONS wants to present?