The Bayala Databases and the language abbreviations used in them
- Jeremy Steele

- Jul 4, 2024
- 7 min read
Updated: Jan 31
What are the Bayala databases?
The Bayala databases are a set of relational databases containing words and sentences in Aboriginal languages from across the country. In the databases, words appear in full and also as broken up into their stem and one or several suffixes (and sometimes prefixes). The databases began simply as lists of Aboriginal words and their corresponding English meanings. Since then they have become increasingly complex, and now provide much more. To give a single instance of their power, you can find virtually all the words for 'kangaroo' (or anything else) used across the entire country—limited only by the amount of information that has been added to the databases. That is to say, not every last instance of Aboriginal words for 'kangaroo' has yet been added to the databases, although a large number have been.
Purpose
The purpose of the databases is to enable the finding out of as much information about the indigenous languages covered as possible. Take for instance the word nabawinya, deriving from a record made by William Dawes:
Australian | respelt | English | EngJSM | source |
"[P. Nābaou-ínia Windayin Tamunadyeminga]" | nabawinya | "[I will look at you through the window (because) you refused me (bread)]" | see will I thee | Dawes (b) [b:32:6.1] [BB] [NSW] |
Dawes wrote it as “Nābaou-ínia”, which on respelling in its simplest form (without double letters or hyphens) becomes nabawinya. The databases enable words such as this to be displayed revealing their constituent parts—here stem: na; future tense marker: ba; and two pronouns: wi; ‘I’ and nya: ‘thee’. All four parts of this word can then be separately searched for, and compared with other instances in other words across all of the languages covered. The databases reveal that in south-east Queensland na still means ‘see’, but in Perth na is an exclamation of surprise: ‘Oh! Ah!’
Languages included
The National Indigenous Languages Report (2020) found that of the around 250 Aboriginal languages that scholars estimate were spoken in Australia in January 1788 only thirteen were being fully used and passed on to the coming generation of children. These languages include the following (although there could be others, especially if dialects are taken into account, and should further linguistic study reveal more): Alyawarr, Anindilyakwa, Anmatyerre, Arrernte, Burrara, Kala Lagaw Ya, Murrinh-Patha, Nyangumarta, Tiwi, Warlpiri, Western Desert Language, Wik Mungkan, Yolngu Matha. The Bayala databases contain none of these. Issues of ownership of languages, and copyright, make it simpler and more prudent to leave such fully living languages aside. If the traditional owners of a language were to request that their language be included in the Bayala databases, however, that would be another matter.
The vocabulary lists included in the Bayala databases are instead for the most part of a historical nature. The lists or records are generally thought or assumed to be out of copyright. There is an advantage in focussing on older records in that, after 1788, the virtually immediate overwhelming presence of English rapidly affected Aboriginal languages, particularly as the new European culture brought with it artefacts and ideas that had never been encountered before, and for which there were no Aboriginal Australian words. If you are sitting in a room as you read this, for instance, probably every single thing you see falls into this category: chair, wall, carpet, light, window and so on. And there are new ideas too: Tuesday, fourteen, vermillion and purple, England, how do you do, please, and so on again. The Bayala databases might accordingly be seen as perhaps having captured the classical forms of the languages included in them.
How the databases work
There are tens, possibly hundreds, of thousands of entries, or ‘records’, in the various Bayala databases. There are a large number of columns in the databases, but not all columns are used for every entry. Colours were applied to the columns to more easily distinguish one column from another.

The following is a typical summary table from the Bayala databases presenting a few of the words for ‘kangaroo’ across the country. Summary tables such as this appear in virtually all posts on this blog.
Australian | respelt | English | EngJSM | source |
"uggerra" | agara | "Kangaroo" | kangaroo | Curr 1 #39 Belt [1:424.1:1] [Arnda] [NT] [1886] |
"Argooci" | argugi | "Kangaroo" | kangaroo | SofM 19040322 [24 ClSecQld] [:26.2:31] [Ktj] [Qld] [1904] |
"badjeerie" | badyiri | "Kangaroo" | kangaroo | Curr 1 #9 Weedookarry [1:294.1:1] [Nymal] [WA] [1886] |
"Pun-darr" | banda | "Kangaroo" | kangaroo | Barlow, Harriott [:2:20.3] [Ngri] [Qld] [c.1865] |
"baricka" | bari-ga | "kangaroo" | kangaroo | Plomley ar [A2188:300:18],[Wst],[Tas],[c.1837] |
"bārrel" | baril | "kangaroo" | kangaroo | KAOL Ridley [61 Dippil] [:65.2:3] [GGbi] [Qld] [1875] |
"Péekoora" | bigura | "Kangaroo" | kangaroo | Walcott Nichol Bay WA [:249:12] [Jbra] [WA] [1863] |
"ply.hat.te.ner" | bla-ya-dina | "kangaroo" | kangaroo : | Plomley gar [:300:6] [OyB] |
"pray.en.ner" | bra-na | "bush kangaroo" | kangaroo | Plomley gar [:300:11] [NE] [Tas] [c.1835] |
"dray" | dra | "boomer" | kangaroo | Plomley gar [:298:14],[T-S],[Tas],[c.1835] |
"yshuckuru" | dyaguru | "Kangaroo" | kangaroo | Curr 2 #44 Jacobs [2:14.1:1] [Wnkru] [SA] [1886] |
"Chookaroo" | dyugaru | "Kangaroo" | kangaroo | Curr 2 #56 Jacobs [2:108.1:1] [Dhiri] [SA] [1886] |
"Groó man" | garu-man | "Kangaroo" | kangaroo | Mitchell, T.L.: 6: Moreton Bay [:378:29] [Ygra] [Qld] [1839] |
"Koorbiili" | gurbili | "A Kangaroo" | kangaroo | SofM 19030130 [181: ColSec WA] [:184.3:40] [Nwla] [WA] [1904] |
"kurloo" | gurlu | "Kangaroo" | kangaroo | Curr 2 #72 Lake Dix [2:176.1:1] [Admna] [SA] [1886] |
"larth.gar" | la-D-ga | "kangaroo" | kangaroo | Plomley gar [:299:17] [Wst] [Tas] [c.1835] |
"lalliga" | la-li-ga | "Kangaroo" | kangaroo | Ro/JJ [A610.jj:13:33.1] [SE] [Tas] [] |
"lila" | lila | "kangaroo" | kangaroo : | Plomley lh [:299:3] [OyB] |
"Loi-tyo" | luwidyu | "Kangaroo" | kangaroo | KING PP (Vol II appendix): Caledon Bay Gulf of Carpentaria [:634:24] [Dyyi] [NT] [1828] |
"mungaroo" | mangaru | "Kangaroo" | kangaroo | Curr 1 #10 Dyaburara [1:300.1:17] [Jbra] |
"(murri)" | mari | "kangaroo" | kangaroo | KAOL Ridley [75 Turrubul] [:82.4:3] [Trbl] [Qld] [1875] |
"now.wit.yer" | nawidya | "kangaroo" | kangaroo : | Plomley [:297:35] [OyB] |
"yowerda" | yawada | "Kangaroo" | kangaroo | Curr 1 #12 Majanna [1:308.1:1] [Mlkna] [WA] [1886] |
"Pa-ta-go-rong" | bada-garang | "Leaping Quadruped- large specie" | kangaroo [eat]-plenty | Anon (c) [c:24:12] [BB] [NSW] [1790-91] |
"Patagorang" | bada-garang | "Kangaroo" | kangaroo [eat]-plenty | KING PP (Vol II appendix): Port Jackson [II:635:6.2] [Syd] [NSW] [c 1820] |
"Bou-rou" | buru | "Kangaroo" | kangaroo | KING PP (Vol II appendix): Port Jackson [II:635:6.6] [Syd] [NSW] [c 1820] |
The two grey columns (columns 1 & 3) are the original records for the Aboriginal word and its English translation, and are given in double inverted commas.
The orange column is a standardised respelling of the Aboriginal word. In the Bayala databases a respelling convention is used for all words across the country. There are, for example, no double letters used in these respellings. Hyphens are also no longer used, so the examples where they occur above (e.g. rows 6, 9 & 10) were records created some years ago. Likewise colons in the yellow column are no longer used.
The yellow column gives a standardised simplified English rendering of the word. So the original English translations of “bush kangaroo” and “boomer” in rows 10 and 11 become simply ‘kangaroo’. What does its heading ‘EngJSM’ mean? Well, ‘Eng’ is obviously ‘English’. ‘JS’ is the initials of the compiler, Jeremy Steele. And ‘M’ stands for the ‘main’ English column, because there is also a secondary light yellow English column, not shown in the table, called EngJSAdj. This is used for a word such as “black cockatoo”’ where ‘cockatoo’ is entered into the EngJSM column, and ‘black’ into EngJSAdj.
It is these two features—the standardised respelling of Aboriginal words and the standardised simplifications of their original English translations—that give the Bayala databases an unparalleled ability to uncover word and meaning matches.
It is, however, the pink source column that is the particular focus of this post. It packs in quite a bit of information, and because of this some abbreviating is necessary. The components of the pink column are the following, taking Rows 7, 8 and 25 (indicated in a lighter shade of pink) as examples:
KAOL Ridley [61 Dippil] [:65.2:3] [GGbi] [Qld] [1875]
Walcott Nichol Bay WA [:249:12] [Jbra] [WA] [1863]
Anon (c) [c:24:12] [BB] [NSW] [1790-91]
It is easier to explain the constituent components back to front.
● The last component is the date of the record.
● Next to last is the state.
● Third last is the language abbreviation, which is the reason for writing this post: see further below.
● The fourth last shows where to find the record. There are usually two but sometimes three elements here, called notebook, page and line. All three appear in the third, Anon, example above: notebook c; page 24, line 12. The two records above that (KAOL and Walcott) just have page and line numbers.
● The first component in the pink source column is the source identifier. While the source identifier names sometimes appear obscure, this is because they too are abbreviated. In the Bayala databases they are drawn from, there is an additional field called ‘source details’ giving explanatory information sometimes in considerable depth, the following being an example explaining the last of the examples above (Anon):
Anonymous: Notebook (c). Vocabulary of the language of N.S. Wales in the neighbourhood of Sydney. (Native & English, but not alphabetical). Marsden Collection 41645c, held in the Library of the School of Oriental and African Studies, London.
It consists of some 44 pages, mostly in copperplate hands, with occasional annotations in rough hand. It may have been written by the governors, Arthur Phillip, John Hunter and Philip Gidley King. Keith Smith has dubbed it the ‘Governors’ Vocabulary’. It is loosely thematic as it progresses, and was probably compiled over time, and by a process of analysing and writing up rough notes.
JS LIST LOCATION: Brown mid-sized ringbinder ‘DAWES WORDLISTS’ on JS upper study bookshelves. |
Language abbreviations
All of the language abbreviations used in the Bayala databases can be found in a table available on the website.
A very small portion of this table appears below as a sample. The columns headed Language, Dialect and Subdialect are largely taken from the following work, with the author's permission:
Dixon, R. M. W. (2002). Australian Languages. Cambridge, U.K., Cambridge University Press.
The 'JS Name' column is the language or dialect respelt according to the spelling convention used in the Bayala databases.
Abbreviation | JS Name | Language | Dialect | Subdialect | State |
Anwn | Aniwan | Aniwan |
|
| NSW |
Awa | Awabagal | Awabakal | NSW | ||
BB | Biyal Biyal | Iora | Iora |
| NSW |
BBa | Baraba Baraba | Baraba-Baraba | NSW | ||
Bbya | Baranbindya | Barranbinja |
|
| NSW |
birn | Biriin | NSW | |||
Bjlg | Bandyalang | Bandjalang |
|
| NSW |
GGbi | Gabi Gabi | Gabi-Gabi | (or Dippil) | Qld | |
Jbra | Dyaburara | Ngarluma (or Kymurra) | [[Jaburrara]] | WA |
From the sample it can be seen that the languages from which these records were sourced are:
GGbi - Gabi-Gabi
Jbra - Jaburrara, and
BB - Biyal Biyal, the classical Sydney language.
All of the languages referred to in this post and others can readily be identified at any time by simply looking up the language abbreviation in the table on the website.
Jeremy Steele
4 July 2024



Comments