What's the best way to pronounce heximal numbers?
One could just read them out like they're decimal: "21" being pronounced "twenty one", etc. But that leads to some confusion when mixing heximal and decimal numbers in conversation, and more generally just confuses people who are currently used to only using decimal numbers. It also makes phrases like "base ten" confusing/ambiguous - which base is that 10 in???
Instead I'll propose two simple possibilities, both of which I like for different reasons. One sticks close to decimal while remaining distinguishable; the other is much more foreign but has fun and interesting numerical and syntactic qualities.
How To Pronounce It Like Decimal
(Several aspects of this are inspired by jan Misali's method, but I don't like all the details of their proposal.)
All the individual digits are pronounced normally, as in decimal English: zero, one, two, three, four, five.
All the "teens" (two-digit numbers with a 1 in the higher digit) are also pronounced like in decimal English: 10 is "six", 11 is "seven", 12 is "eight", 13 is "nine", 14 is "ten", and 15 is "eleven".
Higher numbers are pronounced akin to how decimal does it, just with a different suffix. Rather than appending -ty, append -sy: 20 is "twosy", 21 is "twosy one", 32 is "threesy two", etc with "foursy" and "fivesy". (Note: English phonotactics dictates that all of these S's become voiced, pronouncing as "z".)
After two digits, use "quat" to indicate the third digit: 123 is "one quat twosy three".
Always group quats into two digits: 1234 is "eight quat threesy four".
Groups of four digits then form a "gran" (short for "grand quat"): 10000 is "one gran", 23451 is "two gran, threesy four quat, fivesy one".
And then finally, groups of six digits form a "kilo". This, finally is the base of the greater number system that we'll extend from here; we reuse plain numbers, -sy numbers, quats, and grans within the larger kilo group. So "200 000000" "two quat kilo"; "12345 000000" is "approximately one gran kilo", etc.
Every group of six past that gets named using the standard SI prefixes: 20 digits is "mega", 30 is "giga", 40 is "tera", etc. I presume we'd have some latin cognates to continue to draw on for very large numbers, like in decimal, which similarly are never actually used, so I'm not going to care about them.
Keeping the single digits same as in decimal is obvious; all bases do that already.
Pronouncing 10₆ as "six" isn't too unusual, from what I understand of other base-pronunciation schemes. Having a unique word for your 10 value helps in talking about it, so you don't constantly have to mutter "base six" after every mention of "ten".
For the rest of the 1X values, continuing to use the decimal names makes them simple and easy to remember. Also, the 1X's don't yet have the ability to do the unifying suffix pattern that the higher values do (which I'll get to in a sec), so they've gotta be weird somehow anyway; this is why decimal has the unique "X-teen" variant (shortened from "X and ten"). As a bonus, decimal has unique non-systematic words for all the values we need; it doesn't start systematizing them (which would feel weird carrying over into heximal) until thirteen.
For 2X and after, I use a naming scheme distinguishable from but obviously inspired by decimal. 20⏨ in decimal was originally "two tens"; English shortened the "tens" to a -ty suffix (and lightly modifies "two" and "three" to sound better with the suffix). If we follow the same pattern in heximal, 20₆ is "two sixes", shortening to "twosy".
An accidental nice feature of this over decimal is that it lacks the "thirteen/thirty"/etc confusion that decimal has; a final "n" is hard to hear in some circumstances. In heximal, "nine" and "thirty" are totally different-sounding.
For higher values, "quat" was chosen because (a) it sounds good and unique, not easily confusable with any other numbers, and (b) it's a shortening of "quarter gross", because 100₆ is indeed one quarter of 144⏨.
After that we come to a decision point. Do we start the googlology (naming of large numbers) with 4-digit groupings, because 6⁴ (1296⏨) ≈ 10³ (1000⏨)? Or do we go for the rounder-in-heximal 6-digit grouping, so your first grouping is at 1010 (in heximal), the next at 1020, etc?
For now, I lean toward the rounder-in-heximal solution. It's a bit larger (six heximal digits counts up to about 50,000⏨), but that also means that it's a large enough number you won't actually hit it in normal usage. And 50k isn't unusably large, anyway; overall I think it's still quite usable, and the benefit of extremely round googlology is quite nice.
So I needed a name for four digits. I chose "quat" for two digits because 62 is a "quarter gross", referencing dozenal; 64 is three quarters of a "grand gross" (123), so I continued the analogy with a "grand quat", shortened to "gran".
Then we finally get to the googlology point, where we start naming much larger groups and just repeatedly reuse the preceding digit groupings. Using decimal's thousand/million/billion etc is a bit misleading (the values are pretty different) and would feel weirdly out-of-place (I'm using different names for everything else). Plus I've never liked their naming, at least in the short scale the US and most of the world now uses, where "million" (referring to "one") is 103×2, "billion" (referring to "two") is 103×3, etc. (In the long scale, at least, a million is 106×1, a billion is 106×2, etc, which makes sense.)
So I figured, let's just use the metric system's naming. It's both familiar and slightly foreign, and means that when you actually use the heximal metric system, it all just works automatically.
The only downside of this is that it leaves six digits in a row without a separator, which is slightly longer than I feel comfortable with. Putting one between each pair of digits seems a little excessive, but between each triple of digits breaks up the grouping in an unnatural way. I may still decide to use a lesser separator, like spaces between each pair, I dunno.
Pronouncing Digit Pairs Specially Instead
(This system was suggested to me by @berenryan, apparently cribbed from their old notes talking to fellow conlang nerds.)
The previous method has the nice benefit of being reasonably familiar; it requires a minimum of teaching (just teaching what "quat", "kilo", and "misson"/etc mean) and you can even skip that and just read out digit strings.
However, it just inherits the English naming system with minimal tweaks. If you're willing to get a little weird and deal with the fact that nobody will understand your numbers (but have the satisfaction that in a theoretical world where everyone learned it as kids, it would be better than the first method), there are some much better possibilities.
First, the digits. 0 is "pa" (/pɑ/ like in "papa"), 1 is "be" (/be/ like "bay"), 2 is "ti" (/ti/ like "tea"), 3 is "do" (/do/ like "d'oh!"), 4 is "ku" (/ku/ like "coup"), and 5 is "gr" (/gɹ/ like "grrrr" but not drawn out). Note that all of the vowels are Spanish-style, not English-style.
When speaking numbers larger than a single digit, group them into pairs and pronounce each pair as follows:
(ŋ is pronounced "ng")
So the basic patterns:
- the first letter of each digit-pair word uses the consonant from the one's digit's name: anything with a 0 ones digit starts with a "p" because 0 is "pa", etc.
- the last letter of each digit-pair word uses the vowel from the ten's digit's name; all the 1_ values end in an "e" because 1 is "be", etc.
- the second and third letters cycle between five vowels (a, e, i, o, u) and seven consonants (m, f, v, n, s, z, ŋ), respectively.
Larger numbers work the same as in the previous system; use "quat" after two digits, "gran" after four, etc.
So, 1,2345 is "befa gran dafi quat gufu".
This is actually a pretty brilliant setup for several reasons.
First, the digit names are obviously chosen to just use the Nth first/last letter; they also (aside from gr/gaza) are the first syllable of the 0_ numbers, since the second and last letters cycle thru the same vowels for the first five values.
Second, the sounds are well-chosen to be regular but distinguishable. The first consonants are all plosives, alternating between unvoiced and voiced; the second consonants are a collection of nasals and fricatives. Choosing consonants this way gives each word a sharp, distinguishable start that won't blend into other words, and then softens into a blending sound in the middle that carries stress well.
Third, the words all have significant redundancy, which helps with hearing. It's common for invented systems like this to be minimally redundant; it would have been easy to just produce two-letter words by, say, combining the first letter of the ten's digit's name and the last letter of the one's digit's name. They'd all be unique, so that technically works, right? In real conversation tho, where people might mumble, or the room might be noisy, or a radio might be crackling, distinguishing between some of those sounds can be quite difficult; this is why military jargon pronounces 5 and 9 as "fiver" and "niner", because "five" and "nine" sound remarkably similar over a low-bandwidth radio and the addition of the "er" sound forces us to emphasize the "v" versus "n" and makes them more distinct.
In here, the first and fourth letters cover 6×6 unique combinations, for the 36 possible values, but the second and third letters cover 5×7, or 35, possible values, meaning they're also almost completely unique across the set of numbers. (Only 0₆ and 55₆, pama and gamr, share their center letters.) As well, the two sets tile their combinations in different ways, so if two sounds are kinda similar they won't show up together multiple times. If someone is shouting across the room and you're not sure if they said "pizi" or "bizi", well, only one of those is the name of a number. If you can correctly hear any two letters in the word, you can almost always tell which number it was; you've got to fuck up really bad to mishear it.
(Of the six possible pairs of letters you could hear, three of the pairs uniquely identify a number in every instance; one (the center two letters) has a single collision, between 0₆ (pama) and 55₆ (gama); and two (the first and second letter, or the second and fourth letter) have six collisions, as each X0₆/X5₆ or 0X₆/5X₆ number collides.)
Fourth, it puts divisibility right into the name of the number: numbers divisible by 2 start with "p", "t", or "k"; numbers divisible by 3 start with "p" or "d"; divisible by 5 has an "a" in the second position; divisible by 7 (11₆) has an m in the third position.
This scales above two-digit numbers, too. In decimal, you can tell if a number is divisible by 3 or 9 by adding all the digits; if the result is divisible by 3 or 9, so was the original number. In heximal this trick works for 5, and the naming system actually does half the work for you already - since the second letter tells the value of the pair mod 5, you can easily tell that, say, a four-digit number whose first pair has an "e" and second pair has an "u" (like demo kusa, 3304) is divisible by five (since they're +1 and +4 above a multiple of five, respectively). Any pair with an "a" can be skipped, since it's divisible by 5 already and won't affect the result.
Similarly, there's a trick for divisibility by seven (11₆) - add together digit pairs, and if the result is divisible by seven, the original number was. Again, the naming system does half our work for us, since the second letter tells the value of the pair mod 7, so we can just look at those letters and quickly tell that, say, 12331 (befa dafi buzo) is divisible by 7, since "f" means it's +1 above a multiple of seven and "z" means +5, so 1+1+5 = 7!. (And it's divisible by five: e+a+u = 1+0+4 = 5!)
(These tricks work in any base, for the value one below and one above the base. So in base 10, they're for the values 9⏨ and 11⏨.)
So we get super-easy divisiblity by 2, 3, 5, and 7 all built into the naming system. This isn't just a party-trick, either - it's a built-in mental math check. Multiplying, say, basu (41) by kevi (24), you should get 1504, or gese kusa. You know right away that the last pair of the result must start with a "k" (because b×k, or 1×4, is 4 or k), check. Then the whole number needs to be divisible by 5 (because a×e, or 0×1, is 0), and yup, e+u is 5. Finally, it needs to be 1 more than a multiple of 7 (because s×v, or 4×2, is 8, reducing to 1), and yup, s+s is 4+4, or 1 mod 7.
These phonic additions and multiplications would be drilled into you as a child learning arithmetic in elementary school, becoming utterly intuitive. And from them, you can tell that you answer is very likely to be correct, since you got the correct offset from multiples of 2, 3, 5, and 7.
(In fact, a given combination of 2, 3, 5, and 7 is unique within each span of 210⏨ numbers, so if you pass those checks, you know you're either exactly right, or you're off by some multiple of 210⏨ (550₆). And 550₆ is just short of 1000₆; in other words, you'd have to be off by at least six gran to still get the checks to pass.)
- Fifth, some studies have shown an inverse correlation between the number of syllables in the names for numbers and the amount of numbers a person can remember at once. That is, in languages where digits are shorter to say (like Chinese, where they're all single-syllable), people can remember longer numbers than in languages with longer digit names (like English, with "seven" being two-syllable and several of the other digits "full" syllables stuffed with sounds, and almost of the names for the tens digits being two or three syllables). In this system, each digit is a single syllable; all two-digit numbers are two syllables; writing out a pronounced number requires exactly twice as many letters as writing it out in digits. That's about as small as you can get things!
- Sixth, having a dedicated and very short system for pronouncing all the two-digit values is convenient for "senary compression", the act of grouping heximal digits into pairs and treating them as units. Since base 6 is a little on the small side, this can be useful when talking about or remembering long numbers (this is the same reason octal and hexadecimal are useful, as compression methods for reading/discussing long binary numbers).