The nuqta (or nukta) / ़ / is a diacritic that is attached to a few characters in Hindi/Devanagari. However, there are separate unicode characters that come with nuqta as a separate character. Thus, Unicode provides for adding nuqta to almost any character, probably to allow for use of nuqta in even in letters like य and व which is not required in Hindi but may be required to distinguish other sounds from languages other than Hindi or several other unwritten languages where Government of India recommends use of Devanagari script with necessary variations.
Below is a list two ways how the prominent valid nuqta-based letters can be written in Hindi. One way is to write the letter with two characters combined, e.g. क + ़ = क़. Another way is to simply the single code-point unicode provided character क़ . Using the latter version of writing has an advanatage (and is preferable) because it uses only one character while the former uses two characters.Below is the list of all the nuqta-based characters used in Hindi (written in two ways.
Character with Separate Nukta |
Character with nukta-embedded |
क + ़ = क़ |
क़ |
ख + ़ = ख़ |
ख़ |
ग + ़ = ग़ |
ग़ |
ज + ़ = ज़ |
ज़ |
फ + ़ = फ़ |
फ़ |
ड + ़ = ड़ |
ड़ |
ढ + ़ = ढ़ |
ढ़ |
How to find whether nuqta is separate or embedded?
Given that both ways of writing nuqta character is visibly different to the eyes, finding whether the character as a separate character for nuqta or whether it is embedded one is impossible to find with bare yes. The easiest way out to find out is to copy the character and paste in a text editor, preferably Notepad++ or Notepad (if you are on a Windows machine) (MS Word or RTF or Open Office does not work).
After pasting it, just move cursor one by one and add space. If you see the cursor sitting inside a nukta based character, it means there is a separate character which you can see for yourself as the nuqta comes out separately. Another way is to simply do a Ctrl+F on any editor or application and look for the nuqta character i.e. ़ . Whichever word matches it, it contains a nuqta separately inserted character.
Rules for a Spell Checker to check spelling errors of Nuqta
A spell checker, working out a lexicon of valid words in Hindi, may often not have both the variations of the word. For example, a lexicon may have the word सिर्फ़ (containing single code point character फ़) but not सिर्फ़ (containing two characters of फ+ ़) . The case may also be vice-versa. In such a case, it is imperative that a rule matches the single code point nuqta character with the dual code point nuqta character.
Therefore, a rule may be formulated simply as such that the two separator characters as shown in the table above matches with each other.
Examples of Nukta based characters that can be used as test cases to check whether the spell-checker is working correctly: काग़ज़, ख़िलाफ़, ग़ज़ल, अल्फ़ाज़, इत्तेफ़ाक़, कर्ज़, फ़र्ज़, काग़ज़, ख़िलाफ़, ग़ज़ल, अल्फ़ाज़, इत्तेफ़ाक़, कर्ज़, फ़र्ज़