The nukta / ़ / is a diacritic that is attached to a few characters in Hindi/Devanagari. However, there are separate unicode characters that come with nukta as a separate character. Thus, Unicode provides for adding nukta to almost any character, probably to allow for use of nukta in even in letters like य and व which is not required in Hindi but may be required to distinguish other sounds from languages other than Hindi or several other unwritten languages where Government of India recommends use of Devanagari script with necessary variations.
Below is a list two ways how the prominent valid nukta-based letters can be written in Hindi. One way is to write the letter with two characters combined, e.g. क + ़ = क़. Another way is to simply the single code-point unicode provided character क़ . Using the latter version of writing has an advanatage (and is preferable) because it uses only one character while the former uses two characters.Below is the list of all the nukta-based characters used in Hindi (written in two ways.
Character with Separate Nukta
Character with nukta-embedded
क + ़ = क़
ख + ़ = ख़
ग + ़ = ग़
ज + ़ = ज़
फ + ़ = फ़
ड + ़ = ड़
ढ + ़ = ढ़
How to find whether nukta is separate or embedded?
Given that both ways of writing nukta character is visibly different to the eyes, finding whether the character as a separate character for nukta or whether it is embedded one is impossible to find with bare yes. The easiest way out to find out is to copy the character and paste in a text editor, preferably Notepad++ or Notepad (if you are on a Windows machine) (MS Word or RTF or Open Office does not work).
After pasting it, just move cursor one by one and add space. If you see the cursor sitting inside a nukta based character, it means there is a separate character which you can see for yourself as the nukta comes out separately. Another way is to simply do a Ctrl+F on any editor or application and look for the nukta character i.e. ़ . Whichever word matches it, it contains a nukta separately inserted character.
Rules for a Spell Checker to check spelling errors of Nukta
A spell checker, working out a lexicon of valid words in Hindi, may often not have both the variations of the word. For example, a lexicon may have the word सिर्फ़ (containing single code point character फ़) but not सिर्फ़ (containing two characters of फ+ ़) . The case may also be vice-versa. In such a case, it is imperative that a rule matches the single code point nukta character with the dual code point nukta character.
Therefore, a rule may be formulated simply as such that the two separater characters as shown in the table above matches with each other.
Examples of Nukta based characters that can be used as test cases to check whether the spell-checker is working correctly: काग़ज़, ख़िलाफ़, ग़ज़ल, अल्फ़ाज़, इत्तेफ़ाक़, कर्ज़, फ़र्ज़, काग़ज़, ख़िलाफ़, ग़ज़ल, अल्फ़ाज़, इत्तेफ़ाक़, कर्ज़, फ़र्ज़
Pronunciation in English is always a bone of contention. A non-phonetic or semi-phonetic spelling convention makes the actual pronunciation even harder. So, here is a poem testing your English pronunciation skills. Just check if you can pronounce each of the words and give special attention to the spelling convention for each. If you can do this, you have a good understanding of English! Try this out once. The poem aptly titled as "The Chaos" is written by Gerard Nolst Trenité in 1922.
Dearest creature in creation,
Study English pronunciation.
I will teach you in my verse
Sounds like corpse, corps, horse, and worse.
I will keep you, Suzy, busy,
Make your head with heat grow dizzy.
Tear in eye, your dress will tear.
So shall I! Oh hear my prayer.
Just compare heart, beard, and heard,
Dies and diet, lord and word,
Sword and sward, retain and Britain.
(Mind the latter, how it’s written.)
Now I surely will not plague you
With such words as plaque and ague.
But be careful how you speak:
Say break and steak, but bleak and streak;
Cloven, oven, how and low,
Script, receipt, show, poem, and toe.
Hear me say, devoid of trickery,
Daughter, laughter, and Terpsichore,
Typhoid, measles, topsails, aisles,
Exiles, similes, and reviles;
Scholar, vicar, and cigar,
Solar, mica, war and far;
One, anemone, Balmoral,
Kitchen, lichen, laundry, laurel;
Gertrude, German, wind and mind,
Scene, Melpomene, mankind.
Billet does not rhyme with ballet,
Bouquet, wallet, mallet, chalet.
Blood and flood are not like food,
Nor is mould like should and would.
Viscous, viscount, load and broad,
Toward, to forward, to reward.
And your pronunciation’s OK
When you correctly say croquet,
Rounded, wounded, grieve and sieve,
Friend and fiend, alive and live.
Ivy, privy, famous; clamour
And enamour rhyme with hammer.
River, rival, tomb, bomb, comb,
Doll and roll and some and home.
Stranger does not rhyme with anger,
Neither does devour with clangour.
Souls but foul, haunt but aunt,
Font, front, wont, want, grand, and grant,
Shoes, goes, does. Now first say finger,
And then singer, ginger, linger,
Real, zeal, mauve, gauze, gouge and gauge,
Marriage, foliage, mirage, and age.
Query does not rhyme with very,
Nor does fury sound like bury.
Dost, lost, post and doth, cloth, loth.
Job, nob, bosom, transom, oath.
Though the differences seem little,
We say actual but victual.
Refer does not rhyme with deafer.
Fe0ffer does, and zephyr, heifer.
Mint, pint, senate and sedate;
Dull, bull, and George ate late.
Scenic, Arabic, Pacific,
Science, conscience, scientific.
Liberty, library, heave and heaven,
Rachel, ache, moustache, eleven.
We say hallowed, but allowed,
People, leopard, towed, but vowed.
Mark the differences, moreover,
Between mover, cover, clover;
Leeches, breeches, wise, precise,
Chalice, but police and lice;
Camel, constable, unstable,
Principle, disciple, label.
Petal, panel, and canal,
Wait, surprise, plait, promise, pal.
Worm and storm, chaise, chaos, chair,
Senator, spectator, mayor.
Tour, but our and succour, four.
Gas, alas, and Arkansas.
Sea, idea, Korea, area,
Psalm, Maria, but malaria.
Youth, south, southern, cleanse and clean.
Doctrine, turpentine, marine.
Compare alien with Italian,
Dandelion and battalion.
Sally with ally, yea, ye,
Eye, I, ay, aye, whey, and key.
Say aver, but ever, fever,
Neither, leisure, skein, deceiver.
Heron, granary, canary.
Crevice and device and aerie.
Face, but preface, not efface.
Phlegm, phlegmatic, ass, glass, bass.
Large, but target, gin, give, verging,
Ought, out, joust and scour, scourging.
Ear, but earn and wear and tear
Do not rhyme with here but ere.
Seven is right, but so is even,
Hyphen, roughen, nephew Stephen,
Monkey, donkey, Turk and jerk,
Ask, grasp, wasp, and cork and work.
Pronunciation (think of Psyche!)
Is a paling stout and spikey?
Won’t it make you lose your wits,
Writing groats and saying grits?
It’s a dark abyss or tunnel:
Strewn with stones, stowed, solace, gunwale,
Islington and Isle of Wight,
Housewife, verdict and indict.
Finally, which rhymes with enough,
Though, through, plough, or dough, or cough?
Hiccough has the sound of cup.
My advice is to give up!!!
The materials provided here are part of the 3 day tuotorial program delivered UGC-NET preparing students of Tezpur university, Assam
In this tech-talk session, I want to talk about an application that has been a great help bridging the language gap and digital divide across the globe.
Chances are that you have already heard of Adaptxt. Adaptxt is an acronym formed out of two words of “Adapt” and “Text”. The idea is that the tool offered under the aegis of Adaptxt is adaptable to the text. And it comes really true to its name.
Adaptxt is a keyboard cum dictionary application made particularly for mobile devices and tablet computers. As you are already aware, almost all of these devices come with only one language input tool i.e. English. That is, if you want to write something on these devices in a language other than English, chances are that you will not find any way to type type because your language is not supported by the device manufacturer. Device manufacturers like Nokia, Samsung, Micromax or several other companies that come with different software platforms, take it granted that the people’s basic need is to be able to type in English and therefore they provide support just for that language. Developing support for other languages requires much more than just localization of the software platform. It requires a lot of feedback from native language experts (linguists) as well as other resources. Therefore features like being able to type in Hindi or Bengali or Tamil is not available by default on mobiles or tablets that come with Android, Symbian or Windows Mobile platforms. If you want to be able to type in Hindi or other numerous languages, you will have to resort to external softwares and install them separately on your own.
For long since the advent of electronic communication for the common people, Hindi and other non-English languages were encoded in Roman itself. So much so that Hindi encoded in Roman and not Deonagari, the official script of the language, has earned a character of its own and has contributed much to what is now being termed as “Hinglish”.
No definition of Hinglish can be given for sure. No surveys are present to show who uses it, and how much and in what context this is being used. Hindi/Urdu/Hindustani, all being written in Roman can be comfortably termed as Hinglish. Going by this definition, we can safely say that there is a lot text available online and much of it has already passed the electronic communication channel (through SMSes, mails etc.)
Users of Hinglish:
There is a lot of interest in 'Hinglish', particularly from the industry. Even though a puritan view would reject the whole idea of Hinglish as an aberration, this term is going to stick. And therefore, the need to study this behaviour of the people for the sake of academics as well.
Hinglish is being used both by the native speakers and non-native speakers of Hindi/Urdu/Hindustani. Hindi/Urdu/Hindustani has been traditionally written in Devanagari script or Urdu (Nastaliqh) script. But its Hinglish avatar is now being written in Roman. And this has only increased the users of this language. Now, the whole of South Asia can be said to write this, because Hindi/Urdu/Hindustani is the lingua franca of this region.
Fully dedicated to language and related works, LangLex is a startup venture putting efforts towards enriching linguistic resources in Indian Languages through community efforts. As many of Indian languages remain undocumented and under-represented on the web, this is an effort towards filling this gap.
LangLex intends to be a medium for developing language resources for people studying Indian languages and linguistics.
As a start run, we have uploaded some language resources on this site that we think is useful for the people studying languages. A major goal of this site is to run an online tool where we can collect linguistic resources in different languages. These resources can be in any form, starting from just a few words to collecting sentences of languages that are less studied. Any person belonging to a lesser known language group can submit words and sentences on a given format which can later be used by linguists in studying that language. Thus, we will have a community for a language being connected remotely.