Natural language handling
I am a (very amateur) enthusiast about natural
languages, and have started to write some software related to this
interest. However, I'm not a computational linguist -- this is just at
a hobby level as far as the natural language side goes.
This has become MuLVoc on SourceForge.
- Show localization for source files, e.g. translations of
comments and perhaps names. Intended for (typically open-source)
developers in countries where knowledge of English is not
typically good; particularly where there may be large
organizations who could have translators translating the
comments on open-source code that their programmers then work
- Switches input modes according to which column of a spreadsheet
point is in. This requires a special row of the spreadsheet to
set up its input methods. Useful for entering data for use with
mulvo. Builds on csv-mode.el by Francis J. Wright.
- Convert pinyin data coming in from files.
- List the words occurring in the symbols defined in the
currently running emacs. Meant to give an idea of what to
translate if preparing for emacs work using localized-source.el.
- Look up language codes, using the data from http://www.ethnologue.com/codes/LanguageCodes.tab on Ethnologue.
- Read a Swadesh list file from Wiktionary, and convert it
into mulvo format data.
Here is an extract from wiktionary's
description of the Swadesh List:
The below list of words was devised by the linguist Morris Swadesh.
He used it as a means of determining the closeness of any pair of
languages. It is a useful list of the most common words, which are
essential to most languages and may be used in learning basic
communication in other languages and even multiple languages at once
since, for basic communication, vocabulary is generally more useful
than a knowledge of the target language syntax. Sometimes it is even
possible to learn basic communication with no knowledge of the target
I think I'll have to produce a parts-of-speech table for the
Swadesh list for myself, and will put it in my languages web
pages when I've done it.
- Add parts of speech to a Swadesh list.
- Read a phrase list, such as the one in Basque
language - Wikipedia, and produce a CSV file from it, for
mulvo to use.
- Support for the Irish (Gaelic) language. Modifies
capitalization to understand uru (eclipsis).
- Convert escape sequences to the characters they represent.
[My languages page]
[My elisp index]
[My emacs index]
[My computing index]
[My home page]
Last modified: Thu Sep 6 16:05:46 IST 2007