- All words must belong to a language
- All words must be spelled as they are used in that language
- Only data that is more or less structured can be parsed and thereby become a candidate for inclusion in the database
If this means that some data cannot be converted I find this disturbing. I do however not see how it can be done in a different way.