Bots have three major limitations in the vein of speech and recognition. Those problems are vocabulary, sentance analysis, and response. In designing a bot, I would suggest first allowing for a learning process that is derived from input, much like how humans learn from watching speech first hand and associating actions/events to that speech. This "learning" process would require a database of known words and attributes of that word (a noun, a verb, ect). The bot would then take existing words and derive new words for it's database based on input by finding key learning phrases. For example, if someone were to give the sentence "I am lonely" in response to "how are you?", the bot would identify the phrase "I am" and catagorize lowercased words like "lonely" as an adjective. This is possible because simple grammatical statements like "I am" are limited in amount throughout the english language. The bot however, does not have to have these hardcoded but can derive new grammatical statements and put them into working groups for purposes of learning by using its database of known words. For example, if the bot were to consistantly find a certain preceeding phrase to adjectives, a phrase like "I be", then it could identify that phrase and record it under the selection of phrases used to learn adjectives. So if a user were to consistantly put "I be ___" in response to "How are you?", the bot could eventually identify that as a replalcement for "I am" and even use it in speech. (Though there is a hole for grammatical correctness) The bot could start conversations by asking "How you be" by using its connection between "am" and "be". This works to combat a natural issue of language in that it is always changing within circles of its speakers, dialects develop, and natural presentations of the language change over time. The major issue here is that the bot requires a database. Though the practice is traditionally substandard among bot developement, using the bot in an environment like aol instant messenger would make this perfectly possible. Since the server that hosts the bot can stay active and in a specified location, its database can be constantly updated and protected. This makes for a bot that is almost perfectly tailored for that environment with benefits like the bot will be able to learn by imitating actual human speech, which is what made the chatterbot Eliza so convincing. When compared to Eliza though, this design should take care of some of her short-comings. The trouble with Eliza is that she is specifically related to psycho-therapy. Her keywords are limited to things like "mother", "family", "childhood", ect. When communicating with Eliza it's important to not go beyond that area of speech because the user will eternally be bombarded with "tell me more" (which seems to be Eliza's default response). In the case of the classic bot, there is no open ended analysis that makes human counterparts so intriguing. The magic of a self-moding bot is that there can be.
October 6th, 2005