Abstract:
|
In order to obtain a better knowledge about the errors in the personal names, included in the database of the Information Systems, we analyzed a real corpus of errors. The analysis is done in terms of edit operations and edit distance. We present results about the distribution of the edit operations depending on the edit distance, the letters involved, their position, the influence of the lenght on error types, and the confusions between names. |