Repository logo
  • English
  • Català
  • Čeština
  • Deutsch
  • Español
  • Français
  • Gàidhlig
  • Italiano
  • Latviešu
  • Magyar
  • Nederlands
  • Polski
  • Português
  • Português do Brasil
  • Srpski (lat)
  • Suomi
  • Svenska
  • Türkçe
  • Tiếng Việt
  • Қазақ
  • বাংলা
  • हिंदी
  • Ελληνικά
  • Српски
  • Yкраї́нська
  • Log In
    New user? Click here to register. Have you forgotten your password?
Repository logo
  • Colleges, Institutes & Collections
  • Browse AAU-ETD
  • English
  • Català
  • Čeština
  • Deutsch
  • Español
  • Français
  • Gàidhlig
  • Italiano
  • Latviešu
  • Magyar
  • Nederlands
  • Polski
  • Português
  • Português do Brasil
  • Srpski (lat)
  • Suomi
  • Svenska
  • Türkçe
  • Tiếng Việt
  • Қазақ
  • বাংলা
  • हिंदी
  • Ελληνικά
  • Српски
  • Yкраї́нська
  • Log In
    New user? Click here to register. Have you forgotten your password?
  1. Home
  2. Browse by Author

Browsing by Author "Elleni Sisay"

Now showing 1 - 1 of 1
Results Per Page
Sort Options
  • No Thumbnail Available
    Item
    Ethio-Semitic Proto-Language Reconstruction with In-Context Learning and LSTM Encode-Decode Model
    (Addis Ababa University, 2024-12) Elleni Sisay; Fitsum Assamnew
    As language evolve, it change and words obtain new meanings and lose old ones, making their reconstruction a critical area of study. Proto-EthioSemitic languages, in particular remain underexplored despite their cultural and historical significance. This research investigates Historical Language Reconstruction (HLR) for Proto-EthioSemitic languages in word level, focusing on two core objectives: cognate identification and proto-word reconstruction. A three-way dictionary was used to compile a dataset of 14,100 semantically related words from Amharic, Ge’ez, and Tigrinya. Linguists manually identified a golden data set with 74 cognate pairs from the Swadesh list concept translated into the three languages of interest and reconstructed proto-forms, while using automated methods (SCA and LexStat) extracted an additional 1,847 cognates from the dataset, significantly enhancing scale. Building on these results, synthetic proto-forms were generated using in-context learning with GPT-4o, based on its performance of achieving a reconstruction accuracy of 85% when evaluated against the golden data. Furthermore, an LSTM-based encodedecode model was trained on the generated data to predict proto-forms from cognates, achieving a prediction accuracy of 91% and an average edit distance of 0.21. This work establishes a foundation for reconstructing ancestral languages within the Afro-Semitic family by integrating linguistic expertise, automated cognate extraction tools, and state-of the-art large language models. The findings underscore the potential of interdisciplinary approaches in preserving and understanding linguistic heritage, with implications for future studies in historical linguistics and language preservation.

Home |Privacy policy |End User Agreement |Send Feedback |Library Website

Addis Ababa University © 2023