Knowledge Graph Construction Based on Ontology from Source Code: The Case of Python

dc.contributor.advisorGetahun, Fekade (PhD)
dc.contributor.authorTadiwos, Amanuel
dc.date.accessioned2021-04-12T09:19:46Z
dc.date.accessioned2023-11-29T04:06:26Z
dc.date.available2021-04-12T09:19:46Z
dc.date.available2023-11-29T04:06:26Z
dc.date.issued2021-03-29
dc.description.abstractTechnology companies and online communities have been emerging tremendously and this resulted in release of millions of software. Source code is believed to hold necessarily important information about the software and business logic. Therefore, a semantically well linked and organized code data management system has been crucial issue in the field of software engineering. This study deals with an automatic method for constructing knowledge graph for python source code based on domain ontology. This allows software engineers in various fields such as online communities, open-source developers, knowledge management, expert systems, and semantic web to understand and process code semantically. A supervised Bi-LSTM (bi directional Long Short-Term Memory) network with CRF (Conditional Random Fields) on the top was used to extract candidate terms to be concepts/entities. The models were defined manually and trained automatically and simultaneously on a labeled data corpus. Using CRF on the top of BI-LSTM makes an optimized classification of terms in a given source code. Some features to be extracted from source code in addition to the default CRF features were defined and this helped the model to learn constraints for classification. Then Bi-LSTM model was adopted to extract relations (taxonomic and non-taxonomic). We have extracted relations among concepts both in term level and code level and the result was merged using max pooling. Experiments on SNIPS-NLU library (python library for natural language processing) shows the relevance and feasibility of proposed approach. Evaluation was done in two ways, one using gold standard ontology developed by expert and the other by expert evaluation. The result of experiment shows this approach achieved average f-measure of 77.04 and average relevance of 81.275 based on expert evaluation. This result implies that recurrent neural networks are efficient and promising in entity and relation extraction from python and other related programming languages.en_US
dc.identifier.urihttp://etd.aau.edu.et/handle/123456789/26069
dc.language.isoenen_US
dc.publisherAddis Ababa Universityen_US
dc.subjectKnowledge Graphen_US
dc.subjectKnowledge Graph Constructionen_US
dc.subjectOntologyen_US
dc.subjectOntology Learningen_US
dc.subjectSemantic Weben_US
dc.subjectKnowledge-Baseen_US
dc.titleKnowledge Graph Construction Based on Ontology from Source Code: The Case of Pythonen_US
dc.typeThesisen_US

Files

Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Amanuel Tadiwos 2021.pdf
Size:
1.8 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Plain Text
Description: