NSF SHF: Medium: Collaborative Research: Semantically-Enhanced Software Traceability for Supporting Human-Centric Tasks
Project Description (NSF CCF-1901059)
Achieving accurate, complete, trustworthy and usable traceability across software-intensive systems can be extremely beneficial as the underlying network of traceability links can be used to answer diverse questions about the software system and its development process. However, in practice, many software projects suffer from inadequate or inaccurate traceability due to the cost and difficulty of manually creating and maintaining trace links. This research will develop a holistic, interactive tracing environment, which incorporates diverse algorithmic solutions for dynamically generating trace links, visualizing the results, and guiding the user through the interactive process of using the results to support diverse Software Engineering tasks. Automating the creation and maintenance of accurate traceability links offers significant potential for industrial impact. For example, traceability is required by certifying bodies in safety-critical domains and can help in the construction and delivery of high quality, competitive, timely products. The cross-disciplinary nature of the team will introduce new opportunities for software engineers, data scientists, and human-computer interaction experts to collaborate in addressing open Software Engineering challenges and will provide research opportunities for diverse and underrepresented students.
The research will explore challenging problems at the intersection of software engineering, semantic text mining, and visualization. It will directly address one of the prominent causes of trace-link inaccuracy caused by the inability of current algorithms to reason over deep semantics of underlying software artifacts such as requirements, design, and code. First, the researchers will investigate semantically enhanced traceability algorithms that generate trace links, even in the absence of shared textual representations. Second, the work will develop a holistic tracing solution that dynamically configures a trace engine to leverage diverse tracing techniques such as semantic traceability, trace link evolution, and other existing techniques. Finally, given a diverse set of trace links with different degrees of accuracy, the research team will design, develop, and publicly release a novel, interactive, visual interface that enables users to understand the provenance and trustworthiness of each link while providing a clear rationale for trace query results.
Faculty
Research Assistants
Publications
- Biomedical Knowledge Graphs Construction from Conditional Statements
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB).
- Enhancing Taxonomy Completion with Concept Generation via Fusing Relational Representations
ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 2021.
- Technical Question Answering across Tasks and Domains
Conference of the North American Chapter of the Association for Computational Linguistics - Human Language Technologies (NAACL-HTL), 2021.
- Traceability Transformed: Generating More Accurate Links with Pre-Trained BERT Models
International Conference on Software Engineering (ICSE), 2021. (ACM SIGSOFT Distinguished Paper Award)
- Use of Internal Knowledge: Biomedical Literature Search Liberated From External Resources
IEEE International Conference on Bioinformatics and Biomedicine (BIBM), 2020.
- A Technical Question Answering System with Transfer Learning
Findings of Empirical Methods on Natural Language Processing (EMNLP), 2020.
- Tri-Train: Automatic Pre-fine Tuning between Pre-training and Fine-tune Training for SciNER
Findings of Empirical Methods on Natural Language Processing (EMNLP), 2020.
- Towards Semantically Guided Traceability
IEEE International Requirements Engineering Conference (RE), 2020.
- Crossing Variational Autoencoders for Answer Retrieval
Annual Meeting of the Association for Computational Linguistics (ACL), 2020.
- CTGA: Graph-based Biomedical Literature Search
IEEE International Conference on Bioinformatics and Biomedicine (BIBM), 2019.
- Multi-input Multi-output Sequence Labeling for Joint Extraction of Fact and Condition Tuples from Scientific Text
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2019.
- The Role of 'Condition': A Novel Scientific Knowledge Graph Representation and Construction Model
ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 2019.
|
|