Computer & Information Science Department Theses and Dissertations

Permanent URI for this collection

https://hdl.handle.net/1805/2053

For more information about the Computer & Information Science graduate programs visit: http://www.science.iupui.edu.

Browse

Now showing 1 - 10 of 124

Identifying High Acute Care Users Among Bipolar and Schizophrenia Patients
(2023-12) Li, Shuo; Ben-Miled, Zina; Fang, Shiaofen; Zheng, Jiang Yu
The electronic health record (EHR) documents the patient’s medical history, with information such as demographics, diagnostic history, procedures, laboratory tests, and observations made by healthcare providers. This source of information can help support preventive health care and management. The present thesis explores the potential for EHR-driven models to predict acute care utilization (ACU) which is defined as visits to an emergency department (ED) or inpatient hospitalization (IH). ACU care is often associated with significant costs compared to outpatient visits. Identifying patients at risk can improve the quality of care for patients and can reduce the need for these services making healthcare organizations more cost-effective. This is important for vulnerable patients including those suffering from schizophrenia and bipolar disorders. This study compares the ability of the MedBERT architecture, the MedBERT+ architecture and standard machine learning models to identify at risk patients. MedBERT is a deep learning language model which was trained on diagnosis codes to predict the patient’s at risk for certain disease conditions. MedBERT+, the architecture introduced in this study is also trained on diagnosis codes. However, it adds socio-demographic embeddings and targets a different outcome, namely ACU. MedBERT+ outperformed the original architecture, MedBERT, as well as XGB achieving an AUC of 0.71 for both bipolar and schizophrenia patients when predicting ED visits and an AUC of 0.72 for bipolar patients when predicting IH visits. For schizophrenia patients, the IH predictive model had an AUC of 0.66 requiring further improvements. One potential direction for future improvement is the encoding of the demographic variables. Preliminary results indicate that an appropriate encoding of the age of the patient increased the AUC of Bipolar ED models to up to 0.78.
Deep Learning Based Methods for Automatic Extraction of Syntactic Patterns and their Application for Knowledge Discovery
(2023-12-28) Kabir, Md. Ahsanul; Hasan, Mohammad Al; Mukhopadhyay, Snehasis; Tuceryan, Mihran; Fang, Shiaofen
Semantic pairs, which consist of related entities or concepts, serve as the foundation for comprehending the meaning of language in both written and spoken forms. These pairs enable to grasp the nuances of relationships between words, phrases, or ideas, forming the basis for more advanced language tasks like entity recognition, sentiment analysis, machine translation, and question answering. They allow to infer causality, identify hierarchies, and connect ideas within a text, ultimately enhancing the depth and accuracy of automated language processing. Nevertheless, the task of extracting semantic pairs from sentences poses a significant challenge, necessitating the relevance of syntactic dependency patterns (SDPs). Thankfully, semantic relationships exhibit adherence to distinct SDPs when connecting pairs of entities. Recognizing this fact underscores the critical importance of extracting these SDPs, particularly for specific semantic relationships like hyponym-hypernym, meronym-holonym, and cause-effect associations. The automated extraction of such SDPs carries substantial advantages for various downstream applications, including entity extraction, ontology development, and question answering. Unfortunately, this pivotal facet of pattern extraction has remained relatively overlooked by researchers in the domains of natural language processing (NLP) and information retrieval. To address this gap, I introduce an attention-based supervised deep learning model, ASPER. ASPER is designed to extract SDPs that denote semantic relationships between entities within a given sentential context. I rigorously evaluate the performance of ASPER across three distinct semantic relations: hyponym-hypernym, cause-effect, and meronym-holonym, utilizing six datasets. My experimental findings demonstrate ASPER's ability to automatically identify an array of SDPs that mirror the presence of these semantic relationships within sentences, outperforming existing pattern extraction methods by a substantial margin. Second, I want to use the SDPs to extract semantic pairs from sentences. I choose to extract cause-effect entities from medical literature. This task is instrumental in compiling various causality relationships, such as those between diseases and symptoms, medications and side effects, and genes and diseases. Existing solutions excel in sentences where cause and effect phrases are straightforward, such as named entities, single-word nouns, or short noun phrases. However, in the complex landscape of medical literature, cause and effect expressions often extend over several words, stumping existing methods, resulting in incomplete extractions that provide low-quality, non-informative, and at times, conflicting information. To overcome this challenge, I introduce an innovative unsupervised method for extracting cause and effect phrases, PatternCausality tailored explicitly for medical literature. PatternCausality employs a set of cause-effect dependency patterns as templates to identify the key terms within cause and effect phrases. It then utilizes a novel phrase extraction technique to produce comprehensive and meaningful cause and effect expressions from sentences. Experiments conducted on a dataset constructed from PubMed articles reveal that PatternCausality significantly outperforms existing methods, achieving a remarkable order of magnitude improvement in the F-score metric over the best-performing alternatives. I also develop various PatternCausality variants that utilize diverse phrase extraction methods, all of which surpass existing approaches. PatternCausality and its variants exhibit notable performance improvements in extracting cause and effect entities in a domain-neutral benchmark dataset, wherein cause and effect entities are confined to single-word nouns or noun phrases of one to two words. Nevertheless, PatternCausality operates within an unsupervised framework and relies heavily on SDPs, motivating me to explore the development of a supervised approach. Although SDPs play a pivotal role in semantic relation extraction, pattern-based methodologies remain unsupervised, and the multitude of potential patterns within a language can be overwhelming. Furthermore, patterns do not consistently capture the broader context of a sentence, leading to the extraction of false-positive semantic pairs. As an illustration, consider the hyponym-hypernym pattern the w of u which can correctly extract semantic pairs for a sentence like the village of Aasu but fails to do so for the phrase the moment of impact. The root cause of this limitation lies in the pattern's inability to capture the nuanced meaning of words and phrases in a sentence and their contextual significance. These observations have spurred my exploration of a third model, DepBERT which constitutes a dependency-aware supervised transformer model. DepBERT's primary contribution lies in introducing the underlying dependency structure of sentences to a language model with the aim of enhancing token classification performance. To achieve this, I must first reframe the task of semantic pair extraction as a token classification problem. The DepBERT model can harness both the tree-like structure of dependency patterns and the masked language architecture of transformers, marking a significant milestone, as most large language models (LLMs) predominantly focus on semantics and word co-occurrence while neglecting the crucial role of dependency architecture. In summary, my overarching contributions in this thesis are threefold. First, I validate the significance of the dependency architecture within various components of sentences and publish SDPs that incorporate these dependency relationships. Subsequently, I employ these SDPs in a practical medical domain to extract vital cause-effect pairs from sentences. Finally, my third contribution distinguishes this thesis by integrating dependency relations into a deep learning model, enhancing the understanding of language and the extraction of valuable semantic associations.
Lane-based Weaving Area Traffic Analysis Using Field Camera Video Data
(2023-12) Lin, Wei; Tuceryan, Mihran; Chien, Stanley; Raje, Rajeev; Christopher, Lauren
Vehicle weaving describes the lane-changing actions of vehicles, which is a critical aspect of traffic management and road design. This study focused on the weaving behavior of vehicles occurring between ramp merge and diverge areas. Weaving in these areas causes congestion and increases the risk of accidents, especially during heavy traffic. Redesigning such areas for enhanced safety requires a comprehensive analysis of the traffic conditions. Obtaining the weaving pattern is a challenge in the traffic industry. To address this challenge, we leveraged AI and image processing technology to develop algorithms for quantitative analysis of weaving using surveillance videos at the consecutive ramp merge and diverge areas. This approach can also determine the weaving patterns of passenger cars and trucks respectively. The experimental results captured the lane-based weaving behavior of around 30% of vehicles in the favorable areas. The captured weaving data is used as weaving data samples to derive an overall analysis of a weaving location. Remarkably, our approach can reduce the manual processing time for weaving analysis by more than 90%, making this highly practical for use.
Trustworthy AI: Ensuring Explainability & Acceptance
(2023-12) Kaur, Davinder; Durresi, Arjan; Tuceryan, Mihran; Dundar, Murat; Hu, Qin
In the dynamic realm of Artificial Intelligence (AI), this study explores the multifaceted landscape of Trustworthy AI with a dedicated focus on achieving both explainability and acceptance. The research addresses the evolving dynamics of AI, emphasizing the essential role of human involvement in shaping its trajectory. A primary contribution of this work is the introduction of a novel "Trustworthy Explainability Acceptance Metric", tailored for the evaluation of AI-based systems by field experts. Grounded in a versatile distance acceptance approach, this metric provides a reliable measure of acceptance value. Practical applications of this metric are illustrated, particularly in a critical domain like medical diagnostics. Another significant contribution is the proposal of a trust-based security framework for 5G social networks. This framework enhances security and reliability by incorporating community insights and leveraging trust mechanisms, presenting a valuable advancement in social network security. The study also introduces an artificial conscience-control module model, innovating with the concept of "Artificial Feeling." This model is designed to enhance AI system adaptability based on user preferences, ensuring controllability, safety, reliability, and trustworthiness in AI decision-making. This innovation contributes to fostering increased societal acceptance of AI technologies. Additionally, the research conducts a comprehensive survey of foundational requirements for establishing trustworthiness in AI. Emphasizing fairness, accountability, privacy, acceptance, and verification/validation, this survey lays the groundwork for understanding and addressing ethical considerations in AI applications. The study concludes with an exploration of quantum alternatives, offering fresh perspectives on algorithmic approaches in trustworthy AI systems. This exploration broadens the horizons of AI research, pushing the boundaries of traditional algorithms. In summary, this work significantly contributes to the discourse on Trustworthy AI, ensuring both explainability and acceptance in the intricate interplay between humans and AI systems. Through its diverse contributions, the research offers valuable insights and practical frameworks for the responsible and ethical deployment of AI in various applications.
Rewiring Police Officer Training Networks to Reduce Forecasted Use of Force
(2023-08) Pandey, Ritika; Mohler, George; Hill, James; Hasan, Mohammad Al; Mukhopadhyay, Snehasis
Police use of force has become a topic of significant concern, particularly given the disparate impact on communities of color. Research has shown that police officer involved shootings, misconduct and excessive use of force complaints exhibit network effects, where officers are at greater risk of being involved in these incidents when they socialize with officers who have a history of use of force and misconduct. Given that use of force and misconduct behavior appear to be transmissible across police networks, we are attempting to address if police networks can be altered to reduce use of force and misconduct events in a limited scope. In this work, we analyze a novel dataset from the Indianapolis Metropolitan Police Department on officer field training, subsequent use of force, and the role of network effects from field training officers. We construct a network survival model for analyzing time-to-event of use of force incidents involving new police trainees. The model includes network effects of the diffusion of risk from field training officers (FTOs) to trainees. We then introduce a network rewiring algorithm to maximize the expected time to use of force events upon completion of field training. We study several versions of the algorithm, including constraints that encourage demographic diversity of FTOs. The results show that FTO use of force history is the best predictor of trainee's time to use of force in the survival model and rewiring the network can increase the expected time (in days) of a recruit's first use of force incident by 8%. We then discuss the potential benefits and challenges associated with implementing such an algorithm in practice.
Improving the Robustness of Artificial Neural Networks via Bayesian Approaches
(2023-08) Zhuang, Jun; Al Hasan, Mohammad; Mukhopadhyay, Snehasis; Mohler, George; Tuceryan, Mihran
Artificial neural networks (ANNs) have achieved extraordinary performance in various domains in recent years. However, some studies reveal that ANNs may be vulnerable in three aspects: label scarcity, perturbations, and open-set emerging classes. Noisy labeling and self-supervised learning approaches address the label scarcity issues, but most of the work couldn't handle the perturbations. Adversarial training methods, topological denoising methods, and mechanism designing methods aim to mitigate the negative effects caused by perturbations. However, adversarial training methods can barely train a robust model under the circumstance of extensive label scarcity; topological denoising methods are not efficient on dynamic data structures; and mechanism designing methods often depend on heuristic explorations. Detection-based methods devote to identifying novel or anomaly instances for further downstream tasks. Nonetheless, such instances may belong to open-set new emerging classes. To embrace the aforementioned challenges, we address the robustness issues of ANNs from two aspects. First, we propose a series of Bayesian label transition models to improve the robustness of Graph Neural Networks (GNNs) in the presence of label scarcity and perturbations in the graph domain. Second, we propose a new non-exhaustive learning model, named NE-GM-GAN, to handle both open-set problems and class-imbalance issues in network intrusion datasets. Extensive experiments with several datasets demonstrate that our proposed models can effectively improve the robustness of ANNs.
Enabling Real Time Instrumentation Using Reservoir Sampling and Binpacking
(2023-05) Meruga, Sai Pavan Kumar; Hill, James H.; Durresi, Arjan; Zheng, Jiang Yu
This thesis investigates the overhead added by reservoir sampling algorithm at different levels of granularity in real-time instrumentation of a distributed software systems. Firstly, this thesis not only discusses the inconsistencies found in the implementation of the reservoir sampling pintool in paper [ 1 ] but also provides the correct implementation. Secondly, this thesis provides the design and implementation of pintools for different level of granularities i.e., thread level, image level and routine level. Additionally, we provide quantitative comparison of performance for different sampling techniques (including reservoir sampling) at different levels of granularity. Based on the insights obtained from the empirical results, to enable real time instrumentation, we need to scale and manage the resources in the best way possible. To scale the reservoir sampling algorithm on a real time software system we integrate the traditional bin packing approach with the instrumentation in such a way that there is a decrease in the memory usage and improve the performance. The results of this research show that percentage difference between overhead added by Reservoir and Constant Sampling at a Image level granularity is 1.74%, at a Routine level granularity is 0.3% percent, at a Thread level granularity is 0.035%. Additionally, when we use bin packing technique along with reservoir sampling it normalizes the memory usage/performance runtime for Reservoir Sampling across multiple threads and different system visibility levels.
Automatic Extraction of Computer Science Concept Phrases Using a Hybrid Machine Learning Paradigm
(2023-05) Jahin, S M Abrar; Al Hasan, Mohammad; Fang, Shiaofen; Mukhopadhyay, Snehasis
With the proliferation of computer science in recent years in modern society, the number of computer science-related employment is expanding quickly. Software engineer has been chosen as the best job for 2023 based on pay, stress level, opportunity for professional growth, and balance between work and personal life. This was decided by a rankings of different news, journals, and publications. Computer science occupations are anticipated to be in high demand not just in 2023, but also for the foreseeable future. It's not surprising that the number of computer science students at universities is growing and will continue to grow. The enormous increase in student enrolment in many subdisciplines of computers has presented some distinct issues. If computer science is to be incorporated into the K-12 curriculum, it is vital that K-12 educators are competent. But one of the biggest problems with this plan is that there aren't enough trained computer science professors. Numerous new fields and applications, for instance, are being introduced to computer science. In addition, it is difficult for schools to recruit skilled computer science instructors for a variety of reasons including low salary issue. Utilizing the K-12 teachers who are already in the schools, have a love for teaching, and consider teaching as a vocation is therefore the most effective strategy to improve or fix this issue. So, if we want teachers to quickly grasp computer science topics, we need to give them an easy way to learn about computer science. To simplify and expedite the study of computer science, we must acquaint school-treachers with the terminology associated with computer science concepts so they can know which things they need to learn according to their profile. If we want to make it easier for schoolteachers to comprehend computer science concepts, it would be ideal if we could provide them with a tree of words and phrases from which they could determine where the phrases originated and which phrases are connected to them so that they can be effectively learned. To find a good concept word or phrase, we must first identify concepts and then establish their connections or linkages. As computer science is a fast developing field, its nomenclature is also expanding at a frenetic rate. Therefore, adding all concepts and terms to the knowledge graph would be a challenging endeavor. Cre- ating a system that automatically adds all computer science domain terms to the knowledge graph would be a straightforward solution to the issue. We have identified knowledge graph use cases for the schoolteacher training program, which motivates the development of a knowledge graph. We have analyzed the knowledge graph's use case and the knowledge graph's ideal characteristics. We have designed a webbased system for adding, editing, and removing words from a knowledge graph. In addition, a term or phrase can be represented with its children list, parent list, and synonym list for enhanced comprehension. We' ve developed an automated system for extracting words and phrases that can extract computer science idea phrases from any supplied text, therefore enriching the knowledge graph. Therefore, we have designed the knowledge graph for use in teacher education so that school-teachers can educate K-12 students computer science topicses effectively.
Registration and Localization of Unknown Moving Objects in Markerless Monocular SLAM
(2023-05) Troutman, Blake; Tuceryan, Mihran; Fang, Shiaofen; Tsechpenakis, Gavriil; Hu, Qin
Simultaneous localization and mapping (SLAM) is a general device localization technique that uses realtime sensor measurements to develop a virtualization of the sensor's environment while also using this growing virtualization to determine the position and orientation of the sensor. This is useful for augmented reality (AR), in which a user looks through a head-mounted display (HMD) or viewfinder to see virtual components integrated into the real world. Visual SLAM (i.e., SLAM in which the sensor is an optical camera) is used in AR to determine the exact device/headset movement so that the virtual components can be accurately redrawn to the screen, matching the perceived motion of the world around the user as the user moves the device/headset. However, many potential AR applications may need access to more than device localization data in order to be useful; they may need to leverage environment data as well. Additionally, most SLAM solutions make the naive assumption that the environment surrounding the system is completely static (non-moving). Given these circumstances, it is clear that AR may benefit substantially from utilizing a SLAM solution that detects objects that move in the scene and ultimately provides localization data for each of these objects. This problem is known as the dynamic SLAM problem. Current attempts to address the dynamic SLAM problem often use machine learning to develop models that identify the parts of the camera image that belong to one of many classes of potentially-moving objects. The limitation with these approaches is that it is impractical to train models to identify every possible object that moves; additionally, some potentially-moving objects may be static in the scene, which these approaches often do not account for. Some other attempts to address the dynamic SLAM problem also localize the moving objects they detect, but these systems almost always rely on depth sensors or stereo camera configurations, which have significant limitations in real-world use cases. This dissertation presents a novel approach for registering and localizing unknown moving objects in the context of markerless, monocular, keyframe-based SLAM with no required prior information about object structure, appearance, or existence. This work also details a novel deep learning solution for determining SLAM map initialization suitability in structure-from-motion-based initialization approaches. This dissertation goes on to validate these approaches by implementing them in a markerless, monocular SLAM system called LUMO-SLAM, which is built from the ground up to demonstrate this approach to unknown moving object registration and localization. Results are collected for the LUMO-SLAM system, which address the accuracy of its camera localization estimates, the accuracy of its moving object localization estimates, and the consistency with which it registers moving objects in the scene. These results show that this solution to the dynamic SLAM problem, though it does not act as a practical solution for all use cases, has an ability to accurately register and localize unknown moving objects in such a way that makes it useful for some applications of AR without thwarting the system's ability to also perform accurate camera localization.
Mutual Learning Algorithms in Machine Learning
(2023-05) Chowdhury, Sabrina Tarin; Mukhopadhyay, Snehasis; Fang, Shiaofen; Tuceryan, Mihran
Mutual learning algorithm is a machine learning algorithm where multiple machine learning algorithms learns from different sources and then share their knowledge among themselves so that all the agents can improve their classification and prediction accuracies simultaneously. Mutual learning algorithm can be an efficient mechanism for improving the machine learning and neural network efficiency in a multi-agent system. Usually, in knowledge distillation algorithms, a big network plays the role of a static teacher and passes the data to smaller networks, known as student networks, to improve the efficiency of the latter. In this thesis, it is showed that two small networks can dynamically and interchangeably play the changing roles of teacher and student to share their knowledge and hence, the efficiency of both the networks improve simultaneously. This type of dynamic learning mechanism can be very useful in mobile environment where there is resource constraint for training with big dataset. Data exchange in multi agent, teacher-student network system can lead to efficient learning. The concept and the proposed mutual learning algorithm are demonstrated using convolutional neural networks (CNNs) and Support Vector Machine (SVM) to recognize the pattern recognition problem using MNIST hand-writing dataset. The concept of machine learning is applied in the field of natural language processing (NLP) too. Machines with basic understanding of human language are getting increasingly popular in day-to-day life. Therefore, NLP-enabled machines with memory efficient training can potentially become an indispensable part of our life in near future. A classic problem in the field of NLP is news classification problem where news articles from newspapers are classified by news categories by machine learning algorithms. In this thesis, we show news classification implemented using Naïve Bayes and support vector machine (SVM) algorithm. Then we show two small networks can dynamically play the changing roles of teacher and student to share their knowledge on news classification and hence, the efficiency of both the networks improves simultaneously. The mutual learning algorithm is applied between homogenous algorithms first, i.e., between two Naive Bayes algorithms and two SVM algorithms. Then the mutual learning is demonstrated between heterogenous agents, i.e., between one Naïve Bayes and one SVM agent and the relative efficiency increase between the agents is discussed before and after mutual learning.

Browse

Recent Submissions