Munkhtsetseg Namsrai, Institute of Language and Literature, Mongolian Academy of Sciences, Ulaanbaatar, Mongolia
This study aims to examine the terminology management tool ProTerm for its appropriateness for our need to manage Mongolian. Although there are plenty of terminology management tools, the most of them do not support Asian languages including Mongolian whereas they only support developed languages such as English, German, French, Spanish and some other languages. Since manual collection of terms is the most laborious, costly, and time-consuming work, we need appropriate terminology management tool to facilitate our heavy loading work. In this regard, I have tested the terminology management tool ProTerm whether it can help us to manage terms in Mongolian. The results of the experiment show that the Proterm can be a timesaving and efficient tool for our terminology work since it can extract more acceptable terms than I expected and allow us to create a termbase with our desired terminology entries.
termbank, terminology management system, terminology extraction, Mongolian term, terminology resources, noise, silence
Mohamed Amine Menacer and Kamel Smaïli, Université de Lorraine, CNRS, LORIA, F-54000 Nancy, France
The Arabic language has many varieties, including its standard form, Modern Standard Arabic (MSA), and its spoken forms, namely the dialects. Those dialects are representative examples of under-resourced languages for which automatic speech recognition is considered as an unresolved issue. To address this issue, we recorded several hours of spoken Algerian dialect and used them to train a baseline model. This model was boosted afterwards by taking advantage of other languages that impact this dialect by integrating their data in one large corpus and by investigating three approaches: multilingual training, multitask learning and transfer learning. The best performance was achieved using a limited and balanced amount of acoustic data from each additional language, as compared to the data size of the studied dialect. This approach led to an improvement of 3.8% in terms of word error rate in comparison to the baseline system trained only on the dialect data.
Automatic speech recognition, Algerian dialect, MSA, multilingual training, multitask learning, transfer learning
Rupali Batta, Notre Dame High School, San Jose, United States
Question Answering (QA) has attracted much attention due to its widespread use in search engines, semantic parsing, and knowledge representation. QA is the process of generating a suitable answer to a question posed by a user given a reference text. A crucial shortcoming of existing QA systems is the requirement for a large quantity of reference data. This research aims to facilitate ef icient machine understanding of natural language by introducing an automated Question Answering system that has a deeper understanding of the reference text. The tool inputs a corpus of English sentences and their semantic annotations, or Abstract Meaning Representations (AMRs). It then processes the AMR trees and chooses the best generated question using a language model. The output file contains a QA pair for each unique word in each sentence of the corpus. This is a novel application of AMRs, which have previously been used to manually create QA pairs on a small scale, but have not been used computationally en masse. By converting complete corpori into QA pairs, this research can be a useful tool for Natural Language Processing datasets and language understanding. The evaluation metrics suggest that this method of QA pair generation should support significant future work.
Natural Language Processing, Language Understanding and Computational Semantics, Neural Networks, Question Answering, Language Modelling, Abstract Meaning Representations.
Zhihao Zheng, Yao Zhang, Vinay Gurram, Jose Salazar Useche, Isabella Roth, Yi Hu, Department of Computer Science, Northern Kentucky University, Highland Heights, Kentucky USA 41099
At present, the development and innovation in any business/engineering field are inseparable from the computer and network infrastructure that supports the core business. The world has been turning into an era of rapid development of information technology. Every year, there are more individuals and companies that start using cloud storages and other cloud services for computing and information storage. Therefore, the security of sensitive information in cloud becomes a very important challenge that needs to be addressed. The cloud authentication is a special form of authentication for today’s enterprise IT infrastructure. Cloud applications communicate with the LDAP server which could be an on-premises directory server or an identity management service running on cloud. Due to the complex nature of cloud authentication, an effective and fast authentication scheme is required for successful cloud applications. In this study, we designed several cloud authorization schemes to integrate an on-premises or cloud-based directory service with a cloud application. We also discussed the pros and cons of different approaches to illustrate the best practices on this topic.
Cloud Application Authentication, Identity Management in Cloud, IAM
Olu Amusan, Dominic Carrillo, Luke Hillard, Department of Computer Science and Engineering University of North Texas
Current research in the automotive industry has been striving to new heights in object detection with adding real-time object detection. The accuracy of majority classification models are not adequate when detecting vehicles in multiple different scenarios especially in real-time. In more direct terms, what we aim to improve is the loss of objects between frames. Within our model we propose to train our model using diverse images of vehicles in different scenarios. We will be using a novel approach of creating composite frames in the training data by overlaying the previous one and two frames over the original frame. Experimental results will demonstrate how our classification models improve in detection with this novel training data approach. The impact we hope to see is that we can make safer autonomous driving by changing the training data.
autonomous vehicles, yolov4, real-time detection, object detection
Kaustuv Kunal, Littilabs.com, India
Serverless architectures are cost effective, fast, reliable and less maintainable. Building such systems for big data setup is challenging task specially, for start-ups. The paper proposes a baseline serverless large scale end-to-end batch log processing architecture for data analytics and modeling followed by a case study. The four-layer FaaS architecture is effective, low maintainable, inexpensive and shall be setup leveraging any public cloud. A part from typical serverless advantage it also aids in data management and user profiling.
Big Data Processing, Cloud Computing, Serverless Architecture, Batch Processing, Public Cloud
Khalid Amen, Mohamed Zohdy and Mohammed Mahmoud, Oakland University, USA
With the increase in heart disease rates at advanced ages, we need to put a high quality algorithm in place to be able to predict the presence of heart disease at an early stage and thus, prevent it. Previous Machine Learning approaches were used to predict whether patients have heart disease. The purpose of this work is to compare two more algorithms (NB, KNN) to our previous work  to predict the five stages of heart disease starting from no disease, stage 1, stage 2, stage 3 and advanced condition, or severe heart disease. We found that the LR algorithm performs better compared to the other two algorithms. The experiment results show that LR performs the best with an accuracy of 82%, followed by NB with an accuracy of 79% when all three classifiers are compared and evaluated for performance based on accuracy, precision, recall and F measure.
Machine Learning (ML), Logistic Regression (LR), Naïve Bayes (NB), K-Nearest Neighbors (KNN).
Alberto F. de Oliveira Jr1*†, Marcelo Querino Lima Afonso³†, Manuel Lemos1,Noel Lopes2, 1 - Universidade da Beira, erior, R. Marquês de Ávila e Bolama, 6201-001, Covilhã/Portugal, 2 – Instituto Politécnico da Guarda, Av.ª Dr. Francisco Sá Carneiro, n.º 50, 6300-559, Guarda/Portugal, 3 – Universidade Federal de Minas Gerais, Av. Pres. Antônio Carlos, 6627 – Pampulha,31270-901, Belo Horizonte – Minas Gerais/Brazil
Together Oxytocin and Vasopressin set the neurohypophysial hormones that form a family of structurally and functionally related peptide hormones. However, the biological function of these proteins may vary depending on their taxonomic classification. In our study, using a broad of bioinformatics and machine learning techniques, we described the role of sets of coevolved amino acids in determining the taxonomic classes of neurohypophysial hormone sequences. Withal, it would be possible to correlate that certain taxonomic classes can still be classified from the presence of specific amino acids from these coevolved sets, bringing more light around how the molecular evolution can describe the structure and function.
oxytocin, vasopressin, evolution, coevolution of amino acids, coevolved sets, machine learning, molecular phylogeny, neurohypophysial hormones.
Sabrina Luftensteiner1 and Michael Zwick2, 1Software Competence Center Hagenberg, Hagenberg, Austria, 2Software Competence Center Hagenberg, Hagenberg, Austria
Recently, the amount of available data from industry processes is heavily increasing. This trend is caused by higher rates of machine equipment regarding sensors, which produce continuously data used for further analysis and processing. This paper proposes a framework for improving offline learning models through the usage of such online data, focusing on the minimization of catastrophic forgetting arising in online learning scenarios. The framework incorporates several state-of-the-art methods in deep learning and machine learning and enables simple comparisons between proposed methods. The methods range from memory-based approaches to methods for loss calculation and optimizers in deep learning. The proposed framework is specifically tailored for regression problems in the industrial field. It can cope with single as well as with multi-task models and is easily expandable. Furthermore, it enables various configuration possibilities regarding adaptations to a given problem.
Online Learning, Catastrophic Forgetting, Regression, Domain Adaption.
Tingyan Deng1 and Keivan Stassun2, 1Department of Electrical Engineering and Computer Science, Vanderbilt University, Nashville, Tennessee, 2Department of Physics & Astronomy
Autistic Spectrum Disorder (ASD) is a developmental disability, which can affect communication and behavior, causing significant social, communication, and behavior challenge. From a rare childhood disorder, ASD has evolved into a disorder that is found, according to the National Institute of Health, in 1% to 2% of the population in high income countries. A potential early and accurate diagnosis can not only help doctors to find the disease early, leading to a more on time treatment to the patient, but also can save significant healthcare costs for the patients. With the rapid growth of ASD cases, many opensource ASD related datasets were created for scientists and doctors to investigate this disease. Autistic Spectrum Disorder Screening Data for Adult is a well-known dataset, which contains 20 features to be utilized for further analysis on the potential cause and prediction of ASD. In this paper, we developed an Autism classification algorithm based on logistic regression model. The model can predict the ADS in an average F1 score of 0.97, which displays the superiority and feasibility of the proposed model. Besides, the data visualization technique was used to displays several feature distributions images for people to better understand the data and related feature engineering.
ASD, Logistic Regression, Classification, Machine Learning, Neurodiversity.
Tingyan Deng and Tyler Derr, Department of Electrical Engineering and Computer Science, Vanderbilt University, Nashville, Tennessee
Social network analysis emerged as an important research topic in sociology decades ago, and it has now attracted researchers from various fields of study. In the contemporary society, a significant number of researches has been conducted using social network analysis techniques to design e-commerce recommender systems. Since its invention, e-commerce took just a few decades to transform how people shop. During the pandemic, e-commerce has become an essential component in many people’s life as social gathering is prohibited. The form of ecommerce is also going through a transformation. The traditional style: Business-toCustomer(B2C) e-commence (e.g. Sephora, Costco) where the companies sell their services or products directly to customers occupied the market, but nowadays Customer-to-Customer(C2C) businesses (e.g. Alibaba Taobao, Facebook Marketplace) where sellers and buyers can interact are gaining popularity. The core at e-commerce is the recommender system in each business to filter overload information in e-commerce. The traditional recommender system developed for B2C are not suitable for C2C because of the inclusion of multiple sellers. The challenge for C2C e-commence now is to recommend an item and the associated seller pair to the buyer. Since in C2C models, the business will not only recommend an item, but also a seller, the recommender system will be more complicated with much more information than traditional B2C models. Besides, there is a lack of research for C2C e-commerce models, therefore, we put forward a new recommender system for C2C e-commerce. In this paper, we consider users and their transactions as a network and built a recommender system based on graph attention networks (GATs) and use the Bonanza dataset, one of the first two real-world C2C marketplace dataset, provided by Dr. Tyler Derr to further demonstrate the effectiveness of our model.
Customer-2-Cutomer E-Commerce, Recommender System, Graph Attention Networks, Graph Neural Network, Machine learning.
Jing Zhu, Aidong Deng, Shuo Xue, Xue Ding, Shun Zhang, School of energy and environment, southeast university, Nanjing, China
Rolling bearing is an important mechanical equipment, and its condition monitoring and fault diagnosis play an important role in the stable operation of machinery. This paper presents an intelligent diagnosis method of rolling bearings based on VMD-CWT feature extraction and MobileNet, VMD is used to extract the signal features, and then wavelet transform is used to extract the time-frequency features. After the image is enhanced, the MobileNet network is trained. In order to accelerate the convergence speed, this paper adds transfer learning in the network training process, and migrates the weights of the first several layers pretrained to the corresponding network. Experimental results based on bearing fault data sets show that the proposed method can effectively identify the fault types of rolling bearings, and the accuracy of MobileNet network is effectively improved by VMD-CWT method while reducing the parameters of neural network.
Mobilenet, VMD, CWT, Rolling bearing.
Sabila Al Jannat, Tanjina Hoque, Nafisa Alam Supti and Prof. Dr. Ashraful Alam, Department of Computer Engineering and Science, BRAC University, Dhaka, Bangladesh
Accurate detection of white matter lesions in 3D Magnetic Resonance Images(MRIs) of patients with Multiple Sclerosis is essential for diagnosis and treatment evaluation of MS. It is strenuous for the optimal treatment of the disease to detect early MS and estimate its progression. In this study, we propose efficient Multiple Sclerosis detection techniques to improve the performance of a supervised machine learning algorithm and classify the progression of the disease. Detection of MS lesions become more intricate due to the presence of unbalanced data with a very small number of lesions pixel. Our pipeline is evaluated on MS patients’ data from the Laboratory of Imaging Technologies. Fluid-attenuated inversion recovery(FLAIR) series are incorporated to introduce a faster system alongside maintaining readability and accuracy. Our approach is based on convolutional neural networks(CNN). We have trained the model using transfer learning and used softMax as an activation function to classify the progression of the disease. Our results significantly show the effectiveness of the usage of MRIs can accurately predict disease progression. Manual detection of lesions by clinical experts is complicated and time-consuming as a large amount of MRI data is required to analyse. We analyse the accuracy of the proposed model on the dataset. Our approach exhibits a significant accuracy rate of up to 98.24%.
Magnetic Resonance Imaging(MRI), Machine Learning, Multiple Sclerosis(MS), 3D Magnetic Resonance Imaging, White Matter Lesion Detection, Deep Learning, Convolutional Neural Network(CNN), Fluid-attenuated inversion recovery(FLAIR), Data Augmentation, Image Processing.
Monika Agrawal1 and Dr M Nageswara Rao2, 1Research Scholar, Department of Computer Science & Engineering, K L University, Vijayawada, Andhra Pradesh, India, 2Department of Computer Science & Engineering, K L University, Vijayawada, Andhra Pradesh, India
Sentiment Analysis includes methods and techniques for businesses to understand and analyse customer reviews, feedback and opinion on a particular product or service. Sentiment Analysis uses Natural Language Processing (NLP) tools to analyse feelings or emotions, attitudes, opinions, thoughts, etc behind the words. Sentiments such as positive, negative and neutral are associated with a particular product. Sentiment analysis is applicable in multi- domains such as customer feedback for a particular product, movie reviews, social and political comments. This survey basically focuses on different aspect- based word embedding models and aspect-based sentiment classification techniques, where the goal is to extract key features from the sentences and classify sentiment on entities at document level. Aspect Based Sentiment Analysis (ABSA) is a technique that concentrates not only the entire sentence but analyses key terms explicitly to predict the polarity as a whole. ABSA model accepts aspect categories and its corresponding aspect terms to generate sentiment corresponding to each aspect from the text corpus. This article provides a comprehensive survey on different word embedding models under CNN framework for aspect extraction and different machine learning techniques applicable for sentiment classification purpose.
Aspect sentiment, filtering, classification, polarity.
Sareh Aghaei and Anna Fensel, Semantic Technology Institute (STI) Innsbruck, Department of Computer Science, University of Innsbruck, Innsbruck, Austria
Finding similar entities among knowledge graphs is an essential research problem for knowledge integration and knowledge graph connection. This paper aims at finding semantically similar entities between two knowledge graphs. It can help end users and search agents more effectively and easily access pertinent information across knowledge graphs. Given a query entity in one knowledge graph (first KG), the proposed approach tries to find the most similar entity in another knowledge graph (second KG). The main idea is to leverage graph embedding, clustering, regression and sentence embedding. In this approach, RDF2Vec has been employed to generate vector representations of all entities of the second knowledge graph and then the vectors have been clustered based on cosine similarity using K medoids algorithm. Then, an artificial neural network with multilayer perception topology has been used as the regression model to predict the correspondent vector in the second knowledge graph for a given vector from the first knowledge graph. After determining the cluster of the predicated vector, the entities of the detected cluster are ranked through sentence-BERT method and finally the entity with the highest rank is chosen as the most similar one. To evaluate the proposed approach, extensive experiments have been conducted on real-world knowledge graphs. The experimental results demonstrate the effectiveness of the proposed approach.
Knowledge Graph, Similar Entity, Graph Embedding, Clustering, Regression, Sentence Embedding.
Nasim Sadat Mosavi and Manuel Filipe Santos, Algoritmi Research Centre, University of Minho, Guimaraes, Portugal
Research is ongoing all over the world for identifying the barriers and finding effective solutions to accelerate the projection of Precision Medicine (PM) in the healthcare industry. Yet there has not been a valid and practical model to tackle the several challenges that have slowed down the widespread of this clinical practice. This study aimed to highlight the major limitations and considerations for implementing Precision Medicine. The two theories Diffusion of Innovation and Socio-Technical are employed to discuss the success indicators of PM adoption. Throughout the theoretical assessment, two key theoretical gaps are identified and related findings are discussed.
Precision Medicine, Adoption, Artificial Intelligence, Healthcare Big Data, Open data exchange, Genomes, Biological indicators, Standards, Internet of Things.
Ramagiri Venkata Ramana Chary, Information Technology, BV Raju Institute of Technology. Narsapur, Medak Dist, Telangana, India
Now-a-days, people are freely expressing their opinions in multiple ways through microblogs usage like Twitter and facebook in the Web. These channels are rich social media repositories, which provide an emerging channel for web users to express their sentiments in multimodal way, which is composed of image, video, short text, and emoticons. Such multimodal social media has significant application ranging from event monitoring, social network analytics, to commercial recommendations, etc. The existing unimodal sentiment analysis is purely on textual modality or emoticon based alone. In present scenario, predicting sentiment of multimodal microblogs by including all types of data. In this research work we have used two types of data i.e. texts and images to analyze. To analyze the images, we have used the Haar feature classifier which uses the rectangle integral to calculate the value of a feature and for the text analysis we have used the Artificial neural networks algorithm which trained with a large set of data consisting of different texts and their emotion. A few sample images and texts are used to test this research work and we are able to predict the sentiment correctly.
Sentiment prediction, Unimodal, Multimodal, Weakly supervised learning.
Vagif Gasimov and Shahla Aliyeva, Department of Computer Systems and Networks, Azerbaijan Technical University, Baku, Azerbaijan
The article is devoted to the study of blockchain technology and related key technologies. The article provides information on the essence of blockchain technology and considers the issues of the use of blockchain technology in the IoT, Smart Home and Smart City environment, the application of blockchain technology to create a secure IoT, Smart Home and Smart City infrastructure.
Blockchain, IoT, Smart City, IoT Security.
Biao Chen and TengFei Li, School of Software Engineering East China Normal University, Shanghai, China
There are abundant spatio-temporal data and dynamic stochastic behaviours in the autonomous driving scenario, which makes it full of challenges for the modeling and verification of the safety of the scenario. In this paper, we propose a Scenario Modeling Language (SCML) for autonomous driving. SCML can not only express the stochastic dynamic behaviours of autonomous driving, but also abstract the primary objects and state transitions to model the autonomous driving scenario. Firstly, we provide the syntax and semantics of SCML. Then, we construct a metamodel of SCML and propose mapping rules to transform the SCML model into the network of stochastic hybrid automata (NSHA) model. According to the NSHA model, UPPAAL-SMC is used to verify the autonomous driving scenario. Finally, we use the forward collision warning system to illustrate that the proposed approach can effectively model and verify the driving scenario.
Autonomous Driving Scenario, SCML, NSHA, UPPAAL-SMC, Formal Verification.
Mworia Daniel, Nderu Lawrence and Kimwele Michael, Department of computing, Jommo Kenyatta University of Agriculture and Technology, Kenya
There are many calls from software engineering scholars to incorporate non-functional requirements as first-class citizens in the software development process. In Software Product Line Engineering emphasis is on explicit definition of functional requirements using feature models while non-functional requirements are considered implicit. In this paper we present an integrated requirements specification template for common quality attributes alongside functional requirements at software product line variation points. This approach implemented at analytical description phase increases the visibility of quality requirements obliging developers to consider them in subsequent phases. The approach achieves weaving of quality requirements into associated functional requirements through higher level feature abstraction method. This work therefore promotes achievement of system quality by elevating non-functional requirement specification. The approach is illustrated with an exemplar mobile phone family data storage requirements case study.
Software Product Line Engineering, Functional and Non-functional requirements, Quality attributes, feature variability, integration and requirements specification.