Category Archives: Machine Learning
Applications of Semi-supervised Learning part3(Machine Learning … – Medium
Author : Tao Wang, Yuanbin Chen, Xinlin Zhang, Yuanbo Zhou, Junlin Lan, Bizhe Bai, Tao Tan, Min Du, Qinquan Gao, Tong Tong
Abstract : Supervised learning algorithms based on Convolutional Neural Networks have become the benchmark for medical image segmentation tasks, but their effectiveness heavily relies on a large amount of labeled data. However, annotating medical image datasets is a laborious and time-consuming process. Inspired by semi-supervised algorithms that use both labeled and unlabeled data for training, we propose the PLGDF framework, which builds upon the mean teacher network for segmenting medical images with less annotation. We propose a novel pseudo-label utilization scheme, which combines labeled and unlabeled data to augment the dataset effectively. Additionally, we enforce the consistency between different scales in the decoder module of the segmentation network and propose a loss function suitable for evaluating the consistency. Moreover, we incorporate a sharpening operation on the predicted results, further enhancing the accuracy of the segmentation. Extensive experiments on three publicly available datasets demonstrate that the PLGDF framework can largely improve performance by incorporating the unlabeled data. Meanwhile, our framework yields superior performance compared to six state-of-the-art semi-supervised learning methods. The codes of this study are available at https://github.com/ortonwang/PLGDF.
2.SSASS: Semi-Supervised Approach for Stenosis Segmentation (arXiv)
Author : In Kyu Lee, Junsup Shin, Yong-Hee Lee, Jonghoe Ku, Hyun-Woo Kim
Abstract : Coronary artery stenosis is a critical health risk, and its precise identification in Coronary Angiography (CAG) can significantly aid medical practitioners in accurately evaluating the severity of a patients condition. The complexity of coronary artery structures combined with the inherent noise in X-ray images poses a considerable challenge to this task. To tackle these obstacles, we introduce a semi-supervised approach for cardiovascular stenosis segmentation. Our strategy begins with data augmentation, specifically tailored to replicate the structural characteristics of coronary arteries. We then apply a pseudo-label-based semi-supervised learning technique that leverages the data generated through our augmentation process. Impressively, our approach demonstrated an exceptional performance in the Automatic Region-based Coronary Artery Disease diagnostics using x-ray angiography imagEs (ARCADE) Stenosis Detection Algorithm challenge by utilizing a single model instead of relying on an ensemble of multiple models. This success emphasizes our methods capability and efficiency in providing an automated solution for accurately assessing stenosis severity from medical imaging dat
See the rest here:
Applications of Semi-supervised Learning part3(Machine Learning ... - Medium
Applications of Semi-supervised Learning part2(Machine Learning … – Medium
Author : Yue Fan, Anna Kukleva, Dengxin Dai, Bernt Schiele
Abstract : Semi-supervised learning (SSL) methods effectively leverage unlabeled data to improve model generalization. However, SSL models often underperform in open-set scenarios, where unlabeled data contain outliers from novel categories that do not appear in the labeled set. In this paper, we study the challenging and realistic open-set SSL setting, where the goal is to both correctly classify inliers and to detect outliers. Intuitively, the inlier classifier should be trained on inlier data only. However, we find that inlier classification performance can be largely improved by incorporating high-confidence pseudo-labeled data, regardless of whether they are inliers or outliers. Also, we propose to utilize non-linear transformations to separate the features used for inlier classification and outlier detection in the multi-task learning framework, preventing adverse effects between them. Additionally, we introduce pseudo-negative mining, which further boosts outlier detection performance. The three ingredients lead to what we call Simple but Strong Baseline (SSB) for open-set SSL. In experiments, SSB greatly improves both inlier classification and outlier detection performance, outperforming existing methods by a large margin. Our code will be released at https://github.com/YUE-FAN/SSB.
2.MSE-Nets: Multi-annotated Semi-supervised Ensemble Networks for Improving Segmentation of Medical Image with Ambiguous Boundaries (arXiv)
Author : Shuai Wang, Tengjin Weng, Jingyi Wang, Yang Shen, Zhidong Zhao, Yixiu Liu, Pengfei Jiao, Zhiming Cheng, Yaqi Wang
Abstract : Medical image segmentation annotations exhibit variations among experts due to the ambiguous boundaries of segmented objects and backgrounds in medical images. Although using multiple annotations for each image in the fully-supervised has been extensively studied for training deep models, obtaining a large amount of multi-annotated data is challenging due to the substantial time and manpower costs required for segmentation annotations, resulting in most images lacking any annotations. To address this, we propose Multi-annotated Semi-supervised Ensemble Networks (MSE-Nets) for learning segmentation from limited multi-annotated and abundant unannotated data. Specifically, we introduce the Network Pairwise Consistency Enhancement (NPCE) module and Multi-Network Pseudo Supervised (MNPS) module to enhance MSE-Nets for the segmentation task by considering two major factors: (1) to optimize the utilization of all accessible multi-annotated data, the NPCE separates (dis)agreement annotations of multi-annotated data at the pixel level and handles agreement and disagreement annotations in different ways, (2) to mitigate the introduction of imprecise pseudo-labels, the MNPS extends the training data by leveraging consistent pseudo-labels from unannotated data. Finally, we improve confidence calibration by averaging the predictions of base networks. Experiments on the ISIC dataset show that we reduced the demand for multi-annotated data by 97.75% and narrowed the gap with the best fully-supervised baseline to just a Jaccard index of 4%. Furthermore, compared to other semi-supervised methods that rely only on a single annotation or a combined fusion approach, the comprehensive experimental results on ISIC and RIGA datasets demonstrate the superior performance of our proposed method in medical image segmentation with ambiguous boundaries
Originally posted here:
Applications of Semi-supervised Learning part2(Machine Learning ... - Medium
Applications of Semi-supervised Learning part1(Machine Learning … – Medium
Author : Hao Dong, Gatan Frusque, Yue Zhao, Eleni Chatzi, Olga Fink
Abstract : Anomaly detection (AD) is essential in identifying rare and often critical events in complex systems, finding applications in fields such as network intrusion detection, financial fraud detection, and fault detection in infrastructure and industrial systems. While AD is typically treated as an unsupervised learning task due to the high cost of label annotation, it is more practical to assume access to a small set of labeled anomaly samples from domain experts, as is the case for semi-supervised anomaly detection. Semi-supervised and supervised approaches can leverage such labeled data, resulting in improved performance. In this paper, rather than proposing a new semi-supervised or supervised approach for AD, we introduce a novel algorithm for generating additional pseudo-anomalies on the basis of the limited labeled anomalies and a large volume of unlabeled data. This serves as an augmentation to facilitate the detection of new anomalies. Our proposed algorithm, named Nearest Neighbor Gaussian Mixup (NNG-Mix), efficiently integrates information from both labeled and unlabeled data to generate pseudo-anomalies. We compare the performance of this novel algorithm with commonly applied augmentation techniques, such as Mixup and Cutout. We evaluate NNG-Mix by training various existing semi-supervised and supervised anomaly detection algorithms on the original training data along with the generated pseudo-anomalies. Through extensive experiments on 57 benchmark datasets in ADBench, reflecting different data types, we demonstrate that NNG-Mix outperforms other data augmentation methods. It yields significant performance improvements compared to the baselines trained exclusively on the original training data. Notably, NNG-Mix yields up to 16.4%, 8.8%, and 8.0% improvements on Classical, CV, and NLP datasets in ADBench. Our source code will be available at https://github.com/donghao51/NNG-Mix
2.Segment Together: A Versatile Paradigm for Semi-Supervised Medical Image Segmentation (arXiv)
Author : Qingjie Zeng, Yutong Xie, Zilin Lu, Mengkang Lu, Yicheng Wu, Yong Xia
Abstract : Annotation scarcity has become a major obstacle for training powerful deep-learning models for medical image segmentation, restricting their deployment in clinical scenarios. To address it, semi-supervised learning by exploiting abundant unlabeled data is highly desirable to boost the model training. However, most existing works still focus on limited medical tasks and underestimate the potential of learning across diverse tasks and multiple datasets. Therefore, in this paper, we introduce a textbf{Ver}satile textbf{Semi}-supervised framework (VerSemi) to point out a new perspective that integrates various tasks into a unified model with a broad label space, to exploit more unlabeled data for semi-supervised medical image segmentation. Specifically, we introduce a dynamic task-prompted design to segment various targets from different datasets. Next, this unified model is used to identify the foreground regions from all labeled data, to capture cross-dataset semantics. Particularly, we create a synthetic task with a cutmix strategy to augment foreground targets within the expanded label space. To effectively utilize unlabeled data, we introduce a consistency constraint. This involves aligning aggregated predictions from various tasks with those from the synthetic task, further guiding the model in accurately segmenting foreground regions during training. We evaluated our VerSemi model on four public benchmarking datasets. Extensive experiments demonstrated that VerSemi can consistently outperform the second-best method by a large margin (e.g., an average 2.69% Dice gain on four datasets), setting new SOTA performance for semi-supervised medical image segmentation. The code will be released.
View original post here:
Applications of Semi-supervised Learning part1(Machine Learning ... - Medium
‘Your United States was normal’: has translation tech really made … – The Conversation
Every day, millions of people start the day by posting a greeting on social media. None of them expect to be arrested for their friendly morning ritual.
But thats exactly what happened to a Palestinian construction worker in 2017, when the caption (good morning) on his Facebook selfie was auto-translated as attack them.
A human Arabic speaker would have immediately recognized as an informal way to say good morning. Not so AI. Machines are notoriously bad at dealing with variation, a key characteristic of all human languages.
With recent advances in automated translation, the belief is taking hold that humans, particularly English speakers, no longer need to learn other languages. Why bother with the effort when Google Translate and a host of other apps can do it for us?
In fact, some Anglophone universities are making precisely this argument to dismantle their language programs.
Unfortunately, language technologies are nowhere near being able to replace human language skills and will not be able to do so in the foreseeable future because machine language learning and human language learning differ in fundamental ways.
For machine translation, algorithms are trained on large amounts of texts to find the probabilities of different patterns of words. These texts can be both monolingual and bilingual.
Bilingual training data comes in the form of human-translated parallel texts. These are almost always based on the standard version of the training language, excluding dialects and slang phrases, as in the example above.
Diversity is a characteristic of all human languages, but diversity is a problem for machines. For instance, deadly means causing death in most varieties of English, and that is what appears in the training data.
The Australian meaning of excellent (from Aboriginal English) puts a spanner in the works. If you input Deadly Awards into any translation app, what youll get in your target language is the equivalent of death-causing awards.
The internal linguistic diversity of English, as of any other language, is accompanied by great diversity across languages. Each language does things differently.
Tense, number or gender, for example, need to be grammatically encoded in some languages but not in others. Translating the simple English statement I am a student into German requires the inclusion of a grammatical gender marking and so will either end up as I am a male student or I am a female student.
Read more: Friday essay: is this the end of translation?
Furthermore, some languages are spoken by many people, have powerful nation states behind them, and are well resourced. Others are not.
Well resourced in the context of machine learning means that large digital corpora of training data are available.
The lists of language options offered by automated translation tools like the list of 133 languages in which Google Translate is currently available erase all these differences and suggest that each option is the same.
Nothing could be further from the truth. English is in a class of its own, with over 90% of the training data behind large language models being in English.
The remainder comes from a few dozen languages, in which data of varying sizes are available. The majority of the worlds 6,000+ languages are simply missing in action. Apps for some of these are now being created from models pre-trained on English, which further serves to cement the dominance of English.
One consequence of inequalities in the training data is that translations into English usually sound quite good because the app can draw both on bilingual and monolingual training data. This doesnt mean they are accurate: one recent study found about half of all questions in Vietnamese were incorrectly auto-translated as statements.
Machine-translated text into languages other than English is even more problematic and routinely riddled with mistakes. For instance, COVID-19 testing information auto-translated into German included invented words, grammatical errors, and inconsistencies.
Machine translation is not as good as most people think, but it is useful to get the gist of web sites or be able to ask for directions in a tourist destination with the help of an app.
However, that is not where it ends. Translation apps are increasingly used in high-stakes contexts, such as hospitals, where staff may attempt to bypass human interpreters for quick communication with patients who have limited proficiency in English.
Read more: The problem with machine translation: beware the wisdom of the crowd
This causes big problems when, for instance, a patients discharge instructions state the equivalent of Your United States was normal an error resulting from the abbreviation US being used for ultrasound in medical contexts.
Therefore, there is consensus that translation apps are suitable only in risk-free or low-risk situations. Unfortunately, sometimes even a caption on a selfie can turn into a high-risk situation.
Only humans can identify what constitutes a low- or high-risk situation and whether the use of machine translation may be appropriate. To make informed decisions, humans need to understand both how languages work and how machine learning works.
It could be argued that all the errors described here can be ironed out with more training data. There are two problems with this line of reasoning. First, AI already has more training data than any human will ever be able to ingest, yet makes mistakes no human with much lower levels of investment in their language learning would make.
Second, and more perniciously, training machines to do our language learning for us is incredibly costly. There are the well-known environmental costs of AI, of course. But there is also the cost of dismantling language teaching programs.
If we let go of language programs because we can outsource simple multilingual tasks to machines, we will never train humans to achieve advanced language proficiency. Even from the perspective of pure strategic national interest, the skills to communicate across language barriers in more risky contexts of economics, diplomacy or healthcare are essential.
Languages are diverse, fuzzy, variable, relational and deeply social. Algorithms are the opposite. By buying into the hype that machines can do our language work for us we dehumanise what it means to use languages to communicate, to make meaning, to create relationships and to build communities.
The author would like to thank Ava Vahedi, a Master of mathematics student at UNSW, for her help in writing this article.
Here is the original post:
'Your United States was normal': has translation tech really made ... - The Conversation
Research on Cache Optimization part3(Machine Learning) | by … – Medium
Author : Xiangyu Gao, Yaping Sun, Hao Chen, Xiaodong Xu, Shuguang Cui
Abstract : Mobile edge computing (MEC) networks bring computing and storage capabilities closer to edge devices, which reduces latency and improves network performance. However, to further reduce transmission and computation costs while satisfying user-perceived quality of experience, a joint optimization in computing, pushing, and caching is needed. In this paper, we formulate the joint-design problem in MEC networks as an infinite-horizon discounted-cost Markov decision process and solve it using a deep reinforcement learning (DRL)-based framework that enables the dynamic orchestration of computing, pushing, and caching. Through the deep networks embedded in the DRL structure, our framework can implicitly predict user future requests and push or cache the appropriate content to effectively enhance system performance. One issue we encountered when considering three functions collectively is the curse of dimensionality for the action space. To address it, we relaxed the discrete action space into a continuous space and then adopted soft actor-critic learning to solve the optimization problem, followed by utilizing a vector quantization method to obtain the desired discrete action. Additionally, an action correction method was proposed to compress the action space further and accelerate the convergence. Our simulations under the setting of a general single-user, single-server MEC network with dynamic transmission link quality demonstrate that the proposed framework effectively decreases transmission bandwidth and computing cost by proactively pushing data on future demand to users and jointly optimizing the three functions. We also conduct extensive parameter tuning analysis, which shows that our approach outperforms the baselines under various parameter settings.
2.Matrix Factorization for Cache Optimization in Content Delivery Networks (CDN) (arXiv)
Author : Adolf Kamuzora, Wadie Skaf, Ermiyas Birihanu, Jiyan Mahmud, Pter Kiss, Tams Jursonovics, Peter Pogrzeba, Imre Lendk, Tom Horvth
Abstract : Content delivery networks (CDNs) are key components of high throughput, low latency services on the internet. CDN cache servers have limited storage and bandwidth and implement state-of-the-art cache admission and eviction algorithms to select the most popular and relevant content for the customers served. The aim of this study was to utilize state-of-the-art recommender system techniques for predicting ratings for cache content in CDN. Matrix factorization was used in predicting content popularity which is valuable information in content eviction and content admission algorithms run on CDN edge servers. A custom implemented matrix factorization class and MyMediaLite were utilized. The input CDN logs were received from a European telecommunication service provider. We built a matrix factorization model with that data and utilized grid search to tune its hyper-parameters. Experimental results indicate that there is promise about the proposed approaches and we showed that a low root mean square error value can be achieved on the real-life CDN log data
Read more:
Research on Cache Optimization part3(Machine Learning) | by ... - Medium
Latest research on Novelty Detection part2(Machine Learning 2023) – Medium
Author : Stefan Smeu, Elena Burceanu, Emanuela Haller, Andrei Liviu Nicolicioiu
Abstract : : Novelty detection aims at finding samples that differ in some form from the distribution of seen samples. But not all changes are created equal. Data can suffer a multitude of distribution shifts, and we might want to detect only some types of relevant changes. Similar to works in out-of-distribution generalization, we propose to use the formalization of separating into semantic or content changes, that are relevant to our task, and style changes, that are irrelevant. Within this formalization, we define the robust novelty detection as the task of finding semantic changes while being robust to style distributional shifts. Leveraging pretrained, large-scale model representations, we introduce Stylist, a novel method that focuses on dropping environment-biased features. First, we compute a per-feature score based on the feature distribution distances between environments. Next, we show that our selection manages to remove features responsible for spurious correlations and improve novelty detection performance. For evaluation, we adapt domain generalization datasets to our task and analyze the methods behaviors. We additionally built a large synthetic dataset where we have control over the spurious correlations degree. We prove that our selection mechanism improves novelty detection algorithms across multiple datasets, containing both stylistic and content shifts.
2.Environment-biased Feature Ranking for Novelty Detection Robustness (arXiv)
Author :
Abstract :
See original here:
Latest research on Novelty Detection part2(Machine Learning 2023) - Medium
Predicting water quality through daily concentration of dissolved … – Nature.com
Once again, this paper offers four novel models for DO prediction. The models are composed of an MLP neural network as the core and the TLBO, SCA, WCA, and EFO as the training algorithms. All models are developed and implemented in the MATLAB 2017 environment.
Proper training of the MLP is dependent on the strategy employed by the algorithm appointed for this task (as described in previous sections for the TLBO, SCA, WCA, and EFO). In this section, this characteristic is discussed in the format of the hybridization results of the MLP.
An MLPNN is considered the basis of the hybrid models. As per Section The MLPNN, this model has three layers. The input layer receives the data and has 3 neurons, one for each of WT, pH, and SC. The output layer has one neuron for releasing the final prediction (i.e., DO). However, the hidden layer can have various numbers of neurons. In this study, a trial-and-error effort was carried out to determine the most proper number. Ten models were tested with 1, 2, , and 10 neurons in the hidden layer and it was observed that 6 gives the best performance. Hence, the final model is structured as 361. With the same logic, the activation function of the output and hidden neurons is respectively selected Pureline (x=y) and Tansig (described in Section Formula presentation) 83.
Next, the training dataset was exposed to the selected MLPNN network. The relationship between the DO and water conditions is established by means of weights and biases within the MLPNN (Fig.4). In this study, the role of tuning theses weighst and biases is assigned to the named metaheuristic algorithms. For this purpose, the MLPNN configuration is first transformed in the form of mathematical equations with adjustable weights and biases (The equations will be shown in Section Formula presentation). Training the MLPNN using metaheuristic algorithms is an iterative effort. Hereupon, the RMSE between the modeled and measured DOs is introduced as the objective function of the TLBO, SCA, WCA, and EFO. This function is used to monitor the optimization benhavior of the algorithms. Since RMSE is an error indicator, the algorithms aim to minimize it over time to improve the quality of the weights and biases. Designating the appropriate number of iterations is another important step. By analyzing the convergence behavior of the algorithms, as well as referring to previous similar studies, 1000 iterations were determined for the TLBO, SCA, and WCA, while the EFO was implemented with 30,000 iterations. The final solution is used to constrcuct the optimized MLPNN. Figure5 illustrates the optimization flowchart.
Optimization flowchart of the models.
Furthermore, each algorithm was implemented with nine swarm sizes (NSWs) to achieve the best model configuration. These tested NSWs were 10, 25, 50, 75, 100, 200, 300, 400, and 500 for the TLBO, SCA, WCA, while 25, 30, 50, 75, 100, 200, 300, 400, and 500 for the EFO84. Collecting the obtained objective functions (i.e., the RMSEs) led to creating a convergence curve for each tested NSWs. Figure6 depicts the convergence curves of the TLBO-MLPNN, SCA-MLPNN, WCA-MLPNN, and EFO-MLPNN.
Optimization curves of the (a) TLBO-MLPNN, (b) SCA-MLPNN, (c) WCA-MLPNN, and (d) EFO-MLPNN.
As is seen, each algorithm has a different method for training the MLPNN. According to the above charts, the TLBO-MLPNN, SCA-MLPNN, WCA-MLPNN, and EFO-MLPNN with respective NSWs of 500, 400, 400, and 50, attained the lowest RMSEs. It means that for each model, the MLPNNs trained by these configurations acquired more promising weights and biases compared to eight other NSWs. Table2 collects the final parameters of each model.
The RMSE of the recognized elite models (i.e., the TLBO-MLPNN, SCA-MLPNN, WCA-MLPNN, and EFO-MLPNN with the NSWs of 500, 400, 400, and 50) was 1.3231, 1.4269, 1.3043, and 1.3210, respectively. These values plus the MAEs of 0.9800, 1.1113, 0.9624, and 0.9783, and the NSEs of 0.7730, 0.7359, 0.7794, and 0.7737 indicate that the MLP has been suitably trained by the proposed algorithms. In order to graphically assess the quality of the results, Fig.7a,c,e, and g are generated to show the agreement between the modeled and measured DOs. The calculated RPs (i.e., 0.8792, 0.8637, 0.8828, and 0.8796) demonstrate a large degree of agreement for all used models. Moreover, the outcome of ({DO}_{{i}_{expected }}- {DO}_{{i}_{predicted}}) is referred to as error for every sample, and the frequency of these values is illustrated in Fig.7b,d,f, and h. These charts show larger frequencies for the error values close to 0; meaning that accurately predicted DOs outnumber those with considerable errors.
The scatterplot and histogram of the errors plotted for the training data of (a and b) TLBO-MLPNN, (c and d) SCA-MLPNN, (e and f) WCA-MLPNN, and (g and h) EFO-MLPNN.
Evaluating the testing accuracies revealed the high competency of all used models in predicting the DO for new values of WT, pH, and SC. In other words, the models could successfully generalize the DO pattern captured by exploring the data belonging to 20142018 to the data of the fifth year. For example, Fig.8 shows the modeled and measured DOs for two different periods including (a) October 01, 2018 to December 01, 2018 and (b) January 01, 2019 to March 01, 2019. It can be seen that, for the first period, the upward DO patterns have been well-followed by all four models. Also, the models have shown high sensitivity to the fluctuations in the DO pattern for the second period.
The real and predicted DO patterns for (a) October 01, 2018 to December 01, 2018 and (b) January 01, 2019 to March 01, 2019.
Figure9a,c,e, and g show the errors obtained for the testing data. The RMSE and MAE of the TLBO-MLPNN, SCA-MLPNN, WCA-MLPNN, and EFO-MLPNN were 1.2980 and 0.9728, 1.4493 and 1.2078, 1.3096 and 0.9915, and 1.2903 and 1.0002, respectively. These values, along with the NSEs of 0.7668, 0.7092, 0.7626, and 0.7695, imply that the models have predicted unseen DOs with a tolerable level of error. Moreover, Fig.9b,d,f, and h present the corresponding scatterplots illustrating the correlation between the modeled and measured DOs in the testing phase. Based on the Rp values of 0.8785, 0.8587, 0.8762, and 0.8815, a very satisfying correlation can be seen for all used models.
The error line and scatterplot plotted for the testing data of (a and b) TLBO-MLPNN, (c and d) SCA-MLPNN, (e and f) WCA-MLPNN, and (g and h) EFO-MLPNN.
To compare the efficiency of the employed models, the most accurate model is first determined by comparing the obtained accuracy indicators, then, a comparison between the optimization time is carried out. Table3 collects all calculated accuracy criteria in this study.
In terms of all all accuracy criteria (i.e., RMSE, MAE, RP, and NSE), the WCA-MLPNN emerged as the most reliable model in the training phase. In other words, the WCA presented the highest quality training of the MLP followed by the EFO, TLBO, and SCA. However, the results of the testing data need more discussion. In this phase, while the EFO-MLPNN achieved the smallest RMSE (1.2903), the largest RP (0.8815), and the largest NSE (0.7695) at the same time, the smallest MAE (0.9728) was obtained for the TLBO-MLPNN. About the SCA-based ensemble, it was shown that this model yields the poorest predictions in both phases.
Additionally, Figs.10 and 11 are also produced to compare the accuracy of the models in the form of boxplot and Taylor Diagram, respectively. The results of these two figures are consistent with the above comparison. They indicate the high accordance between the models outputs and target DOs, and also, they reflect the higher accuracy of the WCA-MLPNN, EFO-MLPNN, and TLBO-MLPNN, compared to the SCA-MLPNN.
Boxplots of the models for comparison.
Taylor diagram of the models for comparison.
In comparison with some previous literature, it can be said that our models have attained a higher accuracy of DO prediction. For instance, in the study by Yang et al.85, three metaheuristic algorithms, namely multi-verse optimizer (MVO), shuffled complex evolution (SCE), and black hole algorithm (BHA) were combined with an MLPNN and the models were applied to the same case study (Klamath River Station). The best training performance was achieved by the MLP-MVO (with respective RMSE, MAE, and RP of 1.3148, 0.9687, and 0.8808), while the best testing performance was achieved by the MLP-SCE (with respective RMSE, MAE, and RP of 1.3085, 1.0122, and 0.8775). As per Table3, it can be inferred that the WCA-MLPNN suggested in this study provides better training results. Also, as far as the testing results are concerned, both WCA-MLPNN and TLBO-MLPNN outperformed all models tested by Yang et al.85. In another study by Kisi et al.42, an ensemble model called BMA was suggested for the same case study, and it achieved training and testing RMSEs of 1.334 and 1.321, respectively (See Table 5 of the cited paper). These error values are higher than the RMSEs of the TLBO-MLPNN, WCA-MLPNN, and EFO-MLPNN in this study. Consequently, these model outperform benchmark conventional models that were tested by Kisi et al.42 (i.e., ELM, CART, ANN, MLR, and ANFIS). With the same logic, the superiority of the suggested hybrid models over some conventional models employed in the previous studies49,65 for different stations on the Klamath River can be inferred. Altogether, these comparisons indicate that this study has achieved considerable improvements in the field of DO prediction.
Table4 denotes the times elapsed for optimizing the MLP by each algorithm. According to this table, the EFO-MLPNN, despite requiring a greater number of iterations (i.e., 30,000 for the EFO vs. 1000 for the TLBO, SCA, and WCA), accomplishes the optimization in a considerably shorter time. In this relation, the times for the TLBO, SCA, and WCA range in [181.3, 12,649.6] s, [88.7, 6095.2] s, and [83.2, 4804.0] s, while those of the EFO were bounded between 277.2 and 296.0s. Another difference between the EFO and other proposed algorithms is related to two initial NSWs. Since NSW of 10 was not a viable value for implementing the EFO, two values of 25 and 30 are alternatively considered.
Based on the above discussion, the TLBO, WCA, and EFO showed higher capability compared to the SCA. Examining the time of the selected configurations of the TLBO-MLPNN, SCA-MLPNN, WCA-MLPNN, and EFO-MLPNN (i.e., 12,649.6, 5295.7, 4733.0, and 292.6s for the NSWs of 500, 400, 400, and 50, respectively) shows that the WCA needs around 37% of the TLBOs time to train the MLP. The EFO, however, provides the fastest training.
Apart from comparisons, the successful prediction carried out by all four hybrid models represents the compatibility of the MLPNN model with metaheuristic science for creating predictive ensembles. The used optimizer algorithms could nicely optimize the relationship between the DO and water conditions (i.e., WT, pH, and SC) in the Klamath River Station. The basic model was a 361 MLPNN containing 24 weights and 7 biases (Fig.4). Therefore, each algorithm provided a solution composed of 31 variables in each iteration. Considering the number of tested NSWs and iterations for each algorithm (i.e., 30,000 iterations of the EFO and 1000 iterations of the WCA, SCA, and TLBO all with nine NSWs), it can be said that the outstanding solution (belonging to the EFO algorithm) has been excerpted among a large number of candidates (=130,0009+310009).
However, concerning the limitations of this work in terms of data and methodology, potential ideas can be raised for future studies. First, it is suggested to update the applied models with the most recent hydrological data, as well as the records of other water quality stations, in order to enhance the generalizability of the models. Moreover, further metaheuristic algorithms can be tested in combination with different basic models such as ANFIS and SVM to conduct comparative studies.
The higher efficiency of the WCA and EFO (in terms of both time and accuracy) was derived in the previous section. Hereupon, the MLPNNs constructed by the optimal responses of these two algorithms are mathematically presented in this section to give two formulas for predicting the DO. Referring to Fig.4, the calculations of the output neuron in the WCA-MLPNN and EFO-MLPNN is expressed by Eqs.5 and 6, respectively.
$$ begin{aligned} DO_{WCA - MLPNN } & = , 0.395328 times O_{HN1 } + 0.193182 + O_{HN2 } - 0.419852 times O_{HN3 } + 0.108298 times O_{HN4 } \ & quad +, 0.686191 times O_{HN5 } + 0.801148 times O_{HN6 } + 0.340617 \ end{aligned} $$
(5)
$$ begin{aligned} DO_{EFO - MLPNN } & = 0.033882 times {{O}_{HN1}}^{prime} - 0.737699 times {{O}_{HN2}}^{prime} - 0.028107 times {{O}_{HN3}}^{prime} - 0.700302 \ & quad times {{O}_{HN4}}^{prime} + 0.955481 times {{O}_{HN5}}^{prime} - 0.757153 times {{O}_{HN6}}^{prime} + 0.935491 \ end{aligned} $$
(6)
In the above relationships, ({O}_{HNi}) and ({{O}_{HNi}}^{prime}) represent the outcome of the ith hidden neuron in the WCA-MLPNN and EFO-MLPNN, respectively. Given Tansig (x) = (frac{2}{1+ {e}^{-2x}}) 1 as the activation function of the hidden neurons, ({O}_{HNi}) and ({{O}_{HNi}}^{prime}) are calculated by the below equations. As is seen, these two parameters are calculated from the inputs of the study, i.e., (WT, pH, and SC).
$$ left[ {begin{array}{*{20}c} {O_{HN1 } } \ {O_{HN2 } } \ {O_{HN3 } } \ {O_{HN4 } } \ {O_{HN5 } } \ {O_{HN6 } } \ end{array} } right] = Tansigleft( {left( {left[ {begin{array}{*{20}c} { - 1.818573} & {1.750088} & { - 0.319002} \ {0.974577} & {0.397608} & { - 2.316006} \ { - 1.722125} & { - 1.012571} & {1.575044} \ {0.000789} & { - 2.532009} & { - 0.246384} \ { - 1.288887} & { - 1.724770} & {1.354887} \ {0.735724} & { - 2.250890} & {0.929506} \ end{array} } right] left[ {begin{array}{*{20}c} {WT} \ {pH} \ {SC} \ end{array} } right] } right) + left[ {begin{array}{*{20}c} {2.543969} \ { - 1.526381} \ {0.508794} \ {0.508794} \ { - 1.526381} \ {2.543969} \ end{array} } right]} right) $$
(7)
$$ left[ {begin{array}{*{20}c} {O_{HN1}{prime} } \ {O_{HN2}{prime} } \ {O_{HN3}{prime} } \ {O_{HN4}{prime} } \ {O_{HN5}{prime} } \ {O_{HN6}{prime} } \ end{array} } right] = Tansigleft( {left( {left[ {begin{array}{*{20}c} {1.323143} & { - 2.172674} & { - 0.023590} \ {1.002364} & {0.785601} & {2.202243} \ {1.705369} & { - 1.245099} & { - 1.418881} \ { - 0.033210} & { - 1.681758} & {1.908498} \ {1.023548} & { - 0.887137} & { - 2.153396} \ {0.325776} & { - 1.818692} & { - 1.748715} \ end{array} } right] left[ {begin{array}{*{20}c} {WT} \ {pH} \ {SC} \ end{array} } right] } right) + left[ {begin{array}{*{20}c} { - 2.543969} \ { - 1.526381} \ { - 0.508794} \ { - 0.508794} \ {1.526381} \ {2.543969} \ end{array} } right]} right) $$
(8)
More clearly, the integration of Eqs.(5 and 7) results in the WCA-MLPNN formula, while the integration of Eqs.(6 and 8) results in the EFO-MLPNN formula. Given the excellent accuracy of these two models and their superiority over some previous models in the literature, either of these two formulas can be used for practical estimations of the DO, especially for solving the water quality issue within the Klamath River.
Continue reading here:
Predicting water quality through daily concentration of dissolved ... - Nature.com
Grant backs research on teaching networks to make better decisions – Rice News
Picture a swarm of drones capturing photos and video as they survey an area: What would enable them to process the data collected in the most rapid and effective manner possible?
Rice Universitys Santiago Segarra and Ashutosh Sabharwal have won a grant from the Army Research Office, a directorate of the U.S. Army Combat Capabilities Development Command Army Research Laboratory, to develop a machine learning framework that improves military communication networks decision-making processes. The research could also help inform applications such as self-driving vehicles and cyber intrusion detection.
Distributed decision-making is crucial in military networks, said Sabharwal, who is a co-investigator on the grant. In high-stakes, fast-paced environments, relying solely on a centralized decision-making process can result in delays, bottlenecks and vulnerabilities. Spreading decision and execution responsibilities across the network enables a rapid response to changing situations and adaptability to unforeseen circumstances.
The main challenge for effective distributed network control is that the individual units that make up a network nodes have to find the best way to aggregate local information and distill it into actionable knowledge. In the drone example, to perform a machine learning task like object recognition on visual data collected in real time, the individual nodes or, in our example, drones have to follow designated protocols that specify where the information is to be processed.
This can be done either in the drone with its limited battery and computational capacity or can be offloaded to headquarters through wireless connections with the associated communication latency, Segarra said.
The optimal decision depends on multiple factors, such as the size and sensitivity of the data, the complexity of the task and the congestion level of the communication network. Rigid decision-making protocols that pre-specify how information is to be aggregated can delay or impede the networks ability to react. Sabharwal and Segarra aim to develop a novel distributed machine learning architecture that would enable nodes to combine local data in the most effective manner.
Our goal is for the swarm of drones to make jointly optimal offloading decisions in a distributed manner that is, in the absence of a central agent that tells every drone what to do, Segarra said.
To achieve this, the researchers will develop a deep learning framework where two graph neural networks interact in an actor-critic setting: The actor neural network makes offloading decisions while the critic assesses their quality. By training both neural networks in an iterative fashion, the goal is to obtain a versatile actor whose decisions translate into rapid, adaptive action across a broad range of scenarios.
Segarra is an assistant professor of electrical and computer engineering and statistics. Sabharwal is Rices Ernest Dell Butcher Professor of Engineering and chair of the Department of Electrical and Computer Engineering.
Project title: Distributed Machine Learning for Tactical Networks
Award number: W911NF-24-2-0008
https://news-network.rice.edu/news/files/2023/11/000_AROgrant.jpgCAPTION: Ashutosh Sabharwal (left) and Santiago Segarra(Credit: Photo courtesy of Rice University)
George R. Brown School of Engineering: https://engineering.rice.edu/ Department of Electrical and Computer Engineering: https://eceweb.rice.edu/ Ashutosh Sabharwal website: http://ashu.rice.edu/ Santiago Segarra website: http://segarra.rice.edu/ National Security Research Accelerator: https://runsra.rice.edu/ Wireless Open-Access Research Platform: http://warpproject.org/trac Reconfigurable Eco-system for Next-generation End-to-end Wireless: https://renew-wireless.org/ Scalable Health Labs: http://sh.rice.edu/See Below the Skin: http://www.seebelowtheskin.org/ Saving Lives Through Transformative Health Technologies: https://pathsup.org/
Located on a 300-acre forested campus in Houston, Rice University is consistently ranked among the nations top 20 universities by U.S. News & World Report. Rice has highly respected schools of architecture, business, continuing studies, engineering, humanities, music, natural sciences and social sciences and is home to the Baker Institute for Public Policy. With 4,574 undergraduates and 3,982 graduate students, Rices undergraduate student-to-faculty ratio is just under 6-to-1. Its residential college system builds close-knit communities and lifelong friendships, just one reason why Rice is ranked No. 1 for lots of race/class interaction, No. 2 for best-run colleges and No. 12 for quality of life by the Princeton Review. Rice is also rated as a best value among private universities by Kiplingers Personal Finance.
Continue reading here:
Grant backs research on teaching networks to make better decisions - Rice News
Biologit: Machine learning and AI to monitor medical literature – SiliconRepublic.com
Founded in 2021 by Nicole Baker and Bruno Ohana, Dublin-based Biologit has been helping companies automate the monitoring of scientific literature.
Whether it is human medicines or medical devices, monitoring health products for safety is a very important part of keeping patients safe, says Nicole Baker, an immunologist by background who started her own company to help life sciences firms automate safety monitoring.
Adverse events are one of the top causes of hospitalisation and can lead to serious health issues. Hence, regulators around the world pay very close attention to the surveillance of health products.
Baker, who co-founded Biologit with tech expert Bruno Ohana, specialises in the field of pharmacovigilance, which involves reviewing the vast number of medical extracts published each year and forums on social media to identify any red flags regarding adverse effects of drugs on the market.
This, combined with her experience working in the biotech and pharma industries, helped Baker create Biologit two years ago, leveraging AI to help keep patients safe by simplifying the detection of adverse events from drug development to post-market.
At Biologit, we specialise in cutting-edge active safety surveillance solutions across the life sciences spectrum. We do that by combining expert domain knowledge with the latest technology to build solutions that help keep patients safe, she explains.
The idea is to help companies of all sizes and at any stage of clinical development automate the comprehensive and time-consuming task of monitoring scientific literature.
With its roots in Trinity College Dublin, where the technology was first incubated, Biologit was part of Enterprise Irelands New Frontiers Entrepreneur Development Programme, in which Baker participated in 2019. She then went on to participate inBig Ideas 2020.
The companys first product, Biologit MLM-AI, was developed by iterating with early adopters to build a solution that is fit for the needs of the industry.
From the user experience to the AI, everything was built from the ground up with domain experts. This has given us insights on how to apply AI to solve the problems that mattered most to our users while maintaining high levels of compliance in a regulated industry, Baker explains.
Because we tackled the challenges of the entire safety surveillance workflow, our platform has the unique ability to deliver high levels of automation and productivity gains.
According to her, MLM-AI includes a rich and comprehensive scientific literature database which contains more than 45m citations and is growing every day.
Our users can benefit from the Biologit database out of the box to run their searches, reducing friction and costs.
After launching MLM-AI last year, Baker and Ohana worked on onboarding its first customers. This year, the focus has been more on customer acquisition and growth, Baker says, as the team has increased its presence on industry forums and other online channels.
We continue to onboard new customers and were building a user base that is quite global, Baker says, adding that she has no interest in raising investment at the moment. Weve put a lot of energy in hiring too and have been very fortunate to build a stellar team to help us deliver for our customers and grow Biologit into the future.
Earlier this year, the Dublin-headquartered start-up announced plans to at least double its team in 2023 following a successful 2m funding round led by Enterprise Ireland. At the time, it had 14 employees based across Ireland, India, Poland, France, Spain and the Philippines.
While for Baker the main challenge in running Biologit is finding the time to do the most important tasks, Ohana said there have been a few exciting challenges for the team as a whole.
There were lots of fun challenges since we started, and they have evolved with the different stages of the company: can we build the technology, can we find good market fit and the right business model, will the platform scale as we grow, said Ohana, an expert in machine learning.
It is really nice to get to think about all that, and a great learning experience. Our customers expect a very high standard of security, compliance, and that we continue to innovate for them, were now putting the structure in place to so that we can do those things as we scale.
10 things you need to know direct to your inbox every weekday. Sign up for theDaily Brief, Silicon Republics digest of essential sci-tech news.
Continue reading here:
Biologit: Machine learning and AI to monitor medical literature - SiliconRepublic.com
Machine Learning and Artificial Intelligence Tools: The Benefits … – Quality Magazine
Machine Learning and Artificial Intelligence Tools: The Benefits, How They Work, and Avoiding Common Pitfalls | Quality Magazine This website requires certain cookies to work and uses other cookies to help you have the best experience. By visiting this website, certain cookies have already been set, which you may delete and block. By closing this message or continuing to use our site, you agree to the use of cookies. Visit our updated privacy and cookie policy to learn more. This Website Uses CookiesBy closing this message or continuing to use our site, you agree to our cookie policy. Learn MoreThis website requires certain cookies to work and uses other cookies to help you have the best experience. By visiting this website, certain cookies have already been set, which you may delete and block. By closing this message or continuing to use our site, you agree to the use of cookies. Visit our updated privacy and cookie policy to learn more.
The rest is here:
Machine Learning and Artificial Intelligence Tools: The Benefits ... - Quality Magazine