Category Archives: Machine Learning

Updates on Multitask learning part1(Machine Learning) | by … – Medium

Author : Juan Lu, Mohammed Bennamoun, Jonathon Stewart, JasonK. Eshraghian, Yanbin Liu, Benjamin Chow, Frank M. Sanfilippo, Girish Dwivedi

Abstract : Diagnostic investigation has an important role in risk stratification and clinical decision making of patients with suspected and documented Coronary Artery Disease (CAD). However, the majority of existing tools are primarily focused on the selection of gatekeeper tests, whereas only a handful of systems contain information regarding the downstream testing or treatment. We propose a multi-task deep learning model to support risk stratification and down-stream test selection for patients undergoing Coronary Computed Tomography Angiography (CCTA). The analysis included 14,021 patients who underwent CCTA between 2006 and 2017. Our novel multitask deep learning framework extends the state-of-the art Perceiver model to deal with real-world CCTA report data. Our model achieved an Area Under the receiver operating characteristic Curve (AUC) of 0.76 in CAD risk stratification, and 0.72 AUC in predicting downstream tests. Our proposed deep learning model can accurately estimate the likelihood of CAD and provide recommended downstream tests based on prior CCTA data. In clinical practice, the utilization of such an approach could bring a paradigm shift in risk stratification and downstream management. Despite significant progress using deep learning models for tabular data, they do not outperform gradient boosting decision trees, and further research is required in this area. However, neural networks appear to benefit more readily from multi-task learning than tree-based models. This could offset the shortcomings of using single task learning approach when working with tabular data.

2.Low-Rank Multitask Learning based on Tensorized SVMs and LSSVMs (arXiv)

Author : Jiani Liu, Qinghua Tao, Ce Zhu, Yipeng Liu, Xiaolin Huang, Johan A. K. Suykens

Abstract : Multitask learning (MTL) leverages task-relatedness to enhance performance. With the emergence of multimodal data, tasks can now be referenced by multiple indices. In this paper, we employ high-order tensors, with each mode corresponding to a task index, to naturally represent tasks referenced by multiple indices and preserve their structural relations. Based on this representation, we propose a general framework of low-rank MTL methods with tensorized support vector machines (SVMs) and least square support vector machines (LSSVMs), where the CP factorization is deployed over the coefficient tensor. Our approach allows to model the task relation through a linear combination of shared factors weighted by task-specific factors and is generalized to both classification and regression problems. Through the alternating optimization scheme and the Lagrangian function, each subproblem is transformed into a convex problem, formulated as a quadratic programming or linear system in the dual form. In contrast to previous MTL frameworks, our decision function in the dual induces a weighted kernel function with a task-coupling term characterized by the similarities of the task-specific factors, better revealing the explicit relations across tasks in MTL. Experimental results validate the effectiveness and superiority of our proposed methods compared to existing state-of-the-art approaches in MTL. The code of implementation will be available at https://github.com/liujiani0216/TSVM-MTL

See the original post:
Updates on Multitask learning part1(Machine Learning) | by ... - Medium

PhD Candidate in Machine Learning in Neurology job with … – Times Higher Education

About the job

Do you want to participate in a groundbreaking interdisciplinary research project combining neurology, advanced computational science, and technology? We have a vacant position as PhD candidate for 3 years.

The Department of Neuromedicine and Movement Science, in collaboration with the Department of Computer Science, has recently launched a large-scale research project on the application of machine learning in headache research entitledMachine Intelligence in Headache (MI-HEAD).The overall goal of the project is to develop and apply artificial intelligence and machine learning methods and frameworks to improve the medical treatment of individuals with primary headaches.

In the project, an extensive database consisting of available Norwegian health register data combined with clinical data is used to develop models that predict the effect of migraine medications at the individual level. Such prediction may optimize the administration of correct treatment to individuals with migraine and significantly reduce the negative impacts of headache. We will also carry out a large-scale randomized controlled clinical trial to evaluate the effect of using machine learning to optimize treatment for individuals with migraine.

MI-HEAD is organized under the newly establishedNorwegian Centre for Headache Research (NorHEAD), a nationwide Centre for Clinical Treatment Research funded by the Research Council of Norway. NorHEAD is hosted by the Department of Neuromedicine and Movement Science at NTNU, and collaborates with academic institutions, hospitals, and industry across the nation. MI-HEAD also works in close collaboration with the world-leading High-Dimensional Neurology group at UCL Queen Square Institute of Neurology, London, UK.

Department of Neuromedicine and Movement Science(INB), conducts research and education covering av wide range of areas related to the nervous system, sense organs, the head, and motion control and movement.

Sustainability is an important part of our social mission. As an employee at INB, we want you to get involved in the development of a sustainable future. Together with your colleagues, you will contribute to the department achieving its sustainability goals.

This position is a unique opportunity to contribute to a highly advanced and meaningful research area as part of a large-scale interdisciplinary and international initiative.

For a position as a PhD Candidate, the goal is a completed doctoral education up to an obtained doctoral degree.

Duties of the position

In line with the aims of MI-HEAD, the position will be responsible for:

Required selection criteria

The appointment is to be made in accordance withRegulations concerning the degrees of Philosophiae Doctor (PhD) and Philosodophiae Doctor (PhD) in artistic research at NTNUandNational guidelines for appointment as PhD, post doctor and research assistant

Preferred selection criteria

Personal characteristics

Emphasis will be placed on personal and interpersonal qualities.

We offer

Salary and conditions

As a PhD candidate (code 1017) you are normally paid from gross NOK 532200 per annum before tax, depending on qualifications and seniority. From the salary, 2% is deducted as a contribution to the Norwegian Public Service Pension Fund.

The period of employment is 3 years.

Appointment to a PhD position requires that you are admitted to the PhD program inMedicine and Health SciencesorMedical Technologywithin three months of employment, and that you participate in an organized PhD program during the employment period.

The engagement is to be made in accordance with the regulations in force concerningState Employees and Civil Servants, and the acts relating toControl of the Export of Strategic Goods, Services and Technology. Candidates who by assessment of the application and attachment are seen to conflict with the criteria in the latter law will be prohibited from recruitment to NTNU. After the appointment you must assume that there may be changes in the area of work.

The position is subject to external funding by the Norwegian Research Council.

It is a prerequisite you can be present at and accessible to the institution daily.

About the application

The application and supporting documentation to be used as the basis for the assessment must be in English.

Publications and other scientific work must be attached to the application. Please note that your application will be considered based solely on information submitted by the application deadline. You must therefore ensure that your application clearly demonstrates how your skills and experience fulfil the criteria specified above.

The application must include:

If all,or parts,of your education has been taken abroad, we also ask you to attach documentation of the scope and quality of your entire education, both bachelor's and master's education, in addition to other higher education. Description of the documentation required can befoundhere. If you already have a statement fromNOKUT,pleaseattachthisas well.

We will take joint work into account. If it is difficult to identify your efforts in the joint work, you must enclose a short description of your participation.

In the evaluation of which candidate is best qualified, emphasis will be placed on education,experienceand personal and interpersonalqualities.Motivation,ambitions,and potential will also countin the assessment ofthe candidates.

NTNU is committed to following evaluation criteria for research quality according toThe San Francisco Declaration on Research Assessment- DORA. This means that we pay special attention to the quality and professional breadth of these works. We also consider experience from research management and participation in research projects. We place great emphasis on your scientific work from the last five years.

General information

Working at NTNU

NTNU believes that inclusion and diversity is a strength. We want our faculty and staff to reflect Norways culturally diverse population and we continuously seek to hire the best minds. This enables NTNU to increase productivity and innovation, improve decision making processes, raise employee satisfaction, compete academically with global top-ranking institutions, and carry out our social responsibilities within education and research. NTNU emphasizes accessibility and encourages qualified candidates to apply regardless of gender identity, ability status, periods of unemployment or ethnic and cultural background.

The city ofTrondheimis a modern European city with a rich cultural scene. Trondheim is the innovation capital of Norway with a population of 200,000. The Norwegian welfare state, including healthcare, schools, kindergartens and overall equality, is probably the best of its kind in the world. Professional subsidized day-care for children is easily available. Furthermore, Trondheim offers great opportunities for education (including international schools) and possibilities to enjoy nature, culture and family life and has low crime rates and clean air quality.

As an employeeatNTNU, you must at all times adhere to the changes that the development in the subject entails and the organizational changes that are adopted.

A public list of applicants with name, age, job title and municipality of residence is prepared after the application deadline. If you want to reserve yourself from entry on the public applicant list, this must be justified. Assessment will be made in accordance withcurrent legislation. You will be notified if the reservation is not accepted.

If you have any questions about the position, please contact Anker Stubberud, telephone +4745229174, email anker.stubberud@ntnu.no. If you have any questions about the recruitment process, please contact HR Adviser Bente Kristin rbogen Andersen, e-mail: bente.k.a.andersen@ntnu.no

If you think this looks interesting and in line with your qualifications, please submit your application electronically via jobbnorge.no with your CV, diplomas and certificates attached. Applications submitted elsewhere will not be considered. Upon request, you must be able to obtain certified copies of your documentation.

Application deadline:1 October 2023

NTNU - knowledge for a better world

The Norwegian University of Science and Technology (NTNU) creates knowledge for a better world and solutions that can change everyday life.

The Department of Neuroscience and Movement Science (INB) is an ambitious department in strong development. Great diversity in medicine and health disciplines is a hallmark of the department. We give priority to social relevance and quality in both education and research.

Several of our groups of researchers and teaching staff are at the forefront of their fields in Norway and internationally. Our ambition: Through interdisciplinary collaboration, INB creates forward-looking education and research that lead the way in improving health and function.Read more about the Department here: http://www.ntnu.edu/inb

Deadline1st October 2023EmployerNTNU - Norwegian University of Science and TechnologyMunicipalityTrondheimScopeFulltimeDuration TemporaryPlace of serviceEdvard Griegs gt. , 7030 Trondheim

Read more here:
PhD Candidate in Machine Learning in Neurology job with ... - Times Higher Education

Revolutionizing Drug Development with Machine Learning to … – Cryptopolitan

Description

In a groundbreaking development that could transform the landscape of drug discovery and development, researchers at Pohang University of Science and Technology (POSTECH) have harnessed the power of machine learning to predict a drugs chances of approval before clinical trials even begin. Their findings, recently published in the esteemed journal EBioMedicine, offer a promising solution Read more

In a groundbreaking development that could transform the landscape of drug discovery and development, researchers at Pohang University of Science and Technology (POSTECH) have harnessed the power of machine learning to predict a drugs chances of approval before clinical trials even begin. Their findings, recently published in the esteemed journal EBioMedicine, offer a promising solution to one of the pharmaceutical industrys most pressing challenges the high rate of drug candidates that fail during clinical trials despite showing promise in preclinical testing.

The pursuit of new pharmaceuticals is not merely a scientific endeavor but a vital mission that affects the health and well-being of humanity at large. The development of innovative drugs is instrumental in advancing medical treatments, preventing diseases, and ultimately improving the quality of life for individuals around the globe. However, the arduous journey from laboratory discovery to market availability is fraught with obstacles and uncertainties.

One of the most significant hurdles in drug development is the staggering economic losses incurred when a drug candidate fails during clinical trials. These trials involve diverse population groups and are designed to assess the safety and efficacy of a drug in real-world scenarios. Even when a drug has shown exceptional promise in preclinical stages, the transition to clinical trials can reveal unexpected challenges, leading to setbacks that cost pharmaceutical companies billions of dollars.

To address this critical issue, it is imperative to understand why certain drugs, despite passing rigorous preclinical testing, falter during clinical trials. Moreover, there is a pressing need to develop methods that can predict a drugs chances of approval before embarking on these costly and time-consuming trials.

Enter Professor Sanguk Kim and PhD candidate Minhyuk Park, leading a research team at POSTECHs Department of Life Sciences. Leveraging the power of machine learning, they have achieved remarkable success in predicting potential drug outcomes and side effects before clinical trials commence.

The crux of their groundbreaking research lies in addressing a fundamental discrepancy in drug effects observed between cell lines and animals, commonly used in preclinical testing, and their ultimate impact on humans. This discrepancy arises from variations in how drug target genes function and are expressed in cells as opposed to humans. Neglecting this critical difference can lead to severe and unanticipated side effects when drugs are administered to actual patients, deviating significantly from the promising results seen in laboratory settings.

The researchers at POSTECH tackled this challenge head-on by focusing on the disparities in drug effects between cells and humans. Their approach involved a comprehensive analysis of the CRISPR-Cas9 knockout and loss-of-function mutation rate-based gene perturbation effects in cells and humans, respectively. By evaluating this discrepancy, they aimed to predict the likelihood of a drugs approval, drawing from a dataset that included 1404 approved drugs and 1070 unapproved drugs.

To further validate the risk associated with drug targets exhibiting the cells/humans discrepancy, the researchers delved into the targets of drugs that had previously failed in clinical trials or been withdrawn from the market due to safety concerns. This meticulous analysis provided crucial insights into the factors contributing to drug failures and enabled the research team to refine their predictive models.

What sets this research apart from conventional approaches is its integration of both chemical and genetic strategies. While traditional methods primarily rely on a drugs chemical properties to predict its success, the POSTECH team recognized the significance of genetic differences between preclinical models and humans. By harmonizing these two facets, they achieved a level of accuracy previously unattainable in drug safety and success predictions.

The implications of this research are nothing short of revolutionary. Machine learnings ability to predict a drugs chances of approval with a high degree of accuracy has the potential to reshape the pharmaceutical industry. By providing pharmaceutical companies with a tool to make more informed decisions about which drug candidates to advance to clinical trials, this technology has the potential to reduce the risk of costly failures and accelerate the development of safe and effective drugs.

As with any transformative technology, the use of machine learning in drug development raises important ethical considerations. Ensuring the privacy and security of patient data used in these predictive models is paramount. Additionally, regulatory agencies will need to adapt to accommodate the use of these innovative approaches in the drug approval process, striking a balance between innovation and safety.

The work conducted by Professor Sanguk Kim, Minhyuk Park, and their team at POSTECH represents a significant step forward in drug development. Their integration of machine learning, genetic insights, and chemical properties promises to revolutionize the way pharmaceuticals are discovered and developed, ultimately benefiting not only the industry but also the health and well-being of individuals worldwide. The journey from laboratory discovery to clinical approval may soon become a more efficient and predictable path, ushering in a new era of medical innovation.

Here is the original post:
Revolutionizing Drug Development with Machine Learning to ... - Cryptopolitan

Prediction of lung papillary adenocarcinoma-specific survival using … – Nature.com

The accurate prediction of survival in patients with LPADC is essential for patient counseling, follow-up, and treatment planning. Previous studies have revealed multiple prognostic factors that affect the survival time of patients with pulmonary papillary carcinoma, including patient age, grade classification, lymph node status, tumor size, distant metastases, and surgical treatment9, 11. Machine learning is increasingly utilized in research for the prediction of survival of patients with cancer25,26,27, with relatively favorable results. Although CoxPH is the classical method utilized for the analysis of survival data, the use of this method requires linear relationships between variables. As a result of the continuous advances achieved in recent years, machine learning is widely applied to the medical field28,29,30. In this study, we used ensemble machine learning models to accurately predict CSS in patients with LPADC, and obtained satisfactory results.

Consistent with the findings reported by You et al., the four models developed in this study confirmed that surgery is an important prognostic factor for patients with lung adenocarcinoma3. Similarly, distant metastases have an important impact on the prognosis of patients with LPADC. In conjunction with previous analyses, the findings demonstrate that patients who developed distant metastases had poorer survival rates than other patients26, 27. A higher N-stage also plays a crucial role in the model, indicating poor prognosis28. Other characteristics (e.g., tumor size, grade, sex, chemotherapy, primary site, etc.) have different degrees of importance in various models11, 23, 27. These results suggest that the selection of appropriate treatment modalities (e.g., surgery, radiotherapy, and chemotherapy) may be more important for predicting CSS in patients with LPADC than TNM staging alone.

Interestingly, the ensemble models (i.e., GBS, EST, and RSF) did not demonstrate a markedly better ability for predicting CSS in LPADC in the validation cohort compared with the CoxPH model. This indicates that the machine learning approach may only offer advantages when traditional models are limited. Therefore, there are several possible explanations for the comparable predictive performance observed between the ensemble and CoxPH models in this study. Firstly, the number of predictors used to construct the model was not sufficiently large, and the advantages of machine learning in analyzing large samples and multivariate data are not fully realized. Secondly, the SEER database collects variables derived from clinical experience; many of these variables are linearly correlated with outcomes. Therefore, the data may be better qualified for the application of parametric (CoxPH) models. The GBS, EST, and RSF models developed in this study achieved the predictive efficacy of the CoxPH model under a broader condition. The web calculator constructed for the study is based on the training dataset, and care should be taken when applying the EST model that may be overconfident. Hence, it is not recommended to use this algorithm for the prediction of survival. In this study, the CoxPH model had poorer long-term predictive power than the ensemble models. Therefore, use of the RSF model is recommended for the prediction of LPADC CSS beyond 10 years.

This study had several limitations. Firstly, in the SEER database, there was a lack of data regarding established predictors of survival in patients with LPADC (e.g., chemotherapy regimens and biological markers). Secondly, due to the retrospective nature of this study and data processing, samples with missing information were excluded; this may have led to considerable bias. Thirdly, the work related to the measurement of prediction model errors in the study is not yet complete. Finally, the results of this study were not externally validated; although we randomly split the study sample during the development of the models, the generalizability and reliability of this approach should be further validated with external datasets. The prognostic value of this approach should be improved in the future by adding more predictors, increasing external validation, and conducting prospective studies.

In conclusion, a geometric model and a CoxPH model were developed and evaluated for the prediction of CSS in patients with LPADC. Overall, all four models showed excellent discriminative and calibration capabilities; in particular, the RSF model and GBS model showed excellent consistency for long-term forecasting. The integrated web-based calculator offers the possibility to easily calculate the CSS of patients with LPADC, providing clinicians with a user-friendly risk stratification tool.

Continue reading here:
Prediction of lung papillary adenocarcinoma-specific survival using ... - Nature.com

Seismologists use deep learning to forecast earthquakes – University of California

For more than 30 years, the models that researchers and government agencies use to forecast earthquake aftershocks have remained largely unchanged. While these older models work well with limited data, they struggle with the huge seismology datasets that are now available.

To address this limitation, a team of researchers at the University of California, Santa Cruz, and the Technical University of Munich created a new model that uses deep learning to forecast aftershocks: the Recurrent Earthquake foreCAST (RECAST). In a paper published in Geophysical Research Letters, the scientists show how the deep learning model is more flexible and scalable than the earthquake forecasting models currently used.

The new model outperformed the current model, known as the Epidemic Type Aftershock Sequence (ETAS) model, for earthquake catalogs of about 10,000 events and greater.

The ETAS model approach was designed for the observations that we had in the 80s and 90s when we were trying to build reliable forecasts based on very few observations, said Kelian Dascher-Cousineau, the lead author of the paper who recently completed his Ph.D. at UC Santa Cruz. Its a very different landscape today. Now, with more sensitive equipment and larger data storage capabilities, earthquake catalogs are much larger and more detailed

Weve started to have million-earthquake catalogs, and the old model simply couldnt handle that amount of data, said Emily Brodsky, a professor of earth and planetary sciences at UC Santa Cruz and co-author on the paper. In fact, one of the main challenges of the study was not designing the new RECAST model itself but getting the older ETAS model to work on huge data sets in order to compare the two.

The ETAS model is kind of brittle, and it has a lot of very subtle and finicky ways in which it can fail, said Dascher-Cousineau. So, we spent a lot of time making sure we werent messing up our benchmark compared to actual model development.

To continue applying deep learning models to aftershock forecasting, Dascher-Cousineau says the field needs a better system for benchmarking. In order to demonstrate the capabilities of the RECAST model, the group first used an ETAS model to simulate an earthquake catalog. After working with the synthetic data, the researchers tested the RECAST model using real data from the Southern California earthquake catalog.

They found that the RECAST model which can, essentially, learn how to learn performed slightly better than the ETAS model at forecasting aftershocks, particularly as the amount of data increased. The computational effort and time were also significantly better for larger catalogs.

This is not the first time scientists have tried using machine learning to forecast earthquakes, but until recently, the technology was not quite ready, said Dascher-Cousineau. New advances in machine learning make the RECAST model more accurate and easily adaptable to different earthquake catalogs.

The models flexibility could open up new possibilities for earthquake forecasting. With the ability to adapt to large amounts of new data, models that use deep learning could potentially incorporate information from multiple regions at once to make better forecasts about poorly studied areas.

We might be able to train on New Zealand, Japan, California and have a model that's actually quite good for forecasting somewhere where the data might not be as abundant, said Dascher-Cousineau.

Using deep-learning models will also eventually allow researchers to expand the type of data they use to forecast seismicity.

Were recording ground motion all the time, said Brodsky. So the next level is to actually use all of that information, not worry about whether were calling it an earthquake or not an earthquake but to use everything."

In the meantime, the researchers hope the model sparks discussions about the possibilities of the new technology.

It has all of this potential associated with it, said Dascher-Cousineau. Because it is designed that way.

Follow this link:
Seismologists use deep learning to forecast earthquakes - University of California

Stay ahead of the game: The promise of AI for supply chain … – Washington State Hospital Association

Everybody is talking about artificial intelligence and machine learning lately, but is the hype real? Finding supplies and keeping them stocked to be easily accessible can be daunting, but when minutes count, it becomes even more crucial. There are AI and machine learning tools designed to ease the workload. Join us for a webinar from 11 a.m. 12 p.m. Thursday, Oct. 12 to learn about this growing technology and a practical example of how AI and machine learning have impacted access to lifesaving equipment and supplies at a Washington hospital, enabling more patient-facing time. Register here.

This program is the first in a series of hospital supply chain webinars and roundtables where you will hear from hospital supply chain leaders and peers as they share practical tips and tools on creating supply chain efficiencies, cost savings and innovations. Feel free to share this invitation with your internal hospital networks. Unable to attend? Register anyway and we will send you the slides and webinar recording.

Learning objectives:

The hospital supply chain tips and tools programs are provided for WSHA hospital members in collaboration with the Western States Healthcare Materials Management Association (WSHMMA), the regional chapter of AHRMM.

For more information about this and future supply chain programs, please contact Cynthia Hay, cynthiah@wsha.org, (206) 216-2526. (Cynthia Hay)

Read the rest here:
Stay ahead of the game: The promise of AI for supply chain ... - Washington State Hospital Association

Predicting Stone-Free Status of Percutaneous Nephrolithotomy … – Dove Medical Press

Introduction

Urolithiasis (or nephrolithiasis) is a relatively common disease affecting 113% of the global population and is more common in Jordan affecting 5.95% of the Jordanian population.1,2 It has a predilection for obese Caucasian men and carries significant morbidity, its prevalence being on the rise over the last four decades. Several procedures are currently in use for the management of kidney stones, including extracorporeal shockwave lithotripsy (ESWL), ureteroscopic lithotripsy (URSL), and percutaneous nephrolithotomy (PCNL). With each having its own indications, PCNL remains the golden standard for large renal stones measuring greater than 2 cm, staghorn stones, and partial staghorn stones.3

PCNL is not free of complications and its efficacy can be variable; therefore, few pre-operative nomograms are in place to help predict success rates, namely stone-free status, and possible complications, and nomograms help to systemize the reporting and interpretation of the surgerys outcomes.4 Examples of such nomograms are the S.T.O.N.E score, S-ReSC score, Guys Stone Score (GSS), and CROES nephrolithometry score. The GSS comprises 4 grades that rate the complexity of the future PCNL based on renal anatomy and stone location; the score is based on all stones detected and not only those amenable for PCNL; higher grades in GSS correlate with a lower chance of stone-free status (SFS).5 On the other hand, the S.T.O.N.E score is obtained from pre-operative radiological characteristics that include stone size, the topography of stone, obstruction with respect to the degree of hydronephrosis, number of stones, and the Hounsfield Unit (HU) value of the stone6 Studies have shown that these stone scoring systems (SSS) have similar accuracy in predicting the SFS of PCNL patients, although GSS shows a slight superiority in complication prediction.7

Recently, machine learning (ML) has been trialed as a possible alternative to traditional SSS in predicting the sequelae of PCNL, with five studies in the literature documenting the endeavor.812 All five studies described ML as promising, as it showed high sensitivity and accuracy along with efficiency in comparison to SSS. Each study used different ML methods, from artificial neural network (ANN) systems9 to support vector machine (SVM) models. However, none of the aforementioned studies were externally validated, and they all had smaller sample sizes compared to our study. In our study, we utilized the following three ML methods to predict the SFS of 320 PCNL patients: Random Forest (RF), Support Vector Machine (SVM), and eXtreme Gradient Boosting (XGBoost). Using our results, we compared MLs performance to two stone scoring systems: Guys Stone Score and S.T.O.N.E score. We then externally validated our model, becoming the first study in the literature to do so, as well as the first study in Jordan that aims to create a machine learning model (MLM) for predicting the SFS of PCNL.

We conducted a retrospective, observational, single-center cohort study at King Abdullah University Hospital (KAUH), the main tertiary hospital in North Jordan. The Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD) statement, which provides guidelines for reporting and developing predictive models, was followed in this study.13 The Research Committee of the Faculty of Medicine and the Institutional Review Board at Jordan University of Science and Technology (JUST) approved the study, and the Institutional Review Board provided the ethical approval (8402022). The ethics committee approved a waiver of consent from the patients because the study did not include any therapeutic intervention and the outcomes planned are routinely registered in patients with nephrolithiasis. All patients diagnosed with nephrolithiasis, confirmed by computed tomography (CT) scans, and who had undergone Percutaneous Nephrolithotomy (PCNL) between January 2017 and September 2022 at KAUH were included. A standard diagnostic and preoperative evaluation were performed on all patients, which included a complete blood count, full coagulation profile, urine culture, kidney function test, and second-generation prophylactic antibiotics.

The study included a total of three surgeons (A, B, and C) who performed PCNL procedures on patients with renal calculi. The assignment process involved a random allocation method, which allowed for an unbiased distribution of surgeons across the various groups. This approach aimed to explore the potential patterns or trends in PCNL outcomes without intentional surgeon classification. The study aimed to analyze the outcomes within this random assignment to gain insights that could contribute to the broader understanding of PCNL efficacy and surgeon impact on treatment success. All procedures were performed under general anesthesia. The patients were put in a prone position, and a small skin incision would be made around the nephrostomy tract, under fluoroscopy guidance a guide wire was inserted down to the urinary bladder. Dilatation is then done up to 11fr, using a double-lumen catheter, a safety guidewire would then be inserted. A balloon dilator (nephromax) is used to achieve maximum dilation reaching 12pa, then the working sheath is inserted. A rigid 26 Fr nephroscope was used in all patients, then stone fragmentation was performed using different methods depending on the preference of the treating urologist; ultrasonic was the most common method in this regard. A nephrostomy was placed in almost all cases. If necessary, the nephrostomy tube would be left in the renal pelvis for decompression and/or easy access. Plain radiography of the kidneys, ureters, and bladder (KUB) was obtained from postoperative day 1 according to the state of the patient.

The nephrostomy tubes were removed on postoperative day 1 or 2 when the radiological images show signs of SFS. SFS means either the absence of stones or clinically insignificant residual fragments (diameter less than 4 mm) in the kidney after the procedure. Various methods were used to determine whether stone-free status has been achieved, including imaging studies such as X-rays or CT scans, as well as direct inspection of the kidney using the nephroscope. Stone-free status is typically assessed immediately following the procedure, but in some cases, a follow-up evaluation may be required to confirm that no residual stones remain.

A set of input variables were collected from the hospital records at KAUH for all patients that included preoperative and postoperative variables. The preoperative variables were age, gender, hypertension, diabetes, hyperlipidemia, preoperative hemoglobin, renal insufficiency, recent urinary tract infections, previous surgeries on the target kidney, stone burden, stone location, and hydronephrosis. Postoperative variables included fever, septicemia, need for transfusion, length of hospital stay, ancillary procedures, and stone-free status. SFS was defined as either no residual stone fragments on a CT scan or X-ray as well as direct inspection of the kidney using the nephroscope or those with clinically insignificant residual fragments <4 mm. The results of the definition were entered as a binary number: 1 (stone residual, ie, Yes) or 0 (clinically insignificant residual fragments or no residual stone fragments).

Three ML ensembles were employed in this study: the Random Forest Classifier (RFC), Support Vector Classifier (SVC), and Extreme Gradient Boosting (XGBoost). These algorithms were selected due to their effectiveness in handling complex, multidimensional datasets and their capacity to model nonlinear relationships. The RFC model is a decision tree-based machine learning model. Each node of the decision tree divides the data into two groups using a cutoff value within one of the features. By creating an ensemble of randomized decision trees, each of which overfits the data and aggregates the results to achieve improved classification, the RFC technique mitigates the impact of the overfitting problem.14 SVC is a powerful supervised machine learning technique that aims to find the optimal hyperplane to separate data into different classes. It is well suited for both classification and regression tasks. XGBoost was also used and is constructed based on a decision tree-based gradient boosting regression method.15 In this approach, trees for prediction are sequentially built, with each subsequent tree designed to reduce errors from its predecessors.

After that, the machine learning models were trained on a dataset with a binary classification output predicting the target Stone-Free Status (SFS) using 26 features including demographic, clinical, renal, preoperative, and postoperative surgical variables. Then, dataset was split randomly into 7:3 for the training set (n = 224) and n = 96 for the testing set. Features contribution in predicting SFS status was calculated using the permutation importance method, in which a higher decrease in mean accuracy represents higher importance in models predictions. Receiver Operating Characteristic (ROC) curves and Area Under the Curve (AUC) scores were calculated to evaluate the discriminatory power of different models. The roc_curve function from the sklearn.metrics module was employed to compute the False Positive Rate (FPR) and True Positive Rate (TPR) for each model. The predicted probabilities of the positive class were obtained using the predict_proba method of each model. The AUC scores were calculated using the roc_auc_score function. A custom plotting function plot_roc_curve was defined to visualize the ROC curves of multiple models. The model was also evaluated using Mean Bootstrap Estimate with a 95% Confidence Interval, 10-fold cross-validation, and classification report for precision, recall, and F1-score.

All three models were externally validated by data extracted from a previous similar study by Zhao et al with compatible variables that included 224 patients.8 The algorithm generated predictions for the instances in the validation dataset, and these predictions were compared to the actual outcomes to assess the models accuracy, mean bootstrap estimate, and AUC. The results obtained from this evaluation provide an estimate of the models generalizability to unseen data, thus helping to validate its effectiveness and applicability in real-world scenarios. All ML implementations were processed using the scikit-learn 0.18 package in Python.

All data analyses were performed using the IBM Statistical Package for the Social Sciences (SPSS) software for Windows, version 26.0. Descriptive measures included means standard deviations for continuous data if the normality assumption was not violated, according to the ShapiroWilk test, and median with first and third quartiles (Q1Q3) if the assumption was violated. Categorical data were presented by frequencies and percentages (%). Continuous data were compared using the Student t test in normally distributed variables and the MannWhitney U-test if not normally distributed. Categorical data were compared using the 2 test or the Fisher's exact test if 1 cell had an expected count of less than 5. Variables included in the model were chosen based on a separate bivariate analysis, including all variables yielding a P value of <0.1. Nagelkerke R2 was used as a measure for the goodness-of-fit. The variables in the model were checked for multicollinearity using the variance inflation factor. Statistical significance was considered at a 2-sided P value of .05.

A total of 320 patients (222 males, 69.4%) were enrolled. The mean age was 46.03 14.7 years, and the median (IQR) stone burden was 208.1 231 mm2. Table 1 shows the preoperative variables including individual variables, renal and stone data. The patients comprised 92 non-stone-free cases and 228 stone-free cases. The distribution of GSS categories differed significantly between the Non-Stone-Free and Stone-Free groups (p < 0.0001). The Stone-Free group had higher proportions of patients in the GSS I, GSS II, and GSS III categories compared to the Non-Stone-Free group, which had a higher proportion of patients in the GSS IV category. The S.T.O.N.E score is another scoring system used to evaluate stone characteristics. Similar to GSS, the distribution of S.T.O.N.E score categories varied between the non-stone-free and stone-free groups. Higher S.T.O.N.E scores (9 and above) had a higher percentage in the non-stone-free group, compared to the stone-free group. The non-stone-free group had a higher median stone burden of 319.6 mm2, compared to 182.9 mm2 in the stone-free group. The stone burden within the non-stone-free group was nearly twice as large as that within the stone-free group, and this difference is statistically significant (p < 0.0001). Stone location accounted for statistically significant differences in the upper calyx, middle calyx, and lower calyx showing a higher percentage in non-stone-free compared with stone-free. Preoperative UTI had a higher percentage of 37% in the stone-free group compared with 22.4% in the non-stone-free group. No statistically significant differences were observed between the non-stone-free and stone-free groups for variables such as diabetes, hypertension, hyperlipidemia, unilateral kidney, renal insufficiency, anemia, and previous surgery on the target kidney (p > 0.05). Table 2 shows the postoperative data for these patients. The overall SFS was 71.3% (228/320). Table 3 presents the analysis of the impact of surgeon expertise on SFS in PCNL. The table provides a comparison of SFS across the three surgeon groups (A, B, and C). Among patients operated on by surgeon A, 65 (73%) were stone-free, indicating successful complete stone clearance. In contrast, 24 patients (27%) in this group were classified as non-stone-free, indicating the presence of residual stones >4 mm post-PCNL. In surgeon B, 75 (70%) of the patients were stone-free, while 32 (30%) were classified as non-stone-free. Surgeon C had 88 patients (71%) classified as stone-free, and 36 patients (29%) classified as non-stone-free. Figure 1 shows the stone-free rate in each subgroup of GSS grades and the S.T.O.N.E score systems.

Table 1 Preoperative Factors Including Individual Variables and Renal Stone Factors

Table 2 Postoperative Outcome Variable (n = 320)

Table 3 Analyzing the Impact of Surgeon Expertise on SFS in PCNL

Figure 1 The stone-free rate in each subgroup of GSS grades and the S.T.O.N.E score systems.

The RFC model was performed on the testing set with a mean bootstrap estimate of 0.75 and 95% CI: [0.650.85], 10-fold cross-validation of 0.744, an accuracy of 0.74, and an AUC of 0.761, while the XGBoost model predicted on the testing set with a mean bootstrap estimate of 0.74 and 95% CI: [0.630.85], 10-fold cross-validation of 0.759, the accuracy of 0.72, and AUC of 0.769. The SVM model performed with a mean bootstrap estimate of 0.70 and 95% CI: [0.600.79], 10-fold cross-validation of 0.725, an accuracy of 0.74, and an AUC of 0.751. On the other hand, Guys Score and S.T.O.N.E Score had an AUC of 0.666 and 0.71, respectively. The RFC model performed on the external validation set with a mean bootstrap estimate of 0.87 and 95% CI: [0.810.92], an accuracy of 0.70, and an AUC of 0.795, while the XGBoost model predicted on the external validation set with a mean bootstrap estimate of 0.84 and 95% CI: [0.780.91], an accuracy of 0.74, and an AUC of 0.84. The SVM model performed on the external validation set with a mean bootstrap estimate of 0.86 and 95% CI: [0.800.91], an accuracy of 0.79, and an AUC of 0.858. ROC curves of all MLMs are displayed in Figure 2. The most contributing features in predicting SFS in the RFC model are displayed in Figure 3. The highest contributing factor was stone burden, followed by the length of stay and age.

Figure 2 The ROC curves of the three MLMs including the externally validated set and the GSS and S.T.O.N.E score system.

Figure 3 Results of feature importance analysis in the RFC model for predicting SFS of PCNL patient.

Nephrolithiasis is a common kidney disease with 1%5% prevalence in Asia and 7%13% in America, with a male predominance of ratio 1.52.5:1 [1315]. Urinary lithiasis casts diagnostic, prognostic, and financial burdens, especially when a patient needs multiple imaging and surgical procedures.16 Therefore, we aimed to develop an MLM that can predict the postoperative outcome namely SFS in patients who underwent PCNL. Our models predicted the SFS with high accuracy and certainty using pre- and post-operative variables, marking the stone burden as the highest contributing predictor of SFS.

Among the factors considered in our predictive models, the length of hospital stay and age stand out as noteworthy contributors to the predictive capacity of the model. These findings underscore the dynamic nature of predicting SFS, revealing that beyond preoperative variables, factors associated with the postoperative trajectory can substantially influence outcome predictions. The inclusion of the length of stay, a postoperative parameter, lends valuable insights into the nuances of stone clearance efficacy and the recovery process. Our models demonstrate that patients who required a longer hospital stay were more likely to exhibit distinct stone burdens and procedural complexities, aligning with the clinical intuition that these patients may require additional care to achieve optimal outcomes. The predictive power of the length of hospital stay offers clinicians an early indicator of potential stone-related challenges, allowing for more targeted interventions and follow-up strategies.

Upon comparison of the stone burden between non-stone-free and stone-free groups, the non-stone-free group showed nearly double the stone burden, which was statistically significant. This disparity in stone burden between both groups could be due to the multifaceted challenges these larger stone burdens pose on PCNL procedures. These large burdens often indicate complex or multiple stone formations that hinder full access to the collecting system, therefore, preventing complete fragmentation which renders a lower rate of stone-free status.17

Upon closer examination, we observed a counterintuitive trend between age and SFS. Interestingly, the mean age of non-stone-free patients was lower than that of stone-free patients. It is important to note that the non-stone-free group consisted of 93 patients, whereas the stone-free group comprised 228 patients. This disparity in sample sizes could potentially influence the observed relationship, prompting us to consider the role of sample distribution in drawing conclusive insights. This observation prompts further investigation into the complex relationship between age, stone characteristics, and treatment outcomes. While age emerged as a contributing factor to our predictive models, the inverse relationship between age and SFS in our study warrants a deeper exploration of the underlying mechanisms. The relationship between age and SFS could potentially be attributed to a variety of factors. One possible explanation could be the differential distribution of stone types among different age groups. Age-related variations in stone composition, density, or structure might influence fragmentation behavior and clearance rates, subsequently affecting SFS outcomes. Moreover, physiological differences related to bone density, urinary dynamics, and kidney function across various age cohorts could contribute to the observed trend.

When compared to the conventional scoring systems, our model showed superior performance to Guys stone score and S.T.O.N.E score. The Guys stone score was first developed by Thomas et al and consists of four grades based on the location and number of stones.18 This score has been validated on many prospective PCNL procedures and was significantly correlated with the stone-free rate, whereas stone burden and patients demographic and clinical factors did not show any correlation19 and 0.69 for Guys stone score.20

A study by Zhao et al also assessed the predictive effect of demographic, pre- and post-operative renal variables on SFS using ML with similar performance to our model. The stone burden was also observed to be the highest contributing feature in their logistic regression model in addition to stone location.8 However, their model did not show superior performance to the S.T.O.N.E score but only to Guys score (RFC: 0.80, Guys score: 0.84, S.T.O.N.E score: 0.78). Our data was externally validated using this study, ensuring our results are generalized to the population. However, calculating the stone burden is not consistent across all studies, and there is no descriptive formula for its calculation. This presents heterogeneity and inconsistency in predicting SFS. In our study, the stone burden was calculated by the following formula: (maximum length Maximum width 0.25), which was also used by Smith et al.21 This, in turn, raises the need for a clearly defined model that considers interindividual variables and operative variables in addition to ethnic and racial. All studies have externally evaluated stone scoring systems in eastern and western societies, here we present the first study cohort which evaluated stone scores in a middle eastern society. Srivastava et al also evaluated the effect of inter-observer variability between surgeons and radiologists for Guys and S.T.O.N.E scores and studied the agreement using the Fleiss coefficients. The overall S.T.O.N.E score showed good agreement between surgeons and radiologists (Fleiss = 0.79) the same applies to Guys score for all grades with moderate to very good agreement (Fleiss : Grade I = 0.91; Grade 2: 0.53; Grade 3: 0.61; Grade 4: 0.84).22

We were the only publication in our field to externally validate our data, ensuring the accuracy and reliability of our findings. This process involved an independent third-party review to verify the methodology and results, and by taking this extra step, we were able to provide further creditability to our conclusions. Overall, the use of statistical methods and Python programming in external validation helps to ensure the robustness and generalizability of the data and model. A strength of the study is that it considered additional variables such as age and pre-operative UTI, which were not included in GSS and S.T.O.N.E score.

The ML techniques employed in this study were able to predict the rate of successful stone removal with higher accuracy than the established GSS and S.T.O.N.E score systems, moreover, this study considered other variables that were not considered in the aforementioned scoring systems. When evaluated, all three MLMs were externally validated and showed high accuracy rates. The accuracy of the system for predicting the stone-free rate was found to be between 70 and 79% with an AUC between 0.751 and 0.858, compared to the AUC of GSS and S.T.O.N.E which were 0.67 and 0.71, respectively. All ML methods found that the factors that had the greatest impact on stone-free status were the initial stone burden, length of stay, and patient age.

In accordance with the ethical guidelines and standards outlined in the Declaration of Helsinki, we hereby confirm that our study fully complies with these principles. The Research Committee of the Faculty of Medicine and the Institutional Review Board at Jordan University of Science and Technology (JUST) approved the study, and the Institutional Review Board provided the ethical approval (840-2022). The ethics committee approved a waiver of consent from the patients because the study did not include any therapeutic intervention and the outcomes planned are routinely registered in patients with nephrolithiasis.

The authors declare that they have followed all ethical and scientific standards when conducting their research and writing the manuscript and that all authors have approved the final version of the manuscript for submission.

We would like to thank Editage for English language editing.

All authors made a significant contribution to the work reported, whether that is in the conception, study design, execution, acquisition of data, analysis and interpretation, or in all these areas; took part in drafting, revising or critically reviewing the article; gave final approval of the version to be published; have agreed on the journal to which the article has been submitted; and agree to be accountable for all aspects of the work.

This research received no external funding.

The authors of this article have carefully considered any potential conflicts of interest and have found none to report. They have no relevant financial or non-financial interests that could impact the articles content, and they have no affiliations or involvement with any organizations with a financial or proprietary interest in the material discussed. The authors declare that they have no competing interests related to the manuscripts subject matter and certify that they have no ties to any entity that could present a conflict.

1. Sorokin I, Mamoulakis C, Miyazawa K, Rodgers A, Talati J, Lotan Y. Epidemiology of stone disease across the world. World J Urol. 2017;35(9):13011320. doi:10.1007/s00345-017-2008-6

2. Abboud IA. Prevalence of Urolithiasis in Adults due to Environmental Influences: a Case Study from Northern and Central Jordan. Jordan J Earth Environ Sci. 2018;9(1):2938.

3. Ganpule AP, Vijayakumar M, Malpani A, Desai MR. Percutaneous nephrolithotomy (PCNL) a critical review. Int J Surgery. 2016;36(PD):660664. doi:10.1016/j.ijsu.2016.11.028

4. Kumar U, Tomar V, Yadav SS, et al. STONE score versus Guys Stone Score - Prospective comparative evaluation for success rate and complications in percutaneous nephrolithotomy. Urol Ann. 2018;10(1):7681. doi:10.4103/UA.UA_119_17

5. Wu WJ, Okeke Z. Current clinical scoring systems of percutaneous nephrolithotomy outcomes. Nat Rev Urol. 2017;14(8):459469. doi:10.1038/nrurol.2017.71

6. Zhernovoi I, Shchukin D, Jundi M, Grabs D, Maranzano J, Nayouf A. Comparison of four transdiaphragmatic approaches to remove cavoatrial tumor thrombi: a pilot study. Cent European J Urol. 2022;75(2):145152. doi:10.5173/ceju.2022.0277.R1

7. Jiang K, Sun F, Zhu J, et al. Evaluation of three stone-scoring systems for predicting SFR and complications after percutaneous nephrolithotomy: a systematic review and meta-analysis. BMC Urol. 2019;19:1. doi:10.1186/s12894-019-0488-y

8. Zhao H, Li W, Li J, Li L, Wang H, Guo J. Predicting the Stone-Free Status of Percutaneous Nephrolithotomy With the Machine Learning System: comparative Analysis With Guys Stone Score and the S.T.O.N.E Score System. Front Mol Biosci. 2022;9. doi:10.3389/fmolb.2022.880291

9. Aminsharifi A, Irani D, Pooyesh S, et al. Artificial Neural Network System to Predict the Postoperative Outcome of Percutaneous Nephrolithotomy. J Endourol. 2017;31(5):461467. doi:10.1089/end.2016.0791

10. Shabaniyan T, Parsaei H, Aminsharifi A, et al. An artificial intelligence-based clinical decision support system for large kidney stone treatment. Australas Phys Eng Sci Med. 2019;42(3):771779. doi:10.1007/s13246-019-00780-3

11. Aminsharifi A, Irani D, Tayebi S, Jafari Kafash T, Shabanian T, Parsaei H. Predicting the Postoperative Outcome of Percutaneous Nephrolithotomy with Machine Learning System: software Validation and Comparative Analysis with Guys Stone Score and the CROES Nomogram. J Endourol. 2020;34(6):692699. doi:10.1089/end.2019.0475

12. Hameed BMZ, Shah M, Naik N, Singh Khanuja H, Paul R, Somani BK. Application of Artificial Intelligence-Based Classifiers to Predict the Outcome Measures and Stone-Free Status Following Percutaneous Nephrolithotomy for Staghorn Calculi: cross-Validation of Data and Estimation of Accuracy. J Endourol. 2021;35(9):13071313. doi:10.1089/end.2020.1136

13. Collins GS, Reitsma JB, Altman DG, Moons K. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD Statement. BMC Med. 2015;13(1):1. doi:10.1186/s12916-014-0241-z

14. TIBCO Software. What is a Random Forest? Available from: https://www.tibco.com/reference-center/what-is-a-random-forest. Accessed August 19, 2023.

15. Pushkar Mandot. How exactly XGBoost Works? 2019. Available from: https://medium.com/@pushkarmandot/how-exactly-xgboost-works-a320d9b8aeef. Accessed August 19, 2023.

16. Ziemba JB, Matlaga BR. Epidemiology and economics of nephrolithiasis. Investig Clin Urol. 2017;58(5):299306. doi:10.4111/icu.2017.58.5.299

17. Rais-Bahrami S, Friedlander JI, Duty BD, Okeke Z, Smith AD. Difficulties with access in percutaneous renal surgery. Ther Adv Urol. 2011;3(2):5968. doi:10.1177/1756287211400661

18. Thomas K, Smith NC, Hegarty N, Glass JM. The Guys Stone ScoreGrading the Complexity of Percutaneous Nephrolithotomy Procedures. Urology. 2011;78(2):277281. doi:10.1016/j.urology.2010.12.026

19. Noureldin YA, Elkoushy MA, Andonian S. External validation of the S.T.O.N.E. nephrolithometry scoring system. J Canadian Urol Assoc. 2015;9(6):190195. doi:10.5489/cuaj.2652

20. Ingimarsson JP, Dagrosa LM, Hyams ES, Pais VM. External validation of a preoperative renal stone grading system: reproducibility and inter-rater concordance of the Guys stone score using preoperative computed tomography and rigorous postoperative stone-free criteria. Urology. 2014;83(1):4549. doi:10.1016/j.urology.2013.09.008

21. Smith A, Averch TD, Shahrour K, et al. A nephrolithometric nomogram to predict treatment success of percutaneous nephrolithotomy. J Urol. 2013;190(1):149156. doi:10.1016/j.juro.2013.01.047

22. Srivastava A, Yadav P, Madhavan K, et al. Inter-observer variability amongst surgeons and radiologists in assessment of Guys Stone Score and S.T.O.N.E. nephrolithometry score: a prospective evaluation. Arab J Urol. 2020;18(2):118123. doi:10.1080/2090598X.2019.1703278

Original post:
Predicting Stone-Free Status of Percutaneous Nephrolithotomy ... - Dove Medical Press

What is Image Annotation, and Why is it Important in Machine … – Ground Report

Image annotation is the key to enabling machines to understand the language of pixels in a world where visual data is abundant, and images offer tales, information, and insights. In machine learning, particularly in computer vision, image annotation, a laborious effort comprising the categorization and contextualization of visual features inside images, has a significant impact.

Artificial visual intelligence is based on the accurate interpretation of images, which is necessary for machines to detect, understand, and respond to the visual world. Image annotation acts as this conduit. This fundamental procedure closes the gap between human perception and machine cognition, and it is more than just a technical step. Lets understand how.

The process of classifying or labeling an image in deep learning and machine learning is image annotation. Image annotation services include classifying the images using annotation tools, texts, or both. It helps in presenting the data that your model will recognize. Additionally, during image annotation, metadata is added to the dataset.

People often call image annotation processing, tagging, or transcribing. In the present day, besides images, videos can also be annotated easily. Image annotation is typically done to train your ML models to identify and recognize the images.

Once your machine learning model is used, you want it to recognize features in images that arent annotated. It is done for the model to decide what to do or take action as a result.

A lot of data is used to train, test and validate a machine learning model for achieving results. Image annotation is primarily done to make the models recognize boundaries and objects. After recognizing the ML models, segment these images for complete understanding or meaning.

Simple image annotation is the process of labeling an image using terms describing its object. For example, a cats image can be annotated as a domestic house cat. It is also known as tagging or image classification.

This annotation is used for identifying, counting, and tracking multiple objects in an image. Here the ML model is trained on multiple datasets to recognize between two objects in an image.

For instance, you may have an image of products in your warehouse, and you want the ML model to identify those products or machinery and label them accordingly. For this, you will need the help of data entry services, where all the data will be stored for training your ML model. Your ML model will use this data to understand the various names and classify them accordingly.

You can annotate standard and multi-frame images and videos for your machine-learning model. Below are the two types of data used in image annotation:

Here images, videos, and data from cameras or other technical devices such as single-lens reflex cameras (SLR) or optical microscopes are used.

Here the data is used from cameras and other technical instruments like ion, electron, or scanning probe microscopes for annotation.

Here are some of the reasons why image annotation is needed for ML models:

If you want your ML model to be effective in areas such as robots, drones, and autonomous vehicles, you would want it to identify the desired objects. Identifying it will help the ML model to make decisions and take necessary actions.

Image annotation helps the ML models to categorize and recognize the different objects in an image. Without image annotation, it can be difficult for the computer to identify and label many objects in a single image.

Hence, deep learning, a part of ML, annotate the different images. It is further used to identify these objects and make it easier for the computer to understand, locate and categorize them. It is especially needed when an image has both living and nonliving objects.

Image annotation is needed to integrate ML models and raw visual data efficiently. It gives companies the understanding that their models require for accurately predicting and making decisions. It is an essential part of the development of computer vision as it influences how well your ML model will perform and develop.

Follow Ground Report forClimate ChangeandUnder-Reported issuesin India. Connect with us onFacebook,Twitter,Koo App,Instagram,WhatsappandYouTube. Write us onGReport2018@gmail.com

See the original post:
What is Image Annotation, and Why is it Important in Machine ... - Ground Report

Detection of diabetic patients in people with normal fasting glucose … – BMC Medicine

Data collection and processing

The physical examination data were derived from three hospitals, First Affiliated Hospital of Wannan Medical College, Beijing Luhe Hospital of Capital Medical University, and Daqing Oil field General Hospital. The three datasets were named as D1, D2, and D3, respectively. The first step was data cleaning, in which samples with missing values and abnormal values were excluded. According to the criteria for diagnosing prediabetes and diabetes from WHO, we screened the samples with normal fasting glucose (6.1mmol/L) and classified these samples into two groups by HbA1c level with threshold of 6.5%, diabetes patients (HbA1c6.5%) and normal/healthy samples. After preprocessing, 61,059, 369, and 3247 samples were retained in D1, D2, and D3, which separately contained 603, 3, and 21 subjects with diabetes, that is, the positive samples. Then, we split D1 into training set, validation set, and test set by 6:1:3 using randomly stratified sampling. D2 and D3 were used as newly recruited independent test sets.

All datasets contained 27 physical examination characteristics, including sex, age, height, body mass index (BMI), fasting blood glucose (FBG), white blood cell count (WBC), neutrophil (NEU), absolute neutrophil count (ANC), lymphocyte (LYM), absolute lymphocyte count (ALC), monocyte (MONO), absolute monocyte count (AMC), eosinophil (EOS), absolute eosinophil count (AEC), basophil (BASO), absolute basophil count (ABC), hemoglobin (HGB), hematocrit (HCT), mean corpuscular volume (MCV), mean corpuscular hemoglobin (MCH), red cell distribution width (RDW), platelets (PLT), mean platelet volume (MPV), platelet distribution width (PDW), thrombocytopenia (PCT), red blood cell count (RBC), and mean corpuscular hemoglobin concentration (MCHC).

Given the severe class-imbalance of all datasets, the SMOTE (synthetic minority over-sampling technique) method was employed on training set for oversampling the positive samples. SMOTE could generate new samples for the minority class by interpolation based on k-nearest neighbors [22], which could make positive samples as large as negative samples on training set. The process was implemented by imblearn package in Python. Finally, we conducted Z-score normalization on all datasets, in which the mean and standard deviation values were calculated by the data of training set.

With the physical examination data, we presented a computational framework for identifying the diabetic patients with NFG, as shown in Fig.1. At first, we preprocessed three datasets of D1, D2, and D3 as introduced above, in which D1 was divided into training set, validation set, and test set by 6:1:3, while D2 and D3 as independent test set were used for the evaluation of final model. In view of the class-imbalance of datasets, we used an oversampling method on the training set. Then, multiple widely used machine learning methods including logistic regression (LR), random forest (RF), supported vector machine (SVM), and deep neural network (DNN) were exploited to construct the predictor. Next, we applied feature selection methods on the most superior one of four predictors to improve the feasibility of tool and assessed the performance with independent test sets. Finally, feature importance analysis was used to screen relevant variables with the incidence of diabetes. And we devised a framework for identifying the risk factors of diabetes at individual level and developed an online tool for boosting its clinical practice.

Overview of the DRING approach

In preliminary, in order to build the predictive model, four machine learning methods were employed, including LR, RF, SVM, and DNN. LR is a variation of linear regression prominently used in classification tasks [23], which finds the best fit to describe the linear relationship between responsible variables and input features and then covert the output to the probability by a sigmoid function. RF is composed of numerous decision trees, which are practically a collection of ifthen condition [24] The decision tree recursively split the data into subset based on the best feature and criterion until the stopping criterion is met. In RF, each decision tree is independently trained on random subset of samples and features, which reduces the risk of overfitting. The final decision is voted by all trees improving the overall accuracy and the robustness of the model. SVM, one of the most popular machine learning methods, classifies the samples by finding a hyperplane on the feature space to maximize the margin of points from different classes [25]. It can handle non-linearly separable data by using various kernels such as linear, polynomial, and radial basis function realizing the original feature space into high-dimensional space. The LR, RF, and SVM models were constructed by scikit-learn package in Python 3.8. And default parameters were used in the process of training models. DNN [26] contains input layer, hidden layer, and output layer, where there are plenty of neurons in each layer and the neurons from different layers are connected. For DNN, the connection is generally linear transformation followed by an activation function. Here, we used the ReLU function to activate the linear neurons and softmax function to output the prediction result. In addition, we used the dropout and L2 regularization strategy in the hidden layers to prevent the presence of overfitting. Moreover, the residual blocks also were added into the DNN for simplifying the training process. The DNN was implemented by Pytorch package. In this study, DNN model achieved the best performance when the number of layers at 6 and initial learning rate with 0.0018. Loss on the training set and validation set was depicted in Additional file 1: Fig. S1. And we chose the model with the best performance on validation set for further optimization.

Currently, machine learning models for classification task are evaluated by multiple well-established metrics, for example, sensitivity, accuracy, and area under the receiver operating characteristic curve (AUC), etc. Given the seriously unbalanced classes of validation set and test set, here, we exploited sensitivity, specificity, balanced accuracy, AUC, and area under the precision-recall curve (PR-AUC) to evaluate models, which were calculated as following formulas.

$$mathrm{Sensitivity}=mathrm{TPR}= frac{TP}{TP+FN}$$

(1)

$$mathrm{Specificity}=mathrm{TNR}=frac{TN}{TN+FP}$$

(2)

$$mathrm{Balanced accuracy}= frac{TPR+TNR}{2}$$

(3)

(TP), that is, true positive, is the number of correctly classified diabetes patients. (FP), false positive, denotes the number of normal subjects who were predicted as diabetes. (TN), true negative, represents the number of correctly classified health subjects. (FN), false negative, is the number of diabetes patients who were classified as health individuals. And all above metrics range from 0 to 1.

Although the predictive model based on 27 features had a considerable performance, there still exist several possible redundant information or noise features affecting the decision making. To maximize the effective information of features and simplify the model, we used manual curation and max relevance and min redundancy (mRMR) [27] to extract key features for the final model. Towards manual curation, we firstly selected the features with significant difference between the positive samples and the negative samples. To enhance the stability of the predictive model, we removed the features resulting in severe collinearity. As a result, 13 features were retained. For consistencys sake, the number of feature subset was set to 13 when performing mRMR analysis. In addition, feature selection was executed on the training set for reducing the risk of overfitting. Analysis of feature importance can interpret the prediction model and discover the most relevant features with diabetes. Here, the importance of each feature was measured by its corresponding weight coefficient of the LR model.

We developed an online tool, DRING (http://www.cuilab.cn/dring), based on the predictive models with 13 features filtered by manual curation and mRMR, where the former is the preferred option. The backend development of website was implemented by Python 2.7, and the interactive pages were constructed on the combination of HTML5, Boostrap 4, and JavaScript.

Feature importance analysis can help to explain the model; however, it fails to explore the risk factors for incident diabetes at individual level. To find out the potential risk factor for a specific individual, we learnt from the permutation feature importance (PFI) algorithm [24, 28], which is designed for quantifying the importance for each of the variables of a dataset. Here, we adapted PFI to assess the contributions of the features derived from an individual. Specifically, it contains the following 4 steps: (1) given a feature vector, we firstly create a series of random permutation for one of features based on the input dataset; (2) then, we calculate the prediction results for each of new feature vectors; (3) the contribution of the permutated feature is defined as formula 4:

$${P =| P}_{r}-frac{1}{k}{sum }_{i=1}^{k}{P}_{i} |$$

(4)

({P}_{r}) is the risk score for diabetes calculated with the initial feature vector, here referred to the predictive probability of diabetes; ({P}_{i}) is the prediction result of ith permutation, and (k) is the number of permutations; (4) perform the above steps iteratively for each of features. Here, we set k to 100,000. The feature with a higher value implies more contribution to the risk of diabetes.

More here:
Detection of diabetic patients in people with normal fasting glucose ... - BMC Medicine

Addressing gaps in data on drinking water quality through data … – Nature.com

Input data

The analytical framework begins with input data and continues to data preparation, modeling and application (Fig. 5). The study uses the ESS. The survey is a collaboration between the Central Statistics Agency and the World Bank under the Living Standards Measurement Study- Integrated Surveys on Agriculture (LSMS-ISA) project. ESS began in 2011/12 and the first wave, ESS1 covered rural and small-town areas. The survey was expanded to include medium and large towns in 2013/14 (ESS2). The 2013/2014 sample households were again visited in 2015/16 (ESS3) during which the water quality module was implemented. The survey was fielded again in 2018/19 (ESS4) with a refreshed sample. This study is primarily based on the 2016 Survey (ESS3) and associated water quality survey18,28. In this study, ESS2 is the Earlier Survey, ESS3 is the Reference Survey, and ESS4 is the Latest Survey. ESS1 was not used because the survey did not cover medium and large towns. See the Data Availability section for further information on these data sources including metadata.

Methodological workflow from input data to model application.

ESS is a multi-topic household survey with several individual and household level socioeconomic and demographic information. These included basic individual-level demographic information on household structure, education, health, and labor market outcomes, as well as several household-level information such as household assets, consumption expenditure, dwelling characteristics, access to electricity, water, and sanitation facilities. ESS data also comes with a range of geospatial variables that are constructed by mapping the households location to other data available for the area. These include, among other things, rainfall, temperature, greenness, wetness, altitude, population density, the households closeness to the nearest major road, urban and market centers. In addition, the 2015/16 survey (ESS3) which is the main focus of this study, implemented a water quality module that included microbial and chemical tests to measure water quality. The microbial test included the presence of E. coli, WHOs preferred indicator of fecal contamination5.

The response variable in this study is the presence of E. coli contamination at the point of collection. Contaminated drinking water refers to the detection of E. coli in water samples collected from the households drinking water source.

The objective of this study was to develop a predictive model for drinking water contamination from minimal socioeconomic information. Therefore, only features that are often included in household surveys are considered. For example, the 2015/16 water quality module has some information on the chemical and physical characteristics of the water. These variables were not included in the training dataset because they are not usually available in other surveys. Therefore, the data preparation for this study considered only selected variables.

Data preparation activities included pre-processing, data splitting, and dimension reduction. The pre-processing step involved constructing some variables from existing variables, variable transformation, and treating missing values by imputation or dropping them from the analysis. Constructed variables included wealth index and open defecation in the area. The wealth index was constructed from selected assets using principal component analysis. Open defecation in the area is an enumeration area (EA) level variable and indicates the proportion of households in the EA who do not have a toilet facility. Variables that were transformed include the water source type. For example, we combined boreholes, protected springs and wells into a single category given the comparatively low number of respondents and in order to harmonize responses across the three waves of the survey. Similarly, unprotected springs and wells were combined. Consequently, the water source type list included in the model selection analysis had fewer categories than in the raw data.

To assess how the classifiers generalize to unseen data, the pre-processed data was split into training and test datasets stratified by the distribution of the response variable. Accordingly, 80% of the data is assigned to the training dataset and the remaining 20% is assigned to the test dataset. The training dataset was used to train the classifiers and estimate the hyperparameters, and the test dataset was used to evaluate the performance of the classifiers and get an independent assessment of how well the classifiers performed in predicting the positive class (contaminated drinking water source). To reduce the dimension of the processed data, the Boruta feature selection algorithm was used. The final list of features used in the analysis is presented in Supplementary Table 1.

We examined a few commonly used classification algorithms including GLM, GLMNET, KNN, SVM, and two decision tree-based classifiers: RF, and XGBoost. To obtain the optimal values of the classifiers hyperparameters that maximize the area under the ROC, we tuned the non-liner classifiers using regular grid search method.

The GLM uses a parametric model allowing for different link functions for the response variable. For classification purposes, the response values are categorical. Especially in this study, we have a binary classification problem; i.e., contaminated versus non-contaminated. Therefore, logistic regression is used as a reference model. The glm R package was used in this study30.

The GLMNET classifier uses GLM via penalized maximum likelihood. The lasso and elastic net are popular types of penalized linear regression (or regularized linear regression models) that add penalties to the loss function during training. It promotes simpler models with better accuracy and removes features that are highly correlated. We also used glmnet R package for the GLMNET classifier and tuned two hyperparameters penalty (regularization parameter) and mixture (representing relative amount of penalties).

KNN is one of the most widely used non-parametric classifiers. It defines similarity as being in close proximity. In other words, it classifies a new case or data point based on its distance or closeness to the majority of its k nearest neighbor points in the training set. We used kknn package in R and tuned two hyperparameters neighbors (nearest neighbors) and weight_func (distance weighting function).

SVM is another classification method that uses distance to the nearest training data points. It classifies data points by using hyperplanes with the maximum margin between classes in high dimensional feature space31. It works for cases not linearly separable. In this study, we used a non-linear kernel (kernlab) package in R and tuned two hyperparameters including cost and degree (polynomial degree).

RF is an ensemble method that builds multiple decision trees by sampling the original data set multiple times with replacement32. Therefore, it uses a subset of the original dataset to train the decision trees and to separate different classes as much as possible. RF combines the trees at the end by taking the majority of votes from those trees. Although large number of trees will slow the process, the greater number of trees in the forest help improve the overall accuracy and prevent the problem of overfitting. We used ranger package in R, which provides the importance of features as well. We tuned the following three hyperparameters: mtry (number of randomly selected predictors), min_n (minimal node size), and trees (1000).

XGBoost is another machine learning ensemble method which uses the gradient of a loss function that measures the performance33. Different than other ensemble methods, which train models in isolation of one another, XGBoost (or boosting) trains models sequentially by training each new model to correct the errors made by the previous ones. This continues until there is no scope of further improvements. XGBoost is fast to execute in general and gives good accuracy. In this study, we used XGBClassifier from xgboost package in R. The xgboost package has few tunable parameters and we tuned two of them: trees (trees) and tree_depth (tree depth).

The classification algorithms are evaluated using metrics that are calculated from the four predicted results of the confusion matrix: (i) true positive (TP) or correctly predicted as contaminated, (ii) true negative (TN) or correctly predicted as not contaminated, (iii) false positive (FP) or wrongly predicted as contaminated, and (iv) false negative (FN) or wrongly predicted as not contaminated. With our data being class-imbalanced, we used a combination of metrics to evaluate the models. We calculated accuracy, sensitivity (also known as recall or true positive rate (TPR)), specificity or true negative rate (TNR), F1 score, and area under the curve (AUC) of Receiver Operating Characteristics (ROC). The positive cases are more important than the negative cases and the goal is to make sure the best performing model maximizes the TPR. Finally, given the data we used is of imbalanced classes we have implemented resampling techniques17. These include upsampling the minority class and downsampling the majority class (See Supplementary Tables 3 and 4). However, there were no significant improvements in the prediction results. The AUC for the RF model using upsampling and downsampling techniques is 0.90 (95% CI 0.88, 0.93). Similarly, AUC for the XGBoost model is 0.90 (95% CI 0.87, 0.92) for upsampling and 0.89 (95% CI 0.86, 0.92). These are similar to the main results reported in Table 2.

The analyses were conducted with the R programming language.

See the rest here:
Addressing gaps in data on drinking water quality through data ... - Nature.com