Category Archives: Machine Learning
Smarter AI: Choosing the Best Path to Optimal Deep Learning – SciTechDaily
Researchers have improved deep learning by selecting the most efficient overall path to the output, leading to a more effective AI without added layers.
Like climbing a mountain via the shortest possible path, improving classification tasks can be achieved by choosing the most influential path to the output, and not just by learning with deeper networks.
Deep Learning (DL) performs classification tasks using a series of layers. To effectively execute these tasks, local decisions are performed progressively along the layers. But can we perform an all-encompassing decision by choosing the most influential path to the output rather than performing these decisions locally?
In an article published today (August 31) in the journal Scientific Reports, researchers from Bar-Ilan University in Israel answer this question with a resounding yes. Pre-existing deep architectures have been improved by updating the most influential paths to the output.
Like climbing a mountain via the shortest possible path, improving classification tasks can be achieved by training the most influential path to the output, and not just by learning with deeper networks. Credit: Prof. Ido Kanter, Bar-Ilan University
One can think of it as two children who wish to climb a mountain with many twists and turns. One of them chooses the fastest local route at every intersection while the other uses binoculars to see the entire path ahead and picks the shortest and most significant route, just like Google Maps or Waze. The first child might get a head start, but the second will end up winning, said Prof. Ido Kanter, of Bar-Ilans Department of Physics and Gonda (Goldschmied) Multidisciplinary Brain Research Center, who led the research.
This discovery can pave the way for better enhanced AI learning, by choosing the most significant route to the top, added Yarden Tzach, a PhD student and one of the key contributors to this work.
This exploration of a deeper comprehension of AI systems by Prof. Kanter and his experimental research team, led by Dr. Roni Vardi, aims to bridge between the biological world and machine learning, thereby creating an improved, advanced AI system. To date they have discovered evidence for efficientdendriticadaptationusingneuronal cultures, as well as how toimplement those findingsin machine learning, showing howshallow networkscan compete with deep ones, and finding themechanism underlying successful deep learning.
Enhancing existing architectures using global decisions can pave the way for improved AI, which can improve its classification tasks without the need for additional layers.
Reference: Enhancing the accuracies by performing pooling decisions adjacent to the output layer 31 August 2023, Scientific Reports.DOI: 10.1038/s41598-023-40566-y
Read more:
Smarter AI: Choosing the Best Path to Optimal Deep Learning - SciTechDaily
Electronic health records and stratified psychiatry: bridge to … – Nature.com
Development of an ML prediction model involves a multi-step process [11]. Briefly, labeled data are partitioned into training and test subsets. The data subsets undergo preprocessing to minimize the impact of dataset anomalies (e.g., missing values, outliers, redundant features) on the algorithms learning process. The algorithm is applied to the training data, learning the relationship between the features and predictive target. Performance is typically evaluated via cross-validation to estimate the models performance on new observations (internal validation). However, this only approximates a models ability to generalize to unseen data. Prediction models must demonstrate the ability to generalize to independent datasets (external validation) [12]. Ideally, external validation should occur in a separate study by a different analytic team [13]. Clinical validation involves assessing a models generalization to real world data as well as potential clinical utility and impact. Randomized cluster trials, for instance, evaluate groups of patients randomly assigned to receive care based on a models prediction versus care-as-usual.
Few examples exist of predictive ML models advancing to clinical validation in psychiatry, indicative of a sizeable translational gap. Delgadillo et al. compared the efficacy and cost of stratified care compared to stepped care for a psychological intervention for depression (n=951 patients) in a cluster randomized trial [14]. The investigators previously developed a ML prediction model to classify patients as standard or complex cases using self-reported measures and sociodemographic information extracted from clinical records (n=1512 patients) [15]. In the prospective trial, complex cases were matched to high-intensity treatment and standard cases to low-intensity treatment. Stratified care was associated with a 7% increase in the probability of improvement in depressive symptoms at a modest ~$140 increase in cost per patient [14].
What is driving this translational gap? Much of it may relate to challenges in generalizing models beyond their initial training data. There are no silver bullets in the development of ML prediction models and many potential pitfalls. The most common are overfitting and over-optimism due to insufficient training data, excess complexity improper (or lack of) cross-validation, and/or data leakage [16,17,18].
Most published ML studies in psychiatry suffer these methodological flaws [3,4,5]. Tornero-Costa et al. reviewed 153ML applications in mental health and found only one study to be at low risk of bias by the Prediction model Risk Of Bias ASsessment Tool (PROBAST) criteria [3]. Approximately 37.3% of studies used a sample size of 150 or less to train models. Details on preprocessing were completely absent in 36.6% of studies and 47.7% lacked a description of data missingness. Only 13.7% of studies attempted external validation. Flaws in the analysis domain (e.g., attempts to control overfitting and optimism) contributed significantly to bias risk in most applications (90.8%). Furthermore, in 82.3% of the studies, data and developed model were not publicly accessible. Two other systematic reviews also found overall high risk of bias (>90%) among ML prediction studies, including poor reporting of preprocessing steps as well as low rates of internal and external validation [4, 5]. Meehan et al. additionally reported that only 22.7% of studies (of those meeting statistical standards) appropriately embedded feature selection within cross-validation to avoid data leakage [5].
The precise degree to which published ML prediction models overestimate their ability to generalize is difficult to estimate. In the area of prognosis prediction, Rosen et al. assessed 22 published prediction models of transition to psychosis in individuals at clinical high-risk [19]. Models were assessed for external validation from a multisite, naturalistic study. Only two models demonstrated good (AUC>=0.7) performance and 9 models failed to achieve better than chance (AUC=0.5) prediction. None of the models outperformed the clinician raters (AUC=0.75) [19].
The model development process is vulnerable to human inductive biases, which can inflate model performance estimates due to unintentional errors or deliberate gaming for publication [17, 20]. Performance scores have become inappropriately prioritized in peer review due to erroneous higher=better assumptions. Most studies employ a single algorithm without justifying its selection or compare multiple algorithms performance on the same dataset, then select the best performing one (multiple testing issue) [17, 21]. Software packages like PyCaret (Python) offer the ability to screen the performance of a dozen or more algorithms on a dataset in a single step. This analytic flexibility creates risk, because even random data can be tuned to significance solely through manipulation of hyperparameters [17].
Methodological shortcomings offer only partial explanation for the observed translational gap. As the saying goes, garbage in, garbage out. Low quality, small, or biased training data can generate unreliable models with poor generalization to new observations or worse, make unfair predictions that adversely impact patients. Ideal ML training data is large, representative of the population of interest, complete (low missingness), balanced, and possesses accurate and consistent feature and predictive target labels or values (low noise). Per the systematic reviews above, these data quality criteria have been often neglected [3,4,5].
EHR data share many of the same quality issues impacting data collected explicitly for research, as well as some unique challenges that have deterred its use for ML in the past [22,23,24]. EHR data are highly heterogenous, encompassing both structured and unstructured elements. Structured data is collected through predefined fields (e.g., demographics, diagnoses, lab results, medications, sensor readings). Unstructured data is effectively everything else, including imaging and text. Extracting meaningful features from unstructured EHR data is non-trivial and often requires supervised and unsupervised ML techniques.
The quality of EHR data can vary by physician and clinical site. Quality challenges with EHR data that can adversely impact ML models for stratified psychiatry include:
EHR populations are non-random samples, which may create differences between the training data population and the target population [25]. Patients with more severe symptoms or treatment resistance may be frequently referred. Factors other than need for treatment (e.g., insurance status, referral, specialty clinics) can lead to systematic overrepresentation or underrepresentation of certain groups or disorders in the data. Marginalized populations, such as racial and ethnic minorities, for example, face barriers to accessing care and may be absent in the data [26]. When an algorithm trains on data that is not diverse, the certainty of the models predictions is questionable for unrepresented groups (high epistemic uncertainty) [27]. This may lead to unfair predictions (algorithmic bias) [28].
Missing data are common in EHRs. The impacts of missing data on model performance can be severe, especially when the data are missing not at random or missing at random but with a high proportion of missing values [29]. Furthermore, the frequency of records can vary substantially by patient. One individual may have multiple records in a period, others may have none [30]. Does absence of a diagnosis indicate true lack of a disorder or simply reflect that the patient received care elsewhere during a given interval? Structured self-reported patient outcome measures (e.g., psychometric measures) are often missing or incomplete [31].
Feature and target labels or values provide the ground truth for learning. Inaccuracies and missingness generate noise, which can hinder effective learning. The lineage of a given data element is important in considering its reliability and validity. For example, a patients diagnoses may be extracted from clinical notes, encounter/billing data, or problem lists (often not dated or updated) [32]. In some cases, the evaluating practitioner enters the encounter-associated diagnostic codes; in other instances, these are abstracted by a medical billing agent, creating uncertainty.
Imaging and sensor-based data may be collected using different acquisition parameters and equipment, leading to variability in measurements across EHRs and over time [33]. Data may be collected using different coding systems (e.g., DSM, ICD), the criteria for which also change over time. These issues can hinder external validation as well as contribute to data drift with the potential for deterioration in model performance [34].
When data are imbalanced, ML classification models may be more likely to predict the majority class, resulting in a high accuracy but low sensitivity or specificity for the minority class [35]. The consequences of data imbalance can be severe, particularly when the minority class is the most clinically relevant (e.g., patients with suicidal ideation who go on to attempt, adverse drug reactions).
Patient records represent a sequence of events over time [36]. Diagnostic clarification may create conflicts (e.g., depression later revealed to be bipolar disorder), depending on the forward and lookback windows used to create a dataset. Failure to appropriately account for the longitudinal nature of a patients clinical course can contribute to data leakage. Temporal data leakage occurs when future information is inadvertently used to make predictions for past events (e.g., including a future co-morbidity when predicting response to past treatment). Feature leakage occurs when variables expose information about the prediction target.
Empirical evidence indicates that preprocessing techniques can just as easily mitigate as exacerbate underlying data quality and bias issues. For example, missing data may be handled by complete case analysis (i.e., removal of observations with missing features) or imputation [37]. If data are not missing completely at random, deletion may eliminate key individuals [29]. Fernando et al. found that records containing missing data tended to be fairer than complete records and that their removal could contribute to algorithmic bias [38]. In the case of imputation, if the estimated values do not accurately represent the true underlying data, replacing missing values may inject error (e.g., imputing scores for psychometric scale items absent due to skip logic) and impact feature selection [39].
EHR data often require the creation of proxy features and outcomes to capture concepts (e.g., continuous prescription refills as an indicator of treatment effectiveness) or to reduce feature and label noise [40, 41]. No standards currently exist to guide such decisions or their reporting, creating high risk for bias. For example, if attempting to determine cannabis use when a patient was treated with a given antidepressant, one could check for a DSM/ICD diagnosis in their encounters or problem list, mine clinical notes to see whether use was endorsed/denied, or examine urine toxicology for positive/negative results. Each choice carries a different degree of uncertainty. Absence of evidence does not indicate evidence of absence [42], although studies often make that assumption.
See the original post here:
Electronic health records and stratified psychiatry: bridge to ... - Nature.com
Working with Undirected graphs in Machine Learning part2 – Medium
Author : Shyan Akmal, Virginia Vassilevska Williams, Ryan Williams, Zixuan Xu
Abstract : The k-Detour problem is a basic path-finding problem: given a graph G on n vertices, with specified nodes s and t, and a positive integer k, the goal is to determine if G has an st-path of length exactly dist(s,t)+k, where dist(s,t) is the length of a shortest path from s to t. The k-Detour problem is NP-hard when k is part of the input, so researchers have sought efficient parameterized algorithms for this task, running in f(k)poly(n) time, for f as slow-growing as possible. We present faster algorithms for k-Detour in undirected graphs, running in 1.853kpoly(n) randomized and 4.082kpoly(n) deterministic time. The previous fastest algorithms for this problem took 2.746kpoly(n) randomized and 6.523kpoly(n) deterministic time [Bezkov-Curticapean-Dell-Fomin, ICALP 2017]. Our algorithms use the fact that detecting a path of a given length in an undirected graph is easier if we are promised that the path belongs to what we call a bipartitioned subgraph, where the nodes are split into two parts and the path must satisfy constraints on those parts. Previously, this idea was used to obtain the fastest known algorithm for finding paths of length k in undirected graphs [Bjrklund-Husfeldt-Kaski-Koivisto, JCSS 2017]. Our work has direct implications for the k-Longest Detour problem: in this problem, we are given the same input as in k-Detour, but are now tasked with determining if G has an st-path of length at least dist(s,t)+k. Our results for k-Detour imply that we can solve k-Longest Detour in 3.432kpoly(n) randomized and 16.661kpoly(n) deterministic time. The previous fastest algorithms for this problem took 7.539kpoly(n) randomized and 42.549kpoly(n) deterministic time [Fomin et al., STACS 2022].
2.Learning Spanning Forests Optimally using CUT Queries in Weighted Undirected Graphs (arXiv)
Author : Hang Liao, Deeparnab Chakrabarty
Abstract : In this paper we describe a randomized algorithm which returns a maximal spanning forest of an unknown {em weighted} undirected graph making O(n) CUT queries in expectation. For weighted graphs, this is optimal due to a result in [Auza and Lee, 2021] which shows an (n) lower bound for zero-error randomized algorithms. %To our knowledge, it is the only regime of this problem where we have upper and lower bounds tight up to constants. These questions have been extensively studied in the past few years, especially due to the problems connections to symmetric submodular function minimization. We also describe a simple polynomial time deterministic algorithm that makes O(nlognloglogn) queries on undirected unweighted graphs and returns a maximal spanning forest, thereby (slightly) improving upon the state-of-the-art.
See the rest here:
Working with Undirected graphs in Machine Learning part2 - Medium
Generative AI at an inflection point: What’s next for real-world … – VentureBeat
Head over to our on-demand library to view sessions from VB Transform 2023. Register Here
Generative AI is gaining wider adoption, particularly in business.
Most recently, for instance, Walmart announced that it is rolling-out a gen AI app to 50,000 non-store employees. As reported by Axios, the app combines data from Walmart with third-party large language models (LLM) and can help employees with a range of tasks, from speeding up the drafting process, to serving as a creative partner, to summarizing large documents and more.
Deployments such as this are helping to drive demand for graphical processing units (GPUs) needed to train powerful deep learning models. GPUs are specialized computing processors that execute programming instructions in parallel instead of sequentially as do traditional central processing units (CPUs).
According to the Wall Street Journal, training these models can cost companies billions of dollars, thanks to the large volumes of data they need to ingest and analyze. This includes all deep learning and foundational LLMs from GPT-4 to LaMDA which power the ChatGPT and Bard chatbot applications, respectively.
VB Transform 2023 On-Demand
Did you miss a session from VB Transform 2023? Register to access the on-demand library for all of our featured sessions.
The gen AI trend is providing powerful momentum for Nvidia, the dominant supplier of these GPUs: The company announced eye-popping earnings for their most recent quarter. At least for Nvidia, it is a time of exuberance, as it seems nearly everyone is trying to get ahold of their GPUs.
Erin Griffiths wrote in the New York Times that start-ups and investors are taking extraordinary measures to obtain these chips: More than money, engineering talent, hype or even profits, tech companies this year are desperate for GPUs.
In his Stratechery newsletter this week, Ben Thompson refers to this as Nvidia on the Mountaintop. Adding to the momentum, Google and Nvidia announced a partnership whereby Googles cloud customers will have greater access to technology powered by Nvidias GPUs. All of this points to the current scarcity of these chips in the face of surging demand.
Does this current demand mark the peak moment for gen AI, or might it instead point to the beginning of the next wave of its development?
Nvidia CEO Jensen Huang said on the companys most recent earnings call that this demand marks the dawn of accelerated computing. He added that it would be wise for companies to divert the capital investment from general purpose computing and focus it on generative AI and accelerated computing.
General purpose computing is a reference to CPUs that have been designed for a broad range of tasks, from spreadsheets to relational databases to ERP. Nvidia is arguing that CPUs are now legacy infrastructure, and that developers should instead optimize their code for GPUs to perform tasks more efficiently than traditional CPUs.
GPUs can execute many calculations simultaneously, making them perfectly suited for tasks like machine learning (ML), where millions of calculations are performed in parallel. GPUs are also particularly adept at certain types of mathematical calculations such as linear algebra and matrix manipulation tasks that are fundamental to deep learning and gen AI.
However, other classes of software (including most existing business applications), are optimized to run on CPUs and would see little benefit from the parallel instruction execution of GPUs.
Thompson appears to hold a similar view: My interpretation of Huangs outlook is that all of these GPUs will be used for a lot of the same activities that are currently run on CPUs; that is certainly a bullish view for Nvidia, because it means the capacity overhang that may come from pursuing generative AI will be backfilled by current cloud computing workloads.
He continued: That noted, Im skeptical: Humans and companies are lazy, and not only are CPU-based applications easier to develop, they are also mostly already built. I have a hard time seeing what companies are going to go through the time and effort to port things that already run on CPUs to GPUs.
Matt Assay of InfoWorld reminds us that we have seen this before. When machine learning first arrived, data scientists applied it to everything, even when there were far simpler tools. As data scientist Noah Lorang once argued, There is a very small subset of business problems that are best solved by machine learning; most of them just need good data and an understanding of what it means.'
The point is, accelerated computing and GPUs are not the answer for every software need.
Nvidia had a great quarter, boosted by the current gold-rush to develop gen AI applications. The company is naturally ebullient as a result. However, as we have seen from the recent Gartner emerging technology hype cycle, gen AI is having a moment and is at the peak of inflated expectations.
According to Singularity University and XPRIZE founder Peter Diamandis, these expectations are about seeing future potential with few of the downsides. At that moment, hype starts to build an unfounded excitement and inflated expectations.
To this very point, we could soon reach the limits of the current gen AI boom. As venture capitalists Paul Kedrosky and Eric Norlin of SK Ventures wrote on their firms Substack: Our view is that we are at the tail end of the first wave of large language model-based AI. That wave started in 2017, with the release of the [Google] transformers paper (Attention is All You Need), and ends somewhere in the next year or two with the kinds of limits people are running up against.
Those limitations include the tendency to hallucinations, inadequate training data in narrow fields, sunsetted training corpora from years ago, or myriad other reasons. They add: Contrary to hyperbole, we are already at the tail end of the current wave of AI.
To be clear, Kedrosky and Norlin are not arguing that gen AI is at a dead-end. Instead, they believe there needs to be substantial technological improvements to achieve anything better than so-so automation and limited productivity growth. The next wave, they argue, will include new models, more open source, and notably ubiquitous/cheap GPUs which if correct may not bode well for Nvidia, but would benefit those needing the technology.
As Fortune noted, Amazon has made clear its intentions to directly challenge Nvidias dominant position in chip manufacturing. They are not alone, as numerous startups are also vying for market share as are chip stalwarts including AMD. Challenging a dominant incumbent is exceedingly difficult. In this case, at least, broadening sources for these chips and reducing prices of a scarce technology will be key to developing and disseminating the next wave of gen AI innovation.
The future for gen AI appears bright, despite hitting a peak of expectations existing limitations of the current generation of models and applications. The reasons behind this promise are likely several, but perhaps foremost is a generational shortage of workers across the economy that will continue to drive the need for greater automation.
Although AI and automation have historically been viewed as separate, this point of view is changing with the advent of gen AI. The technology is increasingly becoming a driver for automation and resulting productivity. Workflow company Zapier co-founder Mike Knoop referred to this phenomenon on a recent Eye on AI podcast when he said: AI and automation are mode collapsing into the same thing.
Certainly, McKinsey believes this. In a recent report they stated: generative AI is poised to unleash the next wave of productivity. They are hardly alone. For example, Goldman Sachs stated that gen AI could raise global GDP by 7%.
Whether or not we are at the zenith of the current gen AI, it is clearly an area that will continue to evolve and catalyze debates across business. While the challenges are significant, so are the opportunities especially in a world hungry for innovation and efficiency. The race for GPU domination is but a snapshot in this unfolding narrative, a prologue to the future chapters of AI and computing.
Gary Grossman is senior VP of the technology practice at Edelman and global lead of the Edelman AI Center of Excellence.
Welcome to the VentureBeat community!
DataDecisionMakers is where experts, including the technical people doing data work, can share data-related insights and innovation.
If you want to read about cutting-edge ideas and up-to-date information, best practices, and the future of data and data tech, join us at DataDecisionMakers.
You might even considercontributing an articleof your own!
Read More From DataDecisionMakers
Continued here:
Generative AI at an inflection point: What's next for real-world ... - VentureBeat
Best use cases of t-SNE 2023 part2(Machine Learning) – Medium
Photo by Christopher Burns on Unsplash
Author : Okan Dzyel
Abstract : The quality of GAN-generated images on the MNIST dataset was explored in this paper by comparing them to the original images using t-distributed stochastic neighbor embedding (t- SNE) visualization. A GAN was trained with the dataset to generate images and the result of generating all synthetic images, the corresponding labels were saved. The dimensionality of the generated images and the original MNIST dataset was reduced using t-SNE and the resulting embeddings were plotted. The rate of the GAN-generated images was examined by comparing the t-SNE plots of the generated images and the original MNIST images. It was found that the GAN- generated images were similar to the original images but had some differences in the distribution of the features. It is believed that this study provides a useful evaluation method for assessing the quality of GAN-generated images and can help to improve their generation in the future.
2.Revised Conditional t-SNE: Looking Beyond the Nearest Neighbors (arXiv)
Author : Edith Heiter, Bo Kang, Ruth Seurinck, Jefrey Lijffijt
Abstract : Conditional t-SNE (ct-SNE) is a recent extension to t-SNE that allows removal of known cluster information from the embedding, to obtain a visualization revealing structure beyond label information. This is useful, for example, when one wants to factor out unwanted differences between a set of classes. We show that ct-SNE fails in many realistic settings, namely if the data is well clustered over the labels in the original high-dimensional space. We introduce a revised method by conditioning the high-dimensional similarities instead of the low-dimensional similarities and storing within- and across-label nearest neighbors separately. This also enables the use of recently proposed speedups for t-SNE, improving the scalability. From experiments on synthetic data, we find that our proposed method resolves the considered problems and improves the embedding quality. On real data containing batch effects, the expected improvement is not always there. We argue revised ct-SNE is preferable overall, given its improved scalability. The results also highlight new open questions, such as how to handle distance variations between clusters
Continued here:
Best use cases of t-SNE 2023 part2(Machine Learning) - Medium
Using AI technologies for effective document processing … – Data Science Central
Ever-growing volumes of unstructured data stored in countless document formats significantly complicate data processing and timely access to relevant information for organizations. Without proper optimization of data management workflows, its difficult to talk about business growth and scaling. That is why progressive companies opt for intelligent document processing powered by artificial intelligence.
Despite the fact that digitalization has been a top priority for businesses in recent years, companies still spend millions of dollars on manual document processing. According to statistics, about 80% of the data generated by organizations is unstructured. Moreover, this extends to various document formats, including spreadsheets, PDFs, images, etc., which require different approaches to processing this data.
Manual data processing approaches are not only subject to errors but they also could lead to losing important documents, problems with version control, and various legal and regulatory risks. Incorporating AI technologies into the data processing workflow can help to reduce these challenges. AI app development allows for the automation of the classification and extraction of unstructured and semi-structured data with a high level of accuracy.
There are several options for implementing artificial intelligence for document processing that meet different business goals, made possible by AIs ability to find hidden patterns beyond the reach of the human eye.
Traditional Optical Character Recognition (OCR) systems that are usually used for automated data extraction are template-based and require extensive supervision. While this is an acceptable option for highly structured documents like spreadsheets, problems arise when it comes to files with high variability like invoices, receipts, etc. The implementation of machine learning algorithms allows you to significantly expand the capabilities of OCR and provide more flexibility.
Any OCR algorithm includes three basic steps: image processing, text detection, and text recognition. The introduction of machine learning for the last two steps allows you to significantly improve the output. The end result of processing a file using machine learning OCR is converting the document into structured data for easy processing in your database. Since the accuracy of results with traditional OCR depends a lot on the quality of the original document, ML models could also help with solving this issue.
For instance, ML could help to increase the quality of images by applying denoising algorithms or binarization of the images and other approaches that will be the most suitable for resolving the problem of low quality images.
With machine learning, you can teach the model to associate various shapes with a specific symbol for greater accuracy. Such OCR systems can effectively process more complex data, for example, if you are dealing with blueprints and engineering drawings recognition. Also, machine learning can provide a more complete analysis, because it can analyze not only a certain part of the document but also the entire context.
Integration and customization of ready-made software such as OpenCV and Tesseract OCR allow you to create a solution that will meet all your specific needs. ML-based OCR systems help companies avoid mistakes that result in the loss of important data points and greatly facilitate the process of data management. Also, it significantly saves human resources because machine learning requires less human intervention over time. But it still is great if the data recognized by AI is validated by humans from time to time in order to highlight problem spots of recognition and retrain models on new updated data.
Before going to data extraction we need to understand the kind of data we are working on. Thats where natural language processing (NLP) comes to the rescue. Unlike simple rule-based software that can extract information based on strictly defined keywords or tags, NLP is more flexible and can interpret information based on intent and meaning, and thus properly consider changes and options in documents.
One of the basic tasks of NLP is Named Entity Recognition, i.e. identifying named entity mentions within unstructured data and classifying them into predefined categories (names, locations, amounts, etc.). Statistical NER systems usually require a large amount of manually tagged training data, but semi-supervised approaches can reduce this effort. For example, sometimes its sufficient to use out-of-the-box NLP packages that include pre-trained machine learning models and dont require additional data for training. If this is not enough for acceptable results and the business uses specific naming, it will be necessary to label additional entities and retrain the NLP model on the updated dataset.
Text Classification helps to categorize text according to its content. For example, it can be used to classify and assign a set of pre-defined tags or categories to medical reports or insurance claims depending on different criteria. Or you can use classification to prioritize customer requests for a customer support team by ranking them by urgency.
Sentiment Analysis is a way to use natural language processing (NLP) methods to identify and extract peoples opinions, attitudes, and emotions from text. It is a common task in NLP. It allows you to define the thoughts and emotions of customers about your products and services from reviews, survey responses, and social media comments. To determine the opinion, the system is usually guided by keywords. For example, like or love signal a positive statement, and do not, not or hate a negative one. However, its also worth considering the special types of language constructions, because sometimes not and never can have the opposite meaning (for example, not bad). Also, difficulties can arise with slang. For example, the word sick can have both a negative and a positive connotation. Nowadays, it is completely possible to handle these tasks with more advanced deep learning models that are able to understand context from the written text and identify the emotions with a minimum of mistakes.
The accuracy of document processing with NLP depends on many factors, including variation, style, and complexity of the language used, the quality of training data, document size (sometimes large documents are better because they provide more context), number of classes and types of entity, and many more. Each case is unique and requires a customized solution that can be provided by experienced machine learning consultants.
Deciding to integrate AI-powered document processing into your workflow, youll face two options: complete automation and semi-automation with human supervision. The first case is possible if your business processes are logical and repetitive. If theres any chance of variability that can impact the decision-making, its better to opt for semi-automation where the human has the final word.
The creation of an AI product like an intelligent document processing system consists of the following stages:
Different cases require different solutions and the use of AI is not always justified. That is why its important to clearly understand exactly what results you want to get from the automation of document processing and to consult with specialists about the means of achieving these goals.
Consulting with software developers will help you choose the best tools to implement your idea. It can be both the customization of ready-made platforms and the development of completely new solutions if the specifics of your project require it.
To train models, it is important to have accurate, relevant, and comprehensive data. You can have your own databases or find open-source datasets, as well as use web scraping tools. Then, if necessary, the data is cleaned and processed by removing errors, formatting, and handling missing values.
Once the development team has the data they need, they can build and train the models, as well as improve them. The critical point for business owners in this process is finding a reliable development partner who has the necessary expertise and is able to match business needs with technology capabilities. With real experts on your side, you will be able to implement intelligent document processing without additional complications and personally experience the benefits of using AI to optimize business processes.
Excerpt from:
Using AI technologies for effective document processing ... - Data Science Central
The Dawn of Intelligence – Embracing AI’s Rise and What It Means … – TechiExpert.com
In the rapidly evolving world of today, starkly different from its state two decades ago, the digital landscape reigns supreme. Now, brace yourself for a new epoch on the horizon the era of heightened intelligence. This is not a mere fantasy borrowed from science fiction, but a reality thats knocking at our doors. Machines are rapidly ascending the ladder of intelligence, and humanity is learning to harness this newfound potential. While concerns about unchecked AI dominance and fears of human subjugation by machines persist, one certainty emerges: the age of machines collaborating with humans is dawning.
Personal insights do shed light on this evolution. A son deeply immersed in the realm of machine learning and a daughter, a fashion designer, seamlessly integrating AI into the realm of fashion trends, highlight the rapid transformation thats underway. These personal experiences echo what visionary business leaders have been foretelling even before AI was the talk of the town, they spoke of AI as the next revolution. Today, that prophecy is steadily materializing.
So, what exactly does AI entail? It is the amalgamation of machine learning and deep learning techniques, mirroring human behavior and cognition. AI models, nurtured on extensive data, are becoming adept at generating insightful responses and informed decisions. Amidst concerns of AI displacing human roles and triggering job losses, a critical understanding must prevail AIs essence lies in amplifying human capabilities rather than rendering them obsolete. By simplifying tasks, AI elevates efficiency and productivity.
Yet, AIs role isnt to function in isolation; human intervention remains indispensable, at least for now. This symbiotic relationship necessitates training and will inevitably give rise to new roles, shining a spotlight on the need for reskilling and adaptability.
The advancement of AI is intricately tied to the scale of data and complexity of AI models, creating a trajectory of evolution thats awe-inspiring. A prime example is GPT-3, a groundbreaking model that operates through a remarkable 175 billion parameters. This extensive parameter count serves as the foundation for the models cognitive abilities, enabling it to grasp patterns and generate intelligent responses. Yet, this is just a stepping stone on the path of AIs progress. As we look ahead, the landscape is swiftly transforming, with tech giants embarking on the creation of AI models that dwarf their predecessors. These massive models, comprising over 1.6 trillion parameters, signal a quantum leap in AIs capabilities. This exponential growth in parameters encapsulates the accelerated evolution AI is experiencing, underscoring the rapid pace at which technology is reshaping the boundaries of intelligence.
AIs influence spans across a wide spectrum of applications, leaving an indelible mark on various aspects of modern life. From its presence in technologies we encounter daily, such as facial recognition and GPS-driven driver assistance, to the subtler yet equally impactful domains like personalized advertising and voice assistants, AI has seamlessly woven itself into our routines. However, the scope of AIs impact extends far beyond these familiar domains.
AIs reach is evident in realms as diverse as image and content manipulation, where it empowers creative processes and enables transformative alterations to media. It doesnt stop there AIs capabilities are harnessed for complex mathematical problem-solving and the intricate task of predictive analysis, offering insights into drug candidates potential efficacy. The transformative power of AI is perhaps most pronounced in fields like finance, where it assists auditors in studying accounts, aids investors in identifying stock opportunities, and even generates legal contracts with precision. Moreover, AI is an invaluable tool in the realm of risk management, where it uncovers potential insurance fraud, safeguarding against financial losses. The vast expanse of AI applications continues to expand, with its potential reaching across industries and sectors, promising solutions to challenges both known and unforeseen.
According to a comprehensive report by Next Move Strategy Consulting, the AI market could burgeon into a $2 trillion industry by 2030. This staggering projection illuminates the expansive opportunities for AI pioneers, tech juggernauts, and the IT services sector. Indias IT industry, renowned for adapting to digital transformations, is poised to harness AI models constructed by industry giants, ushering in a substantial opportunity akin to digital transformation itself.
The potential of intelligent machines to dissect information offers an unprecedented opportunity that businesses dare not overlook. Those who embrace this shift astutely and expediently will undoubtedly seize a competitive edge. Visionary leaders are attuned to this transformation, as exemplified by Tata Sons Chairman N Chandrasekarans commitment to deploying AI in the rejuvenation of Air India.
AI is far more than just a technological stride forward; it stands poised as a transformative force ready to redefine the very contours of business landscapes. Grasping this potential in a timely manner is a pressing need for both business leaders and stakeholders. Investors and shareholders find themselves confronted with a crucial inquiry: Is their company primed to harness the capabilities of AI? With the dawn of the AI era, readiness transcends being a mere option; it evolves into an indispensable strategy that holds the key to not only survival but also triumph.
Excerpt from:
The Dawn of Intelligence - Embracing AI's Rise and What It Means ... - TechiExpert.com
The TRIPOD-P reporting guideline for improving the integrity and … – Nature.com
Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Rheumatology & Musculoskeletal Sciences, University of Oxford, Oxford, UK
Paula Dhiman,Rebecca Whittle&Gary S. Collins
Department of Development and Regeneration, KU Leuven, Leuven, Belgium
Ben Van Calster
Department of Electrical Engineering and Computer Science, Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA, USA
Marzyeh Ghassemi
University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK
Xiaoxuan Liu
Department of Bioethics, Hospital for Sick Children, Toronto, Ontario, Canada
Melissa D. McCradden
Genetics & Genome Biology, Peter Gilgan Centre for Research and Learning, Division of Clinical and Public Health, Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada
Melissa D. McCradden
Julius Centre for Health Sciences and Primary Care, University Medical Centre Utrecht, Utrecht University, Utrecht, The Netherlands
Karel G. M. Moons
Institute of Applied Health Research, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK
Richard D. Riley
See more here:
The TRIPOD-P reporting guideline for improving the integrity and ... - Nature.com
What does the future of learning look like? Faculty and students … – Lumina Foundation
My whole life Ive been curious about exoskeletons. As a child, I dreamed about building one for humans. Well, while Sarcos beat me to the punch with its wearable robotics, it hasnt quashed my fascination with how humans and technology form a symbiotic relationship.
So, when presented with an opportunity to explore viewpoints on the integration of technology into learning, we wondered: Do faculty and students have the same perceptions of digital tools? If so, what type of educational exoskeleton technological framework built around student needs might we envision existing in the future?
The result was a survey called Time for Class, recently published by Tyton Partners and supported in part by Lumina Foundation. Here are three takeaways of many, from a very rich report.
When asked about preferences for course materials format, students largely selected digital options, at 75%. Faculty, on the other hand, are still figuring it out, with 20% not expressing a preference. Just 46% prefer digital course materials, and 34% favor printed materials.
But the large 20% who say theyre indifferent presents an analytical conundrum. Did they respond that way while transitioning from print to digital? If so, then perhaps there is only a difference of 10 percentage points between the preferences of faculty and students. But if faculty respondents responded no preference because they didnt want to seem behind the times, then there could be a larger 29-percentage-point gap that is substantial and problematic.
One thing is clear: Faculty need to express a point of view.
Faculty have a higher preference for face-to-face courses, and students have a higher preference for learning to occur with at least some online elements. But most faculty and students still value some in-person interaction.
Considering substantial advancements in access to computational power and machine learning, it is and will remain interesting to see how this desire for human interaction evolves. We could see more creative embraces of the flipped classroom, which eschews traditional lectures for students to instead use the shared space to make sense of what theyre learning. Still, I wonder whether the current culture wars over addressing issues of race and diversity, equity, and inclusion in classrooms will dampen the desire and openness to the co-construction of knowledge.
We are in the early days in the arc of artificial intelligence, and faculty and students are already intrigued. The majority of both, 50% and 54% respectively, believe AI has a positive effect on student learning, compared to 22% of each faculty and students who believe its impact is negative. While artificial intelligence can certainly complete the homework the dog ate, I wonder if it can answer a more detailed question in a Socratic seminar?
What the report, and the three select findings shared here, portend is that edtech tools are here to stay. Exactly how they literally, and perhaps figuratively, support the future of learning is yet to be seen. What I do know is that if I were a child now, the exoskeleton I envisioned would likely be able to do more than I ever imagined!
See the original post:
What does the future of learning look like? Faculty and students ... - Lumina Foundation
How sure is sure? Incorporating human error into machine learning – University of Cambridge news
Human error and uncertainty are concepts that many artificial intelligence systems fail to grasp, particularly in systems where a human provides feedback to a machine learning model. Many of these systems are programmed to assume that humans are always certain and correct, but real-world decision-making includes occasional mistakes and uncertainty.
Researchers from the University of Cambridge, along with The Alan Turing Institute, Princeton, and Google DeepMind, have been attempting to bridge the gap between human behaviour and machine learning, so that uncertainty can be more fully accounted for in AI applications where humans and machines are working together. This could help reduce risk and improve trust and reliability of these applications, especially where safety is critical, such as medical diagnosis.
The team adapted a well-known image classification dataset so that humans could provide feedback and indicate their level of uncertainty when labelling a particular image. The researchers found that training with uncertain labels can improve these systems performance in handling uncertain feedback, although humans also cause the overall performance of these hybrid systems to drop. Their results will be reported at the AAAI/ACM Conference on Artificial Intelligence, Ethics and Society (AIES 2023) in Montral.
Human-in-the-loop machine learning systems a type of AI system that enables human feedback are often framed as a promising way to reduce risks in settings where automated models cannot be relied upon to make decisions alone. But what if the humans are unsure?
Uncertainty is central in how humans reason about the world but many AI models fail to take this into account, said first author Katherine Collins from Cambridges Department of Engineering. A lot of developers are working to address model uncertainty, but less work has been done on addressing uncertainty from the persons point of view.
We are constantly making decisions based on the balance of probabilities, often without really thinking about it. Most of the time for example, if we wave at someone who looks just like a friend but turns out to be a total stranger theres no harm if we get things wrong. However, in certain applications, uncertainty comes with real safety risks.
Many human-AI systems assume that humans are always certain of their decisions, which isnt how humans work we all make mistakes, said Collins. We wanted to look at what happens when people express uncertainty, which is especially important in safety-critical settings, like a clinician working with a medical AI system.
We need better tools to recalibrate these models, so that the people working with them are empowered to say when theyre uncertain, said co-author Matthew Barker, who recently completed his MEng degree at Gonville& Caius College, Cambridge. Although machines can be trained with complete confidence, humans often cant provide this, and machine learning models struggle with that uncertainty.
For their study, the researchers used some of the benchmark machine learning datasets: one was for digit classification, another for classifying chest X-rays, and one for classifying images of birds. For the first two datasets, the researchers simulated uncertainty, but for the bird dataset, they had human participants indicate how certain they were of the images they were looking at: whether a bird was red or orange, for example. These annotated soft labels provided by the human participants allowed the researchers to determine how the final output was changed. However, they found that performance degraded rapidly when machines were replaced with humans.
We know from decades of behavioural research that humans are almost never 100% certain, but its a challenge to incorporate this into machine learning, said Barker. Were trying to bridge the two fields so that machine learning can start to deal with human uncertainty where humans are part of the system.
The researchers say their results have identified several open challenges when incorporating humans into machine learning models. They are releasing their datasets so that further research can be carried out and uncertainty might be built into machine learning systems.
As some of our colleagues so brilliantly put it, uncertainty is a form of transparency, and thats hugely important, said Collins. We need to figure out when we can trust a model and when to trust a human and why. In certain applications, were looking at probability over possibilities. Especially with the rise of chatbots, for example, we need models that better incorporate the language of possibility, which may lead to a more natural, safe experience.
In some ways, this work raised more questions than it answered, said Barker. But even though humans may be miscalibrated in their uncertainty, we can improve the trustworthiness and reliability of these human-in-the-loop systems by accounting for human behaviour.
The research was supported in part by the Cambridge Trust, the Marshall Commission, the Leverhulme Trust, the Gates Cambridge Trust and the Engineering and Physical Sciences Research Council (EPSRC), part of UK Research and Innovation (UKRI).
Reference:Katherine M. Collins et al. Human Uncertainty in Concept-Based AI Systems. Paper presented at the Sixth AAAI/ACM Conference on Artificial Intelligence, Ethics and Society (AIES 2023), August 8-10, 2023. Montral, QC, Canada.
Read the original post:
How sure is sure? Incorporating human error into machine learning - University of Cambridge news