Category Archives: Machine Learning

Artificial Intelligence Creeps on to the African Battlefield – Brookings Institution

Even as the worlds leading militaries race to adopt artificial intelligence in anticipation of future great power war, security forces in one of the worlds most conflict-prone regions are opting for a more measured approach. In Africa, AI is gradually making its way into technologies such as advanced surveillance systems and combat drones, which are being deployed to fight organized crime, extremist groups, and violent insurgencies. Though the long-term potential for AI to impact military operations in Africa is undeniable, AIs impact on organized violence has so far been limited. These limits reflect both the novelty and constraints of existing AI-enabled technology.

Artificial intelligence and armed conflict in Africa

Artificial intelligence (AI), at its most basic, leverages computing power to simulate the behavior of humans that requires intelligence. Artificial intelligence is not a military technology like a gun or a tank. It is rather, as the University of Pennsylvanias Mark Horowitz argues, a general-purpose technology with a multitude of applications, like the internal combustion engine, electricity, or the internet. And as AI applications proliferate to military uses, it threatens to change the nature of warfare. According to the ICRC, AI and machine-learning systems could have profound implications for the role of humans in armed conflict, especially in relation to: increasing autonomy of weapon systems and other unmanned systems; new forms of cyber and information warfare; and, more broadly, the nature of decision-making.

In at least two respects, AI is already affecting the dynamics of armed conflict and violence in Africa. First, AI-driven surveillance and smart policing platforms are being used to respond to attacks by violent extremist groups and organized criminal networks. Second, the development of AI-powered drones is beginning to influence combat operations and battlefield tactics.

AI is perhaps most widely used in Africa in areas with high levels of violence to increase the capabilities and coordination of law enforcement and domestic security services. For instance, fourteen African countries deploy AI-driven surveillance and smart-policing platforms, which typically rely on deep neural networks for image classification and a range of machine learning models for predictive analytics. In Nairobi, Chinese tech giant Huawei has helped build an advanced surveillance system, and in Johannesburg automated license plate readers have enabled authorities to track violent, organized criminals with suspected ties to the Islamic State. Although such systems have significant limitations (more on this below), they are proliferating across Africa.

AI-driven systems are also being deployed to fight organized crime. At Liwonde National Park in Malawi, park rangers use EarthRanger software, developed by the late Microsoft co-founder, Paul Allen, to combat poaching using artificial intelligence and predictive analytics. The software detects patterns in poaching that the rangers might overlook, such as upticks in poaching during holidays and government paydays. A small, motion-activated poacher cam relies on an algorithm to distinguish between humans and animals and has contributed to at least one arrest. Its not difficult to imagine how such a system might be repurposed for counterinsurgency or armed conflict, with AI-enabled surveillance and monitoring systems deployed to detect and deter armed insurgents.

In addition to the growing use of AI within surveillance systems across Africa, AI has also been integrated into weapon systems. Most prominently, lethal autonomous weapons systems use real-time sensor data coupled with AI and machine learning algorithms to select and engage targets without further intervention by a human operator. Depending on how that definition is interpreted, the first use of a lethal autonomous weapon system in combat may have taken place on African soil in March 2020. That month, logistics units belonging to the armed forces of the Libyan warlord Khalifa Haftar came under attack by Turkish-made STM Kargu-2 drones as they fled Tripoli. According to a United Nations report, the Kargu-2 represented a lethal autonomous weapons system because it had been programmed to attack targets without requiring data connectivity between the operator and munition. Although other experts have instead classified the Kargu-2 as a loitering munition, its use in combat in northern Africa nonetheless points to a future where AI-enabled weapons are increasingly deployed in armed conflicts in the region.

Indeed, despite global calls for a ban on similar weapons, the proliferation of systems like the Kargu-2 is likely only beginning. Relatively low costs, tactical advantages, and the emergence of multiple suppliers have led to a booming market for low-and-mid tier combat drones currently being dominated by players including Israel, China, Turkey, and South Africa. Such drones, particularly Turkeys Bakratyar TB2, have been acquired and used by well over a dozen African countries.

While the current generation of drones by and large do not have AI-driven autonomous capabilities that are publicly acknowledged, the same cannot be said for the next generation, which are even less costly, more attritable, and use AI-assisted swarming technology to make themselves harder to defend against. In February, the South Africa-based Paramount Group announced the launch of its N-RAVEN UAV system, which it bills as a family of autonomous, multi-mission aerial vehicles featuring next-generation swarm technologies. The N-RAVEN will be able to swarm in units of up to twenty and is designed for technology transfer and portable manufacture within partner countries. These features are likely to be attractive to African militaries.

AIs limits, downsides, and risks

Though AI may continue to play an increasing role in the organizational strategies, intelligence-gathering capabilities, and battlefield tactics of armed actors in Africa and elsewhere, it is important to put these contributions in a broader perspective. AI cannot address the fundamental drivers of armed conflict, particularly the complex insurgencies common in Africa. African states and militaries may overinvest in AI, neglecting its risks and externalities, as well as the ways in which AI-driven capabilities may be mitigated or exploited by armed non-state actors.

AI is unlikely to have a transformative impact on the outbreak, duration, or mitigation of armed conflict in Africa, whose incidence has doubled over the past decade. Despite claims by its makers, there is little hard evidence linking the deployment of AI-powered smart cities with decreases in violence, including in Nairobi, where crime incidents have remained virtually unchanged since 2014, when the citys AI-driven systems first went online. The same is true of poaching. During the COVID-19 pandemic, fewer tourists and struggling local economies have fueled significant increases, overwhelming any progress that has resulted from governments adopting cutting-edge technology.

This is because, in the first place, armed conflict is a human endeavor, with many factors that influence its outcomes. Even the staunchest defenders of AI-driven solutions, such as Huawei Southern Africa Public Affairs Director David Lane, admit that they cannot address the underlying causes of insecurity such as unemployment or inequality: Ultimately, preventing crime requires addressing these causes in a very local way. No AI algorithm can prevent poverty or political exclusion, disputes over land or national resources, or political leaders from making chauvinistic appeals to group identity. Likewise, the central problems with Africas militariesendemic corruption, human rights abuses, loyalties to specific leaders and groups rather than institutions and citizens, and a proclivity for ill-timed seizures of powerare not problems that artificial intelligence alone can solve.

In the second place, the aspects of armed conflict that AI seems most likely to disruptremote intelligence-gathering capabilities and air powerare technologies that enable armies to keep enemies at arms-length and win in conventional, pitched battles. AIs utility in fighting insurgencies, in which non-state armed actors conduct guerilla attacks and seek to blend in and draw support from the population, is more questionable. To win in insurgencies requires a sustained on the ground presence to maintain order and govern contested territory. States cannot hope to prevail in such conflicts by relying on technology that effectively removes them from the fight.

Finally, the use of AI to fight modern armed conflict remains at a nascent stage. To date, the prevailing available evidence has documented how state actors are adopting AI to fight conflict, and not how armed non-state actors are responding. Nevertheless, states will not be alone in seeking to leverage autonomous weapons. Former African service members speculate that it is only a matter of time before before the deployment of swarms or clusters of offensive drones by non-state actors in Africa, given their accessibility, low costs, and existing use in surveillance and smuggling. Rights activists have raised the alarm about the potential for small, cheap, swarming slaughterbots, that use freely available AI and facial recognition systems to commit mass acts of terror. This particular scenario is controversial, but according to American Universitys Audrey Kurth Cronin, it is both technologically feasible and consistent with classic patterns of diffusion.

The AI armed conflict evolution

These downsides and risks suggest the continued diffusion of AI is unlikely to result in the revolutionary changes to armed conflict suggested by some of its more ardent proponents and backers. Rather, modern AI is perhaps best viewed as continuing and perhaps accelerating long-standing technological trends that have enhanced sensing capabilities and digitized and automated the operations and tactics of armed actors everywhere.

For all its complexity, AI is first and foremost a digital technology, its impact dependent on and difficult to disentangle from a technical triad of data, algorithms, and computing power. The impact of AI-powered surveillance platforms, from the EarthRanger software used at Liwonde to Huawei-supplied smart policing platforms, isnt just a result of machine-learning algorithms that enable human-like reasoning capabilities, but also on the ability to store, collect, process collate and manage vast quantities of data. Likewise, as pointed out by analysts such as Kelsey Atherton, the Kargu 2 used in Libya can be classified as an autonomous loitering munition such as Israels Harpy drone. The main difference between the Kargu 2 and the Harpy, which was first manufactured in 1989, is where the former uses AI-driven image recognition, the latter uses electro-optical sensors to detect and hone in on enemy radar emissions.

The diffusion of AI across Africa, like the broader diffusion of digital technology, is likely to be diverse and uneven. Africa remains the worlds least digitized region. Internet penetration rates are low and likely to remain so in many of the most conflict-prone countries. In Somalia, South Sudan, Ethiopia, the Democratic Republic of Congo, and much of the Lake Chad Basin, internet penetration is below 20%. AI is unlikely to have much of an impact on conflict in regions where citizens leave little in the way of a digital footprint, and non-state armed groups control territory beyond the easy reach of the state.

Taken together, these developments suggest that AI will cause a steady evolution in armed conflict in Africa and elsewhere, rather than revolutionize it. Digitization and the widespread adoption of autonomous weapons platforms may extend the eyes and lengthen the fists of state armies. Non-state actors will adopt these technologies themselves and come up with clever ways to exploit or negate them. Artificial intelligence will be used in combination with equally influential, but less flashy inventions such as the AK-47, the nonstandard tactical vehicle, and the IED to enable new tactics that take advantage or exploit trends towards better sensing capabilities and increased mobility.

Incrementally and in concert with other emerging technologies, AI is transforming the tools and tactics of warfare. Nevertheless, experience from Africa suggests that humans will remain the main actors in the drama of modern armed conflict.

Nathaniel Allen is an assistant professor with the Africa Center for Strategic Studies at National Defense University and a Council on Foreign Relations term member. Marian Ify Okpali is a researcher on cyber policy and the executive assistant to the dean at the Africa Center for Strategic Studies at National Defense University. The opinions expressed in this article are those of the authors.

Microsoft provides financial support to the Brookings Institution, a nonprofit organization devoted to rigorous, independent, in-depth public policy research.

Continued here:
Artificial Intelligence Creeps on to the African Battlefield - Brookings Institution

Founded by Ex-Uber Data Architect and Apache Hudi Creator, – GlobeNewswire

MENLO PARK, Calif., Feb. 02, 2022 (GLOBE NEWSWIRE) -- Today Onehouse, the first managed lakehouse company, emerged from stealth with its cloud-native managed service based on Apache Hudi that makes data lakes easier, faster and cheaper.

Data has become the driving force of innovation across nearly every industry in the world. Yet organizations still struggle to build and maintain data architectures that can economically scale at the fast-paced growth of their data. As the size of the data and the AI and machine learning (ML) workloads increase, their costs rise exponentially and they start to outgrow their data warehouses. To scale any further they turn to a data lake where they face a whole new set of complex challenges like constantly tuning data layouts, large-scale concurrency controls, fast data ingestion, data deletions and more.

Onehouse founder Vinoth Chandar faced these very challenges as he was building one of the largest data lakes in the world at Uber. A rapidly growing Uber needed the performance of a warehouse and the scale of a data lake, in near real-time to power AI/ML driven features like predicting ETAs, recommending eats and ensuring ride safety. He created Apache Hudi to implement a new path-breaking architecture where the core warehouse and database functionality was directly added to the data lake, today known as the lakehouse. Apache Hudi brings a state-of-the-art data lakehouse to life with advanced indexes, streaming ingestion services and data clustering/optimization techniques.

Apache Hudi is now widely adopted across the industry used from startups to large enterprises including Amazon, Walmart, Disney+ Hotstar, GE Aviation, Robinhoodand TikTok to build exabyte scale data lakes in near-real-time at vastly improved price/performance. The broad adoption of Hudi has battle-tested and proven the foundational benefits of this open source project. Thousands of organizations from across the world have contributed to Hudi and the project has grown 7x in less than two years to nearly one million monthly downloads. At Uber, Hudi continues to ingest more than 500 billion records every day.

Zheng Shao and Mohammad Islam from Uber shared we started the Hudi project in 2016, and submitted it to Apache Incubator Project in 2019. Apache Hudi is now a Top-Level Project, with the majority of our Big Data on HDFS in Hudi format. This has dramatically reduced the computing capacity needs at Uber in the Cost-Efficient Open Source Big Data Platform at Uber blog: https://eng.uber.com/cost-efficient-big-data-platform/.

Even with transformative technology like Apache Hudi, building a high quality data lake requires months of investment with scarce talent without which there are high risks that data is not fresh enough or the lake is unreliable or performs poorly.

Onehouse founder and CEO Vinoth Chandar said: While a warehouse can just be used, a lakehouse still needs to be built. Having worked with many organizations on that journey for four years in the Apache Hudi community, we believe Onehouse will enable easy adoption of data lakes and future-proof the data architecture for machine learning/data science down the line.

Onehouse streamlines the adoption of the lakehouse architecture, by offering a fully-managed cloud-native service that quickly ingests, self-manages and auto-optimizes data. Instead of creating yet another vertically integrated data and query stack, it provides one interoperable and truly open data layer that accelerates workloads across all popular data lake query engines like Apache Spark, Trino, Presto and even cloud warehouses as external tables.

Leveraging unique capabilities of Apache Hudi, Onehouse opens the door for incremental data processing that is typically orders of magnitude faster than old-school batch processing. By combining a breakthrough technology and a fully-managed easy-to-use service, organizations can build data lakes in minutes, not months, realize large cost savings and still own their data in open formats, not locked into any individual vendors.

Industry Analysts on Onehouse

$8 Million in Seed FundingOnehouse raised $8 million in seed funding co-led by Greylock and Addition. Onehouse plans to use the money for its managed lakehouse product and to further the research and development on Apache Hudi.

Greylock Partner Jerry Chen said: The data lake house is the future of data lakes, providing customers the ease of use of a data warehouse with the cost and scale advantages of a data lake. Apache Hudi is already the de facto starting point for modern data lakes and today Onehouse makes data lakes easily accessible and usable by all customers.

Addition Investor Aaron Schildkrout said: Onehouse is ushering in the next generation of data infrastructure, replacing expensive data ingestion and data warehousing solutions with a single lakehouse thats dramatically less costly, faster, more open and - now - also easier to use. Onehouse is going to make broadly accessible what has to-date been a tightly held secret used by only the most advanced data teams.

Additional Resources

About OnehouseOnehouse provides a cloud-native managed lakehouse service that makes data lakes easier, faster and cheaper. Onehouse blends the ease of use of a warehouse with the scale of a data lake into a fully managed product. Engineers can build data lakes in minutes, process data in seconds and own data in open source formats, not locked away to individual vendors. Onehouse is founded by a former Uber data architect and the creator of Apache Hudi who pioneered the fundamental technology of the lakehouse. For more information, please visit https://onehouse.ai or follow @Onehousehq.

Media and Analyst Contact:Amber Rowlandamber@therowlandagency.com+1-650-814-4560

A photo accompanying this announcement is available at https://www.globenewswire.com/NewsRoom/AttachmentNg/aedd9404-e43b-49fb-9091-a4b0e57e7f39

See original here:
Founded by Ex-Uber Data Architect and Apache Hudi Creator, - GlobeNewswire

How to build healthcare predictive models using PyHealth? – Analytics India Magazine

Machine learning has been applied to many health-related tasks, such as the development of new medical treatments, the management of patient data and records, and the treatment of chronic diseases. To achieve success in those SOTA applications, we must rely on the time-consuming technique of model building evaluation. To alleviate this load, Yue Zhao et al have proposed a PyHealth, a Python-based toolbox. As the name implies, this toolbox contains a variety of ML models and architecture algorithms for working with medical data. In this article, we will go through this model to understand its working and application. Below are the major points that we are going to discuss in this article.

Lets first discuss the use case of machine learning in the healthcare industry.

Machine learning is being used in a variety of healthcare settings, from case management of common chronic conditions to leveraging patient health data in conjunction with environmental factors such as pollution exposure and weather.

Machine learning technology can assist healthcare practitioners in developing accurate medication treatments tailored to individual features by crunching enormous amounts of data. The following are some examples of applications that can be addressed in this segment:

The ability to swiftly and properly diagnose diseases is one of the most critical aspects of a successful healthcare organization. In high-need areas like cancer diagnosis and therapy, where hundreds of drugs are now in clinical trials, scientists and computationalists are entering the mix. One method combines cognitive computing with genetic tumour sequencing, while another makes use of machine learning to provide diagnosis and treatment in a range of fields, including oncology.

Medical imaging, and its ability to provide a complete picture of an illness, is another important aspect in diagnosing an illness. Deep learning is becoming more accessible as data sources become more diverse, and it may be used in the diagnostic process, therefore it is becoming increasingly important. Although these machine learning applications are frequently correct, they have some limitations in that they cannot explain how they came to their conclusions.

ML has the potential to identify new medications with significant economic benefits for pharmaceutical companies, hospitals, and patients. Some of the worlds largest technology companies, like IBM and Google, have developed ML systems to help patients find new treatment options. Precision medicine is a significant phrase in this area since it entails understanding mechanisms underlying complex disorders and developing alternative therapeutic pathways.

Because of the high-risk nature of surgeries, we will always need human assistance, but machine learning has proved extremely helpful in the robotic surgery sector. The da Vinci robot, which allows surgeons to operate robotic arms in order to do surgery with great detail and in confined areas, is one of the most popular breakthroughs in the profession.

These hands are generally more accurate and steady than human hands. There are additional instruments that employ computer vision and machine learning to determine the distances between various body parts so that surgery can be performed properly.

Health data is typically noisy, complicated, and heterogeneous, resulting in a diverse set of healthcare modelling issues. For instance, health risk prediction is based on sequential patient data, disease diagnosis based on medical images, and risk detection based on continuous physiological signals.

Electroencephalogram (EEG) or electrocardiogram (ECG), for example, and multimodal clinical notes (e.g., text and images). Despite their importance in healthcare research and clinical decision making, the complexity and variability of health data and tasks need the long-overdue development of a specialized ML system for benchmarking predictive health models.

PyHealth is made up of three modules: data preprocessing, predictive modelling, and assessment. Both computer scientists and healthcare data scientists are PyHealths target consumers. They can run complicated machine learning processes on healthcare datasets in less than 10 lines of code using PyHealth.

The data preprocessing module converts complicated healthcare datasets such as longitudinal electronic health records, medical pictures, continuous signals (e.g., electrocardiograms), and clinical notes into machine learning-friendly formats.

The predictive modelling module offers over 30 machine learning models, including known ensemble trees and deep neural network-based approaches, using a uniform yet flexible API geared for both researchers and practitioners.

The evaluation module includes a number of evaluation methodologies (for example, cross-validation and train-validation-test split) as well as prediction model metrics.

There are five distinct advantages to using PyHealth. For starters, it contains more than 30 cutting-edge predictive health algorithms, including both traditional techniques like XGBoost and more recent deep learning architectures like autoencoders, convolutional based, and adversarial based models.

Second, PyHealth has a broad scope and includes models for a variety of data types, including sequence, image, physiological signal, and unstructured text data. Third, for clarity and ease of use, PyHealth includes a unified API, detailed documentation, and interactive examples for all algorithmscomplex deep learning models can be implemented in less than ten lines of code.

Fourth, unit testing with cross-platform, continuous integration, code coverage, and code maintainability checks are performed on most models in PyHealth. Finally, for efficiency and scalability, parallelization is enabled in select modules (data preprocessing), as well as fast GPU computation for deep learning models via PyTorch.

PyHealth is a Python 3 application that uses NumPy, scipy, scikit-learn, and PyTorch. As shown in the diagram below, PyHealth consists of three major modules: First is the data preprocessing module can validate and convert user input into a format that learning models can understand;

Second is the predictive modelling module is made up of a collection of models organized by input data type into sequences, images, EEG, and text. For each data type, a set of dedicated learning models has been implemented, and the third is the evaluation module can automatically infer the task type, such as multi-classification, and conduct a comprehensive evaluation by task type.

Most learning models share the same interface and are inspired by the scikit-API learn to design and general deep learning design: I fit learns the weights and saves the necessary statistics from the train and validation data; load model chooses the model with the best validation accuracy, and inference predicts the incoming test data.

For quick data and model exploration, the framework includes a library of helper and utility functions (check parameter, label check, and partition estimators). For example, a label check can check the data label and infer the task type, such as binary classification or multi-classification, automatically.

PyHealth for model building

Now below well discuss how we can leverage the API of this framework. First, we need to install the package by using pip.

! pip install pyhealth

Next, we can load the data from the repository itself. For that, we need to clone the repository. After cloning the repository inside the datasets folder there is a variety of datasets like sequenced based, image-based, etc. We are using the mimic dataset and it is in the zip form we need to unzip it. Below is the snippet clone repository, and unzip the data.

The unzipped file is saved in the current working directory with the name of the folder as a mimic. Next to use this dataset we need to load the sequence data generator function which serves as functionality to prepare the dataset for experimentation.

Now we have loaded the dataset. Now we can do further modelling as below.

Here is the fitment result.

Through this article, we have discussed how machine learning can be used in the healthcare industry by observing the various applications. As this domain is being quite vast and N number application, we have discussed a Python-based toolbox that is designed to build a predictive modelling approach by using various deep learning techniques such as LSTM, GRU for sequence data, and CNN for image-based data.

Read the original:
How to build healthcare predictive models using PyHealth? - Analytics India Magazine

Silicon Labs brings AI and Machine Learning to the Edge with Matter-ready platform – Design Products & Applications

31 January 2022

This new co-optimised hardware and software platform will help bring AI/ML applications and wireless high performance to battery-powered edge devices. Matter-ready, the ultra-low-power BG24 and MG24 families support multiple wireless protocols and incorporate PSA Level 3 Secure Vault protection, ideal for diverse smart home, medical and industrial applications. The SoC and software solution for the Internet of Things (IoT) announced today includes:

Two new families of 2.4 GHz wireless SoCs, which feature the industrys first integrated AI/ML accelerators, support for Matter, Zigbee, OpenThread, Bluetooth Low Energy, Bluetooth mesh, proprietary and multi-protocol operation, the highest level of industry security certification, ultra-low power capabilities and the largest memory and flash capacity in the Silicon Labs portfolio.

A new software toolkit designed to allow developers to quickly build and deploy AI and machine learning algorithms using some of the most popular tool suites like TensorFlow.

The BG24 and MG24 wireless SoCs represent an awesome combination of industry capabilities including broad wireless multiprotocol support, battery life, machine learning, and security for IoT Edge applications, said Matt Johnson, CEO of Silicon Labs.

First integrated AI/ML acceleration improves performance and energy efficiency

IoT product designers see the tremendous potential of AI and machine learning to bring even greater intelligence to edge applications like home security systems, wearable medical monitors, sensors monitoring commercial facilities and industrial equipment, and more. But today, those considering deploying AI or machinehttp://learning at the edge are faced with steep penalties in performance and energy use that may outweigh the benefits.

The BG24 and MG24 alleviate those penalties as the first ultra-low powered devices with dedicated AI/ML accelerators built in. This specialised hardware is designed to handle complex calculations quickly and efficiently, with internal testing showing up to a 4x improvement in performance along with up to a 6x improvement in energy efficiency. Because the ML calculations are happening on the local device rather than in the cloud, network latency is eliminated for faster decision-making and actions.

The BG24 and MG24 families also have the largest Flash and random-access memory (RAM) capacities in the Silicon Labs portfolio. This means that the device can evolve for multi-protocol support, Matter, and trained ML algorithms for large datasets. PSA Level 3-Certified Secure VaultTM, the highest level of security certification for IoT devices, provides the security needed in products like door locks, medical equipment, and other sensitive deployments where hardening the device from external threats is paramount.

To learn more about the capabilities of the BG24 and MG24 SoCs and view a demo on how to get started, register for the instructional Tech Talk "Unboxing the new BG24 and MG24 SoCs" here: https://www.silabs.com/tech-talks.

AI/ML software and Matter-support help designers create for new innovative applications

In addition to natively supporting TensorFlow, Silicon Labs has partnered with some of the leading AI and ML tools providers, like SensiML and Edge Impulse, to ensure that developers have an end-to-end toolchain that simplifies the development of machine learning models optimised for embedded deployments of wireless applications. Using this new AI/ML toolchain with Silicon Labss Simplicity Studio and the BG24 and MG24 families of SoCs, developers can create applications that draw information from various connected devices, all communicating with each other using Matter to then make intelligent machine learning-driven decisions.

For example, in a commercial office building, many lights are controlled by motion detectors that monitor occupancy to determine if the lights should be on or off. However, when typing at a desk with motion limited to hands and fingers, workers may be left in the dark when motion sensors alone cannot recognise their presence. By connecting audio sensors with motion detectors through the Matter application layer, the additional audio data, such as the sound of typing, can be run through a machine-learning algorithm to allow the lighting system to make a more informed decision about whether the lights should be on or off.

ML computing at the edge enables other intelligent industrial and home applications, including sensor-data processing for anomaly detection, predictive maintenance, audio pattern recognition for improved glass-break detection, simple-command word recognition, and vision use cases like presence detection or people counting with low-resolution cameras.

Alpha program highlights variety of deployment options

More than 40 companies representing various industries and applications have already begun developing and testing this new platform solution in a closed Alpha program. These companies have been drawn to the BG24 and MG24 platforms by their ultra-low power, advanced features, including AI/ML capabilities and support for Matter. Global retailers are looking to improve the in-store shopping experience with more accurate asset tracking, real-time price updating, and other uses. Participants from the commercial building management sector are exploring how to make their building systems, including lighting and HVAC, more intelligent to lower owners costs and reduce their environmental footprint. Finally, consumer and smart home solution providers are working to make it easier to connect various devices and expand the way they interact to bring innovative new features and services to consumers.

Silicon Labs most capable family of SoCs

The single-die BG24 and MG24 SoCs combine a 78 MHz ARM Cortex-M33 processor, high-performance 2.4 GHz radio, industry-leading 20-bit ADC, an optimised combination of Flash (up to 1536 kB) and RAM (up to 256 kB), and an AI/ML hardware accelerator for processing machine learning algorithms while offloading the ARM Cortex-M33, so applications have more cycles to do other work. Supporting a broad range of 2.4 GHz wireless IoT protocols, these SoCs incorporate the highest security with the best RF performance/energy-efficiency ratio in the market.

Availability

EFR32BG24 and EFR32MG24 SoCs in 5 x 5mm QFN40 and 6 x 6mm QFN48 packages are shipping today to Alpha customers and will be available for mass deployment in April 2022. Multiple evaluation boards are available to designers developing applications. Modules based on the BG24 and MG24 SoCs will be available in the second half of 2022.

To learn more about the new BG24 family, go to: http://silabs.com/bg24.

To learn more about the new MG24 family, go to: http://silabs.com/mg24.

To learn more about how Silicon Labs supports AI and ML, go to: http://silabs.com/ai-ml.

Read more from the original source:
Silicon Labs brings AI and Machine Learning to the Edge with Matter-ready platform - Design Products & Applications

AIs J-curve and upcoming productivity boom – TechTalks

This article is part of our series that explores thebusiness of artificial intelligence

Digital technologies, and at their forefront artificial intelligence, are triggering fundamental shifts in society, politics, education, economy, and other fundamental aspects of life. These changes provide opportunities for unprecedented growth across different sectors of the economy. But at the same time, they entail challenges that organizations must overcome before they can tap into their full potential.

In a recent talk at an online conference organized by Stanford Human-Centered Artificial Intelligence (HAI), Stanford professor Erik Brynjolfsson discussed some of these opportunities and challenges.

Brynjolfsson, who directs Stanfords Digital Economy Lab, believes that in the coming decade, the use of artificial intelligence will be much more widespread than it is today. But its adoption will also face a period of lull, also known as the J-curve.

Theres a growing gap between what the technology is capable of and what it is already doing versus how we are responding to that, Brynjolfsson says. And thats where a lot of our societys biggest challenges and problems and some of our biggest opportunities lie.

According to Brynjolfsson, the next decade will see significantly higher productivity thanks to a wave of powerful technologiesespecially machine learningthat are finding their way into every computing device and application.

Advances in computer vision have been tremendous, especially in areas such as image recognition and medical imaging. Talking to phones, watches, and smart speakers has become commonplace thanks to advances in natural language processing and speech recognition. Product recommendation, ad placement, insurance underwriting, loan approval, and many other applications have benefited immensely from advances in machine learning.

In many areas, machine learning is reducing costs and accelerating production. For example, the application of large language models in programming can help software developers become much more productive and achieve more in less time.

In other areas, machine learning can help create applications that did not exist before. For example, generative deep learning models are creating new applications for arts, music, and other creative work. In areas such as online shopping, advances in machine learning can create major shifts in business models, such as moving from shopping-then-shipping to shipping-then-shopping.

The lockdowns and urgency caused by the covid-19 pandemic accelerated the adoption of these technologies in different sectors, including remote work tools, robotic process automation, powered drug research, and factory automation.

The pandemic has been horrific in so many ways, but another thing its done is its accelerated the digitization of the economy, compressing in about 20 weeks what would have taken maybe 20 years of digitization, Brynjolfsson says. Weve all invested in technologies that are allowing us to adapt to a more digital world. Were not going to stay as remote as we are now, but were not going all the way back either. And that increased digitization of business processes and skills compresses the timeframe for us to adopt these new ways of working and ultimately drive higher productivity.

The productivity potential of machine learning technologies has one big caveat.

Historically, when these new technologies become available, they dont immediately translate into productivity growth. Often theres a period where productivity declines, where theres a lull, Brynjolfsson says. And the reason theres this lull is that you need to reinvent your organizations, you need to develop new business processes.

Brynjolfsson calls this the Productivity J-Curve and has documented it in a paper published in the American Economic Journal: Macroeconomics. Basically, the great potential caused by new general-purpose technologies like the steam engine, electricity, and more recently machine learning requires fundamental changes in business processes and workflows, the co-invention of new products and business models, and investment in human capital.

These investments and changes often take several years, and during this period, they dont yield tangible results. During this phase, the companies are creating intangible assets, according to Brynjolfsson. For example, they might be training and reskilling their workforce to employ these new technologies. They might be redesigning their factories or instrumenting them with new sensor technologies to take advantage of machine learning models. They might need to revamp their data infrastructure and create data lakes on which they can train and run ML models.

These efforts might cost millions of dollars (or billions in the case of large corporations) and make no change in the companys output in the short term. At first glance, it seems that costs are increasing without any return on investment. When these changes reach their turning point, they result in a sudden increase in productivity.

Were in this period right now where were making a lot of that painful transition, restructuring work, and theres a lot of companies that are struggling with that, Brynjolfsson says. But were working through that, and these J-curves will lead to higher productivityaccording to our research, were near the bottom and turning up.

Unfortunately, adapting to AI and other new digital technologies does not run on a predictable path. Most firms arent making the transition correctly or lack the creativity and understanding to make the transition. Various studies show that most applied machine learning projects fail.

Only about the top 10-15 percent of firms are doing most of the investment in these intangibles. The other 85-90 percent of firms are lagging behind and are hardly making any of these restructuring needed, Brynjolfsson says. This is not just the big tech firms. This is within every industry, manufacturing, retail, finance, resources. In each category, were seeing the leading firms pulling away from the rest. Theres a growing performance gap.

But while adopting new technologies is going to be difficult, it is happening at a much faster pace in comparison to previous cycles of technological advances because we are better prepared to make the transition.

I think what is becoming clear is that its going to happen a lot faster in part because we have a much more professional class of people trying to study what works and what doesnt work, Brynjolfsson says. Some of them are in business schools and academia. A lot of them are in consulting companies. Some of them are journalists. And there are people who are describing which practices work and which dont.

Another element that can help immensely is the availability of machine learning and data science tools to process and study the huge amounts of data available on organizations, people, and the economy.

For example, Brynjolfsson and his colleagues are working on a big dataset of 200 million job postings, which include the full text of the job description along with other information. Using different machine learning models and natural language processing techniques, they can transform the job posts into numerical vectors that can then be used for various tasks.

We think of all the jobs as this mathematical space. We can understand how they can relate to each other, Brynjolfsson says.

For example, they can make simple inferences such as how similar or different two or more job posts are based on their text descriptions. They can use other techniques such as clustering and graph neural networks to draw more important conclusions such as what kind of skills are more in demand, or how would the characteristics of a job post change if you modified the description to add AI skills such as Python or TensorFlow. Companies can use these models to find holes in their hiring strategies or to analyze the hiring decisions of their competitors and leading organizations.

Those kinds of tools just didnt exist as recently as five years ago, and I think its a revolution that is just as important as the microscope or some of the other revolutions in science, Brynjolfsson says. We now have them for social sciences and business to have this kind of visibility. Thats allowing us to make a transition a lot more rapidly than before.

However, Brynjolfsson warns that not many companies are using these kinds of tools. This is perhaps further testament to his previous point that companies have not yet figured out the right transition strategy and are relying on old methods to restructure and adapt themselves to the age of AI. And at the center of this strategy should be the correct use of human capital.

You have hundreds of billions of dollars of human capital, of skills walking out the door, and then the company tries to hire back people with the skills that they need. What they dont realize is that the workers that they let go often had skills that were very adjacent to the ones theyre hiring for, Brynjolfsson says.

With the help of machine learning, they will have better visibility and knowledge of their skill adjacencies, Brynjolfsson says. For example, a company might discover that instead of laying off a bunch of people and looking to hire new talent, perhaps all they need to do is a little bit of retraining and repurposing of their workforce.

Its much more expensive to hire somebody fresh than would have been for them to take some of those people who are already in the company and say, if we teach you Python or customer service skills or other skills, you can be doing this job that were looking to hire people for, Brynjolfsson says. My hope is that, in the coming decade, workers will be in a much better position to take full advantage of their capabilities and skills. And it will be good for the companies too to understand all the assets that they have in there, and machine learning can help a lot with understanding those relationships.

Excerpt from:
AIs J-curve and upcoming productivity boom - TechTalks

TUI adds machine learning to optimize its shared-transfer platform – PhocusWire

Global tourism company TUI Group is partnering withBoston-based Mobi Systems to improve the transportation services it provides tocustomers around the world.

TUI Group says it sold more than 31 million transfers in2019, moving customers between airports, hotels and points of interest.

Starting this month in Mallorca and then rolling outworldwide, TUI is using a new platform for managing shared transportation such as large and small buses, shuttles and cars that is integrated with MobiSystems machine-learning technology.

The system uses TUI's customer booking data, suchas flights, hotels and number of customers, along with data about flightdelays, traffic, weather and vehicle inventory, to calculate the mostefficient transfer plan, updating it in real time and automatically communicatingthe current route and timing to bus companies, drivers and travelers throughthe TUI app.

Subscribe to our newsletter below

One of the key areas that has always been a source oftension for our guests ... is the airport and the transfer, says Peter Ulwahn, chiefdigital officer of TUI Musement, the tours and activities division of TUIGroup.

We now have for the first time a technology that can showcasethe time to the first hotel, the number of hotels they are stopping at, if theirbus is delayed. What we were aiming for was an Uber-style information kind ofservice that our customers have been getting used to with all the ride-sharingservices.

In addition to reducing stress for travelers, Ulwahn says Mobismachine-learning technology automatically recalculates routes as needed, eliminatingtimely manual processes and reducing operating costs and CO2 emissions through better vehicle optimization and routing.

Integrating new technologies, such as machine learning, helps ensure we deliver the best customer experience through having a faster, more stable and more accurate platform, Ulwahn says.

Our transfer scheduling is already automated, but with Mobi it will be faster what previously took hours can be done in seconds - and it will continue to become even more efficient. The huge advantage of this system is that it can scale to schedule the millions of transfers we manage, while also enabling us to deliver a personalized customer experience.

The platform is being launched for airport transfers, but Ulwahn says it will eventually be used also for transportation for excursions, multi-day tours and cruise passengers.

Excerpt from:
TUI adds machine learning to optimize its shared-transfer platform - PhocusWire

Silicon Labs Brings AI and Machine Learning to the Edge with Matter-Ready Platform – inForney.com

AUSTIN, Texas, Jan. 24, 2022 /PRNewswire/ -- Silicon Labs, a leader in secure, intelligent wireless technology for a more connected world, today announced the BG24 and MG24 families of 2.4 GHz wireless SoCs for Bluetooth and Multiple-protocol operations, respectively, and a new software toolkit. This new co-optimized hardware and software platform will help bring AI/ML applications and wireless high performance to battery-powered edge devices. Matter-ready, the ultra-low-power BG24 and MG24 families support multiple wireless protocols and incorporate PSA Level 3 Secure Vaultprotection, ideal for diverse smart home, medical and industrial applications. The SoC and software solution for the Internet of Things (IoT) announced today includes:

"The BG24 and MG24 wireless SoCs represent an awesome combination of industry capabilities including broad wireless multiprotocol support, battery life, machine learning, and security for IoT Edge applications," said Matt Johnson, CEO of Silicon Labs.

First Integrated AI/ML Acceleration Improves Performance and Energy Efficiency

IoT product designers see the tremendous potential of AI and machine learning to bring even greater intelligence to edge applications like home security systems, wearable medical monitors, sensors monitoring commercial facilities and industrial equipment, and more. But today, those considering deploying AI or machine learning at the edge are faced with steep penalties in performance and energy use that may outweigh the benefits.

The BG24 and MG24 alleviate those penalties as the first ultra-low powered devices with dedicated AI/ML accelerators built-in. This specialized hardware is designed to handle complex calculations quickly and efficiently, with internal testing showing up to a 4x improvement in performance along with up to a 6x improvement in energy efficiency. Because the ML calculations are happening on the local device rather than in the cloud, network latency is eliminated for faster decision-making and actions.

The BG24 and MG24 families also have the largest Flash and random access memory (RAM) capacities in the Silicon Labs portfolio. This means that the device can evolve for multi-protocol support, Matter, and trained ML algorithms for large datasets. PSA Level3-Certified Secure VaultTM,the highest level of security certification for IoT devices, provides the security needed in products like door locks, medical equipment, and other sensitive deployments where hardening the device from external threats is paramount.

To learn more about the capabilities of the BG24 and MG24 SoCs and view a demo on how to get started, register for the instructional Tech Talk "Unboxing the new BG24 and MG24 SoCs" here: https://www.silabs.com/tech-talks.

AI/ML Software and Matter-Support Help Designers Create for New Innovative Applications

In addition to natively supporting TensorFlow, Silicon Labs has partnered with some of the leading AI and ML tools providers, like SensiML and Edge Impulse, to ensure that developers have an end-to-end toolchain that simplifies the development of machine learning models optimized for embedded deployments of wireless applications. Using this new AI/ML toolchain with Silicon Labs's Simplicity Studio and the BG24 and MG24 families of SoCs, developers can create applications that draw information from various connected devices, all communicating with each other using Matter to then make intelligent machine learning-driven decisions.

For example, in a commercial office building, many lights are controlled by motion detectors that monitor occupancy to determine if the lights should be on or off. However, when typing at a desk with motion limited to hands and fingers, workers may be left in the dark when motion sensors alone cannot recognize their presence. By connecting audio sensors with motion detectors through the Matter application layer, the additional audio data, such as the sound of typing, can be run through a machine-learning algorithm to allow the lighting system to make a more informed decision about whether the lights should be on or off.

ML computing at the edge enables other intelligent industrial and home applications, including sensor-data processing for anomaly detection, predictive maintenance, audio pattern recognition for improved glass-break detection, simple-command word recognition, and vision use cases like presence detection or people counting with low-resolution cameras.

Alpha Program Highlights Variety of Deployment Options

More than 40 companies representing various industries and applications have already begun developing and testing this new platform solution in a closed Alpha program. These companies have been drawn to the BG24 and MG24 platforms by their ultra-low power, advanced features, including AI/ML capabilities and support for Matter. Global retailers are looking to improve the in-store shopping experience with more accurate asset tracking, real-time price updating, and other uses. Participants from the commercial building management sector are exploring how to make their building systems, including lighting and HVAC, more intelligent to lower owners' costs and reduce their environmental footprint. Finally, consumer and smart home solution providers are working to make it easier to connect various devices and expand the way they interact to bring innovative new features and services to consumers.

Silicon Labs' Most Capable Family of SoCs

The single-die BG24 and MG24 SoCs combine a 78 MHz ARM Cortex-M33 processor, high-performance 2.4 GHz radio, industry-leading 20-bit ADC, an optimized combination of Flash (up to 1536 kB) and RAM (up to 256 kB), and an AI/ML hardware accelerator for processing machine learning algorithms while offloading the ARM Cortex-M33, so applications have more cycles to do other work.Supporting a broad range of 2.4 GHz wireless IoT protocols, these SoCs incorporate the highest security with the best RF performance/energy-efficiency ratio in the market.

Availability

EFR32BG24 and EFR32MG24 SoCs in 5 mm x 5 mm QFN40 and 6 mm x 6 mm QFN48 packages are shipping today to Alpha customers and will be available for mass deployment in April 2022. Multiple evaluation boards are available to designers developing applications.Modules based on the BG24 and MG24 SoCs will be available in the second half of 2022.

To learn more about the new BG24 family, go to: http://silabs.com/bg24.

To learn more about the new MG24 family, go to: http://silabs.com/mg24.

To learn more about how Silicon Labs supports AI and ML, go to: http://silabs.com/ai-ml.

About Silicon Labs

Silicon Labs (NASDAQ: SLAB) is a leader in secure, intelligent wireless technology for a more connected world. Our integrated hardware and software platform, intuitive development tools, unmatched ecosystem, and robust support make us an ideal long-term partner in building advanced industrial, commercial, home, and life applications. We make it easy for developers to solve complex wireless challenges throughout the product lifecycle and get to market quickly with innovative solutions that transform industries, grow economies, and improve lives.Silabs.com

View original content to download multimedia:https://www.prnewswire.com/news-releases/silicon-labs-brings-ai-and-machine-learning-to-the-edge-with-matter-ready-platform-301466032.html

SOURCE Silicon Labs

Read the original post:
Silicon Labs Brings AI and Machine Learning to the Edge with Matter-Ready Platform - inForney.com

How digital twins expand the scope of deep learning applications – Analytics India Magazine

Ajinkya Bhave, Country Head (India) at Siemens Engineering Services, discussed the rising significance of simulated data, in his talk at the MLDS conference, titled Simulation-driven Machine Learning. He discussed the application of simulated data to train machine learning models in situations impossible with physical data. At Siemens, the tool connects simulation models and data to train frameworks for ML to train the model at scale using the digital twin, he explained.

He outlined the challenge of the generation and labelling of real-world data and how industries can overcome the hurdles using a digital twin and simulation data. He referred to the Reduced Order Model (ROM), which simplifies a high-fidelity static or dynamical model, preserving essential behaviour and dominant effects to reduce solution time or storage capacity required for the more complex model.

ROM, simulation and digital twins

The reduced-order model helps organisations convert data to models, extend their scope and compute faster. ROM can run your digital twin on embedded devices, cloud and on-site. The basic idea is that the ROM is the catalyst of the digital twin, enabling more applications that werent possible in the past, he explained.

There are multiple ways to create a ROM, depending on your application area, data, and the models system. The model can be anywhere from data-driven with machine learning and deep learning, hybrid with statistical models and physics, to a complete physics-based model. You cannot create a model without domain knowledge that you encapsulate in it. But, equally important, the data matters. All the models require some amount of data, he said.

To create ROMs based on a neural network approach, the data can become either a stopping point or an advantage. At Siemens, the team either augments existing physical data, creates synthetic data or cleans/ labels existing data.

Simulation plays a huge role in connecting machine learning to the digital twin model. Ajinkya explored this ability through interesting real-world case studies.

Case study 1: Applying synthetic data to deploy the machine in real-world scenarios

Ajinkya walked the audience through a case study of a Siemens client that creates gearboxes for wind turbines. The wind turbines break down due to failures in gearboxes and ball bearings. The company turned to predictive monitoring to minimise the downtime. While the customer had tons of data, they did not have the domain distribution needed, making most of the data good with only one-off events of fault anomalies. To balance the distribution, Siemens leveraged 1D and 3D tools to model the gearbox and the ball bearings around the gears in the companys multiphysics tool for 1D modelling. The model and its parts were simulated through a nonlinear spring mass damper system with both parameters based on real data and others tuned. Then, fault injections were applied on the model with the faults the customer was looking for, that output a synthetic time series. Next, statistical noise injection was done to ensure the output was closer to real-time. Siemens combined noise to create a time-series analyses and ran it through a neural network to identify faults.

The idea was that we created synthetic training data, which was then used to train a neural network on a digital twin of the model. Then we tested that on the real faults which occur in the ball bearings of the gearboxes with the physical data. The graph showed us the prediction was pretty accurate, he said. The model trained on synthetic data with a well-tuned simulation model was able to create good training data for the machine learning algorithm for it to be able to predict those faults in real-world data in a real-world deployment.

Case study 2: Model predictive control

MPCs algorithms need to be accurate and high fidelity plant models, but that is not always possible. To that end, a virtual model of the plant is created through the black or grey-box model approach. The model either completes the system as a plant or as a sub-system in the form of a virtual sensor for the parts of the plant that are not measurable. The neural network-based sensor infers from the physical measurements and a model for the subsystem states the controller needs and are later given to the MPC. You have augmented the physical plant along with the unobservable data using a simulation-based approach to help the controller to do better than what it would have with only the physical model of the plant, he said.

The ROM and synthetic data can be additionally applied to the neural network of the plant in the MPC for model-based reinforcement learning, autonomous driving and factory robots for a fast but reduced-order model of the plant for the controller to optimise.

Case study 3: Predictive maintenance of pole-mounted transformers

The last case study was about the pole-mounted transformers that take high tension wires and reduce the voltage to 230 V for the safe operation of house appliances. However, given Indias diverse temperature conditions, such transformers are a fire risk. An identified cause is the oil levels between the coils of the transformers going down, causing it to overheat or spark. To monitor the oil levels of different transformers, the normalised twin concept is used. Siemens retrofitted the transformers infrastructure with a Siemens box containing four temperature senses and a cloud-based router to send the measurements periodically to the cloud.

This allowed Siemens to infer the oil level, specialise the normalised digital twin for that model and use the live twin to virtually estimate the oil labels. Although this is still an ongoing project, using a digital twin with simulated data was parameterised and finetuned with real parameters from the field.

Lastly, Ajinkya discussed a generative design case study focusing on CFD simulations. ML can be used to adaptively learn the success certainty of simulation runs and reduce the hours of the process to mere minutes.

Here is the original post:
How digital twins expand the scope of deep learning applications - Analytics India Magazine

Getting a Read on Responsible AI | The UCSB Current – The UCSB Current

There is great promise and potential in artificial intelligence (AI), but if such technologies are built and trained by humans, are they capable of bias?

Absolutely, says William Wang, the Duncan and Suzanne Mellichamp Chair in Artificial Intelligence and Designs at UC Santa Barbara, who will give the virtual talk What is Responsible AI, at 4 p.m. Tuesday, Jan. 25, as part of the UCSB Librarys Pacific Views speaker series (register here).

The key challenge for building AI and machine learning systems is that when such asystem is trained on datasets with limited samples from history, they may gain knowledge from the protected variables (e.g., gender, race, income, etc.), and they are prone to produce biased outputs, said Wang, also director of UC Santa Barbaras Center for Responsible Machine Learning.

Sometimes these biases could lead to the rich getting richer phenomenon after the AI systems are deployed, he added.Thats why in addition to accuracy, it is important to conduct research in fair and responsible AI systems, including the definition of fairness, measurement, detection and mitigation of biases in AI systems.

Wangs examination of the topic serves as the kickoff event for UCSB Reads 2022, the campus and community-wide reading program run by UCSB Library. Their new season is centered on Ted Chiangs Exhalation, a short story collection thataddressesessential questions about human and computer interaction, including the use of artificial intelligence.

Copies of Exhalation will be distributed free to students (while supplies last) Tuesday, Feb. 1 outside the Librarys WestPaseo entrance. Additional events announced so far include on-air readings from the book on KCSB, a faculty book discussion moderated by physicist and professor David Weld and a sci-fi writing workshop. It all culminates May 10 with a free lecture by Ted Chiang in Campbell Hall.

First though: William Wang, an associate professor of computer science and co-director of the Natural Language Processing Group.

In this talk, my hope is to summarize the key advances of artificialintelligence technologies in the last decade, and share how AI can bring us an exciting future, he noted. I will also describe the key challenges of AI: how we should consider the research and development of responsible AI systems,which not only optimize their accuracy performance,but also provide a human-centric view to consider fairness, bias, transparency and energy efficiency of AI systems.

How do we build AI models that are transparent? How do we write AI system descriptions that meet disclosive transparency guidelines?How do we consider energy efficiency when building AI models? he asked. The future of AI is bright, but all of these are key aspects of responsible AI that we need to address.

See more here:
Getting a Read on Responsible AI | The UCSB Current - The UCSB Current

Machine learning reduced workload for the Cochrane COVID-19 Study Register: development and evaluation of the Cochrane COVID-19 Study Classifier -…

This article was originally published here

Syst Rev. 2022 Jan 22;11(1):15. doi: 10.1186/s13643-021-01880-6.

ABSTRACT

BACKGROUND: This study developed, calibrated and evaluated a machine learning (ML) classifier designed to reduce study identification workload in maintaining the Cochrane COVID-19 Study Register (CCSR), a continuously updated register of COVID-19 research studies.

METHODS: A ML classifier for retrieving COVID-19 research studies (the Cochrane COVID-19 Study Classifier) was developed using a data set of title-abstract records included in, or excluded from, the CCSR up to 18th October 2020, manually labelled by information and data curation specialists or the Cochrane Crowd. The classifier was then calibrated using a second data set of similar records included in, or excluded from, the CCSR between October 19 and December 2, 2020, aiming for 99% recall. Finally, the calibrated classifier was evaluated using a third data set of similar records included in, or excluded from, the CCSR between the 4th and 19th of January 2021.

RESULTS: The Cochrane COVID-19 Study Classifier was trained using 59,513 records (20,878 of which were included in the CCSR). A classification threshold was set using 16,123 calibration records (6005 of which were included in the CCSR) and the classifier had a precision of 0.52 in this data set at the target threshold recall >0.99. The final, calibrated COVID-19 classifier correctly retrieved 2285 (98.9%) of 2310 eligible records but missed 25 (1%), with a precision of 0.638 and a net screening workload reduction of 24.1% (1113 records correctly excluded).

CONCLUSIONS: The Cochrane COVID-19 Study Classifier reduces manual screening workload for identifying COVID-19 research studies, with a very low and acceptable risk of missing eligible studies. It is now deployed in the live study identification workflow for the Cochrane COVID-19 Study Register.

PMID:35065679 | DOI:10.1186/s13643-021-01880-6

Read the original here:
Machine learning reduced workload for the Cochrane COVID-19 Study Register: development and evaluation of the Cochrane COVID-19 Study Classifier -...