Page 2,229«..1020..2,2282,2292,2302,231..2,2402,250..»

Top online resources to learn Active Learning – Analytics India Magazine

A key requirement of machine learning is to label the data correctly to ensure the best results, but the process is long and time-consuming. This also brings about an issue when dealing with extremely large data sets in unsupervised or semi-supervised learning. The saviour here is active learning with strategies that assist developers in prioritising the data and selecting the most useful samples to label to have the highest training impact. Furthermore, it promises to reduce the samples needed by choosing the right examples.

Various strategies can be used depending on the applications and needs of the model. However, when it comes to learning active learning, the practice is generally a part of bigger machine learning modules, which is why we have created a one-stop guide to mastering active learning online through resources varying from online video tutorials to blog posts and academic papers.

YouTube

Computerphile is a popular YouTube channel that discusses computer science-related topics. Their tutorial on active learning is taught by Dr Michel Valstar, who holds a PhD in Computing and is currently a professor at the University of Nottingham. The tutorial is a foundational element for the basics of active learning, taught through diagrams and illustrations of the concepts.

ICML, the International Conference on Machine Learning, is one of the fastest-growing AI conferences that discuss the latest academic papers. During their 2019 conference, Robert Nowak and Steve Hanneke taught the basics of active learning theory and the popular algorithms to apply (the video is now available online). In addition, the tutorial focuses on sound active learning algorithms and how they can be used to reduce the labels on training data. Robert Nowak holds the Nosbusch Professorship in Engineering at the University of Wisconsin-Madison. Steve Hanneke is a Research Assistant Professor at the Toyota Technological Institute in Chicago, specialising in AI and ML.

Applied AI is a great resource for learning AI/ML online through core concepts and real-life applications. The channels collective views cross 12 million and are popular for the basic concepts thorough teachings. Their tutorial on active learning in ML breaks down the principles of the concept along with real-life examples and mathematical explanations.

PyData is an educational program of NumFOCUS, a US-based not for profit organisation that provides a forum for the international community of data science to share their ideas through conferences. Speaking at one of their events is Jan Freyberg, a machine learning software engineer at Google Health. In a detailed talk, Freyberg discusses active learning in the interactive Python environment, given the ease and comfort in the ecosystem.

Devansh is a Computer Science and Computational Math Double Major at the Rochester Institute of Technology. Through this YouTube tutorial, he comprehensively discusses the basics of active learning, its works and compares it to SSL and GANs. He further explains the concept in detail regarding its use and active learnings acquisition function.

Ranji Raj, holding a masters degree in data science, takes on Youtube to publish tutorials and classwork related to machine learning. His video on active learning gives an in-depth introduction to the subject while discussing important concepts through diagrams and demonstrations. Raj also has consequent coursework on his GitHub page for data scientists interested in learning further.

Scaleway is a French cloud computing company that creates Youtube videos consisting of short machine learning tutorials and real-world applications. In their webinar on active learning, the company collaborated with Kairntech, an AI modelling and dataset creation platform, to discuss the various applications of active learning. The video discusses training datasets and how active learning can be applied for classification. It also glossed over common issues and how to overcome them.

Blog tutorials

Ori Cohen is a PhD holder in CS, currently working as a senior director of data science at New Relic. His Towards Data Science blog post on active learning is an extensive tutorial that discusses the various scenarios possible while using active learning, the algorithms that can be used, the sample selection methods and the codings used for all.

A blog post on Data Camp, an online interactive learning platform, explains in depth the A-Zs of active learning in a moderate level of difficulty. The tutorial discusses the concept in detail with definitions, examples and visuals, and teaches how one can apply active learning on their datasets through a particular example.

Written by a CS and EE student at IIT, India, this post is an in-depth tutorial on using active learning with Python. The tutorial is technical, explaining the code and its concepts through codes and steps. In addition, the post discusses various inputs, outputs, and the Python codes needed to apply active learning correctly.

Alexandre Abraham, a senior research scientist at Dataiku and a Ph D holder in computer science, has written an extensive tutorial on active learning packages on his Medium blog post. The blog post analyses the active learning packages available through a feature comparison, their covered approaches, and their coding aspects. There are three main packages and different methods that data scientists can leverage.

Papers

The paper in discussion is written by Kai Wei, an assistant professor at UCLA, Rishabh Iyer, an assistant professor at the University of Texas, and Jeff Bilmes, a professor at the University of Washington. Their paper studies the problem of selecting a subset of data to train a classifier and how individuals can apply the active learning framework to mitigate the issue.

Online courses

The DeepLearning.AI course in ML data lifecycle has a fourth module, tagged Advanced Labeling, Augmentation and Data Preprocessing, that focuses on semi-supervised learning, dataset labelling, and the role played by active learning within. The instructor, Robert Crowe, works at TensorFlow by Google and has multiple degrees in AI, ML and data science.

Link:

Top online resources to learn Active Learning - Analytics India Magazine

Read More..

IT Sligo helping to close the data science skills gap with flexible, online learning – The Irish Times

After 20 years working in software engineering across a variety of industries and government departments, Darragh Sherwin, a development team lead at Overstock Ireland, and a current student at IT Sligo, noticed that he had a skills gapcompared to his colleagues, following a move to a different department within Overstock.

My team recently moved into the algorithms department of Overstock which employs a lot of data scientists, machine learning (ML) scientists and ML engineers. The move highlighted a skills deficit that I had in Data Science, I had always had it in the back of my head to continue further education and studying the part-time, online Masters in Data Science at IT Sligo felt like a natural alignment. I was moving into a department where data is crucial and there are moves across all industries is to be more data driven,says Sherwin.

Data Science is now the backbone of any industry and current trends indicate that it will accumulate even more importance in the coming years. If businesses want to succeed, it is critical that they bank on data science to make data-informed decisions based on insights and trends.

Due to the ever-growing importance of data, data scientists are now in high demand and IT Sligo are helping to close the current skills deficit in data science with their part-time, online Masters in Data Science.

The course includes a combination of statistical analysis, modelling, machine learning and data visualisation. Applicable to any industry, it combines techniques from mathematics, statistics, information theory, computer science and artificial intelligence. Automated driving, consumer buying habits, medical imaging, business intelligence, fraud/risk detection and speech recognition are just a few applications. Masters students will design data analytic techniques, interpret, and manage big data using software as well as machine learning, and probabilistic and statistical methods.

On why he chose the course from IT Sligo, Sherwinsays; Overstock has worked closely with IT Sligo, we hire graduates from IT Sligo and they are always of impressively high calibre. The course modules aligned with my understanding of the area and gives a good foundation to students.

Relevant to engineers who require upskilling in data science or those have already qualified with a Level 8 honours degree in Computer Science (or related disciplines), this masters is offered part-time and online with live lectures in the evening. You can study anywhere and in your own time.

It is great to have the flexibility of studying online. I have a young child so if I need to miss a lecture, I can come back and watch it later. There are great online resources for learning and college seems to have paid particular attention to ensuring online studying is very smooth with their tools like Moodle and Microsoft Teams, says Sherwin. Choosing to study a part-time, online course allows students to upskill for their career while also working full-time.

A qualification in Data Science skills can lead to an exciting career in an array of industries including IT, financial services, retail, and manufacturing.These roles are not confined to IT based positions but can lead to roles in business intelligence, analysts, or data warehouse consultants.The rewards for such positions are also inviting, starting salaries range from 40,000 with senior data scientists commanding annual salaries of more than 100,000.

Online Learning at IT Sligo is ranked number onefor Most Flexible Learning Students in the Good University Guide 2021. With more than 150 online courses available, IT Sligo is Irelands leading online provider. Through innovative online teaching methods, students anywhere in the world can study and graduate with fully accredited online qualifications matched to industry demand.

Applications are now open for the Masters in Data Science at IT Sligo, starting part-time, online on Monday, January 17th, 2022. Apply here - http://www.itsligo.ie/datascience

Link:

IT Sligo helping to close the data science skills gap with flexible, online learning - The Irish Times

Read More..

Insights on the Healthcare Artificial Intelligence Global Market to 2026 – Featuring Google, IBM and Intel Among Others – Yahoo Finance

Dublin, Jan. 04, 2022 (GLOBE NEWSWIRE) -- The "Healthcare Artificial Intelligence (AI) Market - Global Outlook & Forecast 2021-2026" report has been added to ResearchAndMarkets.com's offering.

The healthcare artificial intelligence market is expected to reach USD 44.5 billion by 2026, growing at a CAGR of 46.21%.

Several pharmaceutical companies are implementing innovative technologies to boost their growth in the global healthcare industry. Collaboration of GSK with Exscientia identified a small compound for targeted therapeutics and its characteristics towards the specific target using the AI platform. AI is becoming an incredible platform in the pharmaceutical industry.

For instance, Novartis announced Microsoft as a strategic partner in AI and data science to set up an AI innovation lab. Since the last year, over 50+ companies have got machine learning and AI algorithms approvals. During the COVID-19 pandemic, AI played a significant role in the healthcare industry. An analytics study by Accenture combined with clinical applications demonstrated the potential of AI to reduce approximately USD 150 billion per annum by 2026 in the US healthcare system.

The following factors are likely to contribute to the growth of the healthcare artificial intelligence market during the forecast period:

Increase in patient volume & complexities associated with data fueling demand for AI.

The shrinking operational workforce in healthcare facilities propelling the need for AI.

Technological advancement & innovations in AI influencing end-users in the market.

Rising Investment in advanced drug discovery & development process augmenting the adoption of AI.

Key Highlights

The healthcare providers segment accounted for the largest market share with around 48% compared to others in 2020.

According to the research, the publisher estimated that APAC would witness the highest growth in the healthcare artificial intelligence (AI) market during the forecast period.

The study considers a detailed scenario of the present healthcare artificial intelligence market and its market dynamics for the period 2021?2026. It covers a detailed overview of several market growth enablers, restraints, and trends. The report offers both the demand and supply aspects of the market. It profiles and examines leading companies and other prominent ones operating in the market.

Story continues

Vendor Analysis

Giant players are focusing on pursuing organic growth strategies to enhance their product portfolio in the healthcare artificial intelligence (AI) market. Several initiatives by the players will complement growth strategies, which are gaining traction among end-users in the market. Rising growth of startups collaborating with key vendors in promoting their artificial intelligence in healthcare applications creating heavy competition in the market.

Key Questions Answered:1. How big is the healthcare artificial intelligence (AI) market?2. Which region has the highest share in the healthcare artificial intelligence market?3. Who are the key players in the healthcare AI market?4. What are the latest market trends in the healthcare artificial intelligence market?5. What is the use of AI in the healthcare market?

Key Topics Covered:

1 Research Methodology

2 Research Objectives

3 Research Process

4 Scope & Coverage4.1 Market Definition4.1.1 Inclusions4.1.2 Exclusions4.1.3 Market Estimation Caveats4.2 Base Year4.3 Scope Of The Study4.3.1 Market Segmentation By Component4.3.2 Market Segmentation By Application4.3.3 Market Segmentation By Technology4.3.4 Market Segmentation By End-User4.3.5 Market Segmentation By Geography

5 Report Assumptions & Caveats5.1 Key Caveats5.2 Currency Conversion5.3 Market Derivation

6 Market At A Glance

7 Introduction7.1 Healthcare Artificial Intelligence (AI)

8 Market Opportunities & Trends8.1 Rising Investments In Advanced Drug Discovery & Development Processes8.2 Mergers, Acquisitions, & Collaborations With Life Science & Medical Device Companies8.3 Influx/Emergence Of Many Startups In The Healthcare AI Industry

9 Market Growth Enablers9.1 Increase In Patient Volume & Complexities Associated With Data9.2 Shrinking Operational Workforce In Healthcare Facilities9.3 Technological Advancements & Innovations In AI9.4 Growing Need To Reduce Healthcare Costs Using It & AI Technologies

10 Market Restraints10.1 High Installation & Implementation Cost Of AI & Related Platforms10.2 Lack Of Skilled AI Workforce & Resistance Among Healthcare Professionals10.3 Stringent & Ambiguous Regulations For Healthcare Software & AI Technologies10.4 Absence Of Interoperability Among Commercially Available Ai Solutions Coupled With Data Privacy Issues

11 Market Landscape11.1 Market Overview11.2 Market Size & Forecast11.3 Five Forces Analysis11.3.1 Threat Of New Entrants11.3.2 Bargaining Power Of Suppliers11.3.3 Bargaining Power Of Buyers11.3.4 Threat Of Substitutes11.3.5 Competitive Rivalry

12 Component12.1 Market Snapshot & Growth Engine12.2 Market Overview12.3 Hardware12.3.1 Market Overview12.3.2 Market Size & Forecast12.3.3 Hardware: Geography Segmentation12.4 Software & Services12.4.1 Market Overview12.4.2 Market Size & Forecast12.4.3 Software & Services: Geography Segmentation

13 Application13.1 Market Snapshot & Growth Engine13.2 Market Overview13.3 Hospital Workflow Management13.3.1 Market Overview13.3.2 Market Size & Forecast13.3.3 Hospital Workflow Management: Geography Segmentation13.4 Medical Imaging & Diagnosis13.4.1 Market Overview13.4.2 Market Size & Forecast13.4.3 Medical Imaging & Diagnosis: Geography Segmentation13.5 Drug Discovery & Precision Medicine13.5.1 Market Overview13.5.2 Market Size & Forecast13.5.3 Drug Discovery & Precision Medicine: Geography Segmentation13.6 Patient Management13.6.1 Market Overview13.6.2 Market Size & Forecast13.6.3 Patient Management: Geography Segmentation

14 Technology14.1 Market Snapshot & Growth Engine14.2 Market Overview14.3 Machine Learning14.3.1 Market Overview14.3.2 Market Size & Forecast14.3.3 Machine Learning: Geography14.4 Querying Method14.4.1 Market Overview14.4.2 Market Size & Forecast14.4.3 Querying Method: Geography Segmentation14.5 Natural Language Processing14.5.1 Market Overview14.5.2 Market Size & Forecast14.5.3 Natural Language Processing: Geography Segmentation14.6 Other Technology14.6.1 Market Overview14.6.2 Market Size & Forecast14.6.3 Other Technology: Geography Segmentation

15 End-User15.1 Market Snapshot & Growth Engine15.2 Market Overview15.3 Healthcare Providers15.3.1 Market Overview15.3.2 Market Size & Forecast15.3.3 Healthcare Providers: Geography Segmentation15.4 Pharma-Biotech & Medical Device Companies15.4.1 Market Overview15.4.2 Market Size & Forecast15.4.3 Pharma-Biotech & Medical Device Companies: Geography Segmentation15.5 Payers15.5.1 Market Overview15.5.2 Market Size & Forecast15.5.3 Payers: Geography Segmentation15.6 Others15.6.1 Market Overview15.6.2 Market Size & Forecast15.6.3 Other End User: Market By Geography

16 Geography16.1 Market Snapshot & Growth Engine16.2 Geographic Overview

17 North America

18 Europe

19 APAC

20 Latin America

21 Middle East & Africa

22 Competitive Landscape22.1 Competition Overview22.2 Market Share Analysis22.2.1 Google22.2.2 IBM22.2.3 Intel22.2.4 Medtronic22.2.5 Microsoft22.2.6 NVIDIA22.2.7 Siemens Healthineers

23 Key Company Profiles23.1 GOOGLE23.1.1 Business Overview23.1.2 Product Offerings23.1.3 Key Strategies23.1.4 Key Strengths23.1.5 Key Opportunities23.2 INTERNATIONAL BUSINESS MACHINES (IBM)23.2.1 Business Overview23.2.2 Product Offerings23.2.3 Key Strategies23.2.4 Key Strengths23.2.5 Key Opportunities23.3 INTEL CORPORATION23.3.1 Business Overview23.3.2 Product Offerings23.3.3 Key Strategies23.3.4 Key Strengths23.3.5 Key Opportunities23.4 MEDTRONIC23.4.1 Business Overview23.4.2 Product Offerings23.4.3 Key Strategies23.4.4 Key Strengths23.4.5 Key Opportunities23.5 MICROSOFT CORPORATION23.5.1 Business Overview23.5.2 Product Offerings23.5.3 Key Strategies23.5.4 Key Strengths23.5.5 Key Opportunities23.6 NVIDIA CORPORATION23.6.1 Business Overview23.6.2 Product Offerings23.6.3 Key Strategies23.6.4 Key Strengths23.6.5 Key Opportunities23.7 SIEMENS HEALTHINEERS23.7.1 Business Overview23.7.2 Product Offerings23.7.3 Key Strategies23.7.4 Key Strengths23.7.5 Key Opportunities

24 Other Prominent Vendors24.1 ARTERYS24.1.1 Business Overview24.1.2 Product Offerings24.2 CAPTION HEALTH24.2.1 Business Overview24.2.2 Product Offerings24.3 ENLITIC24.3.1 Business Overview24.3.2 Product Offerings24.4 CATALIA HEALTH24.4.1 Business Overview24.4.2 Product Offerings24.5 GENERAL VISION24.5.1 Business Overview24.5.2 Product Offerings24.6 PHILIPS24.6.1 Business Overview24.6.2 Product Offerings24.7 STRYKER24.7.1 Business Overview24.7.2 Product Offerings24.8 SHIMADZU RECURSION PHARMACEUTICALS24.8.1 Business Overview24.8.2 Product Offerings24.9 GE HEALTHCARE24.9.1 Business Overview24.9.2 Product Offerings24.10 REMEDY MEDICAL24.10.1 Business Overview24.10.2 Product Offerings24.11 SUBTLE MEDICAL24.11.1 Business Overview24.11.2 Product Offerings24.12 NETBASE QUID24.12.1 Business Overview24.12.2 Product Offerings24.13 BIOSYMETRICS24.13.1 Business Overview24.13.2 Product Offerings24.14 SENSELY24.14.1 Business Overview24.14.2 Product Offerings24.15 INFORMAI24.15.1 Business Overview24.15.2 Product Offerings24.16 BIOCLINICA24.16.1 Business Overview24.16.2 Product Offerings24.17 OWKIN24.17.1 Business Overview24.17.2 Product Offerings24.18 BINAH.AI24.18.1 Business Overview24.18.2 Product Offerings24.19 ONCORA MEDICAL24.19.1 Business Overview24.19.2 Product Offerings24.20 QURE.AI TECHNOLOGIES24.20.1 Business Overview24.20.2 Product Offerings24.21 LUNIT24.21.1 Business Overview24.21.2 Product Offerings24.22 CARESYNTAX24.22.1 Business Overview24.22.2 Product Offerings24.23 ANJU SOFTWARE24.23.1 Business Overview24.23.2 Product Offerings24.24 IMAGIA CYBERNETICS24.24.1 Business Overview24.24.2 Product Offerings24.25 DEEP GENOMICS24.25.1 Business Overview24.25.2 Product Offerings24.26 WELLTOK INC.24.26.1 Business Overview24.26.2 Product Offerings24.27 MDLIVE24.27.1 Business Overview24.27.2 Product Offerings24.28 MAXQ AI24.28.1 Business Overview24.28.2 Product Offerings24.29 QVENTUS24.29.1 Business Overview24.29.2 Product Offerings24.30 WORKFUSION24.30.1 Business Overview24.30.2 Product Offerings

25 Report Summary25.1 Key Takeaways25.2 Strategic Recommendations

26 Quantitative Summary

27 Appendix

For more information about this report visit https://www.researchandmarkets.com/r/it4jn7

See the original post:

Insights on the Healthcare Artificial Intelligence Global Market to 2026 - Featuring Google, IBM and Intel Among Others - Yahoo Finance

Read More..

An Exclusive Interview with Abhishek Rungta, Founder and CEO, INT – Analytics Insight

INT is leveraging big data analytics to provide web development, digital marketing, etc.

Big data analytics is offering huge scope to companies and individuals to unlock certain business potentials with different real-time data structured, unstructured, and semi-structured. Big data is helping in multiple departments like web development, mobile development, understanding consumers, and many more. The global big data analytics market size is expected to hit US$684.12 billion in 2030 with a CAGR of 13.5%.

Here is an exclusive interview with Abhishek Rungta, Founder and CEO, INT, to enlighten the readers about how INT is leveraging big data analytics for web development to be a full-stack software product engineering company.

INT. (Indus Net Technologies) is a full-stack software product engineering company with a team of over 750 full-stack agile solutions experts who can help a business to experience a smooth digital transformation. It has been unlocking the business potential with technology and delivering a seamless user experience since 1997.

INT. offers web development, mobile development, digital marketing, dedicated hiring, analytics, and product design services.

Garbage in Garbage Out: Organizations have a huge volume of unstructured data which at times becomes difficult to process for insight building. INT. works extensively to structure this big data so that it can be consumed by the analytical tools for the generation of insights.

Big data analytics has brought a revolution in almost every industry across the globe. One might have heard of a trending proverb that data is the new oil.

Every business wants to implement the concept of digital first as soon as possible. The usage of big data analytics by businesses is growing by the day as they want to understand customer behaviour insights, re-develop products and create new revenue streams.

According to a report by Analytics Insight, the global big data analytics market is expected to reach almost US$420.98 billion by 2027. So, more and more companies will utilize big data analytics, and data professionals will be in high demand in the coming years.

AI has opened the way for smarter job execution with real-time analysis and more interaction between humans and machines. On the other hand, the Internet of Things (IoT) has enabled communication between devices and humans but without any human intervention.

The IoT combined with artificial intelligence (AI) has the potential to create intelligent robots capable of simulating smart behaviour and aiding decision-making with little or no human intervention.

While IoT collects vast volumes of data by connecting devices to the internet, AI aids in the assimilation and evaluation of this data. Machine learning (a subset of AI) in IoT devices uses highly powerful sensors to find trends and detect any errors in data collecting.

After collecting data, big data analytics can be used for a better understanding of customer behaviour and enhancing customer experience.

The future of artificial intelligence (AI) and machine learning is bright in India. Machine learning is a subset of artificial intelligence. According to an Accenture report, AI has the potential to add US$957 billion, or 15% of Indias current gross value by 2035. As per the The AI Index 2021 Annual Report, Karnataka had the most AI start-ups in 2019, with 356, followed by Maharashtra with 215 and both Andhra Pradesh and Telangana with 111.

In India, IoT (Internet of Things) has gained fame with the introduction of Amazons Alexa, Google Echo, smart locks, smart lighting, etc. A report has revealed that the Indian IoT market is expected to expand at a compound annual growth rate of 13.2% from 2020 to 2025. The top four areas of IoT funding are lifestyle/wearables, embedded computing, industrial internet, and connected homes.

Well, talking about the evolving trends will be incomplete without mentioning cloud computing! According to a NASSCOM report, until 2022, investments in cloud management, storage networks, security, and backup services are predicted to increase by 31% YoY. And India alone recorded nearly 379,000 job openings for cloud roles in 2020 and this demand is likely to increase with time.

My leadership mantra is to be a leader whom people trust and give respect. I strongly believe that a leaders true asset is people.

The team at INT operates as a family and doesnt believe in hierarchy. Whenever I am in Kolkata, I dont sit in a separate cabin at the office. I sit with different teams every day so that my co-workers dont feel hesitant in approaching me. I try to know what motivates them. After all, a leader should know how to drive success and productivity.

Here are a few major challenges that the big data analytics industry is facing today:

Lack of data science professionals: There is a massive shortage of professionals like data scientists, data analysts, and data engineers. Many companies are upskilling their existing eligible employees so that they gain knowledge about big data analytics.

Security of data: Securing data is often neglected by companies as they remain mostly busy with understanding, storing, and analysing their data sets. By doing so, companies can lose a substantial amount of money if they fail to protect their data.

Poor visualization: Valuable data can be overlooked when it is combined with irrelevant data. It can create a faulty interpretation of the information to the audience. This misconception can lead to erroneous insights and bad business decisions, all while claiming to be backed by data.

The pandemic has accelerated the need for the digital transformation of businesses. Eventually, many industries like banking, financial services, ed-tech, logistics, etc. are increasingly looking for talents in the big data analytics field to stay competitive in the post-pandemic world.

These industries are offering lucrative packages and perks to hire and retain data science professionals. The World Economic Forum predicts that data scientists and analysts will become the No. 1 emerging role in the world by 2022.

The number of job openings in big data analytics is touching new heights every day. So, I dont see any immediate signs of plunging of the employment trends of the big data analytics industry. The US Bureau of Labour Statistics has reported that the rise of data science needs will create roughly 11.5 million job openings by 2026.

Visit link:

An Exclusive Interview with Abhishek Rungta, Founder and CEO, INT - Analytics Insight

Read More..

Top 10 Best Machine Learning Companies to Join in 2022 – Analytics Insight

Machine learning is a blessing. Here are the 10 best machine learning companies to join in 2022

Machine learning is a blessing. The industrial sector saw a radical shift when machine learning and AI came into the limelight. Machine learning companies are gradually evolving at a faster pace and have emerged as one of the key players of IT firms. Machine learning refers to the development of intelligent algorithms and statistical modelling, that aids in improving programming, without coding them explicitly. For example, ML can make a predictive analysis app more precise, with the passing time. ML frameworks and models require an amalgamation of data science, engineering, and development skills. As we are becoming completely dependent on technology for making our lives faster and smoother, machine learning has also become an integral part of our lives. It is now widely accessed by various organizations on this planet. They have started building in-house data science teams. Some of these teams primarily focus on analysing business data, to generate valuable insights and the rest try to incorporate machine learning capabilities into their companys products.

Share This ArticleDo the sharing thingy

About AuthorMore info about author

More:

Top 10 Best Machine Learning Companies to Join in 2022 - Analytics Insight

Read More..

A Guide to ECCO: Python Based Tool for Explainability of Transformers – Analytics India Magazine

Accountability is required for any decision-making tool in an organization. Machine learning models are already being used to automate time-consuming administrative tasks and to make complex business decisions. To ensure proper security of the model and business decisions, scientists and engineers must understand the inner mechanics of their models, which is commonly referred to as a black box. This is no longer the case, as various tools, such as ELI5, are available to track the inner mechanics of the model. In this article, well look at how to explain the inner workings of language models like transformers using a toolbox called ECCO. The main points to be covered in this article are listed below.

Lets start the discussion by understanding the explainability of machine learning models.

Explainability in machine learning refers to the process of explaining a machine learning models decision to a human. The term model explainability refers to the ability of a human to understand an algorithms decision or output. Its the process of deciphering the reasoning behind a machine learning models decisions and outcomes. With black box machine learning models, which develop and learn directly from data without human supervision or guidance, this is an important concept to understand.

A human developer would traditionally write the code for a system or model. The system evolves from the data with machine learning. Machine learning will be used to improve the algorithms ability to perform a specific task or action by learning from data. Because the underlying functionality of the machine learning model was developed by the system itself, it can be difficult to understand why the system made a particular decision once it is deployed.

Machine learning models are used to classify new data or predict trends by learning relationships between input and output data. The model will identify these patterns and relationships within the dataset. This means that the deployed model will make decisions based on patterns and relationships that human developers may not be aware of. The explainability process aids human specialists in comprehending the decisions algorithm. After that, the model can be explained to non-technical stakeholders.

Machine learning explainability can be achieved using a variety of tools and techniques that vary in approach and machine learning model type. Traditional machine learning models may be simpler to comprehend and explain, but more complex models, such as deep neural networks, can be extremely difficult to grasp.

When machine learning has a negative impact on business profits, it earns a bad reputation. This is frequently the result of a misalignment between the data science and business teams. There are a few areas where Explainability heals based on this, such as,

Understanding how your models make decisions reveals previously unknown vulnerabilities and flaws. Control is simple with these insights. When applied across all models in production, the ability to quickly identify and correct mistakes in low-risk situations adds up.

In high-risk industries like healthcare and finance, trust is critical. Before ML solutions can be used and trusted, all stakeholders must have a thorough understanding of what the model does. If you claim that your model is better at making decisions and detecting patterns than humans, you must be able to back it up with evidence. Experts in the field are understandably skeptical of any technology that claims to be able to see more than they can.

When a model makes a bad or rogue decision, its critical to understand the factors that led to that decision, as well as who is to blame for the failure, in order to avoid similar issues in the future. Data science teams can use explainability to give organizations more control over AI tools.

The terms explainability and interpretability are frequently used interchangeably in the disciplines of machine learning and artificial intelligence. While they are very similar, it is instructive to note the distinctions, if only to get a sense of how tough things may become as you advance deeper into machine learning systems.

The degree to which a cause and effect may be observed inside a system is referred to as interpretability. To put it another way, its your capacity to predict what will happen if the input or computational parameters are changed.

Explainability, on the other hand, relates to how well a machines or deep learning systems internal mechanics can be articulated in human terms. Its easy to ignore the subtle contrast between interpretability and comprehension, but consider this: interpretability is the ability to comprehend mechanics without necessarily knowing why. The ability to explain what is happening in depth is referred to as explainability.

Many recent advances in NLP have been powered by the transformer architecture, and until now, we had no idea why Transformer-based NLP models have been so successful in recent years. To improve the transparency of Transformer-based language models, ECCO an open-source library for the explainability of Transformer-based NLP models was created.

ECCO offers tools and interactive explorable explanations to help with the examination and intuition of terms, such as Input Saliency, which visualizes the token importance for a given sentence. Hidden State Evaluation is applied to all layers of a model to determine the role of each layer. Non-negative matrix factorization of neuron activations was used to uncover underlying patterns of neuron firings, revealing firing patterns of linguistic properties of input tokens, and neuron activation tell us how a group of neurons spikes or responds while making a prediction.

Now in this section, we will take a look at how ECCO can be used to understand the working of various transformers models while predicting the sequence-based output. Majorly well see how weights are distributed at the final layer while predicting the next sequence and will also analyze all layers of the selected model.

To start with ECCO we can install it using the pip command as! pip install Ecco

And also make sure you have also installed the Pytorch.

First, we will start with generating a single token by passing a random string to the model. The GPT2 is used due to its superiority for generating the next sequence as a human does. The below code shows how we load the pre-trained model and how to use it for the prediction. The below generate method takes an input sequence and additionally there we can pass how many tokens we need to generate from the model by specifying generate= some number.

While initializing the pre-trained model we set activation=True that we capture all the firing status of the neurons.

Now well generate a token using the generate method.

From the method, token 6 and 5 is generated as first and of respectively.

The model has a total of 6 decoder layers and the last layer is the decision layer where the appropriate token is chosen.

Now we will observe the status of the last layers and see what are the top 15 tokens that the model has considered. Here we observe the status for position / token 6 and this can be achieved by output.layer_predictions method as below.

output.layer_predictions(position=6, layer=5, topk=15)

As we can see, the token first comes up with a higher contribution.

Similarly, we can check how different tokens would perform at the output layer. This can be done by explicitly passing the token numbers inside the method ranking_watch. However, tokens can be easily generated by using the pre-trained model that we have selected initially.

Below are the generated token IDs.

Now well supply these IDs to see the rankings.

output.rankings_watch(watch=[262, 717, 621], position=6)

At the decision layer, we can see the first rank is achieved by token first and the rest not even closer to it. Thus we can say the model has correctly identified the next token and did assign proper weights for possible tokens.

We have seen what the explainability of a Model is and how important it is when it comes to deploying such a model to the production level in this article. Tools are needed to aid debugging models, explain their behavior, and develop intuitions about their inner mechanics as language models become more common. Ecco is one such tool that combines ease of use, visual interactive explorable, and a variety of model explainability methods. This article focused on the ML models explainability and a glimpse of the ECCO Toolbox.

Read this article:

A Guide to ECCO: Python Based Tool for Explainability of Transformers - Analytics India Magazine

Read More..

What to know about the Minnesota redistricting plans going before a special judicial panel this week – MinnPost

A passel of lawyers will gather Tuesday morning in a large conference room in the Minnesota Judicial Center and take their last shot at influencing the special five-judge panel charged with drawing new congressional and legislative districts for the state.

The oral arguments in Wattson v. Simon will be the final public part of the process triggered 10 months ago by a lawsuit asking the Minnesota Supreme Court to address the assertion that the 2020 Census made the states current political lines unconstitutional.

After the hearing, the panel will spend six weeks doing two things. The first will be drawing eight new congressional districts, 67 state Senate districts and 134 state House districts. The second will be waiting, at least until Feb. 15, to be certain that the divided state Legislature will fail in its redistricting duties.

Lawyers for each of the four groups proposing new maps, known as intervenors, will be given time to make the case that their vision is the correct one and that the visions of the other three plaintiffs are not.

Article continues after advertisement

On Dec. 7, the four groups filed documents detailing their plans; on Dec. 17, all four filed briefs defending their plan and critiquing the plans of the others. Those briefs give an early look at what will be talked about during oral arguments Tuesday.

A typical civil case involves two parties: a plaintiff and a respondent. This one involves five: four that have proposed maps and Secretary of State Steve Simon, who was sued. The map-makers are known as the Wattson Plaintiffs (for lead plaintiff Peter Wattson, a former legislative lawyer involved in past redistricting efforts); the Anderson Plaintiffs (representing Republican Party interests); the Sachs Plaintiffs (representing DFL interests); and the Corrie Plaintiffs (for lead plaintiff Bruce Corrie, who are advocating for maximum representation for communities of color).

A fifth group, though not formal intervenors, has submitted a series of friend of court filings with a December 8 filing accepted by the court but a Dec. 29 filing commenting on the four intervenors plans rejected [December 8 Minnesota Special Redistricting Panel Brief, December 29 Minnesota Motion for Leave]. Calling themselves the Citizen Data Scientists, the group of 12 Minnesota residents is made up of professors, practitioners, and researchers in data science, computer science, mathematics, statistics, and engineering at some of Minnesotas leading institutions of higher education, who applied computational redistricting, a relatively new field that uses high-performance computers and optimization algorithms to systematically search through millions of possible combinations of district boundaries.

Here is a sampling of how the four intervenors defended their own work and attacked the others in the lengthy briefs that were filed in mid-December.

The Wattson plaintiffs proposal follows a least-change approach that advocates that court-drawn lines make just enough changes to restore population balance while following other legal mandates set by the panel. Based on the 2020 Census, Minnesotas congressional districts should have 713,312 residents, state Senate districts should have 85,172 and state House districts should have 42,586.

The other principles set out by the judicial panel include: not harming communities of color; not overly dividing local government boundaries; not dividing the reservations of American Indian tribes; crafting districts that are contiguous and convenient for voters; preserving communities of people with shared interests; and avoiding drawing lines with the purpose of protecting, promoting or defeating any incumbent, candidate or political party.

The Wattson plaintiffs proposal for new congressional districts.

The plans submitted by the other parties in this matter fail to adhere to this Panels redistricting principles for some obvious reasons, and some not so obvious reasons A less obvious but very important reason is that the plans of the Anderson Plaintiffs and Sachs Plaintiffs were drawn for the purpose of promoting, protecting or defeating an incumbent, candidate or party, notes the Wattson plaintiffs brief. The districts created by these parties can be explained on no ground other than attempting to gain a partisan advantage.

Article continues after advertisement

The Wattson Plaintiffs have argued that the only way to know if a plan was drawn to help an incumbent or party is to know where incumbents live and how proposed lines would impact future elections. This comes despite the panels assertion that it will not draw districts based on the residence of incumbent office holders and will not consider past election results when drawing districts.

Wattson forges ahead anyway, citing a partisan index the plaintiffs created to apply past election results to new lines.

One example that Wattson cites is how the DFL-friendly Sachs Plaintiffs plan shifts voters from the 3rd Congressional District (now held by DFL Rep. Dean Phillips) and the 5th Congressional District (now held by DFL Rep. Ilhan Omar) to make the 2nd Congressional District (now held by DFL Rep. Angie Craig) safer for Democrats.

The net effect of these changes is that CD 5 is much less convenient. It is sandwiched between CD 3 and CD 4 and is shaped like a T or a hammer, the Wattson brief states.

The Wattson brief also points out that the Corrie Plaintiffs new 8th Congressional District includes three GOP incumbents: U.S. Reps. Pete Stauber, Michelle Fischbach and Tom Emmer, while the DFL-leaning Sachs Plaintiffs plan puts both Emmer and Fischbach in the same district.

By just narrowly including Representative Emmer in CD 7 (Corrie Plaintiffs Plan) and narrowly including Representative Fischbach in CD 6 (Sachs Plaintiffs Plan), with no justification other than population, it is apparent that these pairings were done to defeat Republican incumbents, the Wattson brief states.

The GOP-leaning group of intervenors said they base their congressional plan on a geographic distribution of seats established in previous redistricting processes.

Each of the Opposing Parties congressional redistricting plans propose drastic reconfigurations to Minnesotas existing congressional districts and fail to meet this Panels redistricting criteria, the Anderson brief states, by combining rural and suburban communities into the same district. Doing so negatively impacts the ability for rural voters to elect representatives that reflect their priorities and concerns.

Article continues after advertisement

The Anderson Congressional Plan, on the other hand preserves the unique interests of rural, suburban/exurban, and urban Minnesotans.

Anderson takes issue with a new 8th Congressional District proposed by the Corrie Plaintiffs that reaches across the northern part of the state from North Dakota to Lake Superior.

The Anderson plaintiffs proposal for new congressional districts.

Anderson also accuses DFL-leaning plans of helping the DFL win more seats in Congress: By moving first ring suburbs, which have natural affinities with and similarities to Minneapolis and St. Paul, to districts comprised largely of highly suburban and exurban areas, these parties put more DFL-leaning voters in the perennially toss-up Third and Second districts, Anderson wrote. At the same time, removing first ring suburbs and adding outer suburban voters to the urban Fourth and Fifth districts pose no real risk to DFL candidates, incumbents, or the party, because the Fourth and Fifth districts have had highly reliable DFL majorities for decades.

The DFL-leaning group relies heavily on testimony given during the five-judge panels public hearings in October and criticizes others especially Anderson and Wattson for not taking that testimony into account. (For their part, those intervenors say Sachs cherry picks testimony that supports their decisions and disregards others.)

Sachs also accuses the Wattson plaintiffs of overly strict adherence to its least-change philosophy. Rather than draw districts that are responsive to the states geography and demographics, they instead pursue what they characterize as a least-change approach, one that rigidly focuses on calcified lines on a map and not the wishes and needs of Minnesotans statewide, the Sachs brief states. Their overemphasis on staticity for its own sake has produced proposed maps that are non-responsive to the clear wishes of Minnesotans as expressed to the Panel and that will consequently fail to accurately reflect the human geography of the state.

The Sachs plaintiffs proposal for new congressional districts.

Sachs also criticizes Wattson for using election analyses and incumbent location data. The Sachs Plaintiffs maintain that these sorts of partisan considerations ask the Panel to delve into troubling political waters, Sachs stated. Whether the parties proposed plans avoid impermissible political entanglements should instead be judged based on the degree to which they otherwise satisfy the Panels neutral redistricting criteria, particularly evidence in the record regarding the suitability of joining communities within the same district and dividing others among different districts.

Article continues after advertisement

Sachs also objects that Anderson and Wattson continue to have a First Congressional District that runs across the entire border with Iowa, accusing them of slavish devotion to prior district lines. The Sachs plan instead joins the southwest counties with a new 7th Congressional district that would run north and south from Iowa to Canada.

While both Corrie and Sachs criticize the Wattson plan for the least-change approach and a desire to avoid splitting local governments and precincts, they do so with different conclusions. Said Sachs: the Wattson Plaintiffs have ignored the Redistricting Principles laid out by this Panel, and instead prioritized their own principles, particularly preserving voting precincts and ensuring political competitiveness based on past election results.

But Corrie sees much different motives. In stark contrast to the Panels directive, the Wattson brief makes clear that its maps were created to ensure each incumbent is protected and unabashedly describes how districts were created based on where incumbents live and how to solidify their votes. Throughout their discussion, the Wattson Plaintiffs make scant mention of Minnesotas BIPOC communities. Rather, they pursue incumbent protection in the guise of protecting minority voting rights, perhaps hoping this Panel will not see they have directly contravened this Panels Redistricting Principles.

The Corrie plaintiffs proposal for new congressional districts.

The Corrie Plaintiffs House Plan has 24 districts with 30% or greater minority voting-age population. The Sachs Plaintiffs House Plan also has 24, but the Wattson Plaintiffs has only 21, and the Anderson Plaintiffs has only 18. The Corrie House Plan is the only plan that creates a district (HD 2B) where American Indian/Native American residents constitute 44.5% of the district population, giving this community the ability to elect candidates of choice when voting in alliance with others.

And Corrie explains its choice to spread the 8th Congressional District from east to west as a way to get the states tribal nations into a single district.

As the only map proposal that places all of northern Minnesota in one district, thereby bringing together the three largest American Indian reservations (Red Lake Nation, White Earth Nation, and Leech Lake Band of Ojibwe) as well as four other tribal reservations (such as Bois Forte Band of Chippewa, Fond du Lac Band of Lake Superior Chippewa, and Mille Lacs Band of Ojibwe, Grand Portage Band of Lake Superior Chippewa) and trust lands, the Corrie Congressional Map is the only map that abides by the Courts Redistricting Principles.

Here is the original post:

What to know about the Minnesota redistricting plans going before a special judicial panel this week - MinnPost

Read More..

Upcoming NSTDA Supercomputer in Thailand to Use Nvidia A100 GPUs – CDOTrends

Thailands National Science and Technology Development Agency (NSTDA) upcoming supercomputer will harness hundreds of GPUs, making it the largest public high-performance computing system in Southeast Asia, says Nvidia.

Powered by 704 Nvidia A100 Tensor Core GPUs, the new system will be 30 times faster than the current TARA HPC system. According to information from Nvidias product page, the A100 is available in 40GB or 80GB variants and offers up to 294 times higher AI inference performance over traditional CPUs.

The new supercomputer will be hosted at the NSTDA Supercomputer Centre (ThaiSC) to drive research by engineers and computational and data scientists from academia, government, and industry sectors. It is expected to support research projects in areas such as pharmaceuticals, renewable energy, and weather forecasting.

The new supercomputer at NSTDA will expand and enhance research in Thailand, speeding up the development of breakthroughs that benefit individuals and industries in the country, said Dennis Ang the senior director of enterprise business for worldwide field operations in the SEA and ANZ region at Nvidia.

NVIDIA A100 incorporates building blocks across hardware, networking, software, libraries, optimized AI models, and applications to enable extreme performance for AI and HPC, said Ang.

We chose NVIDIA A100 because it is currently the leading solution for HPC-AI in the market. Even more important is that many HPC-AI software applications are well supported by NVIDIA technology, and the list will keep growing, explained Manaschai Kunaseth, chief of operations at ThaiSC.

When operational, the additional power of the new supercomputer will allow users at ThaiSC to scale up existing research projects. Specifically, the new supercomputer will accelerate innovation for Thailands efforts with more advanced modeling, simulation, AI, and analytics capabilities.

Kwanchiva Thangthai, of the National Electronics and Computer Technology Centers Speech and Text Understanding Team, expects to see massive efficiency gains in speech recognition research pipelines. We can gain competitive performance and provide a free-of-charge Thai speech-to-text service for everyone via AIForThai, she said.

The supercomputer is expected to commence operation in the second half of 2022.

Image credit: iStockphoto/sdecoret

Original post:

Upcoming NSTDA Supercomputer in Thailand to Use Nvidia A100 GPUs - CDOTrends

Read More..

January 2022: Insight into how metabolites affect health aided by new data platforms – Environmental Factor Newsletter

Gary Siuzdak, Ph.D., from the Scripps Research Institute, highlighted exciting technologies that he said will advance the field of metabolomics and a wide range of scientific discovery, during a Dec. 7 NIEHS lecture. Metabolomics is the large-scale study of chemical reactions involving metabolites, which are small molecules that play important roles in cells, tissues, and organisms.

According to Siuzdak, research in this field originally focused on identifying metabolites that serve as biological signs of disease, which scientists call biomarkers. However, metabolomics has evolved into a more comprehensive tool for understanding how metabolites themselves can influence health and illness.

The most important area where metabolomics can be applied is in looking for active metabolites that affect physiology, Siuzdak said. For example, metabolites can impact and even improve the way we respond to medicine or exposure to toxic agents.

Siuzdak developed data analysis platforms called XCMS and METLIN that enable scientists to discover how metabolites can alter critical biological processes, and the tools have been cited in more than 10,000 scientific projects, he noted.

Through XCMS and METLIN, which now contains detailed data on 860,000 molecular standards, the Scripps Center for Metabolomics has strengthened research worldwide, across a variety of disciplines, said Siuzdak, the centers director.

Continued development of databases like METLIN is vital to success of the metabolomics field, noted David Crizer, Ph.D., a chemist in the NIEHS Division of the National Toxicology Program. He is a member of the institutes Metabolomics Cross-Divisional Group, which hosted Siuzdaks talk (see sidebar).

METLIN is designed to help scientists identify molecules in organisms, whether metabolites, toxicological agents, or other chemical entities, according to Siuzdak. He noted that the database encompasses more than 350 chemical classes, and there now are more than 50,000 registered users in 132 countries.

Our goal is to identify as many metabolites and other chemical entities as possible, and given the advances in other fields of biology, this data is long overdue, Siuzdak said.

We are finding metabolites that were previously unknown, quite regularly, he added. The more comprehensive METLIN is, the better chance we have of eventually identifying all molecules. To this end, I am constantly looking for ways to facilitate growth of the platform.

A metabolite called indole-3-propionic acid (IPA) is of particular interest to Siuzdak. IPA is a human gutderived metabolite originally identified by his lab in a 2009 paper in the Proceedings of the National Academy of Sciences, and it has since been examined in thousands of studies. Researchers have discovered that it is a multifunctional molecule that can aid immune function, among other roles.

In retrospect, it makes sense that a metabolite derived from a gut microbe could modulate the immune system, which is probably why it still generates so much excitement, he said.

IPA could be especially relevant with respect to autoimmune diseases, Siuzdak added.

For example, most people who die from COVID-19 dont succumb to the virus but from an overactive immune response that causes them to develop respiratory ailments, he said. A metabolite that modulates this effect could be very beneficial, noted Siuzdak.

Overall, we are pursuing one primary goal in the development of METLIN, which is to use experimental data generated from molecular standards to help identify these key, physiologically relevant molecules, he said.

Citation: Wikoff WR, Anfora AT, Liu J, Schultz PG, Lesley SA, Peters EC, Siuzdak G. 2009. Metabolomics analysis reveals large effects of gut microflora on mammalian blood metabolites. Proc Natl Acad Sci U S A 106(10):36983703.

(John Yewell is a contract writer for the NIEHS Office of Communications and Public Liaison.)

Read the original post:

January 2022: Insight into how metabolites affect health aided by new data platforms - Environmental Factor Newsletter

Read More..

Why Banks Are Slow to Embrace Cloud Computing – The New York Times

In North America, banks handle only 12 percent of their tasks on the cloud, but that could double in the next two years, the consulting firm Accenture said in a survey. Jamie Dimon, chief executive of JPMorgan Chase, said the bank needed to adopt new technologies such as artificial intelligence and cloud technology as fast as possible.

Jan. 4, 2022, 7:23 a.m. ET

Wells Fargo plans to move to data centers owned by Microsoft and Google over several years; Morgan Stanley is also working with Microsoft. Bank of America has saved $2 billion a year in part by building its own cloud. Goldman said in November that it would team up with Amazon Web Services to give clients access to mountains of financial data and analytical tools.

Cloud services enable banks to rent data storage and processing power from providers including Amazon, Google or Microsoft, which have their own data centers dotted around the globe. After moving to the cloud, banks can access their data on the internet and use the tech companies computing capacity when needed, instead of running their own servers year-round.

Seeing a big opportunity to sell cloud-computing services to Wall Street, some tech giants have hired former bankers who can use their knowledge of the rules and constraints under which banks operate to pitch the industry.

Scott Mullins, AWSs head of business development for financial services, previously worked at JPMorgan and Nasdaq. Yolande Piazza, vice president for financial services at Google Cloud, is the former chief executive of Citi FinTech, an innovation unit at Citigroup. Bill Borden at Microsoft and Howard Boville at IBM are Bank of America alumni.

Cloud providers are moving at a much faster development pace when you think of security, compliance and control structures, compared with individual banks, said Mr. Borden, a corporate vice president for worldwide financial services at Microsoft. The cloud, Mr. Borden and the other executives said, enables companies to increase their computer processing capabilities when they need it, which is much cheaper than running servers on their own premises.

But glitches do occur. One week after Goldman teamed up with Amazon, an AWS outage halted webcasts from a conference hosted by the bank that convened chief executives from the biggest U.S. financial firms. The glitch also caused problems for Amazons Alexa voice assistant, Disneys streaming service and Ticketmaster. AWS and its competitor, Microsoft Azure, both had outages recently.

Banking regulators in the United States, including the Federal Reserve, Federal Deposit Insurance Corporation and Office of the Comptroller of the Currency, have jointly underscored the need for lenders to manage risks and have backup systems in place when they outsource technology to cloud providers. The European Banking Authority warned firms about concentration risk, or becoming overly reliant on a single tech company.

The Financial Industry Regulatory Authority, which oversees broker dealers firms that engage in trading activity has already moved all its technology to the cloud. The group previously spent tens of millions of dollars a year to run its own servers but now rents space on AWS servers for a fraction of that amount, said Steven J. Randich, FINRAs chief information officer.

Go here to read the rest:
Why Banks Are Slow to Embrace Cloud Computing - The New York Times

Read More..