Category Archives: Machine Learning
Top Machine Learning Services in the Cloud – Datamation
Machine Learning services in the cloud are a critical area of the modern computing landscape, providing a way for organizations to better analyze data and derive new insights. Accessing these service via the cloud tends to be efficient in terms of cost and staff hours.
Machine Learning (often abbreviated as ML) is a subset of Artificial Intelligence (AI) and attempts to 'learn' from data sets in several different ways, including both supervised and unsupervised learning. There are many different technologies that can be used for machine learning, with a variety of commercial tools as well as open source framework.s
While organizations can choose to deploy machine learning frameworks on premises, it is typically a complex and resource intensive exercise. Machine Learning benefits from specialized hardware including inference chips and optimized GPUs. Machine Learning frameworks can also often be challenging to deploy and configure properly. Complexity has led to the rise of Machine Learning services in the cloud, that provide the right hardware and optimally configured software to that enable organizations to easily get started with Machine Learning.
There are several key features that are part of most machine learning cloud services.
AutoML - The automated Machine Learning feature automatically helps to build the right model.Machine Learning Studio - The studio concept is all about providing a developer environment where machine learning models and data modelling scenarios can be built.Open source framework support - The ability to support an existing framework such as TensorFlow, MXNet and Caffe is important as it helps to enable model portability.
When evaluating the different options for machine learning services in the cloud, consider the following criteria:
In this Datamation top companies list, we spotlight the vendors that offer the top machine learning services in the cloud.
Value proposition for potential buyers: Alibaba is a great option for users that have machine learning needs where data sets reside around the world and especially in Asia, where Alibaba is a leading cloud service.
Value proposition for potential buyers: Amazon Web Services has the broadest array of machine learning services in the cloud today, leading with its SageMaker portfolio that includes capabilities for building, training and deploying models in the cloud.
Value proposition for potential buyers: Google's set of Machine Learning services are also expansive and growing, with both generic as well as purpose built services for specific use-cases.
Value proposition for potential buyers: IBM Watson Machine learning enables users to run models on any cloud, or just on the the IBM Cloud
Value proposition for potential buyers: For organizations that have already bought into Microsoft Azure cloud, Azure Machine Learning is good fit, providing a cloud environment to train, deploy and manage machine learning models.
Value proposition for potential buyers: Oracle Machine learning is a useful tools for organizations already using Oracle Cloud applications, to help build data mining notebooks.
Value proposition for potential buyers: Salesforce Einstein is a purpose built machine learning platform that is tightly integrated with the Salesforce platform.
See the rest here:
Top Machine Learning Services in the Cloud - Datamation
Combating the coronavirus with Twitter, data mining, and machine learning – TechRepublic
Social media can send up an early warning sign of illness, and data analysis can predict how it will spread.
The coronavirus illness (nCoV) is now an international public health emergency, bigger than the SARS outbreak of 2003. Unlike SARS, this time around scientists have better genome sequencing, machine learning, and predictive analysis tools to understand and monitor the outbreak.
During the SARS outbreak, it took five months for scientists to sequence the virus's genome. However, the first 2019-nCoV case was reported in December, and scientists had the genome sequenced by January 10, only a month later.
Researchers have been using mapping tools to track the spread of disease for several years. Ten European countries started Influenza Net in 2003 to track flu symptoms as reported by individuals, and the American version, Flu Near You, started a similar service in 2011.
Lauren Gardner, a civil engineering professor at Johns Hopkins and the co-director of the Center for Systems Science and Engineering, led the effort to launch a real-time map of the spread of the 2019-nCoV. The site displays statistics about deaths and confirmed cases of coronavirus on a worldwide map.
Este Geraghty, MD, MS, MPH, GISP, and chief medical officer and health solutions director at Esri, said that since the SARS outbreak in 2003 there has been a revolution in applied geography through web-based tools.
"Now as we deploy these tools to protect human lives, we can ingest real-time data and display results in interactive dashboards like the coronavirus dashboard built by Johns Hopkins University using ArcGIS," she said.
SEE:The top 10 languages for machine learning hosted on GitHub (free PDF)
With this outbreak, scientists have another source of data that did not exist in 2003: Twitter and Facebook. In 2014, Chicago's Department of Innovation and Technology built an algorithm that used social media mining and illness prediction technologies to target restaurants inspections. It worked: The algorithm found violations about 7.5 days before the normal inspection routine did.
Theresa Do, MPH, leader of the Federal Healthcare Advisory and Solutions team at SAS, said that social media can be used as an early indicator that something is going on.
"When you're thinking on a world stage, a lot of times they don't have a lot of these technological advances, but what they do have is cell phones, so they may be tweeting out 'My whole village is sick, something's going on here,' she said.
Do said an analysis of social media posts can be combined with other data sources to predict who is most likely to develop illnesses like the coronavirus illness.
"You can use social media as a source but then validate it against other data sources," she said. "It's not always generalizable (is generalizable a word?), but it can be a sentinel source."
Do said predictive analytics has made significant advances since 2003, including refining the ability to combine multiple data sources. For example, algorithms can look at names on plane tickets and compare that information with data from other sources to predict who has been traveling to certain areas.
"Algorithms can allow you to say 'with some likelihood' it's likely to be the same person," she said.
The current challenge is identifying gaps in the data. She said that researchers have to balance between the need for real-time data and privacy concerns.
"If you think about the different smartwatches that people wear, you can tell if people are active or not and use that as part of your model, but people aren't always willing to share that because then you can track where someone is at all times," she said.
Do said that the coronavirus outbreak resembles the SARS outbreak, but that governments are sharing data more openly this time.
"We may be getting a lot more positives than they're revealing and that plays a role in how we build the models," she said. "A country doesn't want to be looked at as having the most cases but that is how you save lives."
Get expert tips on mastering the fundamentals of big data analytics, and keep up with the latest developments in artificial intelligence. Delivered Mondays
This map from Johns Hopkins shows reported cases of 2019-nCoV as of January 30, 2020 at 9:30 pm. The yellow line in the graph is cases outside of China while the orange line shows reported cases inside the country.
Image: 2019-nCoV Global Cases by Johns Hopkins Center for Systems Science and Engineering
Read the rest here:
Combating the coronavirus with Twitter, data mining, and machine learning - TechRepublic
The impediments to implementing AI and machine learning – Which-50
While the benefits of implementing artificial intelligence (AI) and machine learning into a business are reasonably clearly understood, there are still some impediments in the way.
In a recent LogMeIn report, entitledTransforming the Frontline of Customer Engagement,the challenges organisations face when applying AI and machine learning were addressed directly.
The biggest overall challenge is managing organisational change, according to the report. The more senior the executive, the more likely they are to express concern about software and data integrations.
An executive working as Head of Digital & Customer Care in the retail sector said, Everyone is working towards an integrated system.
The days when agents were expected to jump between systems running non-integrated software are passing. This still exists, but its viewed as untenable.
Instead, the goal is to be able to integrate tickets from social media, email, phone calls and other channels through a single portal.
Our experts stressed that technologies such as machine learning, AI and automation are enablers not replacements.
Luke Shaw, Head of Ecommerce at Sigma Healthcare, said there may be areas where company leaders are overestimating the impact of machine learning and AI, at least as its applied at the moment.
Another misunderstanding is the amount of work required. The reality is it takes a huge amount of human input to make these technologies work in the way that companies want them to, according to an executive.
They said, Obviously, the keywords with machine learning and artificial intelligence are learning and artificial. You do need to teach these technologies the nuances of your particular business and your particular customer base in order for them to deliver the results you want.
The pace of change is accelerating, and often the workforce is left playing catch-up. This can raise issues around compliance and governance, and even basic ethics.
The executives quoted in the report said some leaders may underestimate the requirements to implement Maxine learning and AI at the enterprise level. You do still need team members that will help train the technology and monitor for areas of improvement.
There is also the issue of a shortage of talent to help implement AI. Sharon Melamed, Managing Director at Matchboard, said its all well and good for the C-level to set a mandate for AI-first or push for more AI initiatives, but there is often a shortage of talent to implement this vision.
Technology has moved fast and the workforce is playing catch-up, and theres lots to think about along the way in terms of ethics and governance.
Some leaders may under-estimate the manpower required to launch and successfully operate an enterprise-grade chatbot, for example, thinking its just a case of buying software. Its not. You need UX and conversation designers, marketing, IT, analytics and project management resources, just for starters.
Athina Mallis is the editor of the Which-50 Digital Intelligence Unit of which LogMeIn is a corporate member. Members provide their insights and expertise for the benefit of the Which-50 community. Membership fees apply.
Continued here:
The impediments to implementing AI and machine learning - Which-50
ScoreSense Leverages Machine Learning to Take Its Customer Experience to the Next Level – Yahoo Finance
One Technologies Partners with Arrikto to Uniquely Tailor its ScoreSense Consumer Credit Platform to Each Individual Customer
DALLAS, Jan. 30, 2020 /PRNewswire/ --To provide customers with the most personalized credit experience possible, One Technologies, LLC has partnered with data management innovator Arrikto Inc. (https://www.arrikto.com/)to incorporate Machine Learning (ML) into its ScoreSense credit platform.
ScoreSense, http://www.ScoreSense.com (PRNewsfoto/One Technologies, LLC)
"To truly empower consumers to take control of their financial future, we must rely on insights from real datanot on assumptions and guesswork," said Halim Kucur, Chief Product Officer at One Technologies, LLC. The innovations we have introduced provide data-driven intelligence about customers' needs and wants before they know this information themselves."
"ScoreSense delivers state-of-the-art credit information through their ongoing investment in the most cutting-edge machine learning products the industry has to offer," said Constantinos Venetsanopoulos, Founder and CEO of Arrikto Inc. "Our partnership has been a big success because One Technologies aligns seamlessly with the most forward-looking developers in the ML space and understands the tremendous value of data for serving customers better."
ScoreSense (https://www.scoresense.com) serves as a one-stop digital resource where consumers can access credit scores and reports from all three main credit bureausTransUnion, Equifax, and Experianand comprehensively pinpoint the factors which are most affecting their credit.
About One Technologies
One Technologies, LLC harnesses the power of technology, analytics and its people to create solutions that empower consumers to make more informed decisions about their financial lives. The firm's consumer credit products include ScoreSense, which enables members to seamlessly access, interact with, and understand their credit profiles from all three main bureaus using a single application. The ScoreSense platform is continually updated to give members deeper insights, personalized tools and one-on-one Customer Care support that can help them make the most sense of their credit.
One Technologies is headquartered in Dallas and was established in October 2000. For more information, please visit https://onetechnologies.net/.
Media Contact
Laura MarvinJConnelly for One Technologies646-922-7774 OT@jconnelly.com
View original content to download multimedia:http://www.prnewswire.com/news-releases/scoresense-leverages-machine-learning-to-take-its-customer-experience-to-the-next-level-300995934.html
SOURCE One Technologies, LLC
Blue Prism Adds Conversational AI, Automated Machine Learning and Integration with Citrix to Its Digital Workforce – AiThority
Company Continues to Build Out Intelligent Automation Skills That can be Instantly Accessed and Downloaded by All
Looking to empower enterprises with the latest and most innovative intelligent automation solutions, Blue Prism announced the addition of DataRobot, ServisBOTandUltimato its Technology Alliance Program (TAP) as affiliate partners. These partners extend Blue Prisms reach by making their software accessible to customers viaBlue Prisms Digital Exchange (DX), an intelligent automation app store and online community.
Blue Prisms DX is unique in that, every week new intelligent automation capabilities get added to the forum which has resulted intens of thousands of assets being downloaded, making it the ideal online community foraugmenting and extending traditional RPA deployments. The latest capabilities on the DX includedealing with conversational AI (working with chatbots), adding automated machine learning as well as new integrations with Citrix. With just a few clicks users can drag and drop these new capabilities into Blue Prisms Digital Workforceno coding required.
Blue Prisms vision of providing a Digital Workforce for Every Enterprise is extended with our DX community, which continues to push the boundaries of intelligent automation, says Linda Dotts, SVP Global Partner Strategy and Programs for Blue Prism. Our DX ecosystem is the catalyst and cornerstone for driving broader innovations with our Digital Workforce. It provides everyone with an a la carte menu of automation options that are drag and drop easy to use.
Recommended AI News: Moodys Analytics Tops Five Categories in CeFPro Fintech Leaders Report
Below is a quick summary of the new capabilities being brought to market by these TAP affiliate partners:
DataRobot: The integration of DataRobot with Blue Prism provides enterprises with the intelligent automation needed to transform business processes at scale. By combining RPA with AI, the integration automates data-driven predictions and decisions to improve the customer experience, as well as process efficiencies and accuracy. The resulting business process improvements help move the bottom line for businesses by removing repetitive, replicable, and routine tasks for knowledge workers so they can focus on more strategic work.
The powerful combination of RPA with AI what we call intelligent process automation unlocks tremendous value for enterprises who are looking to operationalize AI projects and solve real business problems, says Michael Setticasi, VP of Alliances at DataRobot. Our partnership with Blue Prism will extend our ability to deliver intelligent process automation to more customers and drive additional value to global enterprises.
Recommended AI News: NICE Powers Predictive Behavioral Routing with AI-Driven Sentiment Data
ServisBOT:ServisBOT offers the integration of an insurance-focused chatbot solution to Blue Prisms Robotic Process Automation (RPA), enabling customers to file an insurance claim with their provider using the convenience and 24/7 availability of a chatbot. This integration with ServisBOTs natural language technology adds a claims chatbot skill to the Blue Prism platform, helping insurance companies increase efficiencies and reduce costs across the complete claims management journey and within a Blue Prism defined workflow.
Together we are providing greater efficiencies in managing insurance claims through chatbots combined with AI-powered automation, says Cathal McGloin, CEO of ServisBOT. This drives down operational costs while elevating a positive customer experience through faster claims resolution times and reduced friction across all customer interactions.
Ultima: The integration of Ultima IA-Connect with Blue Prism enables fast, secure automation of business processes over Citrix Cloud and Citrix virtual apps and desktops sessions (formerly known as XenApp and XenDesktop). The new IA-Connect tool allows users to automate processes across Citrix ICA or Microsoft RDP virtual channels, without needing to resort to screen scraping or surface automation.
We know customers who decided not to automate because they were nervous about using cloud-based RPA or because running automations over Citrix was simply too painful, says Scott Dodds, CEO of Ultima. Weve addressed these concerns, with IA-Connect now available on the DX. It gives users the ability to automate their business processes faster while helping reduce overall maintenance and support costs.
Recommended AI News: 2020 Is Set to Be the Year of RCS, Says mGage
Read the rest here:
Blue Prism Adds Conversational AI, Automated Machine Learning and Integration with Citrix to Its Digital Workforce - AiThority
3 books to get started on data science and machine learning – TechTalks
Image credit: Depositphotos
This post is part of AI education, a series of posts that review and explore educational content on data science and machine learning.
With data science and machine learning skills being in high demand, theres increasing interest in careers in both fields. But with so many educational books, video tutorials and online courses on data science and machine learning, finding the right starting point can be quite confusing.
Readers often ask me for advice on the best roadmap for becoming a data scientist. To be frank, theres no one-size-fits-all approach, and it all depends on the skills you already have. In this post, I will review three very good introductory books on data science and machine learning.
Based on your background in math and programming, the two fundamental skills required for data science and machine learning, youll surely find one of these books a good place to start.
Data scientists and machine learning engineers sit at the intersection of math and programming. To become a good data scientist, you dont need to be a crack coder who knows every single design pattern and code optimization technique. Neither do you need to have an MSc in math. But you must know just enough of both to get started. (You do need to up your skills in both fields as you climb the ladder of learning data science and machine learning.)
If you remember your high school mathematics, then you have a strong base to begin the data science journey. You dont necessarily need to recall every formula they taught you in school. But concepts of statistics and probability such as medians and means, standard deviations, and normal distributions are fundamental.
On the coding side, knowing the basics of popular programming languages (C/C++, Java, JavaScript, C#) should be enough. You should have a solid understanding of variables, functions, and program flow (if-else, loops) and a bit of object-oriented programming. Python knowledge is a strong plus for a few reasons: First, most data science books and courses use Python as their language of choice. Second, the most popular data science and machine learning libraries are available for Python. And finally, Pythons syntax and coding conventions are different from other languages such as C and Java. Getting used to it takes a bit of practice, especially if youre used to coding with curly brackets and semicolons.
Written by Sinan Ozdemir, Principles of Data Science is one of the best intros to data science that Ive read. The book keeps the right balance between math and coding, theory and practice.
Using examples, Ozdemir takes you through the fundamental concepts of data science such as different types of data and the stages of data science. You will learn what it means to clean your data, normalize it and split it between training and test datasets.
The book also contains a refresher on basic mathematical concepts such as vector math, matrices, logarithms, Bayesian statistics, and more. Every mathematical concept is interspersed with coding examples and introduction to relevant Python data science libraries for analyzing and visualizing data. But you have to bring your own Python skills. The book doesnt have any Python crash course or introductory chapter on the programming language.
What makes the learning curve of this book especially smooth is that it doesnt go too deep into the theories. It gives you just enough knowledge so that you can make optimal uses of Python libraries such as Pandas and NumPy, and classes such as DataFrame and LinearRegression.
Granted, this is not a deep dive. If youre the kind of person who wants to get to the bottom of every data science and machine learning concept and learn the logic behind every library and function, Principles of Data Science will leave you a bit disappointed.
But again, as I mentioned, this is an intro, not a book that will put you on a data science career level. Its meant to familiarize you with what this growing field is. And it does a great job at that, bringing together all the important aspects of a complex field in less than 400 pages.
At the end of the book, Ozdemir introduces you to machine learning concepts. Compared to other data science textbooks, this section of Principles of Data Science falls a bit short, both in theory and practice. The basics are there, such as the difference between supervised and unsupervised learning, but I would have liked a bit more detail on how different models work.
The book does give you a taste of different ML algorithms such as regression models, decision trees, K-means, and more advanced topics such as ensemble techniques and neural networks. The coverage is enough to whet your appetite to learn more about machine learning.
As the name suggests, Data Science from Scratch takes you through data science from the ground up. The author, Joel Grus, does a great job of showing you all the nitty-gritty details of coding data science. And the book has plenty of examples and exercises to go with the theory.
The book provides a Python crash course, which is good for programmers who have good knowledge of another programming language but dont have any background in Python. Whats really good about Gruss intro to Python is that aside from the very basic stuff, he takes you through some of the advanced features for handling arrays and matrices that you wont find in general Python tutorial textbooks, such as list comprehensions, assertions, iterables and generators, and other very useful tools.
Moreover, the Second Edition of Data Science from Scratch, published in 2019, leverages some of the advanced features of Python 3.6, including type annotations (which youll love if you come from a strongly typed language like C++).
What makes Data Science from Scratch a bit different from other data science textbooks is its unique way to do everything from scratch. Instead of introducing you to NumPy and Pandas functions that will calculate coefficients and, say, mean absolute errors (MAE) and mean square errors (MSE), Grus shows you how to code it yourself.
He does, of course, remind you that the books sample code is meant for practice and education and will not match the speed and efficiency of professional libraries. At the end of each chapter, he provides references to documentation and tutorials of the Python libraries that correspond to the topic you have just learned. But the from-scratch approach is fun nonetheless, especially if youre one of those I-have-to-know-what-goes-on-under-the-hood type of people.
One thing youll have to consider before diving into this book is, youll need to bring your math skills with you. In the book, Grus codes fundamental math functions, starting from simple vector math to more advanced statistic concepts such as calculating standard deviations, errors, and gradient descent. However, he assumes that you already know how the math works. I guess its okay if youre fine with just copy-pasting the code and seeing it work. But if youve picked up this book because you want to make sense of everything, then have your calculus textbook handy.
After the basics, Data Science from Scratch goes into machine learning, covering various algorithms, including the different flavors of regression models and decision trees. You also get to delve into the basics of neural networks followed by a chapter on deep learning and an introduction to natural language processing.
In short, I would describe Data Science with Python as a fully hands-on introduction to data science and machine learning. Its the most practice-driven book on data science and machine learning that Ive read. The authors have done a great job of bringing together the right data samples and practice code to get you acquainted with the principles of data science and machine learning.
The book contains minimal theoretical content and mostly teaches you by taking you through coding labs. If you have a decent computer and an installation of Anaconda or another Python package that has comes bundled with Jupyter Notebooks, then you can probably go through all the exercises with minimal effort. I highly recommend writing the code yourself and avoiding copy-pasting it from the book or sample files, since the entire goal of the book is to learn through practice.
Youll find no Python intro here. Youll dive straight into NumPy, Pandas, and scikit-learn. Theres also no deep dive into mathematical concepts such as correlations, error calculations, z-scores, etc., so youll need to get help from your math book whenever you need a refresher on any of the topics.
Alternatively, you can just type in the code and see Pythons libraries work their magic. Data Science with Python does a decent job of showing you how to put together the right pieces for any data science and machine learning project.
Data Science with Python provides a solid intro to data preparation and visualization, and then takes you through a rich assortment of machine learning algorithms as well as deep learning. There are plenty of good examples and templates you can use for other projects. The book also gives an intro on XGBoost, a very useful optimization library, and the Keras neural network library. Youll also get to fiddle around with convolutional neural networks (CNN), the cornerstone of current advances in computer vision.
Before starting this book, I strongly recommend that you go through a gentler introductory book that covers more theory, such as Ozdemirs Principles of Data Science. It will make the ride less confusing. The combination of the two will leave you with a very strong foundation to tackle more advanced topics.
These are just three of the many data science books that are out there. If youve read other awesome books on the topic, please share your experience in the comments section. There are also plenty of great interactive online courses, like Udemys Machine Learning A-Z: Hands-On Python & R In Data Science (I will be reviewing this one in the coming weeks).
While an intro to data science will give you a good foothold into the world of machine learning and the broader field of artificial intelligence, theres a lot of room for expanding that knowledge.
To build on this foundation, you can take a deeper dive into machine learning. There are plenty of good books and courses out there. One of my favorites is Aurelien Gerons Hands-on Machine Learning with Scikit-Learn, Keras & TensorFlow (also scheduled for review in the coming months). You can also go deeper on one of the sub-disciplines of ML and deep learning such as CNNs, NLP or reinforcement learning.
Artificial intelligence is complicated, confusing, and exciting at the same time. The best way to understand it is to never stop learning.
Read the original post:
3 books to get started on data science and machine learning - TechTalks
Want To Be AI-First? You Need To Be Data-First. – Forbes
Data First
Those that implement AI and Machine Learning project learn quickly that machine learning projects are not application development projects. Much of the value of machine learning projects rest in the models, training data, and configuration information that guides how the model is applied to the specific machine learning problem. The application code is mostly a means to implement the machine learning algorithms and "operationalize" the machine learning model in a production environment. That's not to say that application code is not necessary after all, the computer needs some way to operationalize the machine learning model but focusing a machine learning project on the application code is missing the big picture. If you want to be AI-first for your project, you need to have a data-first perspective.
Use data-centric methodologies and data-centric technologies
Therefore it follows that if you're going to have a data-first perspective, you need to use a data-first methodology. There's certainly nothing wrong with Agile methodologies as a way of iterating towards success, but Agile on its own leaves much to be desired as it's focused on functionality and delivery of application logic. There are already data-centric methodologies out there that have been proven in many real-world scenarios. One of the most popular is the Cross Industry Standard Process for Data Mining (CRISP-DM), which focuses on the steps needed for successful data projects. In the modern age, it makes sense to merge the notably non-agile CRISP-DM with Agile Methodologies to make it more relevant. While this is still a new area for most enterprises implementing AI projects, we see this sort of merged methodology approach to be more successful than trying to shoehorn all the aspects of an AI project into existing application-focused Agile methodologies.
It stands to reason that if you have a data-centric perspective on AI then you need to pair your data-centric methodologies with data-centric technologies. This means that your choice of tooling to implement all those artifacts detailed above need to be, first and foremost, data-focused. Don't use code-centric IDEs when you should be using data notebooks. Don't use enterprise integration middleware platforms when you should be using tools that focus on model development and maintenance. Don't use so-called machine learning platforms that are really just a pile of cloud-based technologies or overgrown big data management platforms. The tools you use should support the machine learning goals you need, which are in turn supported by the activities you need to do and the artifacts you need to create. Just because a GPU provider has a toolset doesn't mean that it's the right one to use. Just because a big enterprise vendor or a cloud vendor has a "stack" doesn't mean it's the right one. Start from the deliverables and the machine learning objectives and work your way backwards.
Another big consideration is where and how machine learning models will be deployed - or in AI-speak "operationalized". AI models can be implemented in a remarkably wide range of places from "edge" devices sitting disconnected from the internet to mobile and desktop applications; from enterprise servers to cloud-based instances; and all manner of autonomous vehicles and craft. Each of these locations is a place where AI models and implementations can and do exist. This amount of model operationalization heterogeneity highlights even more so how ludicrous the idea of a single machine learning platform is. How can one platform at the same time provide AI capabilities in a drone, mobile app, enterprise implementation, and cloud instance. Even if you source all this technology from a single vendor, it will be a collection of different tools that sit under a single marketing umbrella rather than a single, cohesive, interoperable platform that makes any sense.
Build data-centric talent
All this methodology and technology can't assemble itself. If you're going to be successful at AI projects you're going to need to be successful at building an AI team. And if the data-centric perspective is the correct one for AI, then it makes sense that your team also needs to be data-centric. The talent to build apps or manage enterprise systems or data is not the same to build AI models, tune algorithms, work with training data sets, and operationalize ML models. The primary core of your AI team needs to be data scientists, data engineers, and those folks responsible for putting machine learning models into operation. While there's always a need for coding, development, and project management, finding and growing your data-centric talent is key to long term success of your AI initiatives.
The primary challenge with building data talent is that it's hard to find and grow. The primary reason for this is because data isn't code. You need folks who know how to wrangle lots of data sources, compile them into clean data sets, and then extract information needles from data haystacks. In addition, the language of AI is math, not programming logic. So a strong data team is also strong in the right kinds of math to understand how to select and implement AI algorithms, properly tweak hyperparameters, and properly interpret testing and validation results. Simply guessing about and changing training data sets and hyperparameters at random is not a good way to create AI projects that deliver value. As such, data-centric talent grounded in a fundamental understanding of machine learning math and algorithms combined with an understanding of how to deal with big data sets is crucial to AI project success.
Prepare to continue to invest for the long haul
It should be pretty obvious at this point that the set of activities for AI are indeed very much data-centric and the activities, artifacts, tools, and team need to follow from that data-centric perspective. The biggest challenge is that so much of that ecosystem is still being developed and is not fully available for most enterprises. AI-specific methodologies are still being tested in large scale projects. AI-specific tools and technologies are still being developed, enhanced, and evolutionary changes are being released on a rapid scale. AI talent continues to be tight and is an area where we're just starting to see investment in growth of this skill set.
As a result, organizations that need to be successful with AI, even with this data-centric perspective, need to be prepared to invest for the long haul. Find your peer groups to see what methodologies are working for them and continue to iterate until you find something that works for you. Find ways to continuously update your team's skills and methods. Realize that you're on the bleeding edge with AI technology and prepare to reinvest in new technology on a regular basis, or invent your own if need be. Even though the history of AI spans at least seven decades, we're still in the early stages of making AI work for large scale projects. This is like the early days of the Internet or mobile or big data. Those early pioneers had to learn the hard way, making many mistakes before realizing the "right" way to do things. But once those ways were discovered, organizations reaped big rewards. This is where we're at with AI. As long as you have a data-centric perspective and are prepared to continue to invest for the long haul, you will be successful with your AI, machine learning, and cognitive technology efforts.
Visit link:
Want To Be AI-First? You Need To Be Data-First. - Forbes
This Python Package ‘Causal ML’ Provides a Suite of Uplift Modeling and Causal Inference with Machine Learning – MarkTechPost
Causal ML is a Python package that deals with uplift modeling, which estimates heterogeneous treatment effect (HTE) and causal inference methods with the help of machine learning (ML) algorithms based on research. It uses a standard interface that allows users to estimate theConditional Average Treatment Effect(CATE) orIndividual Treatment Effect(ITE) from data (experimental or observational).
Casual ML package provides eight cutting edge uplift modeling algorithms combining causal inference & ML. Essentially, it estimates the causal impact of interventionTon outcomeYfor users with observed featuresX, without strong assumptions on the model form. As mentioned earlier, the package deals with uplift modeling, which estimates heterogeneous treatment effect (HTE), therefore starting with general causal inference, then learning about HTE and uplift modeling would definitely help.
The Github repository contains a good example onJupyter Notebookof how to use all these algorithms.
Some Use Cases:
The Casual ML package currently supports the following methods:
Github: https://github.com/uber/causalml
Documentation: https://causalml.readthedocs.io/en/latest/about.html
Read: Using Causal Inference to Improve the Uber User Experience
Installation (Source: https://causalml.readthedocs.io/en/latest/installation.html )
causalmlis available on PyPI, and can be installed frompipor source as follows:
Frompip:
From source:
Related
Asif Razzaq is an AI Tech Blogger and Digital Health Business Strategist with robust medical device and biotech industry experience and an enviable portfolio in development of Health Apps, AI, and Data Science. An astute entrepreneur, Asif has distinguished himself as a startup management professional by successfully growing startups from launch phase into profitable businesses. This has earned him awards including, the SGPGI NCBL Young Biotechnology Entrepreneurs Award.
See the original post here:
This Python Package 'Causal ML' Provides a Suite of Uplift Modeling and Causal Inference with Machine Learning - MarkTechPost
Data Transparency and Curation Vital to Success of Healthcare AI – HealthLeaders Media
Amid advances in precision medicine, healthcare is facing the twin challenges of having to curate and tailor the use of patient data to drive genomics-powered breakthroughs.
That was the takeaway from the AI & data sciences track of last weeks Precision MedicineWorld Conference in Santa Clara, California.
"There aren't a lot of physicians saying, 'Bring me more AI,' " said John Mattison, MD, emeritus CMIO and assistant medical director of Kaiser Permanente. "Every physician is saying bring me a safer and more efficient way to deliver care."
Mattison recalled his prolonged conversations with the original developers of IBM's Watson AI technology. "Initially they had no human curation whatsoever," he said. "As Stanford has published over and over again, most of medical published literature is subsequently refuted or ignored, because it's wrong. The original Watson approach was pure machine curation of reported literature without any human curation."
But human curation is not without its own biases. Watson's value to Kaiser was further eroded by Watson's focus on oncology patient data from Memorial Sloan Kettering Cancer Center and MD Anderson Cancer Center, Mattison said.
"I don't really want curation from those two institutions, because they're fee for service, and you get all these biases. The amount of money the drug companies spend on lobbying doctors to use their more expensive novel drugs is remarkably influential. If you're involved in clinical care, you want to take the best output of machine learning and you want to make sure that you have good human curation," which in Kaiser's case, emphasizes value-based care over fee-for-service, he added.
A key in human curation of machine learning and AI is how transparent the curation is, and how accessible the authoring environment for such curation is, so clinicians can make appropriate substitutions for their own requirements, Mattison said.
A current challenge of health systems is being approached by machine learning and AI companies who remain in stealth mode and are not being up-front about how and where that technology will share patient data, making it difficult for chief data officers to introduce the technology to the health system.
"Using [the patient data] for some commercial, unexpected purpose is very different than using it for the purpose that you have agreed with the health system that you're going to be using it with," said Cora Han, JD, chief health data officer with UC Health, the umbrella organization for UCSF, UCLA, UC Irvine, UC Davis, UC San Diego, and UC Riverside health systems.
Related: Opinion: An 'Epic' Pushback as U.S. Prepares for New Era of Empowering Patient Health Data
A recurring theme during the conference was the need for a third party to provide trusted certification that machine learning and AI algorithms are free from bias, such as confirmation bias or ascertainment bias, meaning basing algorithms on a cohort of patients who do not represent the entire population served by the health system.
"We have no certification groups right now that certify these things as being fair," said Atul Butte, MD, director of UCSF's Bakar Computational Health Sciences Institute. "Imagine a world in five to 10 years where we're only going to buy or license methods or algorithms that have been certified as being fair in our population, in the University of California."
UCLA Health has met or exceeded the goal of representing its own demographics within Atlas, the systems community health initiative that "aims to recruit 150,000 patients across the health system with the goal of creating California's largest genomic resource that can be used for translational medicine," according to the UCLA Health website.
"We are a far cry from [meeting] L.A. county" demographics, said Clara Lajonchere, PhD, deputy director of the UCLA Institute for Precision Health. Currently, 15% of Atlas patients are Latino, and 6%7% are African-American. "While those rates exceed that of some of the other large-scale studies, it still really underlies how critical diversity is going to be."
Recent alliances such as the Google/Ascension agreement, or the Mayo Clinic/nference startup for drug development are further enabling the kind of volume, velocity, and variety that will drive machine learning and AI innovations in healthcare, Han said.
HIPAA, which has enabled business associates such as nference to safely enter patient-sharing relationships with providers such as Mayo, can work against the principle of transparency. "If a tech company signs a BAA with a hospital system, [outsiders] don't get to see that contract," Butte said. "We could take it on faith that all the right terms were put in that contract, but sometimes just naming two entities in a sentence seems sinister and ominous in some ways."
Health systems with more than 100 years of trust associated with their brand find themselves partnering with startups with little or no such trust, and this creates additional tension in the healthcare system.
In addition, concerns linger that deidentified data will somehow be able to be reidentified through the course of its use and sharing by innovative startups.
"Whole genomes, it's hard to deidentify those," Han said. "These are issues that we will be working through."
We just need to develop a set of standards about how privacy is controlled, said Brook Byers, founder and partner with Kleiner Perkins, a Silicon Valley venture capital firm.
Related: Epic's CEO Is Urging Hospital Customers to Oppose Rules That Would Make It Easier to Share Medical Info
Scott Mace is a contributing writer for HealthLeaders.
See the original post:
Data Transparency and Curation Vital to Success of Healthcare AI - HealthLeaders Media
Patenting Considerations for Artificial Intelligence in Biotech and Synthetic Biology – Part 2: Key Issues in Patent Subject Matter Eligibility -…
In our first blog in this multi-part series, we explored key considerations for protecting artificial intelligence (AI) inventions in biotech and synthetic biology. In this part 2 of the series, we will examine some key considerations and hurdles in patenting machine learning-based biotech or synthetic biology inventions.
In this series, we are focusing on artificial intelligence inventions, but as Alan Turing aptly pointed out, that neologism is a suitcase term because you can stuff a lot of intelligence classifications and different types of technologies into it. Many of the ground-breaking AI developments in biotech are in the AI subfield of Machine Learning. First, we will briefly discuss what is meant by Machine Learning and discuss some relevant terms. Second, we will review some real world challenges in patenting AI inventions.
What is Machine Learning?
Machine learning (ML) is basically a term to cover algorithms that use statistics to find and apply patterns in digitally stored data, which can be images, numbers, words, etc. (For a user-friendly overview on the different terms, please see Karen Haos article What is Machine Learning? from the MIT Tech Review, available here.) Deep learning is a subfield of machine learning.
Source: https://www.edureka.co/blog/ai-vs-machine-learning-vs-deep-learning/
There are three general types of ML algorithms: Supervised Learning, Unsupervised Learning, and Reinforcement Learning. The MIT Tech Review published this helpful flow chart to explain what kind of ML the algorithm is using, though if you want a more technical explanation this is a helpful resource.
An ML algorithm is a way of classifying information, and a neural network is a type of algorithm that is meant to classify information the same way a human brain does. For example, a neural network can look at pictures and recognize certain elements, like pixel colors and classify them according to what they show. Neural networks are made up of nodes. A node is an individual computation where an algorithm assigns significance (or weight) to input data, the sum of that information is then passed through the activation function which determines what, if anything, is done with the output.
Heres a diagram of what one node might look like:
Image Credit: Skymind
A neural network is several nodes together. Deep Learning (DL) is when more than three layers of neural networks are stacked.
Image Credit:Oracle
DL has spawned many of the most significant advancements in biotech in the past few years and is continuing to drive advancements. For example, DL can predict how genetic variation alters cellular processes involved in pathogenesis, use patient data to characterize disease progression, or speed up computational methods to predict protein structure.
Patenting Machine Learning Inventions
Applying for patent protection presents certain risks, especially for computer-based inventions. If your invention is merely a way to improve the functioning of a computer, without tying it to a practical application, then there is a significant risk that the patent office that may ultimately reject the application because it is based on ineligible subject matter. Abstract ideas are subject matter that is ineligible for patent protection and can include mental processes (concepts performed by the human mind), methods of organizing human activity (such as fundamental economic concepts or managing interactions between peoples), or mathematical relationships, formulas or calculations. This last category is particularly important to AI-based inventions. For example, under U.S. law, an invention that is a stand-alone algorithm is likely to be seen as no more than abstract mathematics and, therefore, not eligible for patent protection.
Mathematical calculations that can be performed by the human mind are the basic tools of scientific and technological work, which are free to all men and reserved exclusively to none. Mayo Collaborative Servs. v. Prometheus Labs., 566 U.S. 66 (2012). This may seem an absurd restriction to some, as the human mind might be able to carry out the millions of calculations a neural network can perform, even if there is no guarantee that a human mind could finish those calculations in one lifetime. However, permitting patents on basic calculations would cripple scientific exploration and advancement. Therefore, to be eligible for patent protection, an invention centered on an algorithm must significantly advance a specific technical application, not merely use an algorithm to solve a problem. The patent application must explain in detail how the claimed algorithm interacts with the physical infrastructure of the computer, network, or both and explain the real world problem the invention is meant to address.
As previously discussed here and here, the tying of algorithms to real world solutions is a requirement in many jurisdictions globally, including the European Patent Office (EPO) and Israel. For example, new guidelines issued by the European Patent Office stress that the AI inventions must have an application for a specific field of technology. In this respect, patent offices are taking a somewhat technical approach and considering AI elements of an invention as any other software element.
Many AI patents face an uphill battle for patentability due to the use of computer systems and algorithms and the rapidly evolving law surrounding subject matter eligibility. To address the changes in law and stem the many patent application rejections, the U.S. Patent and Trademark Office (USPTO) issued Revised Patent Subject Matter Eligibility Guidance in January 2019 and Patent Eligibility Guidance Update in October 2019 which included examples for the revised subject matter eligibility. USPTO director Andrei Iancu stated recently that rejections of AI related patent applications have dropped from 60% to about 32% since the January 2019 guidelines issued.
The USPTOs Example 39 from the October 2019 Patent Eligibility Guidance Update provides a very helpful example of an allowable patent claim for a method of training a neural network for facial detection. The invention attempts to solve the problem of inaccurate facial recognition through using an expanded training set of facial images and then addressing false positives by retraining the algorithm on a new set of images.
The example claim recites A computer-implemented method of training a neural network for facial detection comprising: [a set of digital images] training the neural network in a first stage using the first training set; creating a second training set for a second stage of training comprising the first training set and digital non-facial images that are incorrectly detected as facial images after the first stage of training; and training the neural network in a second stage using the second training set.
The USPTO analysis of this claim finds that it is patent-eligible subject matter, despite including an algorithm, because while some of the limitations may be based on mathematical concepts, the mathematical concepts are not recited in the claims This shows that when an invention involves a neural network, a key focus of the claims should be the inventive means of achieving the result and not the underlying mathematical concepts. For while the claim does mention the computer-implemented method, it does not recite any mathematical relationships, formulas, or calculations.
One example of an invention that uses deep learning is U.S. Patent No. 10,196,427 Epitope focusing by variable effective antigen surface concentration. This invention provides compositions and methods for the generation of an antibody or immunogenic composition, such as a vaccine, through epitope focusing by variable effective antigen surface concentration.
According to the disclosure and the abstract, the invention relies heavily on in silico bioinformatics meaning, scientific experiments or research conducted or produced by means of computer modeling or computer simulation for the science of collecting and analyzing complex biological data. For example, the disclosure describes neural networks to generate a map of the protein surfaces of a particular antigen or to generate an in silico library of antigenic variants. The abstract describes one step of the invention as generating in silico a library of potential antigens for use in the immunogenic composition.
However, the claims avoid tripping up on the subject matter eligibility requirement by not reciting the algorithms or the use of a computer in the claims. The claims merely describe what the computer is used to accomplish, without mentioning that the calculations are performed in silico.
For example, claim 1 recites a method for eliciting an immune response in a human subject, the method comprising: delivering at least six antigens to the human subject, wherein each of the at least six antigens comprises: a target epitope that is common to each of the at least six antigens; and one or more non-conserved regions that are outside of the target epitope; wherein the at least six antigens are delivered such that each individual antigen of the at least six antigens is delivered in an amount that is insufficient to be immunogenic to the human subject on its own, while the at least six antigens are delivered in a combined amount that is sufficient to generate an immune response to the target epitope in the human subject. Claim 1 and the remaining claims, all dependent, may contain limitations that are based on mathematical concepts but the claim language does not recite those mathematical concepts.
Researchers have made many significant advancements in diagnosis of different kinds of cancer through ML. Patenting these types of formulations of aggregated data inventions can be a challenge as inventions that merely present the results of collecting and analyzing information without additional elements that identify a particular tool for the presentation or application of the data, are likely abstract ideas. Abstract ideas are another category of unpatentable subject matter and inventions that involve mathematical manipulation of data without additional elements to append that abstract idea are unpatentable.
In a recent example, the U.S. Patent Trial and Appeal Board (PTAB) affirmed an Examiners determination that Application No. 13/417,188, aimed at using ML to modernize cancer treatment failed subject matter eligibility. 2018 Pat. App. LEXIS 3052, *3 (PTAB April 19, 2018). In that case, the invention was a way to connect multiple genomic alterations such as copy number, DNA methylation, somatic mutations, mRNA expression and microRNA expression to create an [i]ntegrated pathway analysis [] expected to increase the precision and sensitivity of causal interpretations for large sets of observations.
Claim 1 of the patent application read as follows: 1. A method of conveying biological sequence data, comprising: generating a data packet including a first header containing network routing information, a second header containing header information pertaining to the biological sequence data, and a payload containing a representation of the biological sequence data relative to a reference sequence; storing the data packet in a queue in communication with a network interface; and transmitting the data packet over a network accessible through the network interface.
The patent application was rejected by the USPTO as ineligible subject matter because the claimed method of generating a dynamic pathway map (DPM) was merely algorithmic concepts involving the mathematical manipulation of data. The Examiner determined that the claims do not include additional elements/steps appended to the abstract idea that are sufficient to amount to significantly more than mathematical concepts and that, even though the additional elements appended to the abstract idea integrated multiple data sources to identify reproducible and interpretable molecular signatures of tumorigenesis and progression, those elements were routine and conventional techniques for collecting data.
In addition to the abstract idea issues, the 13/417,188 application it was also rejected by the USPTO for double patenting, which means that another patent application filed by the same inventors presumably covered the same technology. Interestingly, the USPTO issued patent No. 10,192,641 on that other patent application. That other application included a limitation in claim 1 that reads: formulating a treatment option for the patient based on the reference pathway activity of the factor graph, wherein at least one of the above method operations is performed through a processor. This limitation may have provided the missing additional steps appended to the abstract idea to amount to sufficiently more than mathematical concepts.
Many nascent protein engineering technology companies are developing fascinating sustainably sourced products using ML. One such company is Arzeda which is developing scratch proof computer screens for cell phones using a renewable source you might not believe tulips. Arzeda has ported the metabolic pathway responsible for making a natural molecule called tulipalin, found in tulips, into industrial microbes. Arzeda is harnessing the power of machine learning to combine protein design, pathway design, HT screening and strain construction, to create and improve designer fermentation strains for virtually any chemical.
Arzedas U.S. Patent No. 10,025,900 describes its invention as providing computational methods for engineering, selecting, and/or identifying proteins with a desired activity, but as we have seen with the other successful applications, the claims do not state the mathematical equations, but rather the process to obtain the desired results. Here is part of claim 1 of the 900 patent:
(c) computationally selecting one or more amino acid sequences having structural homology and/or sequence homology to the template protein having the enzymatic activity;
(d) providing a structural model for each of the amino acid sequences selected in step (c);
(e) selecting the amino acid sequences satisfying the functional site description comprising steps of computationally docking a ligand and optimizing positioning of amino acid side chains and main chain atoms of the amino acid sequences; and
(f) recombinantly expressing and confirming the enzymatic activity for at least one of the amino acid sequences that satisfies the functional site description selected from step (e), thereby making the protein having the enzymatic activity.
Key Lessons in Patenting Machine Learning Inventions
There are two key takeaways from the USPTO guidelines and the successful ML-based patent applications. First, focus on the requirements for how the desired result is achieved. Similar to the USPTO requirements for ML patents, the EPO guidelines require a similar inventive step and a further technical effect, which you can read about here. Second, use caution when reciting specific mathematical equations within the claim language.
Our next post in this series will focusing on the challenges and benefits of protecting your AI biotech inventions under trade secret law and how to determine what kind of IP protection, patents or trade secret, would be most beneficial for your AI biotech invention.