Category Archives: Machine Learning
Top Five Data Privacy Issues that Artificial Intelligence and Machine Learning Startups Need to Know – insideBIGDATA
In this special guest feature, Joseph E. Mutschelknaus, a director in Sterne Kesslers Electronics Practice Group, addresses some of the top data privacy compliance issues that startups dealing with AI and ML applications face. Joseph prosecutes post-issuance proceedings and patent applications before the United States Patent & Trademark Office. He also assists with district court litigation and licensing issues. Based in Washington, D.C. and renown for more than four decades for dedication to the protection, transfer, and enforcement of intellectual property rights, Sterne, Kessler, Goldstein & Fox is one of the most highly regarded intellectual property specialty law firms in the world.
Last year, the Federal Trade Commission (FTC) hit both Facebook and Google with record fines relating to their handling of personal data. The California Consumer Privacy Act (CCPA), which is widely viewed the toughest privacy law in the U.S., came online this year. Nearly every U.S. state has its own data breach notification law. And the limits of the EUs General Data Protection Regulation (GDPR), which impacts companies around the world, are being tested in European courts.
For artificial intelligence (AI) startups, data is king. Data is needed to train machine learning algorithms, and in many cases is the key differentiator from competitors. Yet, personal data, that is, data relating to an individual, is also subject an increasing array of regulations.
As last years $5 billion fine on Facebook demonstrates, the penalties for noncompliance with privacy laws can be severe. In this article, I review the top five privacy compliance issues that every AI or machine learning startup needs to be aware of and have a plan to address.
1. Consider how and when data can be anonymized
Privacy laws are concerned with regulating personally identifiable information. If an individuals data can be anonymized, most of the privacy issues evaporate. That said, often the usefulness of data is premised on being able to identify the individual that it is associated with, or at least being able to correlate different data sets that are about the same individual.
Computer scientists may recognize a technique called a one-way hash as a way to anonymize data used to train machine learning algorithms. Hash operations work by converting data into a number in a manner such that the original data cannot be derived from the number alone. For example, if a data record has the name John Smith associated with it, a hash operation may to convert the name John Smith into a numerical form which is mathematically difficult or impossible to derive the individuals name. This anonymization technique is widely used, but is not foolproof. The European data protection authorities have released detailed guidance on how hashes can and cannot be used to anonymize data.
Another factor to consider is that many of these privacy regulations, including the GDPR, cover not just data where an individual is identified, but also data where an individual is identifiable. There is an inherent conflict here. Data scientists want a data set that is as rich as possible. Yet, the richer the data set is, the more likely an individual can be identified from it.
For example, The New York Times wrote an investigative piece on location data. Although the data was anonymized, the Times was able to identify the data record describing the movements of New York City Mayor Bill de Blasio, by simply cross-referencing the data with his known whereabouts at Gracie Mansion. This example illustrates the inherent limits to anonymization in dealing with privacy compliance.
2. What is needed in a compliant privacy policy
Realizing that anonymization may not be possible in the context of your business, the next step has to be in obtaining the consent of the data subjects. This can be tricky, particularly in cases where the underlying data is surreptitiously gathered.
Many companies rely on privacy policies as a way of getting data subjects consent to collect and process personal information. For this to be effective, the privacy policy must explicitly and particularly state how the data is to be used. Generally stating that the data may be used to train algorithms is usually insufficient. If your data scientists find a new use for the data youve collected, you must return to the data subjects and get them to agree to an updated privacy policy. The FTC regards a companys noncompliance with its own privacy policy as an unreasonable trade practice subject to investigation and possible penalty. This sort of noncompliance was the basis for the $5 billion fine assessed against Facebook last year.
3. How to provide a right to be forgotten
To comply with many of these regulations, including the GDPR and CCPA, you must provide not only a way for a data subject to refuse consent, but also a way to for a data subject to withdraw consent already given. This is sometimes called a right to erase or a right to be forgotten. In some cases, a company must provide a way for subjects to restrict uses of data, offering data subjects a menu of ways the company can and cannot use collected data.
In the context of machine learning, this can be very tricky. Some algorithms, once trained, are difficult to untrain. The ability to remove personal information has to be baked into the system design at the outset.
4. What processes and safeguards need to be in place to properly handle personal data
Privacy compliance attorneys need to be directly involved in the product design effort. In even big sophisticated companies, compliance issues usually arise when those responsible for privacy compliance arent aware of or dont understand the underlying technology.
The GDPR requires certain companies to designate data protection officers that are responsible for compliance. There also record-keeping and auditing obligations in many of these regulations.
5. How to ensure that data security practices are legally adequate
Having collected personal data, you are under an obligation to keep it secure. The FTC regularly brings enforcement actions against companies with unreasonably bad security practices and has detailed guidelines on what practices it considers appropriate.
In the event of a data breach does occur, you should immediately contact a lawyer. Every U.S. state has its own laws governing data breach notification and imposes different requirements in terms of notification and possibly remuneration.
Collecting personal data is essential part of many machine learning startups. Lack of a well-constructed compliance program can be an Achilles heel to any business plan. It is a recipe for an expensive lawsuit or government investigation that could be fatal to a young startup business. So, a comprehensive compliance program has to be an essential part of any AI/ML startups business plan.
Sign up for the free insideBIGDATAnewsletter.
Read the original post:
Top Five Data Privacy Issues that Artificial Intelligence and Machine Learning Startups Need to Know - insideBIGDATA
COVID-19 Impacts: Machine Learning Market will Accelerate at a CAGR of about 39% through 2020-2024 | The Increasing Adoption of Cloud-based Offerings…
LONDON--(BUSINESS WIRE)--Technavio has been monitoring the machine learning market and it is poised to grow by $ 11.16 bn during 2020-2024, progressing at a CAGR of about 39% during the forecast period. The report offers an up-to-date analysis regarding the current market scenario, latest trends and drivers, and the overall market environment.
Technavio suggests three forecast scenarios (optimistic, probable, and pessimistic) considering the impact of COVID-19. Please Request Latest Free Sample Report on COVID-19 Impact
The market is fragmented, and the degree of fragmentation will accelerate during the forecast period. Alibaba Group Holding Ltd., Alphabet Inc., Amazon.com Inc., Cisco Systems Inc., Hewlett Packard Enterprise Development LP, International Business Machines Corp., Microsoft Corp., Salesforce.com Inc., SAP SE, and SAS Institute Inc. are some of the major market participants. To make the most of the opportunities, market vendors should focus more on the growth prospects in the fast-growing segments, while maintaining their positions in the slow-growing segments.
The increasing adoption of cloud-based offerings has been instrumental in driving the growth of the market. However, the shortage of skilled personnel might hamper market growth.
Machine learning market 2020-2024 : Segmentation
Machine learning market is segmented as below:
To learn more about the global trends impacting the future of market research, download a free sample: https://www.technavio.com/talk-to-us?report=IRTNTR40918
Machine learning market 2020-2024 : Scope
Technavio presents a detailed picture of the market by the way of study, synthesis, and summation of data from multiple sources. Our machine learning market report covers the following areas:
This study identifies the increasing use of machine learning in customer experience management as one of the prime reasons driving the machine learning market growth during the next few years.
Machine learning market 2020-2024 : Vendor Analysis
We provide a detailed analysis of around 25 vendors operating in the machine learning market, including some of the vendors such as Alibaba Group Holding Ltd., Alphabet Inc., Amazon.com Inc., Cisco Systems Inc., Hewlett Packard Enterprise Development LP, International Business Machines Corp., Microsoft Corp., Salesforce.com Inc., SAP SE, and SAS Institute Inc. Backed with competitive intelligence and benchmarking, our research reports on the machine learning market are designed to provide entry support, customer profile and M&As as well as go-to-market strategy support.
Register for a free trial today and gain instant access to 17,000+ market research reports.
Technavio's SUBSCRIPTION platform
Machine learning market 2020-2024 : Key Highlights
Table Of Contents :
Executive Summary
Market Landscape
Market Sizing
Five Forces Analysis
Market Segmentation by End-user
Customer Landscape
Geographic Landscape
Drivers, Challenges, and Trends
Vendor Landscape
Vendor Analysis
Appendix
About Us
Technavio is a leading global technology research and advisory company. Their research and analysis focuses on emerging market trends and provides actionable insights to help businesses identify market opportunities and develop effective strategies to optimize their market positions. With over 500 specialized analysts, Technavios report library consists of more than 17,000 reports and counting, covering 800 technologies, spanning across 50 countries. Their client base consists of enterprises of all sizes, including more than 100 Fortune 500 companies. This growing client base relies on Technavios comprehensive coverage, extensive research, and actionable market insights to identify opportunities in existing and potential markets and assess their competitive positions within changing market scenarios.
2 books to deepen your command of python machine learning – TechTalks
Image credit: Depositphotos
This post is part ofAI education, a series of posts that review and explore educational content on data science and machine learning. (In partnership withPaperspace)
Mastering machine learning is not easy, even if youre a crack programmer. Ive seen many people come from a solid background of writing software in different domains (gaming, web, multimedia, etc.) thinking that adding machine learning to their roster of skills is another walk in the park. Its not. And every single one of them has been dismayed.
I see two reasons for why the challenges of machine learning are misunderstood. First, as the name suggests, machine learning is software that learns by itself as opposed to being instructed on every single rule by a developer. This is an oversimplification that many media outlets with little or no knowledge of the actual challenges of writing machine learning algorithms often use when speaking of the ML trade.
The second reason, in my opinion, are the many books and courses that promise to teach you the ins and outs of machine learning in a few hundred pages (and the ads on YouTube that promise to net you a machine learning job if you pass an online course). Now, I dont what to vilify any of those books and courses. Ive reviewed several of them (and will review some more in the coming weeks), and I think theyre invaluable sources for becoming a good machine learning developer.
But theyre not enough. Machine learning requires both good coding and math skills and a deep understanding of various types of algorithms. If youre doing Python machine learning, you have to have in-depth knowledge of many libraries and also master the many programming and memory-management techniques of the language. And, contrary to what some people say, you cant escape the math.
And all of that cant be summed up in a few hundred pages. Rather than a single volume, the complete guide to machine learning would probably look like Donald Knuths famous The Art of Computer Programming series.
So, what is all this tirade for? In my exploration of data science and machine learning, Im always on the lookout for books that take a deep dive into topics that are skimmed over by the more general, all-encompassing books.
In this post, Ill look at Python for Data Analysis and Practical Statistics for Data Scientists, two books that will help deepen your command of the coding and math skills required to master Python machine learning and data science.
Python for Data Analysis, 2nd Edition, is written by Wes McKinney, the creator of the pandas, one of key libraries using in Python machine learning. Doing machine learning in Python involves loading and preprocessing data in pandas before feeding them to your models.
Most books and courses on machine learning provide an introduction to the main pandas components such as DataFrames and Series and some of the key functions such as loading data from CSV files and cleaning rows with missing data. But the power of pandas is much broader and deeper than what you see in a chapters worth of code samples in most books.
In Python for Data Analysis, McKinney takes you through the entire functionality of pandas and manages to do so without making it read like a reference manual. There are lots of interesting examples that build on top of each other and help you understand how the different functions of pandas tie in with each other. Youll go in-depth on things such as cleaning, joining, and visualizing data sets, topics that are usually only discussed briefly in most machine learning books.
Youll also get to explore some very important challenges, such as memory management and code optimization, which can become a big deal when youre handling very large data sets in machine learning (which you often do).
What I also like about the book is the finesse that has gone into choosing subjects to fit in the 500 pages. While most of the book is about pandas, McKinney has taken great care to complement it with material about other important Python libraries and topics. Youll get a good overview of array-oriented programming with numpy, another important Python library often used in machine learning in concert with pandas, and some important techniques in using Jupyter Notebooks, the tool of choice for many data scientists.
All this said, dont expect Python for Data Analysis to be a very fun book. It can get boring because it just discusses working with data (which happens to be the most boring part of machine learning). There wont be any end-to-end examples where youll get to see the result of training and using a machine learning algorithm or integrating your models in real applications.
My recommendation: You should probably pick up Python for Data Analysis after going through one of the introductory or advanced books on data science or machine learning. Having that introductory background on working with Python machine learning libraries will help you better grasp the techniques introduced in the book.
While Python for Data Analysis improves your data-processing and -manipulation coding skills, the second book well look at, Practical Statistics for Data Scientists, 2nd Edition, will be the perfect resource to deepen your understanding of the core mathematical logic behind many key algorithms and concepts that you often deal with when doing data science and machine learning.
The book starts with simple concepts such as different types of data, means and medians, standard deviations, and percentiles. Then it gradually takes you through more advanced concepts such as different types of distributions, sampling strategies, and significance testing. These are all concepts you have probably learned in math class or read about in data science and machine learning books.
But again, the key here is specialization.
On the one hand, the depth that Practical Statistics for Data Scientists brings to each of these topics is greater than youll find in machine learning books. On the other hand, every topic is introduced along with coding examples in Python and R, which makes it more suitable than classic statistics textbooks on statistics. Moreover, the authors have done a great job of disambiguating the way different terms are used in data science and other fields. Each topic is accompanied by a box that provides all the different synonyms for popular terms.
As you go deeper into the book, youll dive into the mathematics of machine learning algorithms such as linear and logistic regression, K-nearest neighbors, trees and forests, and K-means clustering. In each case, like the rest of the book, theres more focus on whats happening under the algorithms hood rather than using it for applications. But the authors have again made sure the chapters dont read like classic math textbooks and the formulas and equations are accompanied by nice coding examples.
Like Python for Data Analysis, Practical Statistics for Data Scientists can get a bit boring if you read it end to end. There are no exciting applications or a continuous process where you build your code through the chapters. But on the other hand, the book has been structured in a way that you can read any of the sections independently without the need to go through previous chapters.
My recommendation: Read Practical Statistics for Data Scientists after going through an introductory book on data science and machine learning. I definitely recommend reading the entire book once, though to make it more enjoyable, go topic by topic in-between your exploration of other machine learning courses. Also keep it handy. Youll probably revisit some of the chapters from time to time.
I would definitely count Python for Data Analysis and Practical Statistics for Data Scientists as two must-reads for anyone who is on the path of learning data science and machine learning. Although they might not be as exciting as some of the more practical books, youll appreciate the depth they add to your coding and math skills.
View original post here:
2 books to deepen your command of python machine learning - TechTalks
Deep learning’s role in the evolution of machine learning – TechTarget
Machine learning had a rich history long before deep learning reached fever pitch. Researchers and vendors were using machine learning algorithms to develop a variety of models for improving statistics, recognizing speech, predicting risk and other applications.
While many of the machine learning algorithms developed over the decades are still in use today, deep learning -- a form of machine learning based on multilayered neural networks -- catalyzed a renewed interest in AI and inspired the development of better tools, processes and infrastructure for all types of machine learning.
Here, we trace the significance of deep learning in the evolution of machine learning, as interpreted by people active in the field today.
The story of machine learning starts in 1943 when neurophysiologist Warren McCulloch and mathematician Walter Pitts introduced a mathematical model of a neural network. The field gathered steam in 1956 at a summer conference on the campus of Dartmouth College. There, 10 researchers came together for six weeks to lay the ground for a new field that involved neural networks, automata theory and symbolic reasoning.
The distinguished group, many of whom would go on to make seminal contributions to this new field, gave it the name artificial intelligence to distinguish it from cybernetics, a competing area of research focused on control systems. In some ways these two fields are now starting to converge with the growth of IoT, but that is a topic for another day.
Early neural networks were not particularly useful -- nor deep. Perceptrons, the single-layered neural networks in use then, could only learn linearly separable patterns. Interest in them waned after Marvin Minsky and Seymour Papert published the book Perceptrons in 1969, highlighting the limitations of existing neural network algorithms and causing the emphasis in AI research to shift.
"There was a massive focus on symbolic systems through the '70s, perhaps because of the idea that perceptrons were limited in what they could learn," said Sanmay Das, associate professor of computer science and engineering at Washington University in St. Louis and chair of the Association for Computing Machinery's special interest group on AI.
The 1973 publication of Pattern Classification and Scene Analysis by Richard Duda and Peter Hart introduced other types of machine learning algorithms, reinforcing the shift away from neural nets. A decade later, Machine Learning: An Artificial Intelligence Approach by Ryszard S. Michalski, Jaime G. Carbonell and Tom M. Mitchell further defined machine learning as a domain driven largely by the symbolic approach.
"That catalyzed a whole field of more symbolic approaches to [machine learning] that helped frame the field. This led to many Ph.D. theses, new journals in machine learning, a new academic conference, and even helped to create new laboratories like the NASA Ames AI Research branch, where I was deputy chief in the 1990s," said Monte Zweben, CEO of Splice Machine, a scale-out SQL platform.
In the 1990s, the evolution of machine learning made a turn. Driven by the rise of the internet and increase in the availability of usable data, the field began to shift from a knowledge-driven approach to a data-driven approach, paving the way for the machine learning models that we see today.
The turn toward data-driven machine learning in the 1990s was built on research done by Geoffrey Hinton at the University of Toronto in the mid-1980s. Hinton and his team demonstrated the ability to use backpropagation to build deeper neural networks.
"This was a major breakthrough enabling new kinds of pattern recognition that were previously not feasible with neural nets," Zweben said. This added new layers to the networks and a way to strengthen or weaken connections back across many layers in the network, leading to the term deep learning.
Although possible in a lab setting, deep learning did not immediately find its way into practical applications, and progress stalled.
"Through the '90s and '00s, a joke used to be that 'neural networks are the second-best learning algorithm for any problem,'" Washington University's Das said.
Meanwhile, commercial interest in AI was starting to wane because the hype around developing an AI on par with human intelligence had gotten ahead of results, leading to an AI winter, which lasted through the 1980s. What did gain momentum was a type of machine learning using kernel methods and decision trees that enabled practical commercial applications.
Still, the field of deep learning was not completely in retreat. In addition to the ascendancy of the internet and increase in available data, another factor proved to be an accelerant for neural nets, according to Zweben: namely, distributed computing.
Machine learning requires a lot of compute. In the early days, researchers had to keep their problems small or gain access to expensive supercomputers, Zweben said. The democratization of distributed computing in the early 2000s enabled researchers to run calculations across clusters of relatively low-cost commodity computers.
"Now, it is relatively cheap and easy to experiment with hundreds of models to find the best combination of data features, parameters and algorithms," Zweben said. The industry is pushing this democratization even further with practices and associated tools for machine learning operations that bring DevOps principles to machine learning deployment, he added.
Machine learning is also only as good as the data it is trained on, and if data sets are small, it is harder for the models to infer patterns. As the data created by mobile, social media, IoT and digital customer interactions grew, it provided the training material deep learning techniques needed to mature.
By 2012, deep learning attained star status after Hinton's team won ImageNet, a popular data science challenge, for their work on classifying images using neural networks. Things really accelerated after Google subsequently demonstrated an approach to scaling up deep learning across clusters of distributed computers.
"The last decade has been the decade of neural networks, largely because of the confluence of the data and computational power necessary for good training and the adaptation of algorithms and architectures necessary to make things work," Das said.
Even when deep neural networks are not used directly, they indirectly drove -- and continue to drive -- fundamental changes in the field of machine learning, including the following:
Deep learning's predictive power has inspired data scientists to think about different ways of framing problems that come up in other types of machine learning.
"There are many problems that we didn't think of as prediction problems that people have reformulated as prediction problems -- language, vision, etc. -- and many of the gains in those tasks have been possible because of this reformulation," said Nicholas Mattei, assistant professor of computer science at Tulane University and vice chair of the Association for Computing Machinery's special interest group on AI.
In language processing, for example, a lot of the focus has moved toward predicting what comes next in the text. In computer vision as well, many problems have been reformulated so that, instead of trying to understand geometry, the algorithms are predicting labels of different parts of an image.
The power of big data and deep learning is changing how models are built. Human analysis and insights are being replaced by raw compute power.
"Now, it seems that a lot of the time we have substituted big databases, lots of GPUs, and lots and lots of machine time to replace the deep problem introspection needed to craft features for more classic machine learning methods, such as SVM [support vector machine] and Bayes," Mattei said, referring to the Bayesian networks used for modeling the probabilities between observations and outcomes.
The art of crafting a machine learning problem has been taken over by advanced algorithms and the millions of hours of CPU time baked into pretrained models so data scientists can focus on other projects or spend more time on customizing models.
Deep learning is also helping data scientists solve problems with smaller data sets and to solve problems in cases where the data has not been labeled.
"One of the most relevant developments in recent times has been the improved use of data, whether in the form of self-supervised learning, improved data augmentation, generalization of pretraining tasks or contrastive learning," said Juan Jos Lpez Murphy, AI and big data tech director lead at Globant, an IT consultancy.
These techniques reduce the need for manually tagged and processed data. This is enabling researchers to build large models that can capture complex relationships representing the nature of the data and not just the relationships representing the task at hand. Lpez Murphy is starting to see transfer learning being adopted as a baseline approach, where researchers can start with a pretrained model that only requires a small amount of customization to provide good performance on many common tasks.
There are specific fields where deep learning provides a lot of value, in image, speech and natural language processing, for example, as well as time series forecasting.
"The broader field of machine learning is enhanced by deep learning and its ability to bring context to intelligence. Deep learning also improves [machine learning's] ability to learn nonlinear relationships and manage dimensionality with systems like autoencoders," said Luke Taylor, founder and COO at TrafficGuard, an ad fraud protection service.
For example, deep learning can find more efficient ways to auto encode the raw text of characters and words into vectors representing the similarity and differences of words, which can improve the efficiency of the machine learning algorithms used to process it. Deep learning algorithms that can recognize people in pictures make it easier to use other algorithms that find associations between people.
More recently, there have been significant jumps using deep learning to improve the use of image, text and speech processing through common interfaces. People are accustomed to speaking to virtual assistants on their smartphones and using facial recognition to unlock devices and identify friends in social media.
"This broader adoption creates more data, enables more machine learning refinement and increases the utility of machine learning even further, pushing even further adoption of this tech into people's lives," Taylor said.
Early machine learning research required expensive software licenses. But deep learning pioneers began open sourcing some of the most powerful tools, which has set a precedent for all types of machine learning.
"Earlier, machine learning algorithms were bundled and sold under a licensed tool. But, nowadays, open source libraries are available for any type of AI applications, which makes the learning curve easy," said Sachin Vyas, vice president of data, AI and automation products at LTI, an IT consultancy.
Another factor in democratizing access to machine learning tools has been the rise of Python.
"The wave of open source frameworks for deep learning cemented the prevalence of Python and its data ecosystem for research, development and even production," Globant's Lpez Murphy said.
Many of the different commercial and free options got replaced, integrated or connected to a Python layer for widespread use. As a result, Python has become the de facto lingua franca for machine learning development.
Deep learning has also inspired the open source community to automate and simplify other aspects of the machine learning development lifecycle. "Thanks to things like graphical user interfaces and [automated machine learning], creating working machine learning models is no longer limited to Ph.D. data scientists," Carmen Fontana, IEEE member and cloud and emerging tech practice lead at Centric Consulting, said.
For machine learning to keep evolving, enterprises will need to find a balance between developing better applications and respecting privacy.
Data scientists will need to be more proactive in understanding where their data comes from and the biases that may inadvertently be baked into it, as well as develop algorithms that are transparent and interpretable. They also need to keep pace with new machine learning protocols and the different ways these can be woven together with various data sources to improve applications and decisions.
"Machine learning provides more innovative applications for end users, but unless we're choosing the right data sets and advancing deep learning protocols, machine learning will never make the transition from computing a few results to providing actual intelligence," said Justin Richie, director of data science at Nerdery, an IT consultancy.
"It will be interesting to see how this plays out in different industries and if this progress will continue even as data privacy becomes more stringent," Richie said.
More:
Deep learning's role in the evolution of machine learning - TechTarget
What I Learned From Looking at 200 Machine Learning Tools – Machine Learning Times – machine learning & data science news – The Predictive…
Originally published in Chip Huyen Blog, June 22, 2020
To better understand the landscape of available tools for machine learning production, I decided to look up every AI/ML tool I could find. The resources I used include:
After filtering out applications companies (e.g. companies that use ML to provide business analytics), tools that arent being actively developed, and tools that nobody uses, I got 202 tools. See the full list. Please let me know if there are tools you think I should include but arent on the list yet!
Disclaimer
This post consists of 6 parts:
I. OverviewII. The landscape over timeIII. The landscape is under-developedIV. Problems facing MLOpsV. Open source and open-coreVI. Conclusion
I. OVERVIEW
In one way to generalize the ML production flow that I agreed with, it consists of 4 steps:
I categorize the tools based on which step of the workflow that it supports. I dont include Project setup since it requires project management tools, not ML tools. This isnt always straightforward since one tool might help with more than one step. Their ambiguous descriptions dont make it any easier: we push the limits of data science, transforming AI projects into real-world business outcomes, allows data to move freely, like the air you breathe, and my personal favorite: we lived and breathed data science.
I put the tools that cover more than one step of the pipeline into the category that they are best known for. If theyre known for multiple categories, I put them in the All-in-one category. I also include the Infrastructure category to include companies that provide infrastructure for training and storage. Most of these are Cloud providers.
To continue reading this article click here.
Continue reading here:
What I Learned From Looking at 200 Machine Learning Tools - Machine Learning Times - machine learning & data science news - The Predictive...
Protecting inventions which use Machine Learning and Artificial Intelligence – Lexology
Protecting inventions which use Machine Learning and Artificial Intelligence
There has been a lot of talk recently about the DABUS family of patent applications where DABUS, an artificial intelligence (AI), was named as an inventor. This has prompted a lot of discussion around whether an inventor must be a human being and there is no doubt that this discussion will continue as AI finds its way into more and more aspects of our lives.
However, one of the other parts of the discussion around AI in patents is around the patentability of inventions which apply machine learning (ML) and AI based concepts to the solution of technical problems.
Why consider patent protection?
Patents protect technical innovations and technical solutions to problems. They can offer broad legal protection for the technical concept you develop, albeit in exchange for disclosure of the invention.
Here in the UK, a patent can give you the right to prevent others from exploiting your invention and can help you to mark out legal exclusivity around a patented product.
Can I not just keep the invention a secret?
It is an option to utilise the invention as a trade secret, but the protection of the trade secret involves considerable effort to implement the technical and administrative environment which will enable the trade secret to stay as a secret. This can include changing your physical workplace to confine certain environments where trade secret-protected inventions are being used. This can also include implementing technical measures to inhibit access to trade secrets from unauthorised individuals. Such technical measures are particularly important for AI and ML-focused inventions as they are often embodied in computer program code which can simply be transferred from one computer to another
What is perhaps more pertinent is that if your AI or ML-enabled concept is to be implemented in association with hardware which is to be sold publicly, then this will by definition negate the value of the concept as a trade secret as it will become publicly available. It may require decompilation or reverse engineering to access the code, but this does not mean that the code is secret.
There may be additional know-how associated with your invention which is worth protecting as a trade secret but as part of a suite of IP rights (including patents) which are focused on protecting your invention.
How much information does the patent application require?
All patent applications are drafted for the skilled person who in this context would be somebody skilled in the techniques of ML and AI, although not necessarily an expert. That is to say, it needs to be enough information to enable such a person to put the invention into effect.
This should include technical information about features which provide an advantage over previous systems and clear identification of advantageous features and why they are advantageous. This will give your Patent Attorney the best possible chance of framing the invention in a way which convinces patent offices around the world to grant a patent.
It is also advisable to include disclosure of at least one set of training data and details of how it has been trained.
In the context of AI and ML it is particularly important to draw attention to technically advantageous features as some patent offices will need a lot of convincing to grant patents for these inventions. It is particularly useful to draw attention to features which solve technical problems or are motivated by technical considerations rather than economic or commercial considerations.
The EPO have stressed that patents will be granted when ML or AI based inventions are limited to a specific technical application or required a specific technical implementation which are directed to a technical purpose. These advantages and details of implementation will enable a patent attorney skilled in drafting patent applications for ML/AI to present your invention in the best light as possible from the perspective of the EPO or the UKIPO as they will enable us to clearly set out how the invention delivers the technical application and solves the technical problem.
Our software patents team are specifically noted for their skill in drafting computer implemented inventions for the UKIPO and the EPO.
Although a lot of information is required, we do not necessarily need program code. It would help, however, to at least include a pseudocode description of the invention so that we can garner an understanding of how the invention works as a series of steps this helps with the description.
Are AI and ML not just like software, i.e. cannot be patented?
It is possible to patent software-based inventions but, like other inventions, the invention needs to solve a technical problem. This is the same with inventions which apply AI and ML.
AI and ML inventions are treated in Europe like other mathematical methods in that they are rejected as excluded from patentability if they do not solve a technical problem. It is best to illustrate this by example.
If your invention is to improve a technique which is used to analyse data such as, for example, your invention improves K-means clustering with no other benefit to a technical field, then you can expect to face considerable obstacles to obtaining a patent to protect your invention. However, if your invention applies K-means clustering to achieve a specific improvement to a specific technical system then you are likely to face less obstacle to obtaining a patent for your invention.
That is to say, when considering whether you wish to pursue patent protection for the technology you have developed then focus on what the innovation achieves in a technical field.
What if the technique has been applied elsewhere? Can I still get a patent?
Referring back to our K-means clustering example, if you see that K-means clustering has been used in sensing of rain droplets on a car window to determine the appropriate setting for the windscreen wipers, then that does not necessarily mean that you cannot get a patent for K-means clustering applied to determining the likelihood of a denial of service attack on a server.
That is to say, if you are applying known technology to a new field and solving a technical problem in that field, there is an arguable case for a patentable invention.
Are there differences between Europe, US and other jurisdictions?
The approach to these inventions across jurisdictions can be different and complete consistency is difficult to guarantee. However, in drafting your patent application we would seek to make the language as flexible as possible in order to admit differing interpretations of the law across jurisdictions and to give the prosecution of your patent applications in those jurisdictions the greatest possible chance of success.
What do I do next?
If you have developed technology which applies AI or ML, then consider whether you could achieve patent protection for that invention. Contact one of our software patent experts to discuss the invention and your options.
It is also useful to note that having a pending patent application can be a useful deterrent for competitors and the uncertainty created for third parties by the existence of the patent application can provide you with the space in the market to establish your exclusivity, develop your customer base and build your brand.
Read this article:
Protecting inventions which use Machine Learning and Artificial Intelligence - Lexology
Machine learning finds use in creating sharper maps of ‘ecosystem’ lines in the ocean – Firstpost
EOSJul 01, 2020 14:54:08 IST
On land, its easy for us to see divisions between ecosystems: A rain forests fan palms and vines stand in stark relief to the cacti of a high desert. Without detailed data or scientific measurements, we can tell a distinct difference in the ecosystems flora and fauna.
But how do scientists draw those divisions in the ocean? A new paper proposes a tool to redraw the lines that define an oceans ecosystems, lines originally penned by the seagoing oceanographerAlan Longhurstin the 1990s. The paper uses unsupervised learning, a machine learning method, to analyze the complex interplay between plankton species and nutrient fluxes. As a result, the tool could give researchers a more flexible definition of ecosystem regions.
Using the tool on global modeling output suggests that the oceans surface has more than 100 different regions or as few as 12 if aggregated, simplifying the56 Longhurst regions. The research could complement ongoing efforts to improve fisheries management and satellite detection of shifting plankton under climate change. It could also direct researchers to more precise locations for field sampling.
A sea turtle in the aqua blue waters of Hawaii. Image: Rohit Tandon/Unsplash
Coccolithophores, diatoms, zooplankton, and other planktonic life-formsfloaton much of the oceans sunlit surface. Scientists monitor plankton with long-term sampling stations and peer at their colorsby satellitefrom above, but they dont have detailed maps of where plankton lives worldwide.
Models help fill the gaps in scientists knowledge, and the latest research relies on an ocean model to simulate where 51 types of plankton amass on the surface oceans worldwide. The latest research then applies the new classification tool, called the systematic aggregated ecoprovince (SAGE) method, to discern where neighborhoods of like-minded plankton and nutrients appear.
SAGE relies, in part, on a type of machine learning algorithm called unsupervised learning. The algorithms strength is that it searches for patterns unprompted by researchers.
To compare the tool to a simple example, if scientists told an algorithm to identify shapes in photographs like circles and squares, the researchers could supervise the process by telling the computer what a square and circle looked like before it began. But in unsupervised learning, the algorithm has no prior knowledge of shapes and will sift through many images to identify patterns of similar shapes itself.
Using an unsupervised approach gives SAGE the freedom to let patterns emerge that the scientists might not otherwise see.
While my human eyes cant see these different regions that stand out, the machine can, first author and physical oceanographerMaike Sonnewaldat Princeton University said. And thats where the power of this method comes in. This method could be used more broadly by geoscientists in other fields to make sense of nonlinear data, said Sonnewald.
A machine-learning technique developed at MIT combs through global ocean data to find commonalities between marine locations, based on how phytoplankton species interact with each other. Using this approach, researchers have determined that the ocean can be split into over 100 types of provinces, and 12 megaprovinces, that are distinct in their ecological makeup.
Applying SAGE to model data, the tool noted 115 distinct ecological provinces, which can then be boiled down into 12 overarching regions.
One region appears in the center of nutrient-poor ocean gyres, whereas other regions show productive ecosystems along the coast and equator.
You have regions that are kind of like the regions youd see on land, Sonnewald said.One area in the heart of a desert-like region of the ocean is characterized by very small cells. Theres just not a lot of plankton biomass. The region that includes Perus fertile coast, however, has a huge amount of stuff.
If scientists want more distinctions between communities, they can adjust the tool to see the full 115 regions. But having only 12 regions can be powerful too, said Sonnewald, because it demonstrates the similarities between the different [ocean] basins. The tool was published in arecent paperin the journalScience Advances.
OceanographerFrancois Ribaletat the University of Washington, who was not involved in the study, hopes to apply the tool to field data when he takes measurements on research cruises. He said identifying unique provinces gives scientists a hint of how ecosystems could react to changing ocean conditions.
If we identify that an organism is very sensitive to temperature, so then we can start to actually make some predictions, Ribalet said. Using the tool will help him tease out an ecosystems key drivers and how it may react to future ocean warming.
Jenessa Duncombe.Text 2020. AGU.
This story has been republished from Eosunder the Creative Commons 3.0 license.Read theoriginal story.
Find latest and upcoming tech gadgets online on Tech2 Gadgets. Get technology news, gadgets reviews & ratings. Popular gadgets including laptop, tablet and mobile specifications, features, prices, comparison.
Read more:
Machine learning finds use in creating sharper maps of 'ecosystem' lines in the ocean - Firstpost
Fake data is great data when it comes to machine learning – Stacey on IoT
Its been a few years since Ilast wroteabout the idea of using synthetic data to train machine learning models.After having three recent discussions on the topic, I figured its time to revisit the technology, especially as it seems to be gaining ground in mainstream adoption.
Back in 2018, at Microsoft Build, I saw a demonstration of a drone flying over a pipeline as it inspected it for leaks or other damage. Notably, the drones visual inspection model was trained using both actual data and simulated data. Use of the synthetic data helped teach the machine learning model about outliers and novel conditions it wasnt able to encounter using traditional training. Italso allowed Microsoft researchers to train the model more quickly and without the need to embark on as many expensive, data-gathering flights as it would have had to otherwise.
The technology is finally starting to gain ground. In April, a startup calledAnyverse raised 3million ($3.37 million)for its synthetic sensor data,while another startup,AI.Reverie,published a paper about how it used simulated data to train a model to identify planes on airport runways.
After writing that initial story, I heard very little about synthetic data untilmy conversation earlier this month with Dan Jeavons, chief data scientist at Shell. When I asked him about Shells machine learning projects, using simulated data was one that he was incredibly excited about because it helps build models that can detect problems that occur only rarely.
I think its a really interesting way to get info on the edge cases that were trying to solve, he said. Even though we have a lot of data, the big problem that we have is that, actually, we often only had a very few examples of what were looking for.
In the oil business, corrosion in factories and pipelines is a big challenge, and one that can lead to catastrophic failures. Thats why companies are careful about not letting anything corrode to the point where it poses a risk. But that also means the machine learning models cant be trained on real-world examples of corrosion. So Shell uses synthetic data to help.
As Jeavons explained, Shell is also using synthetic data to try and solve the problem of people smoking at gas stations. Shelldoesnthave a lot of examples because the cameras dont always catch the smokers; in other cases, theyre too far away or arent facing the camera. So the company is working hard on combining simulated synthetic data with real data to build computer vision models.
Almost always the things were interested in are the edge cases rather than the general norm, said Jeavons. And its quite easy to detect the edge [deviating] from the standard pattern, but its quite hard to detect the specific thing that you want.
In the meantime, startup AI.Reverie endeavored to learn more about the accuracy of synthetic data. The paper it published, RarePlanes: Synthetic Data Takes Flight, lays out how its researchers combined satellite imagery of planes parked at airports that was annotated and validated by humans with synthetic data created by machine.
When using just synthetic data, the model was only about 55% percent accurate, whereas when it only used real-world data that number jumped to 73%. But by makingreal-world data 10% of the training sample and using synthetic data for the rest, the models accuracy came in at 69%.
Paul Walborsky, the CEO of AI.Reverie (and the former CEO at GigaOM; in other words, my former boss), says that synthetic datais going to be a big business. Companies using such data need to account for ways that their fake data can skew the model, but if they can do that, they can achieve robust models faster and at a lower cost than if they relied on real-world data.
So even though IoT sensors are throwing off petabytes of data, it would be impossible to annotate all of it and use it for training models. And as Jeavons points out, those petabytes of data may not have the situation you actually want the computer to look for. In other words, expect the wave of synthetic and simulated data to keep on coming.
Were convinced that, actually, this is going to be the future in terms of making things work well, said Jeavons, both in the cloud and at the edge for some of these complex use cases.
Related
See the rest here:
Fake data is great data when it comes to machine learning - Stacey on IoT
Decisions and NLP Logix Announce Partnership to bring the Power of Machine Learning to Business Process Management – Benzinga
JACKSONVILLE, Fla., July 1, 2020 /PRNewswire-PRWeb/ --The Decisions no-code workflow and rules platform was designed to enable businesses to automate and optimize their digital processes but do so in a way that is able to be done by non-programming staff. NLP Logix was founded with the mission to bring the power of machine learning to industry by becoming its customers outsourced data science team. With the combination of the Decisions platform and NLP Logix machine learning tools and team, the ability to quickly and affordably integrate artificial intelligence to workflows is now here.
"We were brought in to automate a number of financial processes for a very large non-profit," said Matt Berseth, Lead Data Scientist for NLP Logix. "They had already deployed the Decisions platform to automate their workflows and we were able to easily embed a number of machine learning models, one of which reviewed and approved financial applications, and the efficiency gains have been amazing."
A great example of the power of the new Partnership between Decisions and NLP Logix, is the loan origination process, which is almost entirely driven by rules and workflow and any human interactions are repetitive decisions based on experience. The Decisions platform automates the gathering and review of the loan application, while the machine learning models, which have been trained using years of application approval decisions by trained humans, make a final approval recommendation.
"After working with NLP Logix, we quickly realized that the addition of a trained data science team which can train and deploy machine learning models very quickly, accurately and at scale, was a very valuable addition to the Decisions platform" said Athena Harrell. "And to have a partner like NLP Logix that has the talent and team that can also implement the Decisions solution is icing on the cake."
About Decisions
Decisions is a leading provider of Business Process Management/Workflow/Rule Technology and is headquartered in Chesapeake, VA. Decisions technology is deployed as the basis of multiple commercial applications in medical, finance, logistics and operations software. In addition, Decisions technology is used directly by companies on almost all continents, ranging from small/mid-size companies to over a dozen Fortune 500. For more information go to http://www.decisions.com
About NLP Logix
NLP Logix is an artificial intelligence/machine learning product and automation solutions provider, which has evolved over the last nine years to one of the fastest growing teams of machine learning practitioners. Our team of experts have extensive experience leveraging natural language processing, computer vision, neural networks, and predictive modeling to help companies revolutionize how they operate. NLP Logix delivers automation and machine learning solutions to customers across a wide swath of industries, including financial services, energy, healthcare, government, human resources, and many more.
SOURCE NLP Logix
Original post:
Decisions and NLP Logix Announce Partnership to bring the Power of Machine Learning to Business Process Management - Benzinga
Machine Learning in Medical Imaging Market Strategies and Insight Driven Transformation 2020-2030 – Cole of Duty
Prophecy Market Insights recently presented Machine Learning in Medical Imaging market report which provides reliable and sincere insights related to the various segments and sub-segments of the market. The market study throws light on the various factors that are projected to impact the overall dynamics of the Machine Learning in Medical Imaging market over the forecast period (2019-2029).
The Machine Learning in Medical Imaging research study contains 100+ market data Tables, Pie Chat, Graphs & Figures spread through Pages and easy to understand detailed analysis. This Machine Learning in Medical Imaging market research report estimates the size of the market concerning the information on key retailer revenues, development of the industry by upstream and downstream, industry progress, key highlights related to companies, along with market segments and application. This study also analyzes the market status, market share, growth rate, sales volume, future trends, market drivers, market restraints, revenue generation, opportunities and challenges, risks and entry barriers, sales channels, and distributors.
Get Sample Copy of This Report @ https://www.prophecymarketinsights.com/market_insight/Insight/request-sample/3599
Global Machine Learning in Medical Imaging market 2020-2030 in-depth study accumulated to supply latest insights concerning acute options. The report contains different predictions associated with Machine Learning in Medical Imaging market size, revenue, CAGR, consumption, profit margin, price, and different substantial factors. Along with a detailed manufacturing and production analysis, the report also includes the consumption statistics of the industry to inform about Machine Learning in Medical Imaging market share. The value and consumption analysis comprised in the report helps businesses in determining which strategy will be most helpful in expanding their Machine Learning in Medical Imaging market size. Information about Machine Learning in Medical Imaging market traders and distributors, their contact information, import/export and trade analysis, price analysis and comparison is also provided by the report. In addition, the key company profiles/players related with Machine Learning in Medical Imaging industry are profiled in the research report.
The Machine Learning in Medical Imaging market is covered with segment analysis and PEST analysis for the market. PEST analysis provides information on a political, economic, social and technological perspective of the macro-environment from Machine Learning in Medical Imaging market perspective that helps market players understand the factor which can affect businesss accomplishments and performance-related with the particular market segment.
Segmentation Overview:
By Type (Supervised Learning, Unsupervised Learning, Semi Supervised Learning, and Reinforced Leaning)
By Application (Breast, Lung, Neurology, Cardiovascular, Liver, and Others)
By Region (North America, Europe, Asia Pacific, Latin America, and Middle East & Africa)
Competitive landscape of the Machine Learning in Medical Imaging market is given presenting detailed insights into the company profiles including developments such as merges & acquisitions, collaborations, partnerships, new production, expansions, and SWOT analysis.
Machine Learning in Medical Imaging Market Key Players:
The research scope provides comprehensive market size, and other in-depth market information details such as market growth-supporting factors, restraining factors, trends, opportunities, market risk factors, market competition, product and services, product advancements and up-gradations, regulations overview, strategy analysis, and recent developments for the mentioned forecast period.
The report analyzes various geographical regions like North America, Europe, Asia-Pacific, Latin America, Middle East, and Africa and incorporates clear market definitions, arrangements, producing forms, cost structures, improvement approaches, and plans. Besides, the report provides a key examination of regional market players operating in the specific market and analysis and outcomes related to the target market for more than 20 countries.
Request Discount @ https://www.prophecymarketinsights.com/market_insight/Insight/request-discount/3599
The report responds to significant inquires while working on Global Machine Learning in Medical Imaging Market. Some important Questions Answered in Machine Learning in Medical Imaging Market Report are:
Contact Us:
Mr. Alex (Sales Manager)
Prophecy Market Insights
Phone: +1 860 531 2701
Email: [emailprotected]