Category Archives: Machine Learning

An Open Source Alternative to AWS SageMaker – Datanami

(Robert Lucian Crusitu/Shutterstock)

Theres no shortage of resources and tools for developing machine learning algorithms. But when it comes to putting those algorithms into production for inference, outside of AWSs popular SageMaker, theres not a lot to choose from. Now a startup called Cortex Labs is looking to seize the opportunity with an open source tool designed to take the mystery and hassle out of productionalizing machine learning models.

Infrastructure is almost an afterthought in data science today, according to Cortex Labs co-founder and CEO Omer Spillinger. A ton of energy is going into choosing how to attack problems with data why, use machine learning of course! But when it comes to actually deploying those machine learning models into the real world, its relatively quiet.

We realized there are two really different worlds to machine learning engineering, Spillinger says. Theres the theoretical data science side, where people talk about neural networks and hidden layers and back propagation and PyTorch and Tensorflow. And then you have the actual system side of things, which is Kubernetes and Docker and Nvidia and running on GPUs and dealing with S3 and different AWS services.

Both sides of the data science coin are important to building useful systems, Spillinger says, but its the development side that gets most of the glory. AWS has captured a good chunk of the market with SageMaker, which the company launched in 2017 and which has been adopted by tens of thousands of customers. But aside from just a handful of vendors working in the area, such as Algorithmia, the general data-building public has been forced to go it alone when it comes to inference.

A few years removed from UC Berkeleys computer science program and eager to move on from their tech jobs, Spillinger and his co-founders were itching to build something good. So when it came to deciding what to do, Spillinger and his co-founders decided to stick with what they knew, which was working with systems.

(bluebay/Shutterstock.com)

We thought that we could try and tackle everything, he says. We realized were probably never going to be that good at the data science side, but we know a good amount about the infrastructure side, so we can help people who actually know how to build models get them into their stack much faster.

Cortex Labs software begins where the development cycle leaves off. Once a model has been created and trained on the latest data, then Cortex Labs steps in to handle the deployment into customers AWS accounts using its Kubernetes engine (AWS is the only supported cloud at this time; on-prem inference clusters are not supported).

Our starting point is a trained model, Spillinger says. You point us at a model, and we basically convert it into a Web API. We handle all the productionalization challenges around it.

That could be shifting inference workloads from CPUs to GPUs in the AWS cloud, or vice versa. It could be we automatically spinning up more AWS servers under the hood when calls to the ML inference service are high, and spinning down the servers when that demand starts to drop. On top of its build-in AWS cost-optimization capabilities, the Cortex Labs software logs and monitors all activities, which is a requirement in todays security- and regulatory-conscious climate.

Cortex Labs is a tool for scaling real-time inference, Spillinger says. Its all about scaling the infrastructure under the hood.

Cortex Labs delivers a command line interface (CLI) for managing deployments of machine learning models on AWS

We dont help at all with the data science, Spillinger says. We expect our audience to be a lot better than us at understanding the algorithms and understanding how to build interesting models and understanding how they affect and impact their products. But we dont expect them to understand Kubernetes or Docker or Nvidia drivers or any of that. Thats what we view as our job.

The software works with a range of frameworks, including TensorFlow, PyTorch, scikit-learn, and XGBoost. The company is open to supporting more. Theres going to be lots of frameworks that data scientists will use, so we try to support as many of them as we can, Spillinger says.

Cortex Labs software knows how to take advantage of EC2 spot instances, and integrates with AWS services like Elastic Kubernetes Service (EKS), Elastic Container Service (ECS), Lambda, and Fargate. The Kubernetes management alone may be worth the price of admission.

You can think about it as a Kubernetes thats been massaged for the data science use case, Spillinger says. Theres some similarities to Kubernetes in the usage. But its a much higher level of abstraction because were able to make a lot of assumptions about the use case.

Theres a lack of publicly available tools for productionalizing machine learning models, but thats not to say that they dont exist. The tech giants, in particular, have been building their own platforms for doing just this. Airbnb, for instance, has its BigHead offering, while Uber has talked about its system, called Michelangelo.

But the rest of the industry doesnt have these machine learning infrastructure teams, so we decided wed basically try to be that team for everybody else, Spillinger says.

Cortex Labs software is distributed under an open source license and is available for download from its GitHub Web page. Making the software open source is critical, Spillinger says, because of the need for standards in this area. There are proprietary offerings in this arena, but they dont have a chance of becoming the standard, whereas Cortex Labs does.

We think that if its not open source, its going to be a lot more difficult for it to become a standard way of doing things, Spillinger says.

Cortex Labs isnt the only company talking about the need for standards in the machine learning lifecycle. Last month, Cloudera announced its intention to push for standards in machine learning operations, or MLOps. Anaconda, which develops a data science platform, also is backing

Eventually, the Oakland, California-based company plans to develop a managed service offering based on its software, Spillinger says. But for now, the company is eager to get the tool into the hands of as many data scientists and machine learning engineers as it can.

Related Items:

Its Time for MLOps Standards, Cloudera Says

Machine Learning Hits a Scaling Bump

Inference Emerges As Next AI Challenge

Read the original here:
An Open Source Alternative to AWS SageMaker - Datanami

Jenkins Creator Launches Startup To Speed Software Testing with Machine Learning — ADTmag – ADT Magazine

Jenkins Creator Launches Startup To Speed Software Testing with Machine Learning

Kohsuke Kawaguchi, creator of the open source Jenkins continuous integration/continuous delivery (CI/CD) server, and Harpreet Singh, former head of the Bitbucket group at Atlassian, have launched a startup that's using machine learning (ML) to speed up the software testing process.

Their new company, Launchable, which emerged from stealth mode on Thursday, is developing a software-as-a-service (SaaS) product with the ability to predict the likelihood of a failure for each test case, given a change in the source code. The service will use ML to extract insights from the massive and growing amount of data generated by the increasingly automated software development process to make its predictions.

"As a developer, I've seen this problem of slow feedback from tests first-hand," Kawaguchi told ADTmag. "And as the guy who drove automation in the industry with Jenkins, it seemed to me that we could make use of all that data the automation is generating by applying machine learning to the problem. I thought we should be able to train the machine on the model and apply quantifiable metrics, instead of relying on human experience and gut instinct. We believe we can predict, with meaningful accuracy, what tests are more likely to catch a regression, given what has changed, and that translates to faster feedback to developers."

The strategy here is to run only a meaningful subset of tests, in the order that minimizes the feedback delay.

Kawaguchi (known as "KK") and Singh worked together at CloudBees, the chief commercial supporter of Jenkins. Singh left that company in 2018 to serve as GM of Atlassian's Bitbucket cloud group. Kawaguchi became an elite developer and architect at CloudBees, and he's been a part of the community throughout the evolution of this technology. His departure from the company was amicable: Its CEO and co-founder Sacha Labourey is an investor in the startup, and Kawaguchi will continue to be involved with the Jenkins community, he said.

Software testing has been a passion of Kawaguchi's since his days at Sun Microsystems, where he developed Jenkins as a fork of the Hudson CI server in 2011. Singh also worked at Sun and served as the first product manager for Hudson before working on Jenkins. They will serve as co-CEOs of the new company. They reportedly snagged $3.2 million in seed funding to get the ball rolling.

"KK and I got to talking about how the way we test now impacts developer productivity, and how machine learning could be used to address the problem," Singh said. "And then we started talking about doing a startup. We sat next to each other at CloudBees for eight years; it was an opportunity I couldn't pass up."

An ML engine is at the heart of the Launchable SaaS, but it's really all about the data, Singh said.

"We saw all these sales and marketing guys making data-driven decisions -- even more than the engineers, which was kind of embarrassing," Singh said. "So it became a mission for us to change that. It's kind of our north star."

The co-execs are currently talking with potential partners and recruiting engineers and data scientists. They offered no hard release date, but they said they expect a version of the Launchable SaaS to become generally available later this year.

Posted by John K. Waters on 01/23/2020 at 7:18 AM

Continued here:
Jenkins Creator Launches Startup To Speed Software Testing with Machine Learning -- ADTmag - ADT Magazine

How Machine Learning Will Lead to Better Maps – Popular Mechanics

Despite being one of the richest countries in the world, in Qatar, digital maps are lagging behind. While the country is adding new roads and constantly improving old ones in preparation for the 2022 FIFA World Cup, Qatar isn't a high priority for the companies that actually build out maps, like Google.

"While visiting Qatar, weve had experiences where our Uber driver cant figure out how to get where hes going, because the map is so off," Sam Madden, a professor at MIT's Department of Electrical Engineering and Computer Science, said in a prepared statement. "If navigation apps dont have the right information, for things such as lane merging, this could be frustrating or worse."

Madden's solution? Quit waiting around for Google and feed machine learning models a whole buffet of satellite images. It's faster, cheaper, and way easier to obtain satellite images than it is for a tech company to drive around grabbing street-view photos. The only problem: Roads can be occluded by buildings, trees, or even street signs.

So Madden, along with a team composed of computer scientists from MIT and the Qatar Computing Research Institute, came up with RoadTagger, a new piece of software that can use neural networks to automatically predict what roads look like behind obstructions. It's able to guess how many lanes a given road has and whether it's a highway or residential road.

RoadTagger uses a combination of two kinds of neural nets: a convolutional neural network (CNN), which is mostly used in image processing, and a graph neural network (GNN), which helps to model relationships and is useful with social networks. This system is what the researchers call "end-to-end," meaning it's only fed raw data and there's no human intervention.

First, raw satellite images of the roads in question are input to the convolutional neural network. Then, the graph neural network divides up the roadway into 20-meter sections called "tiles." The CNN pulls out relevant road features from each tile and then shares that data with the other nearby tiles. That way, information about the road is sent to each tile. If one of these is covered up by an obstruction, then, RoadTagger can look to the other tiles to predict what's included in the one that's obfuscated.

Parts of the roadway may only have two lanes in a given tile. While a human can easily tell that a four-lane road, shrouded by trees, may be blocked from view, a computer normally couldn't make such an assumption. RoadTagger creates a more human-like intuition in a machine learning model, the research team says.

"Humans can use information from adjacent tiles to guess the number of lanes in the occluded tiles, but networks cant do that," Madden said. "Our approach tries to mimic the natural behavior of humans ... to make better predictions."

The results are impressive. In testing out RoadTagger on occluded roads in 20 U.S. cities, the model correctly counted the number of lanes 77 percent of the time and inferred the correct road types 93 percent of the time. In the future, the team hopes to include other new features, like the ability to identify parking spots and bike lanes.

Original post:
How Machine Learning Will Lead to Better Maps - Popular Mechanics

What is Machine Learning? | Types of Machine Learning …

Machine learning is sub-categorized to three types:

Supervised Learning Train Me!

Unsupervised Learning I am self sufficient in learning

Reinforcement Learning My life My rules! (Hit & Trial)

Supervised Learning is the one, where you can consider the learning is guided by a teacher. We have a dataset which acts as a teacher and its role is to train the model or the machine. Once the model gets trained it can start making a prediction or decision when new data is given to it.

The model learns through observation and finds structures in the data. Once the model is given a dataset, it automatically finds patterns and relationships in the dataset by creating clusters in it. What it cannot do is add labels to the cluster, like it cannot say this a group of apples or mangoes, but it will separate all the apples from mangoes.

Suppose we presented images of apples, bananas and mangoes to the model, so what it does, based on some patterns and relationships it creates clusters and divides the dataset into those clusters. Now if a new data is fed to the model, it adds it to one of the created clusters.

It is the ability of an agent to interact with the environment and find out what is the best outcome. It follows the concept of hit and trial method. The agent is rewarded or penalized with a point for a correct or a wrong answer, and on the basis of the positive reward points gained the model trains itself. And again once trained it gets ready to predict the new data presented to it.

The rest is here:
What is Machine Learning? | Types of Machine Learning ...

Vectorspace AI Datasets are Now Available to Power Machine Learning (ML) and Artificial Intelligence (AI) Systems in Collaboration with Elastic -…

SAN FRANCISCO, Jan. 22, 2020 /PRNewswire/ -- Vectorspace AI (VXV) announces datasets that power data engineering, machine learning (ML) and artificial intelligence (AI) systems. Vectorspace AI alternative datasets are designed for predicting unique hidden relationships between objects including current and future price correlations between equities.

Vectorspace AI enables data, ML and Natural Language Processing/Understanding (NLP/NLU) engineers and scientists to save time by testing a hypothesis or running experiments faster to achieve an improvement in bottom line revenue and information discovery. Vectorspace AI datasets underpin most of ML and AI by improving returns from R&D divisions of any company in discovering hidden relationships in drug development.

"We are happy to be working with Vectorspace AI based on their most recent collaboration with us based on the article we published titled 'Generating and visualizing alpha with Vectorspace AI datasets and Canvas'. They represent the tip of the spear when it comes to advances in machine learning and artificial intelligence. Our customers and partners will certainly benefit from our continued joint development efforts in ML and AI," Shaun McGough, Product Engineering, Elastic.

Increasing the speed of discovery in every industry remains the aim of Vectorspace AI, along with a particular goal which relates to engineering machines to trade information with one another, connected to exchanging and transacting data in a way that minimizes a selected loss function. Data vendors such as Neudata.co, asset management companies and hedge funds including WorldQuant, use Vectorspace AI datasets to improve and protect 'alpha'.

Limited releases of Vectorspace AI datasets will be available in partnership with Amazon and Microsoft.

About Vectorspace AI (vectorspace.ai)

Vectorspace AI focuses on context-controlled NLP/NLU (Natural Language Processing/Understanding) and feature engineering for hidden relationship detection in data for the purpose of powering advanced approaches in Artificial Intelligence (AI) and Machine Learning (ML). Our platform powers research groups, data vendors, funds and institutions by generating on-demand NLP/NLU correlation matrix datasets. We are particularly interested in how we can get machines to trade information with one another or exchange and transact data in a way that minimizes a selected loss function. Our objective is to enable any group analyzing data to save time by testing a hypothesis or running experiments with higher throughput. This can increase the speed of innovation, novel scientific breakthroughs and discoveries. For a little more on who we are, see our latest reddit AMA on r/AskScience or join our 24 hour communication channel here. Vectorspace AI offers NLP/NLU services and alternative datasets consisting of correlation matrices, context-controlled sentiment scoring, and other automatically engineered feature attributes. These services are available utilizing the VXV token and VXV wallet-enabled API. Vectorspace AI is a spin-off from Lawrence Berkeley National Laboratory (LBNL) and the U.S. Dept. of Energy (DOE). The team holds patents in the area of hidden relationship discovery.

SOURCE Vectorspace AI

vectorspace.ai

Excerpt from:
Vectorspace AI Datasets are Now Available to Power Machine Learning (ML) and Artificial Intelligence (AI) Systems in Collaboration with Elastic -...

Red Hat Survey Shows Hybrid Cloud, AI and Machine Learning are the Focus of Enterprises – Computer Business Review

Add to favorites

The data aspect in particular is something that we often see overlooked

Open source enterprise software firm Red Hat now a subsidiary of IBM have conducted its annual survey of its customers which highlights just how prevalent artificial intelligence and machine learning is becoming, while a talent and skill gap is still slowing down companies ability to enact digital transformation plans.

Here are the top three takeaways from Red Hats customer survey;

When asked to best describe their companies approach to cloud infrastructure 31 percent stated that they run a hybrid cloud, while 21 percent said their firm has a private cloud first strategy in place.

The main reason cited for operating a hybrid cloud strategy was the security and cost benefits it provided. Some responders noted that data integration was easier within a hybrid cloud.

Not everyone is fully sure about their approach yet, as 17 percent admitted they are in the process of establishing a cloud strategy, while 12 percent said they have no plans at all to focus on the cloud.

When it comes to digital transformation there has been a notable rise in the amount of firms that undertaken transformation projects. In 2018; under a third of responders (31 percent) said they were implementing new processes and technology, this year that number has nearly doubled as 58 percent confirm they are introducing new technology.

Red Hat notes that: The drivers for these projects vary. And the drivers also vary by the role of the respondent. System administrators care most about simplicity. IT architects focus on user experience and innovation. For managers, simplicity, user experience, and innovation are all tied for top priority. Developers prioritize innovationwhich, overall, was cited as the most important reason to do digital transformation projects.

However, one in ten surveyed said they are facing a talent and skillset gap that is slowing down the pace at which they can transform their business. The skillset is being made worse by the amount of new technologies that are being brought to market such as artificial intelligence, machine learning and containerisation, the use of which is expected to grow significantly in the next 24 months.

Artificial intelligence, machine learning models and processes is the clear emerging technology for firms in 2019, as 30 percent said that they are planning to implement an AI or ML project within the next 12 months.

However, enterprises are worried about the compatibility and complexity of implementing AI or ML, with 29 percent stating they are worried about evolving software stacks.

One in five (22 percent) responders are worried about getting access to the right data. The data aspect in particular is something that we often see overlooked; obtaining relevant data and cleansing or transforming it in ways that its a useful input for models can be one of the most challenging aspects of an AI project, Red Hat notes.

Red Hats survey was created by compiling 876 qualified responses from Red Hat customers during August and September of 2019.

See the original post:
Red Hat Survey Shows Hybrid Cloud, AI and Machine Learning are the Focus of Enterprises - Computer Business Review

Learning that Targets Millennial and Generation Z – HR Exchange Network

Both Millennials and Generation Z can be categorized as digital natives. The way in which they learn reflects that reality. From a learning perspective, a companys learning programs must reflect that also.

Utilizing technologies such as microlearning, which is usually delivered with mobile technology, or machine learning to can engage these individuals in the way they are accustomed to consuming information.

Microlearning is delivering learning in bite-sized pieces. It can take many different forms such an animation or a video. In either case, the information is delivered in a short amount of time; in as little as two to three minutes. In most cases, micro-learning happens on a mobile device or tablet.

When should micro-learning be used?

Think of it as a way to engage employees already on the job. It can be used to deliver quick bits of information that will become immediately relevant to their daily responsibilities. To be more pointed, microlearning is the bridge between formal training and application. At least one study shows after six weeks following a formal training, 85% of the content consumed will have been lost. Microlearning can deliver that information in the interim and can be used at the moment of application.

Microlearning shouldnt be used to replace formal training, but rather as a compliment which makes it perfect for developing and retaining high-quality talent.

Amnesty International piloted a microlearning strategy to launch its global campaign on Human Rights Defenders. The program used the learning approached to build a culture of human rights. It allowed Amnesty to discuss human rights issues in a quick, relevant, and creative manner. As such, learners were taught how to talk to people in everyday life about human rights and human rights defenders.

WEBINAR: L&Ds Role in Enabling the Future of Work with a Skills Focused Strategy

Dell has also used the strategy to implement a digital campaign to encourage 14,000 sales representatives around the world to implement elements of its Net Promoter Score methodology. Using mobile technology and personal computers, the company was able to achieve 11% to 19% uptake in desire among sales reps globally.

Machine learning can also be used as a strategy. Machine learning, which is a branch of artificial intelligence, is an application that provides systems the ability to automatically learn and improve from experience without being programmed to do so.

For the purpose of explanation, the example of an AI-controlled multiple-choice test is relevant. If a person taking the test marked an incorrect answer, AI would then give them a question a bit easier to answer. If the question was answered wrong again, AI would follow with a question lower in difficulty level. When the student began to answer questions correctly, the difficulty of the questions would increase. Similarly, a person answering questions correctly would continue to get more difficult questions. This allows the AI to determine what topics the student understands least. In doing so, learning becomes personalized and specific for the student.

But technology isnt the sole basis for disseminating information. Learning programs should also focus on creating more experience opportunities that offer development in either leadership or talent. Those programs should also prioritize retention. Programs such as mentoring and coaching are great examples.

Dipankar Bandyopadhyay led this charge when he was the Vice President of HR Global R&D and Integration Planning Lead Culture & Change Management for the Monsanto Company. Monsanto achieved this through itsGlobal Leadership Program For Experienced Hires.

A couple of years ago, we realized we had a need to supplement our talent pipeline, essentially in our commercial organization and businesses globally really building talent for key leadership roles within the business, which play really critical influence roles and help drive organizational strategy in these areas. With this intention, we created Global Commercial Emerging Leaders Program, Bandyopadhyay said. Essentially, what it does is focus on getting external talent into Monsanto through different industry segments. This allows us to broaden our talent pipeline, bringing in diverse points of view from very different industry segments (i.e., consumer goods, investment banking, the technology space, etc.) The program selects, onboards, assimilates and develops external talent to come into Monsanto.

Microlearning and machine learning are valuable in developing the workforce, but they are not the only ones available. Additionally, its important to note an organization cant simply provide development and walk away. There has to be data and analysis that tracks employee learning success. There also needs to be strategies in place to make sure workers are retaining that knowledge. Otherwise, it is a waste of money.

NEXT: How L&D Can Help Itself

Want more content faster? Connect with us on Twitter, Facebook and LinkedIn. And don't forget to join our LinkedIn group!

Photo courtesy: Pexels

Visit link:
Learning that Targets Millennial and Generation Z - HR Exchange Network

Uncover the Possibilities of AI and Machine Learning With This Bundle – Interesting Engineering

If you want to be competitive in an increasingly data-driven world, you need to have at least a baseline understanding of AI and machine learningthe driving forces behind some of todays most important technologies.

The Essential AI & Machine Learning Certification Training Bundle will introduce you to a wide range of popular methods and tools that are used in these lucrative fields, and its available for over 90 percent off at just $39.99.

This 4-course bundle is packed with over 280 lessons that will introduce you to NLP, computer vision, data visualization, and much more.

After an introduction to the basic terminology of the field, youll explore the interconnected worlds of AI and machine learning through instruction that focusses on neural networks, deep architectures, large-scale data analysis, and much more.

The lessons are easy to follow regardless of your previous experience, and there are plenty of real-world examples to keep you on track.

Dont get left behind during the AI and machine learning revolution. The Essential AI & Machine Learning Certification Training Bundle will get you up to speed for just $39.99over 90 percent off for a limited time.

Prices are subject to change.

This is a promotional article about one of Interesting Engineering's partners. By shopping with us, you not only get the materials you need, but youre also supporting our website.

Go here to see the original:
Uncover the Possibilities of AI and Machine Learning With This Bundle - Interesting Engineering

Five Reasons to Go to Machine Learning Week 2020 – Machine Learning Times – machine learning & data science news – The Predictive Analytics Times

When deciding on a machine learning conference, why go to Machine Learning Week 2020? This five-conference event May 31 June 4, 2020 at Caesars Palace, Las Vegas delivers brand-name, cross-industry, vendor-neutral case studies purely on machine learnings commercial deployment, and the hottest topics and techniques. In this video, Predictive Analytics World Founder Eric Siegel spills on the details and lists five reasons this is the most valuable machine learning event to attend this year.

Note: This article is based on the transcript of a special episode of The Dr. Data Show click here to view.

In this article, I give five reasons that Machine Learning Week May 31 June 4, 2020 at Caesars Palace, Las Vegas is the most valuable machine learning event to attend this year. MLW is the largest annual five-conference blow-out part of the Predictive Analytics World conference series, of which I am the founder.

First, some background info. Your business needs machine learning to thrive and even just survive. You need it to compete, grow, improve, and optimize. Your team needs it, your boss demands it, and your career loves machine learning.

And so we bring you Predictive Analytics World, the leading cross-vendor conference series covering the commercial deployment of machine learning. By design, PAW is where to meet the whos who and keep up on the latest techniques.

This June in Vegas, Machine Learning Week brings together five different industry-focused events: PAW Business, PAW Financial, PAW Industry 4.0, PAW Healthcare, and Deep Learning World. This is five simultaneous two-day conferences all happening alongside one another at Caesars Palace in Vegas. Plus, a diverse range of full-day training workshops, which take place in the days just before and after.

Machine Learning Week delivers brand-name, cross-industry, vendor-neutral case studies purely on machine learning deployment, and the hottest topics and techniques.

This mega event covers all the bases for both senior-level expert practitioners as well as newcomers, project leaders, and executives. Depending on the topic, sessions and workshops are either demarcated as the Expert/practitioner level, or for All audiences. So, you can bring your team, your supervisor, and even the line-of-business managers you work with on model deployment. About 60-70% of attendees are on the hands-on practitioner side, but, as you know, successful machine learning deployment requires deep collaboration between both sides of the equation.

PAW and Deep Learning World also takes place in Germany, and Data Driven Government takes place in Washington DC but this article is about Machine Learning Week, so see predictiveanalyticsworld.com for details about the others.

Here are the five reasons to go.

Five Reasons to Go to Machine Learning Week June 2020 in Vegas

1) Brand-name case studies

Number one, youll access brand-name case studies. At PAW, youll hear directly from the horses mouth precisely how Fortune 500 analytics competitors and other companies of interest deploy machine learning and the kind of business results they achieve. More than most events, we pack the agenda as densely as possible with named case studies. Each day features a ton of leading in-house expert practitioners who get things done in the trenches at these enterprises and come to PAW to spill on the inside scoop. In addition, a smaller portion of the program features rock star consultants, who often present on work theyve done for one of their notable clients.

2) Cross-industry coverage

Number two, youll benefit from cross-industry coverage. As I mentioned, Machine Learning Week features these five industry-focused events. This amounts to a total of eight parallel tracks of sessions.

Bringing these all together at once fosters unique cross-industry sharing, and achieves a certain critical mass in expertise about methods that apply across industries. If your work spans industries, Machine Learning Week is one-stop shopping. Not to mention that convening the key industry figures across sectors greatly expands the networking potential.

The first of these, PAW Business, itself covers a great expanse of business application areas across many industries. Marketing and sales applications, of course. And many other applications in retail, telecommunications, e-commerce, non-profits, etc., etc.

The track topics of PAW Business 2020

PAW Business is a three-track event with track topics that include: analytics operationalization & management i.e., the business side core machine learning methods and advanced algorithms i.e., the technical side innovative business applications covered as case studies, and a lot more.

PAW Financial covers machine learning applications in banking including credit scoring insurance applications, fraud detection, algorithmic trading, innovative approaches to risk management, and more.

PAW Industry 4.0 and PAW Healthcare are also entire universes unto themselves. You can check out the details about all four of these PAWs at predictiveanalyticsworld.com.

And the newer sister event Deep Learning World has its own website, deeplearningworld.com. Deep learning is the hottest advanced form of machine learning with astonishing, proven value for large-signal input problems, such as image classification for self-driving cars, medical image processing, and speech recognition. These are fairly distinct domains, so Deep Learning World does well to complement the four Predictive Analytics World events.

3) Pure-play machine learning content

Number three, youll get pure-play machine learning content. PAWs agenda is not watered down with much coverage of other kinds of big data work. Instead, its ruthlessly focused specifically on the commercial application of machine learning also known as predictive analytics. The conference doesnt cover data science as a whole, which is a much broader and less well-defined area, that, for example, can include standard business intelligence reporting and such. And we dont cover AI per se. Artificial intelligence is at best a synonym for machine learning that tends to over-hype, or at worst an outright lie that promises mythological capabilities.

4) Hot new machine learning practices

Number four, youll learn the latest and greatest, the hottest new machine learning practices. Now, we launched PAW over a decade ago, so far delivering value to over 14,000 attendees across more than 60 events. To this day, PAW remains the leading commercial event because we keep up with the most valuable trends.

For example, Deep Learning World, which launched more recently in 2018 covers deep learnings commercial deployment across industry sectors. This relatively new form of neural networks has blossomed, both in buzz and in actual value. As I mentioned, it scales machine learning to process, for example, complex image data.

And what had been PAW Manufacturing for some years has now changed its name to PAW Industry 4.0. As such, the event now covers a broader area of inter-related work applying machine learning for smart manufacturing, the Internet of Things (IoT), predictive maintenance, logistics, fault prediction, and more.

In general, machine learning continues to widen its adoption and apply in new, innovative ways across sectors in marketing, financial risk, fraud detection, workforce optimization, and healthcare. PAW keeps up with these trends and covers todays best practices and the latest advanced modeling methods.

5) Vendor-neutral content

And finally, number five, youll access vendor-neutral content. PAW isnt run by an analytics vendor and the speakers arent trying to sell you on anything but good ideas. PAW speakers understand that vendor-neutral means those in attendance must be able to implement the practices covered and benefit from the insights delivered without buying any particular analytics product.

During the event, some vendors are permitted to deliver short presentations during a limited minority of demarcated sponsored sessions. These sessions often are also substantive and of great interest. In fact, you can access all the sponsors and tap into their expertise at will in the exhibit hall, where theyre set up for just that purpose.

By the way, if youre an analytics vendor yourself, check out PAWs various sponsorship opportunities. Our events bring together a great crowd of practitioners and decision makers.

Summary Five Reasons to Go

1) Brand-name case studies

2) Cross-industry coverage

3) Pure-play machine learning content

4) Hot new machine learning practices

5) Vendor-neutral content

and those are the reasons to come to Machine Learning Week: brand-name, cross-industry, vendor-neutral case studies purely on machine learnings commercial deployment, and the hottest topics and techniques.

Machine Learning Week not only delivers unique knowledge-gaining opportunities, its also a universal meeting place the industrys premier networking event. It brings together the whos who of machine learning and predictive analytics, the greatest diversity of expert speakers, perspectives, experience, viewpoints, and case studies.

This all turns the normal conference stuff into a much richer experience, including the keynotes, expert panels, and workshop days, as well as opportunities to network and talk shop during the lunches, coffee breaks, and reception.

I encourage you to check out the detailed agenda see all the speakers, case studies, and advanced methods covered. Each of the five conferences has its own agenda webpage, or you can also view the entire five-conference, eight-track mega-agenda at once. This view pertains if youre considering registering for the full Machine Learning Week pass, or if youll be attending along with other team members in order to divide and conquer.

Visit our website to see all these details, register, and sign up for informative event updates by email.

Or to learn more about the field in general, check out our Predictive Analytics Guide, our publication The Machine Learning Times, which includes revealing PAW speaker interviews, and, episodes of this show, The Dr. Data Show which, by the way, is generally about the field of machine learning in general, rather than about our PAW events.

This article is based on a transcript from The Dr. Data Show.

CLICK HERE TO VIEW THE FULL EPISODE

About the Dr. Data Show. This new web series breaks the mold for data science infotainment, captivating the planet with short webisodes that cover the very best of machine learning and predictive analytics. Click here to view more episodes and to sign up for future episodes of The Dr. Data Show.

About the Author

Eric Siegel, Ph.D., founder of the Predictive Analytics Worldand Deep Learning World conference series and executive editor ofThe Machine Learning Times, makes the how and why of predictive analytics (aka machine learning) understandable and captivating. He is the author of the award-winning bookPredictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die, the host of The Dr. Data Show web series, a former Columbia University professor, and a renowned speaker, educator, and leader in the field. Follow him at @predictanalytic.

The rest is here:
Five Reasons to Go to Machine Learning Week 2020 - Machine Learning Times - machine learning & data science news - The Predictive Analytics Times

Adventures With Artificial Intelligence and Machine Learning – Toolbox

Since October of last year I have had the opportunity to work with an startup working on automated machine learning and I thought that I would share some thoughts on the experience and the details of what one might want to consider around the start of a journey with a data scientist in a box.

Ill start by saying that machine learning and artificial intelligence has almost forced itself into my work several times in the past eighteen months, all in slightly different ways.

The first brush was back in June 2018 when one of the developers I was working with wanted to demonstrate to me a scoring model for loan applications based on the analysis of some other transactional data that indicated loans that had been previously granted. The model had no explanation and no details other than the fact that it allowed you to stitch together a transactional dataset which it assessed using a nave Bayes algorithm. We had a run at showing this to a wider audience but the palate for examination seemed low and I suspect that in the end the real reason was we didnt have real data and only had a conceptual problem to be solved.

The second go was about six months later when another colleague in the same team came up with a way to classify data sets and in fact developed a flexible training engine and data tagging approach to determining whether certain columns in data sets were likely to be names, addresses, phone numbers and email addresses. On face value you would think this to be something simple but in reality, it is of course only as good as the training data and in this instance we could easily confuse the system and the data tagging with things like social security numbers that looked like phone numbers, postcodes that were simply numbers and ultimately could be anything and so on. Names were only as good as the locality from which the names training data was sourced and cities, towns. Streets and provinces all proved to most work ok but almost always needed region-specific training data. At any rate, this method of classifying contact data for the most part met the rough objectives of the task at hand and so we soldiered on.

A few months later I was called over to a developers desk and asked for my opinion on a side project that one of the senior developers and architects had been working on. The objective was ambitious but impressive. The solution had been built in response to three problems in the field. The first problem to be solved was decoding why certain records were deemed to be related to one another when with the naked eye they seemed to not be, or vice versa. While this piece didnt involve any ML per se, the second part of the solution did, in that it self-configured thousands of combinations of alternative fuzzy matching criteria to determine an optimal set of duplicate record matching rules.

This was understandably more impressive and practically understandable almost self-explanatory. This would serve as a great utility for a consultant, a data analyst or a relative layperson to find explainability in how potential duplicate records were determined to have a relationship. This was specifically important because it immediately could provide value to field services personnel and clients. In addition, the developer had cunningly introduced a manual matching option that allowed a user to evaluate two records and make a decision through visual assessment as to whether two records could potentially be considered related to one another.

In some respects what was produced was exactly the way that I like to see products produced. The field describes the problem; the product management organization translates that into more elaborate stories and looks for parallels in other markets, across other business areas and for ubiquity. Once those initial requirements have been gathered it is then to engineering and development to come up with a prototype that works toward solving the issue.

The more experienced the developer of course the more comprehensive the result may be and even the more mature the initial iteration may be. Product is then in a position to pitch the concept back at the field, to clients and a selective audience to get their perspective on the solution and how well it matches the for solving the previously articulated problem.

The challenge comes when you have a less tightly honed intent, a less specific message and a more general problem to solve and this comes now to the latest aspect of machine learning and artificial intelligence that I picked up.

One of the elements with dealing with data validation and data preparation is the last mile of action that you have in mind for that data. If your intent is as simple as one of, lets evaluate our data sources, clean them up and makes them suitable for online transaction processing then thats a very specific mission. You need to know what you want to evaluate, what benchmark you wish to evaluate them against and then have some sort of remediation plan for them so that they support the use case for which theyre intended say, supporting customer calls into a call centre. The only areas where you might consider artificial intelligence and machine learning for applicability in this instance might be for determining matches against the baseline but then the question is whether you simply have a Boolean decision or whether in fact, some sort of stack ranking is relevant at all. It could be argued either way, depending on the application.

When youre preparing data for something like a decision beyond data quality though, the mission is perhaps a little different. Effectively your goal may be to cut the cream of opportunities off the top of a pile of contacts, leads, opportunities or accounts. As such, you want to use some combination of traits within the data set to determine influencing factors that would determine a better (or worse) outcome. Here, linear regression analysis for scoring may be sufficient. The devil, of course, lies in the details and unless youre intimately familiar with the data and the proposition that youre trying to resolve for you have to do a lot of trial and error experimentation and validation. For statisticians and data scientists this is all very obvious and you could say, is a natural part of the work that they do. Effectively the challenge here is feature selection. A way of reducing complexity in the model that you will ultimately apply to the scoring.

The journey I am on right now with a technology partner, focuses on ways to actually optimise the features in a way that only the most necessary and optimised features will need to be considered. This, in turn, makes the model potentially simpler and faster to execute, particularly at scale. So while the regression analysis still needs to be done, determining what matters, what has significance and what should be retained vs discarded in terms of the model design, is being all factored into the model building in an automated way. This doesnt necessarily apply to all kinds of AI and ML work but for this specific objective it is perhaps more than adequate and one that doesnt require a data scientist to start delivering a rapid yield.

Read the original:
Adventures With Artificial Intelligence and Machine Learning - Toolbox