AI Weekly: The promise and limitations of machine programming tools – VentureBeat

Elevate your enterprise data technology and strategy at Transform 2021.

Machine programming, which automates the development and maintenance of software, is becoming supercharged by AI. During its Build developer conference in May, Microsoft detailed a new feature in Power Apps that taps OpenAIs GPT-3 language model to assist people in choosing formulas. Intels ControlFlag can autonomously detect errors in code. And Facebooks TransCoderconverts code from one programming language into another.

The applications of computer programming are vast in scope. And as computers become ubiquitous, the demand for quality code draws an ever-growing number of aspiring programmers to the profession. After years of study to become proficient at coding, experts learn to convert abstracts into concrete, executable programs. But they spend the majority of their work hours not programming. According to a study from the University of Cambridge, at least half of developers efforts are spent debugging, which costs the software industry an estimated $312 billion per year.

AI-powered code suggestion and review tools promise to cut development costs substantially while allowing coders to focus on more creative, less repetitive tasks, according to Justin Gottschlich, principal AI scientist at Intels machine programming division. Gottschlich is spearheading the work on ControlFlag, which fuses machine learning, formal methods, programming languages, and compilers to detect normal coding patterns, identifying abnormalities in code that are likely to cause a bug.

Prior to machine learning- or AI-based programming systems, programmers had dozens perhaps hundreds of tools to help them be more productive, produce code with fewer logic errors, improve the softwares performance, and so on. However, nearly all of these systems were rules-based,' Gottschlich told VentureBeat via email. While useful, rules-based systems are inherently limited in scope by the rules that they have been programmed into them. As such, if new kinds of things occur, the systems would need to be updated by humans. Moreover, these rules-based systems have always been prone to human error in creating the rules encoded in them. For example, programmers may accidentally create a rule to find a certain type of bug, but incorrectly define the rules to find it. This hidden bug in the rules system could go undetected forever.

Gottschlich asserts that AI-based systems offer benefits over the rules-based systems of yesteryear because AI can learn on its own in an unsupervised fashion, enabling it to draw on massive code databases. With unsupervised learning, an algorithm is fed unknown data for which no previously defined labels exist. The system must teach itself to classify the data by processing it to learn from its structure.

For example, ControlFlag was trained on over 1 billion unlabeled lines of code to identify stylistic variations in programming language. As for TransCoder, it learned to translate between C++, Java, and Python by analyzing a GitHub corpus containing over 2.8 million repositories. Microsoft trained a bug-spotting program on a dataset of 13 million work items and bugs from 47,000 developers across AzureDevOps and GitHub repositories. And code review platform DeepCodes algorithms were taught using billions of lines of code captured from public open source projects.

Theres a difference between AI-powered coding tools that can generate code from whole cloth versus augment a programmers workflow, of course. The latter is more common. Startups such as Tabine (formerly Codota) are developing platforms that suggest and autocomplete scripts in Python, C, HTML, Java, Scala, Kotlin, and JavaScript. Ponicode taps AI to check the accuracy of code. Intels Machine Inferred Code Similarity engine can determine when two pieces of code perform similar tasks, even when they use different structures and algorithms. And DeepCode offers a machine learning-powered system for whole-app code reviews as does Amazon.

Currently, we see a lot of AI-powered assistants, enabling software engineers to gain velocity and accuracy in their work. And the reason for the availability of more assistant tools than automation tools is that AI-powered automation has simply not yet reached the level of accuracy required, Ponicode CEO Patrick Joubert told VentureBeat. Our industry is still young, and even though we can already see the potential of automation with AI based code generators, we have to acknowledge that automatically generated code is still pretty unmaintainable and the overall quality is not meeting the right standards yet. While some engineers are working on the future of AI powered automation, my team and I, along with many other stakeholders, are dedicated to creating tools that can be used today. Within a few years I believe there will be enough tools to cover all steps of the development lifecycle.

For Joubert, the most intriguing categories of machine programming tools today are autocompletion and code analysis. Autocompletion systems like Tabnine and Kite employ AI to analyze semantics and make sense of code, autocompleting functions with a sense of the codes semantic content and purpose. As for code analysis tools like Snyk and DeepCode, theyre dedicated to finding vulnerabilities in the code and suggesting actions to resolve them often with surprising speed and precision.

When we see the numerous leaks and bugs from any software, including the ones built by leading multinationals, we can agree that [the software] industry has not yet matured. AI-powered coding tools are mostly meant to enhance the developer experience and empower them, thanks to greater velocity and greater efficiency, Joubert added. Behind these developer-focused benefits, I believe we are on the way to allowing software engineers to build industrial-grade software, where quality, innovation, and speed are reached systematically Autocompletion [in particular is] enabling software engineers to focus on the most complex part of their codebase and removing the burden of manually writing long strings of code.

Despite their potential, both AI-powered code generators and coding assistance tools have their limitations. For example, while GitHub has over 250 million code repositories alone, most of the data is unannotated. Theres only a few examples that describe precisely what the code does, posing a particular challenge for any system that cant learn from unlabeled data.

In an effort to address this, IBM recently released CodeNet, a 14-million-sample labeled dataset with 500 million lines of code written in 55 programming languages. The company claims that the rich annotations added to CodeNet make it suitable for a diverse set of tasks as opposed to other datasets specialized for specific programming tasks. Already, researchers at IBM have conducted several experiments with CodeNet, including code classification, code similarity evaluation, and code completion.

It is my speculation that in the next decade, code semantics understanding systems are likely to be one of the most important areas of machine programming in the coming decade, Joubert said. It depends on the domain the machine programming system is being applied to. For small programs, such as unit tests or regression tests, full program synthesizers are a reality today. Yet, for larger programs, its currently computationally intractable for machine programming systems to generate the potential thousands or millions of lines of code without the assistance of a programmer.

Boris Paskalev, the cofounder and CEO of DeepCode, calls creating a couple of lines of code with AI more of a toy than a productivity breakthrough. While techniques like natural language processing work well with text because theres fixed limits on the words and syntax that need to be understood, code isnt the same, he argues.

Since there are no formal rules for software development, [programming] is an art that requires a complete understanding of code and a developers intentions to produce something that works as expected without bugs, Paskalev told VentureBeat. As far as weve come in using machine learning and neural networks for code, were still only in the invention of the wheel phase machine learning is already proving to be very useful for code, but only after it goes through a semantic machine learning-representation of the code: making sure all semantic facts, variables, transitions, and logical interrelations are clearly represented and considered by the learning model.

To Paskalevs point, recent studies suggest that AI has a ways to go before it can reliably generate code. In June, a team of researchers at the University of California at Berkeley, Cornell, the University of Chicago, and the University of Illinois at Urbana-Champaign released APPS, a benchmark for code generation from natural language specifications. The team tested several types of models on APPS, including OpenAIs GPT-2, GPT-3, and an open source version of GPT-3 called GPT-Neo. In experiments, they discovered that the models could learn to generate code that solves easier problems but not without syntax errors. Approximately 59% of GPT-3s solutions for introductory problems had errors, while the best-performing model GPT-Neo attained only 10.15% accuracy.

When generating code from whole cloth, there are typically challenges around both specifying the intent and consuming the results, Tabine CEO Dror Weiss told VentureBeat. User intent can be specified in natural language by providing examples, writing code in a higher-level language, or in other means. But in most cases, this intent does not provide a full specification of the desired behavior. Also, the generated code may be following different route than what the developer had in mind. As such, it may be challenging for the developer to judge whether the code performs the desired operation exactly.

Facebook AI researchers Baptiste Rozire and Marie-Anne Lachaux, who worked on TransCoder, agree with Tabines assessment. It is inherently difficult to generate correct code from unspecific natural language problem descriptions that could correspond to several different code snippets. An easier task would be to generate code from an input that is more specific and closer to the output code, like pseudo-code or code written in a different language, they told VentureBeat. A huge obstacle to the adoption of methods generating large amounts of code without human supervision is that they would need to be extremely reliable to be used easily. Even a tool that could generate methods with 99% accuracy would fail to generate a working codebase of hundreds of functions. It could speedup the code generation process but would still require human testing and intervention.

Rozire and Lachaux also point out that tasks around code generation are generally much harder than classification tasks because the model has a lot of freedom and can create many different outputs, making it hard to control the correctness of the generation. Moreover, compared with natural languages, programming languages are very sensitive to small errors. A one-character difference can change the semantics of the code and make the output faulty.

Current machine learning algorithms may not be able to generalize well enough to different problems to match human performance for coding interviews without larger datasets or much better unsupervised pre-training methods, Rozire and Lachaux said.

Paskalev thinks itll be at least five to ten years until natural language processing enables developers to create meaningful components or even entire apps from a simple description. But Gottschlich is more optimistic. He notes that AI-powered coding tools arent just valuable in writing code, but also when it comes to lower-hanging fruit like upgrading existing code. Migrating an existing codebase to a modern or more efficient language like Java or C++, for example, requires expertise in both the source and target languages and its often costly. The Commonwealth Bank of Australia spent around $750 million over the course of five years to convert its platform from COBOL to Java.

Deep learning already enables us to cover the smaller tasks, the repetitive and redundant ones which clutter a software engineers routine. Today, AI can free software engineers from tedious tasks slowing them down and decreasing their creativity, Gottschlich said. The human mind remains far superior when it comes to creation, innovation, and designing the most complex parts of our softwares. Enabling them to increase velocity in these exciting, high added value parts of their work is, I believe, the most interesting way to leverage the power of machine learning today.

Joubert and Weiss say that the potential business value of machine programming also cant be ignored. An estimated 19% to 23% of software development projects fail, with that statistic holding steady for the past couple of decades. Standish Groupfound that challenged projects i.e., those that fail to meet scope, time, or budget expectations account for about 52% of software projects. Often, a lack of user involvement and clear requirements are to blame for missed benchmarks.

We see a great number of new tools using AI to enhance legacy code and help existing assets reach industrial-grade standards. We can elevate developer legacy code management workflows and be part of reducing the hefty level of technical debt built up over the past 50 years in the software industry, Joubert said. The days when developers had to write and read code line by line are gone. Im excited to see how the other steps in the software development lifecycle are going to be transformed and how tools will reach the same level that Kite or Snyk have attained. Leveraging AI to build efficient, one-purpose, tested, secure, and documented code effortlessly is going to profoundly change the way software companies can create incremental value and innovation.

From Weiss perspective, AI-powered coding tools can reduce costly interactions between developers like Q&A sessions and repetitive code review feedback while shortening the project onboarding process. [These] tools make all developers in the enterprise better. They take the collective code intelligence of the organization and make it available, during development time, to all developers. This allows any developer on the team to punch above their weight, he said.

For AI coverage, send news tips toKyle Wiggers and be sure to subscribe to the AI Weekly newsletterand bookmark our AI channel,The Machine.

Thanks for reading,

Kyle Wiggers

AI Staff Writer

See original here:
AI Weekly: The promise and limitations of machine programming tools - VentureBeat

What Is Machine Learning? | How It Works, Techniques ... [Last Updated On: September 5th, 2019] [Originally Added On: September 5th, 2019]
Start Here with Machine Learning [Last Updated On: September 22nd, 2019] [Originally Added On: September 22nd, 2019]
What is Machine Learning? | Emerj [Last Updated On: October 1st, 2019] [Originally Added On: October 1st, 2019]
Microsoft Azure Machine Learning Studio [Last Updated On: October 1st, 2019] [Originally Added On: October 1st, 2019]
Machine Learning Basics | What Is Machine Learning? | Introduction To Machine Learning | Simplilearn [Last Updated On: October 1st, 2019] [Originally Added On: October 1st, 2019]
What is Machine Learning? A definition - Expert System [Last Updated On: October 2nd, 2019] [Originally Added On: October 2nd, 2019]
Machine Learning | Stanford Online [Last Updated On: October 2nd, 2019] [Originally Added On: October 2nd, 2019]
How to Learn Machine Learning, The Self-Starter Way [Last Updated On: October 17th, 2019] [Originally Added On: October 17th, 2019]
definition - What is machine learning? - Stack Overflow [Last Updated On: November 3rd, 2019] [Originally Added On: November 3rd, 2019]
Artificial Intelligence vs. Machine Learning vs. Deep ... [Last Updated On: November 3rd, 2019] [Originally Added On: November 3rd, 2019]
Machine Learning in R for beginners (article) - DataCamp [Last Updated On: November 3rd, 2019] [Originally Added On: November 3rd, 2019]
Machine Learning | Udacity [Last Updated On: November 3rd, 2019] [Originally Added On: November 3rd, 2019]
Machine Learning Artificial Intelligence | McAfee [Last Updated On: November 3rd, 2019] [Originally Added On: November 3rd, 2019]
Machine Learning [Last Updated On: November 3rd, 2019] [Originally Added On: November 3rd, 2019]
AI-based ML algorithms could increase detection of undiagnosed AF - Cardiac Rhythm News [Last Updated On: November 19th, 2019] [Originally Added On: November 19th, 2019]
The Cerebras CS-1 computes deep learning AI problems by being bigger, bigger, and bigger than any other chip - TechCrunch [Last Updated On: November 19th, 2019] [Originally Added On: November 19th, 2019]
Can the planet really afford the exorbitant power demands of machine learning? - The Guardian [Last Updated On: November 19th, 2019] [Originally Added On: November 19th, 2019]
New InfiniteIO Platform Reduces Latency and Accelerates Performance for Machine Learning, AI and Analytics - Business Wire [Last Updated On: November 19th, 2019] [Originally Added On: November 19th, 2019]
How to Use Machine Learning to Drive Real Value - eWeek [Last Updated On: November 19th, 2019] [Originally Added On: November 19th, 2019]
Machine Learning As A Service Market to Soar from End-use Industries and Push Revenues in the 2025 - Downey Magazine [Last Updated On: November 26th, 2019] [Originally Added On: November 26th, 2019]
Rad AI Raises $4M to Automate Repetitive Tasks for Radiologists Through Machine Learning - - HIT Consultant [Last Updated On: November 26th, 2019] [Originally Added On: November 26th, 2019]
Machine Learning Improves Performance of the Advanced Light Source - Machine Design [Last Updated On: November 26th, 2019] [Originally Added On: November 26th, 2019]
Synthetic Data: The Diamonds of Machine Learning - TDWI [Last Updated On: November 26th, 2019] [Originally Added On: November 26th, 2019]
The transformation of healthcare with AI and machine learning - ITProPortal [Last Updated On: November 26th, 2019] [Originally Added On: November 26th, 2019]
Workday talks machine learning and the future of human capital management - ZDNet [Last Updated On: November 26th, 2019] [Originally Added On: November 26th, 2019]
Machine Learning with R, Third Edition - Free Sample Chapters - Neowin [Last Updated On: November 26th, 2019] [Originally Added On: November 26th, 2019]
Verification In The Era Of Autonomous Driving, Artificial Intelligence And Machine Learning - SemiEngineering [Last Updated On: November 26th, 2019] [Originally Added On: November 26th, 2019]
Podcast: How artificial intelligence, machine learning can help us realize the value of all that genetic data we're collecting - Genetic Literacy... [Last Updated On: November 28th, 2019] [Originally Added On: November 28th, 2019]
The Real Reason Your School Avoids Machine Learning - The Tech Edvocate [Last Updated On: November 28th, 2019] [Originally Added On: November 28th, 2019]
Siri, Tell Fido To Stop Barking: What's Machine Learning, And What's The Future Of It? - 90.5 WESA [Last Updated On: November 28th, 2019] [Originally Added On: November 28th, 2019]
Microsoft reveals how it caught mutating Monero mining malware with machine learning - The Next Web [Last Updated On: November 28th, 2019] [Originally Added On: November 28th, 2019]
The role of machine learning in IT service management - ITProPortal [Last Updated On: November 28th, 2019] [Originally Added On: November 28th, 2019]
Global Director of Tech Exploration Discusses Artificial Intelligence and Machine Learning at Anheuser-Busch InBev - Seton Hall University News &... [Last Updated On: November 28th, 2019] [Originally Added On: November 28th, 2019]
The 10 Hottest AI And Machine Learning Startups Of 2019 - CRN: The Biggest Tech News For Partners And The IT Channel [Last Updated On: November 28th, 2019] [Originally Added On: November 28th, 2019]
Startup jobs of the week: Marketing Communications Specialist, Oracle Architect, Machine Learning Scientist - BetaKit [Last Updated On: November 30th, 2019] [Originally Added On: November 30th, 2019]
Here's why machine learning is critical to success for banks of the future - Tech Wire Asia [Last Updated On: December 2nd, 2019] [Originally Added On: December 2nd, 2019]
3 questions to ask before investing in machine learning for pop health - Healthcare IT News [Last Updated On: December 8th, 2019] [Originally Added On: December 8th, 2019]
Machine Learning Answers: If Caterpillar Stock Drops 10% A Week, Whats The Chance Itll Recoup Its Losses In A Month? - Forbes [Last Updated On: December 8th, 2019] [Originally Added On: December 8th, 2019]
Measuring Employee Engagement with A.I. and Machine Learning - Dice Insights [Last Updated On: December 8th, 2019] [Originally Added On: December 8th, 2019]
Amazon Wants to Teach You Machine Learning Through Music? - Dice Insights [Last Updated On: December 8th, 2019] [Originally Added On: December 8th, 2019]
Machine Learning Answers: If Nvidia Stock Drops 10% A Week, Whats The Chance Itll Recoup Its Losses In A Month? - Forbes [Last Updated On: December 8th, 2019] [Originally Added On: December 8th, 2019]
AI and machine learning platforms will start to challenge conventional thinking - CRN.in [Last Updated On: December 23rd, 2019] [Originally Added On: December 23rd, 2019]
Machine Learning Answers: If Twitter Stock Drops 10% A Week, Whats The Chance Itll Recoup Its Losses In A Month? - Forbes [Last Updated On: December 23rd, 2019] [Originally Added On: December 23rd, 2019]
Machine Learning Answers: If Seagate Stock Drops 10% A Week, Whats The Chance Itll Recoup Its Losses In A Month? - Forbes [Last Updated On: December 23rd, 2019] [Originally Added On: December 23rd, 2019]
Machine Learning Answers: If BlackBerry Stock Drops 10% A Week, Whats The Chance Itll Recoup Its Losses In A Month? - Forbes [Last Updated On: December 23rd, 2019] [Originally Added On: December 23rd, 2019]
Amazon Releases A New Tool To Improve Machine Learning Processes - Forbes [Last Updated On: December 23rd, 2019] [Originally Added On: December 23rd, 2019]
Another free web course to gain machine-learning skills (thanks, Finland), NIST probes 'racist' face-recog and more - The Register [Last Updated On: December 23rd, 2019] [Originally Added On: December 23rd, 2019]
Kubernetes and containers are the perfect fit for machine learning - JAXenter [Last Updated On: December 23rd, 2019] [Originally Added On: December 23rd, 2019]
TinyML as a Service and machine learning at the edge - Ericsson [Last Updated On: December 23rd, 2019] [Originally Added On: December 23rd, 2019]
AI and machine learning products - Cloud AI | Google Cloud [Last Updated On: December 23rd, 2019] [Originally Added On: December 23rd, 2019]
Machine Learning | Blog | Microsoft Azure [Last Updated On: December 23rd, 2019] [Originally Added On: December 23rd, 2019]
Machine Learning in 2019 Was About Balancing Privacy and Progress - ITPro Today [Last Updated On: December 25th, 2019] [Originally Added On: December 25th, 2019]
CMSWire's Top 10 AI and Machine Learning Articles of 2019 - CMSWire [Last Updated On: December 25th, 2019] [Originally Added On: December 25th, 2019]
Here's why digital marketing is as lucrative a career as data science and machine learning - Business Insider India [Last Updated On: January 13th, 2020] [Originally Added On: January 13th, 2020]
Dell's Latitude 9510 shakes up corporate laptops with 5G, machine learning, and thin bezels - PCWorld [Last Updated On: January 13th, 2020] [Originally Added On: January 13th, 2020]
Finally, a good use for AI: Machine-learning tool guesstimates how well your code will run on a CPU core - The Register [Last Updated On: January 13th, 2020] [Originally Added On: January 13th, 2020]
Cloud as the enabler of AI's competitive advantage - Finextra [Last Updated On: January 13th, 2020] [Originally Added On: January 13th, 2020]
Forget Machine Learning, Constraint Solvers are What the Enterprise Needs - - RTInsights [Last Updated On: January 13th, 2020] [Originally Added On: January 13th, 2020]
Informed decisions through machine learning will keep it afloat & going - Sea News [Last Updated On: January 13th, 2020] [Originally Added On: January 13th, 2020]
The Problem with Hiring Algorithms - Machine Learning Times - machine learning & data science news - The Predictive Analytics Times [Last Updated On: January 13th, 2020] [Originally Added On: January 13th, 2020]
New Program Supports Machine Learning in the Chemical Sciences and Engineering - Newswise [Last Updated On: January 13th, 2020] [Originally Added On: January 13th, 2020]
AI-System Flags the Under-Vaccinated in Israel - PrecisionVaccinations [Last Updated On: January 22nd, 2020] [Originally Added On: January 22nd, 2020]
New Contest: Train All The Things - Hackaday [Last Updated On: January 22nd, 2020] [Originally Added On: January 22nd, 2020]
AFTAs 2019: Best New Technology Introduced Over the Last 12 MonthsAI, Machine Learning and AnalyticsActiveViam - www.waterstechnology.com [Last Updated On: January 22nd, 2020] [Originally Added On: January 22nd, 2020]
Educate Yourself on Machine Learning at this Las Vegas Event - Small Business Trends [Last Updated On: January 22nd, 2020] [Originally Added On: January 22nd, 2020]
Seton Hall Announces New Courses in Text Mining and Machine Learning - Seton Hall University News & Events [Last Updated On: January 22nd, 2020] [Originally Added On: January 22nd, 2020]
Looking at the most significant benefits of machine learning for software testing - The Burn-In [Last Updated On: January 22nd, 2020] [Originally Added On: January 22nd, 2020]
Leveraging AI and Machine Learning to Advance Interoperability in Healthcare - - HIT Consultant [Last Updated On: January 22nd, 2020] [Originally Added On: January 22nd, 2020]
Adventures With Artificial Intelligence and Machine Learning - Toolbox [Last Updated On: January 22nd, 2020] [Originally Added On: January 22nd, 2020]
Five Reasons to Go to Machine Learning Week 2020 - Machine Learning Times - machine learning & data science news - The Predictive Analytics Times [Last Updated On: January 22nd, 2020] [Originally Added On: January 22nd, 2020]
Uncover the Possibilities of AI and Machine Learning With This Bundle - Interesting Engineering [Last Updated On: January 22nd, 2020] [Originally Added On: January 22nd, 2020]
Learning that Targets Millennial and Generation Z - HR Exchange Network [Last Updated On: January 23rd, 2020] [Originally Added On: January 23rd, 2020]
Red Hat Survey Shows Hybrid Cloud, AI and Machine Learning are the Focus of Enterprises - Computer Business Review [Last Updated On: January 23rd, 2020] [Originally Added On: January 23rd, 2020]
Vectorspace AI Datasets are Now Available to Power Machine Learning (ML) and Artificial Intelligence (AI) Systems in Collaboration with Elastic -... [Last Updated On: January 23rd, 2020] [Originally Added On: January 23rd, 2020]
What is Machine Learning? | Types of Machine Learning ... [Last Updated On: January 23rd, 2020] [Originally Added On: January 23rd, 2020]
How Machine Learning Will Lead to Better Maps - Popular Mechanics [Last Updated On: January 30th, 2020] [Originally Added On: January 30th, 2020]
Jenkins Creator Launches Startup To Speed Software Testing with Machine Learning -- ADTmag - ADT Magazine [Last Updated On: January 30th, 2020] [Originally Added On: January 30th, 2020]
An Open Source Alternative to AWS SageMaker - Datanami [Last Updated On: January 30th, 2020] [Originally Added On: January 30th, 2020]
Machine Learning Could Aid Diagnosis of Barrett's Esophagus, Avoid Invasive Testing - Medical Bag [Last Updated On: January 30th, 2020] [Originally Added On: January 30th, 2020]
OReilly and Formulatedby Unveil the Smart Cities & Mobility Ecosystems Conference - Yahoo Finance [Last Updated On: January 30th, 2020] [Originally Added On: January 30th, 2020]

Cloud Hosting

AI Weekly: The promise and limitations of machine programming tools – VentureBeat

Recent Posts

Categories

Archives

Media Sites

Pages

Site admin