Category Archives: Alphazero
When Mohammad Haft-Javaherian, a student at the Massachusetts Institute of Technology, attended MITs Green AI Hackathon in January, it was out of curiosity to learn about the capabilities of a new supercomputer cluster being showcased at the event. But what he had planned as a one-hour exploration of a cool new server drew him into a three-day competition to create energy-efficient artificial-intelligence programs.
The experience resulted in a revelation for Haft-Javaherian, who researches the use of AI in healthcare: The clusters I use every day to build models with the goal of improving healthcare have carbon footprints, Haft-Javaherian says.
The processors used in the development of artificial intelligence algorithms consume a lot of electricity. And in the past few years, as AI usage has grown, its energy consumption and carbon emissions have become an environmental concern.
I changed my plan and stayed for the whole hackathon to work on my project with a different objective: to improve my models in terms of energy consumption and efficiency, says Haft-Javaherian, who walked away with a $1,000 prize from the hackathon. He now considers carbon emission an important factor when developing new AI systems.
But unlike Haft-Javaherian, many developers and researchers overlook or remain oblivious to the environmental costs of their AI projects. In the age of cloud-computing services, developers can rent online servers with dozens of CPUs and strong graphics processors (GPUs) in a matter of minutes and quickly develop powerful artificial intelligence models. And as their computational needs rise, they can add more processors and GPUs with a few clicks (as long as they can foot the bill), not knowing that with every added processor, theyre contributing to the pollution of our green planet.
The recent surge in AIs power consumption is largely caused by the rise in popularity of deep learning, a branch of artificial-intelligence algorithms that depends on processing vast amounts of data. Modern machine-learning algorithms use deep neural networks, which are very large mathematical models with hundreds of millionsor even billionsof parameters, says Kate Saenko, associate professor at the Department of Computer Science at Boston University and director of the Computer Vision and Learning Group.
These many parameters enable neural networks to solve complicated problems such as classifying images, recognizing faces and voices, and generating coherent and convincing text. But before they can perform these tasks with optimal accuracy, neural networks need to undergo training, which involves tuning their parameters by performing complicated calculations on huge numbers of examples.
To make matters worse, the network does not learn immediately after seeing the training examples once; it must be shown examples many times before its parameters become good enough to achieve optimal accuracy, Saenko says.
All this computation requires a lot of electricity. According to a study by researchers at the University of Massachusetts, Amherst, the electricity consumed during the training of a transformer, a type of deep-learning algorithm, can emit more than 626,000 pounds of carbon dioxidenearly five times the emissions of an average American car. Another study found that AlphaZero, Googles Go- and chess-playing AI system, generated 192,000 pounds of CO2 during training.
To be fair, not all AI systems are this costly. Transformers are used in a fraction of deep-learning models, mostly in advanced natural-language processing systems such as OpenAIs GPT-2 and BERT, which was recently integrated into Googles search engine. And few AI labs have the financial resources to develop and train expensive AI models such as AlphaZero.
Also, after a deep-learning model is trained, using it requires much less power. For a trained network to make predictions, it needs to look at the input data only once, and it is only one example rather than a whole large database. So inference is much cheaper to do computationally, Saenko says.
Many deep-learning models can be deployed on smaller devices after being trained on large servers. Many applications of edge AI now run on mobile devices, drones, laptops, and IoT (Internet of Things) devices. But even small deep-learning models consume a lot of energy compared with other software. And given the expansion of deep-learning applications, the cumulative costs of the compute resources being allocated to training neural networks are developing into a problem.
Were only starting to appreciate how energy-intensive current AI techniques are. If you consider how rapidly AI is growing, you can see that we're heading in an unsustainable direction, says John Cohn, a research scientist with IBM who co-led the Green AI hackathon at MIT.
According to one estimate, by 2030, more than 6 percent of the worlds energy may be consumed by data centers. I don't think it will come to that, though I do think exercises like our hackathon show how creative developers can be when given feedback about the choices theyre making. Their solutions will be far more efficient, Cohn says.
CPUs, GPUs, and cloud servers were not designed for AI work. They have been repurposed for it, as a result, are less efficient than processors that were designed specifically for AI work, says Andrew Feldman, CEO and cofounder of Cerebras Systems. He compares the usage of heavy-duty generic processors for AI to using an 18-wheel-truck to take the kids to soccer practice.
Cerebras is one of a handful of companies that are creating specialized hardware for AI algorithms. Last year, it came out of stealth with the release of the CS-1, a huge processor with 1.2 trillion transistors, 18 gigabytes of on-chip memory, and 400,000 processing cores. Effectively, this allows the CS-1, the largest computer chip ever made, to house an entire deep learning model without the need to communicate with other components.
When building a chip, it is important to note that communication on-chip is fast and low-power, while communication across chips is slow and very power-hungry, Feldman says. By building a very large chip, Cerebras keeps the computation and the communication on a single chip, dramatically reducing overall power consumed. GPUs, on the other hand, cluster many chips together through complex switches. This requires frequent communication off-chip, through switches and back to other chips. This process is slow, inefficient, and very power-hungry.
The CS-1 uses a tenth of the power and space of a rack of GPUs that would provide the equivalent computation power.
Satori, the new supercomputer that IBM built for MIT and showcased at the Green AI hackathon, has also been designed to perform energy-efficient AI training. Satori was recently rated as one of the worlds greenest supercomputers. Satori is equipped to give energy/carbon feedback to users, which makes it an excellent laboratory for improving the carbon footprint both AI hardware and software, says IBMs Cohn.
Cohn also believes that the energy sources used to power AI hardware are just as important. Satori is now housed at the Massachusetts Green High Performance Computing Center (MGHPCC), which is powered almost exclusively by renewable energy.
We recently calculated the cost of a high workload on Satori at MGHPCC compared to the average supercomputer at a data center using the average mix of energy sources. The results are astounding: One year of running the load on Satori would release as much carbon into the air as is stored in about five fully-grown maple trees. Running the same load on the 'average' machine would release the carbon equivalent of about 280 maple trees, Cohn says.
Yannis Paschalidis, the Director of Boston Universitys Center for Information and Systems Engineering, proposes a better integration of data centers and energy grids, which he describes as demand-response models. The idea is to coordinate with the grid to reduce or increase consumption on-demand, depending on electricity supply and demand. This helps utilities better manage the grid and integrate more renewables into the production mix, Paschalidis says.
For instance, when renewable energy supplies such as solar and wind power are scarce, data centers can be instructed to reduce consumption by slowing down computation jobs and putting low-priority AI tasks on pause. And when theres an abundance of renewable energy, the data centers can increase consumption by speeding up computations.
The smart integration of power grids and AI data centers, Paschalidis says, will help manage the intermittency of renewable energy sources while also reducing the need to have too much stand-by capacity in dormant electricity plants.
Scientists and researchers are looking for ways to create AI systems that dont need huge amounts of data during training. After all, the human brain, which AI scientists try to replicate, uses a fraction of the data and power that current AI systems use.
During this years AAAI Conference, Yann LeCun, a deep-learning pioneer, discussed self-supervised learning, deep-learning systems that can learn with much less data. Others, including cognitive scientist Gary Marcus, believe that the way forward is hybrid artificial intelligence, a combination of neural networks and the more classic rule-based approach to AI. Hybrid AI systems have proven to be more data- and energy-efficient than pure neural-network-based systems.
It's clear that the human brain doesnt require large amounts of labeled data. We can generalize from relatively few examples and figure out the world using common sense. Thus, 'semi-supervised' or 'unsupervised' learning requires far less data and computation, which leads to both faster computation and less energy use, Cohn says.
Read the original post:
AI Could Save the World, If It Doesnt Ruin the Environment First - PCMag Portugal
According to his website, Gary Marcus, a notable figure in the AI community, has published extensively in fields ranging from human and animal behaviour to neuroscience, genetics, linguistics, evolutionary psychology and artificial intelligence.
AI and evolutionary psychology, which is considered to be a remarkable range of topics to cover for a man as young as Marcus.
Marcus, in his website, calls himself a scientist, a best-selling author, and an entrepreneur. And is also a founding member of Geometric Intelligence, a machine learning company acquired by Uber in 2016. However, Marcus is widely known for his debates with machine learning researchers like Yann Lecun and Yoshua Bengio.
Marcus leaves no stone unturned to flaunt his ferocity in calling out the celebrities of the AI community.
However, he also, call it an act of benevolence or finding a neutral ground, downplays his criticisms through his we agree to disagree tweets.
Last week, Marcus did what he does best when he tried to reboot and shake up AI once again as he debated Turing award winner Yoshua Bengio.
In this debate, hosted by Montreal.AI, Marcus, in his speech, criticized Bengio for not citing him in Bengios work and complained that it would devalue Marcus contribution.
Marcus, in his arguments, tried to explain how hybrids are pervasive in the field of AI by citing the example of Google, which according to him, is actually a hybrid between knowledge graph, a classic symbolic knowledge, and deep learning like a system called BERT.
Hybrids are all around us
Marcus also insists on the requirement of thinking in terms of nature and nurture, rather than nature versus nurture when it comes to the understanding of the human brain.
He also laments about how much of machine learning, historically, has avoided nativism.
Marcus also pointed out that Yoshua misrepresented him as saying deep learning doesnt work.
I dont care what words you want to use, Im just trying to build something that works.
While Marcus argued for symbols, pointing out that DeepMinds chess-winning AlphaZero program is a hybrid involving symbols because it uses Monte Carlo Tree Search. You have to keep track of your trees, and trees are symbols.
Bengio dismissed the notion that a tree search is a symbol system. Rather Its a matter of words, Bengio said. If you want to call those symbols, but symbols to me are different, they have to do with the discreteness of concepts.
Bengio also shared his views on how deep learning might be extended to dealing with computational capabilities rather than taking the old techniques and combining them with Neural Nets.
Bengio admitted that he completely agrees that a lot of current systems, which use machine learning, has also used a bunch of handcrafted rules and codes that were designed by people.
While Marcus pressed Bengio for hybrid systems as a solution, Bengio, patiently reminded how hybrid systems have already been built, which has led to Marcus admitting that he misunderstood Bengio!
This goof-up was followed by Bengios takedown of symbolic AI and why there is a need to move on from good old fashioned AI (GOFAI). In a nod to Daniel Kahnemann, Bengio, took the two-system theory to explain how richer representation is required in the presence of an abundance of knowledge.
To this Marcus quickly responded by saying, Now I would like to emphasise on our agreements. This was followed up by one more hour of conversation between the speakers and a Q&A session with the audience.
The debate ended with the moderator Vincent Boucher thanking the speakers for a hugely impactful debate, which was hugely pointless for a large part of it.
Gary Marcus has been playing or trying to play the role of an antagonist that would shake up the hype around AI for a long time now.
In his interview with Synced, when asked about his relationship with Yann Lecun, Marcus said that they both are friends as well as enemies. While calling out Lecun for making ad hominem attacks on him, he also approves many perspectives of his frenemies.
Deliberate or not, Marcus online polemics to bring down hype of AI, usually ends up hyping up his own antagonism. What the AI community needs is the likes of Nassim Taleb, who is known for his relentless, eloquent and technically intact arguments. Taleb has been a practitioner and an insider who doesnt give a damn about being an outsider.
On the other hand, Marcus calls himself a cognitive scientist, however, his contribution to the field of AI cannot be called groundbreaking. There is no doubt that Marcus should be appreciated for positioning himself in the line of fire in the celebrated era of AI. However, one cant help but wonder two things when one listens to Marcus antics/arguments:
There is a definitely a thing or two Marcus can learn from Talebs approaches in debunking pseudo babble. A very popular example could be that of Talebs takedown of Steven Pinker, who also happens to be a dear friend and mentor to Marcus.
That said, the machine learning research community, did witness something similar in the form of David Duvenaud and Smerity, when they took a detour from the usual we shock with you jargon research, and added a lot of credibility to the research community. While Duvenaud, trashed his own award-winning work, Stephen Smerity Merity, investigated his paper on the trouble with naming inventions and unwanted sophistication.
There is no doubt that there is a lot of exaggerations related to what AI can do. Not to forget the subtle land grab amongst the researchers for papers, which can mislead the community into thinking vanity as an indication of innovation. As we venture into the next decade, AI can use a healthy dose of scepticism and debunking from the Schmidhubers and Smeritys of its research world to be more reliable.
The rest is here:
For the five years, I've been working with Sophia, the world's most expressive humanoid robot (and the first robot citizen), and the other amazing creations of social robotics pioneer Dr. David Hanson. During this time, I've been asked a few questions over and over again.
Some of these are not so intriguing like, "Can I take Sophia out on a date?"
But there are some questions that hold more weight and lead to even deeper moral and philosophical discussions questions such as "Why do we really want robots that look and act like humans, anyway?"
This is the question I aim to address.
The easiest answer here is purely practical. Companies are going to make, sell and lease humanoid robots because a significant segment of the population wants humanoid robots. If some people aren't comfortable with humanoid robots, they don't need to buy or rent them.
I stepped back from my role as chief scientist of Hanson Robotics earlier this year so as to devote more attention to my role as CEO of SingularityNET, but I am still working on the application of artificial general intelligence (AGI) and decentralized AI to social robotics.
At the web summit this November, I demonstrated the OpenCog neural-symbolic AGI engine and the SingularityNET blockchain-based decentralized AI platform controlling David Hanson's Philip K. Dick robot (generously loaned to us by Dan Popa's lab at the University of Louisville). The ability of modern AI tools to generate philosophical ruminations in the manner of Philip K. Dick (PKD) is fascinating, beguiling and a bit disorienting. You can watch a video of the presentation here to see what these robots are like.
While the presentation garnered great enthusiasm, I also got a few people coming to me with the "Why humanoid robots?" question but with a negative slant. Comments in the vein of "Isn't it deceptive to make robots that appear like humans even though they don't have humanlike intelligence or consciousness?"
To be clear, I'm not in favor of being deceptive. I'm a fan of open-source software and hardware, and my strong preference is to be transparent with product and service users about what's happening behind the magic digital curtain. However, the bottom line is that "it's complicated."
There is no broadly agreed theory of consciousness of the nature of human or animal consciousness, or the criteria a machine would need to fulfill to be considered as conscious as a human (or more so).
And intelligence is richly multidimensional. Technologies like AlphaZero and Alexa, or the AI genomic analysis software used by biologists, are far smarter than humans in some ways, though sorely lacking in certain aspects such as self-understanding and generalization capability. As research pushes gradually toward AGI, there may not be a single well-defined threshold at which "humanlike intelligence" is achieved.
A dialogue system like the one we're using in the PKD robot incorporates multiple components some human-written dialogue script fragments, a neural network for generating text in the vein of PKD's philosophical writings and some simple reasoning. One thread in our ongoing research focuses on more richly integrating machine reasoning with neural language generation. As this research advances, the process of the PKD robot coming to "really understand what it's talking about" is probably going to happen gradually rather than suddenly.
It's true that giving a robot a humanoid form, and especially an expressive and reactive humanlike face, will tend to bias people to interact with the robot as if it really had human emotion, understanding and culture. In some cases this could be damaging, and it's important to take care to convey as accurately as feasible to the people involved what kind of system they're interacting with.
However, I think the connection that people tend to feel with humanoid robots is more of a feature than a bug. I wouldn't want to see human-robot relationships replace human-human relationships. But that's not the choice we're facing.
McDonald's, for instance, has bought an AI company and is replacing humans with touchpad-based kiosks and automated voice systems, for cost reasons. If people are going to do business with machines, let them be designed to create and maintain meaningful social and emotional connections with people.
As well as making our daily lives richer than they would be in a world dominated by faceless machines, humanoid robots have the potential to pave the way toward a future in which humans and robots and other AIs interact in mutually compassionate and synergetic ways.
As today's narrow AI segues into tomorrow's AGI, how will emerging AGI minds come to absorb human values and culture?
Hard-coded rules regarding moral values can play, at best, a very limited role, e.g., in situations like a military robot deciding who to kill, or a loan-officer AI deciding who to loan funds to. The vast majority of real-life ethical decisions are fuzzy, uncertain and contextual in nature the kind of thing that needs to be learned by generalization from experience and by social participation.
The best way for an AI to absorb human culture is the way kids do, through rich participation in human life. Of course, the architecture of the AI's mind also matters. It has to be able to represent and manipulate thought and behavior patterns as nebulous as human values. But the best cognitive architecture won't help if the AI doesn't have the right experience base.
So my ultimate answer to why should we have humanoid robots is not just because people want them or because they are better for human life and culture than faceless kiosks but because they are the best way I can see to fill the AGI mind of the future with human values and culture.