Category Archives: Machine Learning
Psychologists use machine learning algorithm to pinpoint top predictors of cheating in a relationship – PsyPost
According to a study published in the Journal of Sex Research, relationship characteristics like relationship satisfaction, relationship length, and romantic love are among the top predictors of cheating within a relationship. The researchers used a machine learning algorithm to pinpoint the top predictors of infidelity among over 95 different variables.
While a host of studies have investigated predictors of infidelity, the research has largely revealed mixed and often contradictory findings. Study authors Laura M. Vowels and her colleagues aimed to improve on these inconsistencies by using machine learning models. This approach would allow them to compare the relative predictability of various relationship factors within the same analyses.
The research topic was actually suggested by my co-author, Dr. Kristen Mark, who was interested in understanding predictors of infidelity better. She has previously published several articles on infidelity and is interested in the topic, explained Vowels, a principal researcher forBlueheart.ioand postdoctoral researcher at the University of Lausanne.
Vowels and her team pooled data from two different studies. The first data set came from a study of 891 adults, the majority of whom were married or cohabitating with a partner (63%). Around 54% of the sample identified as straight, 21% identified as bisexual, 11% identified as gay, and 7% identified as lesbian. A second data set was collected from both members of 202 mixed-sex couples who had been together for an average of 9 years, the majority of whom were straight (93%).
Data from the two studies included many of the same variables such as demographic measures like age, race, sexual orientation, and education, in addition to assessments of participants sexual behavior, sexual satisfaction, relationship satisfaction, and attachment styles. Both studies also included a measure of in-person infidelity (having interacted sexually with someone other than ones current partner) and online infidelity (having interacted sexually with someone other than ones current partner on the internet).
Using machine learning techniques, the researchers analyzed the data sets together first for all respondents and then separately for men and women. They then identified the top ten predictors for in-person cheating and for online cheating. Across both samples and among both men and women, higher relationship satisfaction predicted a lower likelihood of in-person cheating. By contrast, higher desire for solo sexual activity, higher desire for sex with ones partner, and being in a longer relationship predicted a higher likelihood of in-person cheating. In the second data set only, greater sexual satisfaction and romantic love predicted a lower likelihood of in-person infidelity.
When it came to online cheating, greater sexual desire and being in a longer relationship predicted a higher likelihood of cheating. Never having had anal sex with ones current partner decreased the likelihood of cheating online a finding the authors say likely reflects more conservative attitudes toward sexuality. In the second data set only, higher relationship and sexual satisfaction also predicted a lower likelihood of cheating.
Overall, I would say that there isnt one specific thing that would predict infidelity. However, relationship related variables were more predictive of infidelity compared to individual variables like personality. Therefore, preventing infidelity might be more successful by maintaining a good and healthy relationship rather than thinking about specific characteristics of the person, Vowels told PsyPost.
Consistent with previous studies, relationship characteristics like romantic love and sexual satisfaction surfaced as top predictors of infidelity across both samples. The researchers say this suggests that the strongest predictors for cheating are often found within the relationship, noting that, addressing relationship issues may buffer against the likelihood of one partner going out of the relationship to seek fulfillment.
These results suggest that intervening in relationships when difficulties first arise may be the best way to prevent future infidelity. Furthermore, because sexual desire was one of the most robust predictors of infidelity, discussing sexual needs and desires and finding ways to meet those needs in relationships may also decrease the risk of infidelity, the authors report.
The researchers emphasize that their analysis involved predicting past experiences of infidelity from an array of present-day assessments. They say that this design may have affected their findings, since couples who had previously dealt with cheating within the relationship may have worked through it by the time they completed the survey.
The study was exploratory in nature and didnt include all the potential predictors, Vowels explained. It also predicted infidelity in the past rather than current or future infidelity, so there are certain elements like relationship satisfaction that might have changed since the infidelity occurred. I think in the future it would be useful to look into other variables and also look at recent infidelity because that would make the measure of infidelity more reliable.
The study, Is Infidelity Predictable? Using Explainable Machine Learning to Identify the Most Important Predictors of Infidelity, was authored by Laura M. Vowels, Matthew J. Vowels, and Kristen P. Mark.
See the original post here:
Psychologists use machine learning algorithm to pinpoint top predictors of cheating in a relationship - PsyPost
MIT: Forcing ML Models to Avoid Shortcuts (and Use More Data) for Better Predictions – insideHPC
CAMBRIDGE, Mass. If your Uber driver takes a shortcut, you might get to your destination faster. But if a machine learning model takes a shortcut, it might fail in unexpected ways.
In machine learning, a shortcut solution occurs when the model relies on a simple characteristic of a dataset to make a decision, rather than learning the true essence of the data, which can lead to inaccurate predictions. For example, a model might learn to identify images of cows by focusing on the green grass that appears in the photos, rather than the more complex shapes and patterns of the cows.
A new study by researchers at MIT explores the problem of shortcuts in a popular machine-learning method and proposes a solution that can prevent shortcuts by forcing the model to use more data in its decision-making.
By removing the simpler characteristics the model is focusing on, the researchers force it to focus on more complex features of the data that it hadnt been considering. Then, by asking the model to solve the same task two ways once using those simpler features, and then also using the complex features it has now learned to identify they reduce the tendency for shortcut solutions and boost the performance of the model.
One potential application of this work is to enhance the effectiveness of machine learning models that are used to identify disease in medical images. Shortcut solutions in this context could lead to false diagnoses and have dangerous implications for patients.
It is still difficult to tell why deep networks make the decisions that they do, and in particular, which parts of the data these networks choose to focus upon when making a decision. If we can understand how shortcuts work in further detail, we can go even farther to answer some of the fundamental but very practical questions that are really important to people who are trying to deploy these networks, says Joshua Robinson, a PhD student in the Computer Science and Artificial Intelligence Laboratory (CSAIL) and lead author of the paper.
Robinson wrote thepaperwith his advisors, senior author Suvrit Sra, the Esther and Harold E. Edgerton Career Development Associate Professor in the Department of Electrical Engineering and Computer Science (EECS) and a core member of the Institute for Data, Systems, and Society (IDSS) and the Laboratory for Information and Decision Systems; and Stefanie Jegelka, the X-Consortium Career Development Associate Professor in EECS and a member of CSAIL and IDSS; as well as University of Pittsburgh assistant professor Kayhan Batmanghelich and PhD students Li Sun and Ke Yu. The research will be presented at the Conference on Neural Information Processing Systems in December.
The long road to Understanding Shortcuts
The researchers focused their study on contrastive learning, which is a powerful form of self-supervised machine learning. In self-supervised machine learning, a model is trained using raw data that do not have label descriptions from humans. It can therefore be used successfully for a larger variety of data.
A self-supervised learning model learns useful representations of data, which are used as inputs for different tasks, like image classification. But if the model takes shortcuts and fails to capture important information, these tasks wont be able to use that information either.
For example, if a self-supervised learning model is trained to classify pneumonia in X-rays from a number of hospitals, but it learns to make predictions based on a tag that identifies the hospital the scan came from (because some hospitals have more pneumonia cases than others), the model wont perform well when it is given data from a new hospital.
For contrastive learning models, an encoder algorithm is trained to discriminate between pairs of similar inputs and pairs of dissimilar inputs. This process encodes rich and complex data, like images, in a way that the contrastive learning model can interpret.
The researchers tested contrastive learning encoders with a series of images and found that, during this training procedure, they also fall prey to shortcut solutions. The encoders tend to focus on the simplest features of an image to decide which pairs of inputs are similar and which are dissimilar. Ideally, the encoder should focus on all the useful characteristics of the data when making a decision, Jegelka says.
So, the team made it harder to tell the difference between the similar and dissimilar pairs, and found that this changes which features the encoder will look at to make a decision.
If you make the task of discriminating between similar and dissimilar items harder and harder, then your system is forced to learn more meaningful information in the data, because without learning that it cannot solve the task, she says.
But increasing this difficulty resulted in a tradeoff the encoder got better at focusing on some features of the data but became worse at focusing on others. It almost seemed to forget the simpler features, Robinson says.
To avoid this tradeoff, the researchers asked the encoder to discriminate between the pairs the same way it had originally, using the simpler features, and also after the researchers removed the information it had already learned. Solving the task both ways simultaneously caused the encoder to improve across all features.
Their method, called implicit feature modification, adaptively modifies samples to remove the simpler features the encoder is using to discriminate between the pairs. The technique does not rely on human input, which is important because real-world data sets can have hundreds of different features that could combine in complex ways, Sra explains.
From Cars to COPD
The researchers ran one test of this method using images of vehicles. They used implicit feature modification to adjust the color, orientation, and vehicle type to make it harder for the encoder to discriminate between similar and dissimilar pairs of images. The encoder improved its accuracy across all three features texture, shape, and color simultaneously.
To see if the method would stand up to more complex data, the researchers also tested it with samples from a medical image database of chronic obstructive pulmonary disease (COPD). Again, the method led to simultaneous improvements across all features they evaluated.
While this work takes some important steps forward in understanding the causes of shortcut solutions and working to solve them, the researchers say that continuing to refine these methods and applying them to other types of self-supervised learning will be key to future advancements.
This ties into some of the biggest questions about deep learning systems, like Why do they fail? and Can we know in advance the situations where your model will fail? There is still a lot farther to go if you want to understand shortcut learning in its full generality, Robinson says.
This research is supported by the National Science Foundation, National Institutes of Health, and the Pennsylvania Department of Healths SAP SE Commonwealth Universal Research Enhancement (CURE) program.
Written by Adam Zewe, MIT News Office
Paper: Can contrastive learning avoid shortcut solutions?
Original post:
MIT: Forcing ML Models to Avoid Shortcuts (and Use More Data) for Better Predictions - insideHPC
Top Machine Learning Tools Used By Experts In 2021 – Analytics Insight
The amount of data generated on a day-to-day basis is humungous so much so that the term given to identify such a large volume of data is coined as big data. Big data is usually raw and cannot be used to meet business objectives. Thus, transforming this data into a form that is easy to understand is important. This is exactly where machine learning comes into play. With machine learning in place, it is possible to understand the customer demands, their behavioral pattern and a lot more thereby enabling the business to meet its objectives. For this very purpose, companies and experts rely on certain machine learning tools. Here is our find of top machine learning tools used by experts in 2021. Have a look!
Keras is a free and open-source Python library popularly used for machine learning. Designed by Google engineer Franois Chollet, Keras acts as an interface for the TensorFlow library. In addition to being user-friendly, this machine learning tool is quick, easy and runs on both CPU and GPU. Keras is written in Python language and functions as an API for neural networks.
Yet another widely used machine learning tool across the globe is KNIME. It is easy to learn, free and ideal for data reporting, analytics, and integration platforms. One of the many remarkable features of this machine learning tool is that it can integrate codes of programming languages like Java, JavaScript, R, Python, C, and C++.
WEKA, designed at the University of Waikato, in New Zealand is a tried-and-tested solution for open-source machine learning. This machine learning tool is considered ideal for research, teaching I models, and creating powerful applications. This is written in Java and supports platforms like Linux, Mac OS, Windows. It is extensively used for teaching and research purposes and also for industrial applications for the sole reason that the algorithms employed are easy to understand.
Shogun, an open-source and free-to-use software library for machine learning is quite easily accessible for businesses of all backgrounds and sizes. Shoguns solution is entirely in C++. One can access it in other development languages, including R, Python, Ruby, Scala, and more. Everything from regression and classification to Hidden Markov models, this machine learning tool has got you covered.
If you are a beginner then there cannot be a better machine learning tool to start with other than Rapid Miner. It is because of the fact that it doesnt require any programming skills in the first place. This machine learning tool is considered to be ideal for text mining, data preparation, and predictive analytics. Designed for business leaders, data scientists, and forward-thinking organisations, Rapid Miner surely has grabbed attention for all the right reasons.
TensorFlow is yet another machine learning tool that has gained immense popularity in no time. This open-source framework blends both neural network models with other machine learning strategies. With its ability to run on both CPU as well as GPU, TensorFlow has managed to make it to the list of favourite machine learning tools.
Share This ArticleDo the sharing thingy
About AuthorMore info about author
Go here to read the rest:
Top Machine Learning Tools Used By Experts In 2021 - Analytics Insight
New exhibition to investigate the history of AI & machine learning in art. – FAD magazine
Gazelli Art House to present Code of Arms a group exhibition investigating the history of artificial intelligence (AI) and machine learning in art. The exploration of implementing code and AI in art in the 1970s 80s comes at a time of rapid change in our understanding and appreciation of computer art.
The exhibition brings together pioneer artists in computer and generative art such as George Nees (b.1926), Frieder Nake (b.1938), Manfred Mohr (b.1938) and Vera Molnar (b.1924), and iconic artists employing AI in their practice such as Harold Cohen (b.1928), Lynn Hershman Leeson (b.1941), and Mario Klingemann (b.1970).
Code of Arms follows the evolution of the medium through the works of exhibited artists. Harold Cohens painting, Aspect (1964), a work shown at the Whitechapel Gallery in 1965 (Harold Cohen: Paintings 1960-1965), marks the artists earliest point of enquiry unfolding his scientific and artistic genius. Cohen, who was most famous for creating the computer program AARON, a predecessor of contemporary AI technologies, implemented the program in his work from 80s onwards as seen by the drawings from this period in the exhibition.
Much of the early computer artworks explored geometric forms and structure employing the technology which was still in the infant stage. Plotter drawings carried out by flatbed precision plotter and early printouts by Manfred Mohr, Georg Nees, Frieder Nake and Vera Molnar from the mid-1960s through the 1980s are an excellent representation of that period: the artists focused on the visual forms rather than addressing the underlying meaning and ethics of using computers in their art. The artists saw machines as an external force that would allow them to explore the visual aspect of the works and experiment with the form in an objective manner. Coming from different backgrounds, they worked alongside each other and made an immense contribution to early computer art.
Initially working as an abstract expressionist artist, Manfred Mohr (b. 1938) was inspired by Max Benses information aesthetics which defined his approach to the creative process from the 1960s onwards. Encouraged by the computer music composer Pierre Barbaud whom he met in 1967, Mohr programmed his first computer drawings in 1969. On display are Mohrs plotter drawings of the 70s and 80s alongside a generative software piece from 2015.
Georg Nees (1926-2016) was a German academic who showed one of the worlds first computer graphics created with a digital computer in 1965. In 1970, at the 35th Venice Biennale, he presented his sculptures and architectural designs, which he continued to work on through the 1980s as seen through his drawings in this exhibition.
Frieder Nake (b. 1938) was actively pursuing computer art in the 1960s. With over 300 works produced and shown at various exhibitions (including Cybernetic Serendipity at ICA, London in 1968), Nake brought his background in computer science and mathematics into his art practice. At the The Great Temptation exhibition at the ZKM in 2005, Nees said: ?There it was, the great temptation for me, for once not to represent something technical with this machine but rather something useless geometrical patterns.? Alongside his iconic 60s plotter drawings, Nakes recent body of work (Sets of Straight Lines, 2018) will be on view as a reminder of the artists ability to transform and move away from the geometric abstraction.
Vera Molnar (b. 1924) is a Hungarian-French artist who is considered a pioneer of computer and generative art. Having created combinational images since 1959, Molnars first non-representational images (abstract geometric and systematic paintings) were produced in 1946. Her plotter drawings from the 80s are displayed alongside her later canvas and work on paper (Double Signe Sans Signification, 2005; Deux Angles Droits, 2006) demonstrating the artists consistency and dedication to the process over three decades.
The exhibition moves on to explore relationships between digital technologies and humans through works by Lynn Hershman Leeson (b.1941), an American artist and filmmaker working in moving image, collage, drawing and new media. The artist, who has recently been the focus of a solo exhibition at the New Museum, New York, will show a series of her rare drawings from the 60s and 70s, as well as her seminal work; Agent Ruby, commissioned by SFMoMA (2001), which is an algorithmic work that interacts with online users through a website, shaping the AIs memory, knowledge and moods. Leeson is known for the first interactive piece using Videodisc (Lorna (1983)), and Deep Contact (1984), the first artwork to incorporate a touch screen.
Mario Klingemann brings neural networks, code and algorithms into the contemporary context. The artist investigates systems of todays society employing deep learning, generative and evolutionary art, glitch art, data classification. The exhibition features his recent digital artwork Memories of Passersby I (Solitaire Version), 2018, and prints Morgan le Fay and Cobalamime from 2017.
Mark Westall
Mark Westall is the Founder and Editor of FAD magazine Founder and co-publisher of Art of Conversation and founder of the platform @worldoffad
The New Museums summer 2021 exhibition line-up features three monographic presentations installed in the Museums main galleries. On the Second []
DRAG: Self-portraits and Body Politics ?is the first institutional exhibition to expand on the traditional representations of drag, involving drag queens, drag kings and bio drags from different generations and backgrounds.
Art Basel will screen a premier program of 16 film and video works presented by the shows participating galleries. The Film program is curated for the fourth consecutive year by Cairo-based film curator Maxa Zoller
Dreamlands: Immersive Cinema and Art, 19052016 focuses on the ways in which artists have dismantled and reassembled the conventions of cinemascreen, projection, darknessto create new experiences of the moving image. The exhibition will fill the Whitney Museums 18,000-square-foot fifth-floor galleries, and will include a film series in the third-floor theatre.
Read more here:
New exhibition to investigate the history of AI & machine learning in art. - FAD magazine
Machine Learning May Help Predict Success of Prescription Opioid Regulations | Columbia Public Health – Columbia University
Hundreds of laws aimed at reducing inappropriate prescriptionopioiddispensing have been implemented in the United States, yet due to the complexity of the overlapping programs, it has been difficult to evaluate their impact. Anew study by researchers at Columbia University Mailman School of Public Health uses machine learning to evaluate the laws and their relation to prescription opioid dispensing patterns.They found that the presence of prescription drug monitoring programs (PDMPs) that give prescribers and dispensers access topatient datawere linked to high-dispensing and high-dose dispensing counties.The findings are published in the journal Epidemiology.
The aim of our study was to identify individual and prescription opioid-related law provision combinations that were most predictive of high opioid dispensing and high-dose opioid dispensing in U.S. counties, said Silvia Martins, MD, PhD, associate professor of epidemiology at Columbia Mailman School. Our results showed that not all prescription drug monitoring programs laws are created equal or influence effectiveness, and there is a critical need for better evidence on how law variations might affect opioid-related outcomes. We found that a machine learning approach could help to identify what determines a successful prescription opioid dispensing model.
Using 162 prescription opioid law provisions capturing prescription drug monitoring program access, reporting and administration features, pain management clinic provisions, and prescription opioid limits, the researchers examined various approaches and models toattempt to identify laws most predictive of county-level and high-dose dispensing in different overdose epidemic phasesthe prescription opioid phase (2006-2009), the heroin phase (2010-2012), and the fentanyl phase (2013-2016)to further explore pattern shifts over time.
PDMP patient data access provisions most consistently predicted high-dispensing and high-dose dispensing counties. Pain management clinic-related provisions did not generally predict dispensing measures in the prescriptionopioidphase but became more discriminant of high dispensing and high-dose dispensing counties over time, especially in the fentanyl period.
While further research employing diverse study designs is needed to better understand how opioid laws generally, and specifically, can limit inappropriate opioid prescribing and dispensing to reduce opioid-related harms, we feel strongly that the results of our machine learning approach to identify salient law provisions and combinations associated with dispensing rates will be key for testing which law provisions and combinations of law provision work best in future research, noted Martins.
The researchers observe that there are at least two major challenges to evaluating the impacts of prescription opioid laws on opioid dispensing. First, U.S. states often adopt widely different versions of the same general type of law, making it particularly important to examine the specific provisions that make these laws more or less effective in regards to opioid-related harms. Second, states tend to enact multiple law types simultaneously, making it difficult to isolate the effect of any one law or specific provisions.
Machine learning methods are increasingly being applied to similar high-dimensional data problems, and may offer a complementary approach to other forms of policy analysis including as a screening tool to identify policies and law provision interactions that require further attention, said Martins.
Co-authors are Emilie Bruzelius, Jeanette Stingone, Hanane Akbarnejad, Christine Mauro, Megan Marzial, Kara Rudolph, Katherine Keyes, and Deborah Hasin, Columbia University Mailman School; Katherine Wheeler-Martin and Magdalena Cerd, NYU Grossman School of Medicine; Stephen Crystal and Hillary Samples, Rutgers University; and Corey Davis, Network for Public Health Law.
The study was supported by the National Institute on Drug Abusegrants DA048572, DA047347, DA048860 and DA049950; the Agency for Healthcare Quality and Research, grant R18 HS023258; and the National Center for Advancing Translational Sciences and the New Jersey Health Foundation, grant TR003017.
Read more from the original source:
Machine Learning May Help Predict Success of Prescription Opioid Regulations | Columbia Public Health - Columbia University
An Illustrative Guide to Extrapolation in Machine Learning – Analytics India Magazine
Humans excel at extrapolating in a variety of situations. For example, we can use arithmetic to solve problems with infinitely big numbers. One can question if machine learning can do the same thing and generalize to cases that are arbitrarily far apart from the training data. Extrapolation is a statistical technique for estimating values that extend beyond a particular collection of data or observations. In contrast to extrapolation, we shall explain its primary aspects in this article and attempt to connect it to machine learning. The following are the main points to be discussed in this article.
Lets start the discussion by understanding extrapolation.
Extrapolation is a sort of estimation of a variables value beyond the initial observation range based on its relationship with another variable. Extrapolation is similar to interpolation in that it generates estimates between known observations, but it is more uncertain and has a higher risk of giving meaningless results.
Extrapolation can also refer to a methods expansion, presuming that similar methods are applicable. Extrapolation is a term that refers to the process of projecting, extending, or expanding known experience into an unknown or previously unexperienced area in order to arrive at a (typically speculative) understanding of the unknown.
Extrapolation is a method of estimating a value outside of a defined range. Lets take a general example. If youre a parent, you may recall your youngster calling any small four-legged critter a cat because their first classifier employed only a few traits. They were also able to correctly identify dogs after being trained to extrapolate and factor in additional attributes.
Even for humans, extrapolation is challenging. Our models are interpolation machines, no matter how clever they are. Even the most complicated neural networks may fail when asked to extrapolate beyond the limitations of their training data.
Machine learning has traditionally only been able to interpolate data, that is, generate predictions about a scenario that is between two other, known situations. Because machine learning only learns to model existing data locally as accurately as possible, it cannot extrapolate that is, it cannot make predictions about scenarios outside of the known conditions. It takes time and resources to collect enough data for good interpolation, and it necessitates data from extreme or dangerous settings.
When We use data in regression problems to generalize a function that translates a set of input variables X to a set of output variables y. A y value can be predicted for any combination of input variables using this function mapping. When the input variables are located between the training data, this procedure is referred to as interpolation; however, if the point of estimation is located outside of this region, it is referred to as extrapolation.
The grey and white sections in the univariate example in Fig above show the extrapolation and interpolation regimes, respectively. The black lines reflect a selection of polynomial models that were used to make predictions within and outside of the training data set.
The models are well limited in the interpolation regime, causing them to collapse in a tiny region. However, outside of the domain, the models diverge, producing radically disparate predictions. The absence of information given to the model during training that would confine the model to predictions with a smaller variance is the cause of this large divergence of predictions (despite being the same model with slightly different hyperparameters and trained on the same set of data).
This is the risk of extrapolation: model predictions outside of the training domain are particularly sensitive to training data and model parameters, resulting in unpredictable behaviour unless the model formulation contains implicit or explicit assumptions.
In the absence of training data, most learners do not specify the behaviour of their final functions. Theyre usually made to be universal approximators or as close as possible with few modelling constraints. As a result, in places where there is little or no data, the function has very little previous control. As a result, we cant regulate the behaviour of the prediction function at extrapolation points in most machine learning scenarios, and we cant tell when this is a problem.
Extrapolation should not be a problem in theory; in a static system with a representative training sample, the chances of having to anticipate a point of extrapolation are essentially zero. However, most training sets are not representative, and they are not derived from static systems, therefore extrapolation may be required.
Even empirical data derived from a product distribution can appear to have a strong correlation pattern when scaled up to high dimensions. Because functions are learned based on an empirical sample, they may be able to extrapolate effectively even in theoretically dense locations.
Extrapolation works with linear and other types of regression to some extent, but not with decision trees or random forests. In the Decision Tree and Random Forest, the input is sorted and filtered down into leaf nodes that have no direct relationship to other leaf nodes in the tree or forest. This means that, while the random forest is great at sorting data, the results cant be extrapolated because it doesnt know how to classify data outside of the domain.
A good decision on which extrapolation method to use is based on a prior understanding of the process that produced the existing data points. Some experts have recommended using causal factors to assess extrapolation approaches. We will see a few of them. These are pure mathematical methods one should relate to your problem properly.
Linear extrapolation is the process of drawing a tangent line from the known datas end and extending it beyond that point. Only use linear extrapolation to extend the graph of an essentially linear function or not too much beyond the existing data to get good results. Linear extrapolation produces the function if the two data points closest to the point x* to be extrapolated are (xk-1,yk-1) and (xk,yk).
A polynomial curve can be built using all of the known data or just a small portion of it (two points for linear extrapolation, three points for quadratic extrapolation, etc.). The curve that results can then be extended beyond the available data. The most common way of polynomial extrapolation is to use Lagrange interpolation or Newtons method of finite differences to generate a Newton series that matches the data. The data can be extrapolated using the obtained polynomial.
Five spots near the end of the given data can be used to make a conic section. If the conic section is an ellipse or a circle, it will loop back and rejoin itself when extrapolated. A parabola or hyperbola that has been extrapolated will not rejoin itself, but it may curve back toward the X-axis. A conic sections template (on paper) or a computer could be used for this form of extrapolation.
Further, we will see the simple python implementation of linear extrapolation.
The technique is beneficial when the linear function is known. Its done by drawing a tangent and extending it beyond the limit. When the projected point is close to the rest of the points, linear extrapolation delivers a decent result.
Extrapolation is a helpful technique, but it must be used in conjunction with the appropriate model for describing the data, and it has limitations after you leave the training area. Its applications include predicting in situations where you have continuous data, such as time, speed, and so on. Prediction is notoriously imprecise, and the accuracy falls as the distance from the learned area grows. In situations where extrapolation is required, the model should be updated and retrained to lower the margin of error. Through this article, we have understood extrapolation and its interpolation mathematically and related them with the ML, and seen their effect on the ML system. We have also seen particularly where it fails, and methods that can be used.
Visit link:
An Illustrative Guide to Extrapolation in Machine Learning - Analytics India Magazine
We mapped every large solar plant on the planet using satellites and machine learning – The Conversation UK
An astonishing 82% decrease in the cost of solar photovoltaic (PV) energy since 2010 has given the world a fighting chance to build a zero-emissions energy system which might be less costly than the fossil-fuelled system it replaces. The International Energy Agency projects that PV solar generating capacity must grow ten-fold by 2040 if we are to meet the dual tasks of alleviating global poverty and constraining warming to well below 2C.
Critical challenges remain. Solar is intermittent, since sunshine varies during the day and across seasons, so energy must be stored for when the sun doesnt shine. Policy must also be designed to ensure solar energy reaches the furthest corners of the world and places where it is most needed. And there will be inevitable trade-offs between solar energy and other uses for the same land, including conservation and biodiversity, agriculture and food systems, and community and indigenous uses.
Colleagues and I have now published in the journal Nature the first global inventory of large solar energy generating facilities. Large in this case refers to facilities that generate at least 10 kilowatts when the sun is at its peak. (A typical small residential rooftop installation has a capacity of around 5 kilowatts).
We built a machine learning system to detect these facilities in satellite imagery and then deployed the system on over 550 terabytes of imagery using several human lifetimes of computing.
We searched almost half of Earths land surface area, filtering out remote areas far from human populations. In total we detected 68,661 solar facilities. Using the area of these facilities, and controlling for the uncertainty in our machine learning system, we obtain a global estimate of 423 gigawatts of installed generating capacity at the end of 2018. This is very close to the International Renewable Energy Agencys (IRENA) estimate of 420 GW for the same period.
Our study shows solar PV generating capacity grew by a remarkable 81% between 2016 and 2018, the period for which we had timestamped imagery. Growth was led particularly by increases in India (184%), Turkey (143%), China (120%) and Japan (119%).
Facilities ranged in size from sprawling gigawatt-scale desert installations in Chile, South Africa, India and north-west China, through to commercial and industrial rooftop installations in California and Germany, rural patchwork installations in North Carolina and England, and urban patchwork installations in South Korea and Japan.
Country-level aggregates of our dataset are very close to IRENAs country-level statistics, which are collected from questionnaires, country officials, and industry associations. Compared to other facility-level datasets, we address some critical coverage gaps, particularly in developing countries, where the diffusion of solar PV is critical for expanding electricity access while reducing greenhouse gas emissions. In developed and developing countries alike, our data provides a common benchmark unbiased by reporting from companies or governments.
Geospatially-localised data is of critical importance to the energy transition. Grid operators and electricity market participants need to know precisely where solar facilities are in order to know accurately the amount of energy they are generating or will generate. Emerging in-situ or remote systems are able to use location data to predict increased or decreased generation caused by, for example, passing clouds or changes in the weather.
This increased predictability allows solar to reach higher proportions of the energy mix. As solar becomes more predictable, grid operators will need to keep fewer fossil fuel power plants in reserve, and fewer penalties for over- or under-generation will mean more marginal projects will be unlocked.
Using the back catalogue of satellite imagery, we were able to estimate installation dates for 30% of the facilities. Data like this allows us to study the precise conditions which are leading to the diffusion of solar energy, and will help governments better design subsidies to encourage faster growth.
Knowing where a facility is also allows us to study the unintended consequences of the growth of solar energy generation. In our study, we found that solar power plants are most often in agricultural areas, followed by grasslands and deserts.
This highlights the need to carefully consider the impact that a ten-fold expansion of solar PV generating capacity will have in the coming decades on food systems, biodiversity, and lands used by vulnerable populations. Policymakers can provide incentives to instead install solar generation on rooftops which cause less land-use competition, or other renewable energy options.
The github, code, and data repositories from this research have been made available to facilitate more research of this type and to kickstart the creation of a complete, open, and current dataset of the planets solar energy facilities.
View original post here:
We mapped every large solar plant on the planet using satellites and machine learning - The Conversation UK
Sama has been named the Best in Machine Learning Platforms at the 2021 AI TechAwards – HapaKenya – HapaKenya
Sama, a training data provider in Artificial Intelligence (AI) projects, has announced it has received the 2021 AI TechAward for Best in Machine Learning Platforms.
The annual awards celebrate companies leading in technical innovation, adoption, and reception in the AI and machine learning industry and by the developer community, presented by AI DevWorld.
The winners for the 2021 awards were selected from more than 100 entries submitted per category globally, and announced during a virtual AI DevWorld conference. The conference targeted software engineers and data scientists interested in AI as well as AI dev professionals looking for a landscape view on the newest AI technologies.
For over a decade, organizations such as Google, Microsoft, NVIDIA and others have continued to rely on Sama to deliver secure, high-quality training data and model validation for their machine learning projects.
Sama hires over 90% of its workforce from low-income backgrounds and marginalized populations, including unemployed urban and rural youth that are traditionally excluded from the digital economy.
The company has remained committed to connecting people to dignified digital work and paying them living wages that solve some of the worlds most pressing challenges. This includes reducing poverty, empowering women, and mitigating climate change. As a result, it has helped over 56,000 people lift themselves out of poverty, increased wages of workers up to 4 times, and provided over 11,000 hours of training to youth and women, who comprise over 50% of their workforce in both Kenya and Uganda offices.
Earlier this year, the company was recognized as one of the fastest-growing private companies in America on the 2021 Inc. 5000 list for the second year in a row.
Here is the original post:
Sama has been named the Best in Machine Learning Platforms at the 2021 AI TechAwards - HapaKenya - HapaKenya
As machine learning becomes standard in military and politics, it needs moral safeguards | TheHill – The Hill
Over the past decade, the world has experienced a technological revolution powered by machine learning (ML). Algorithms remove the decision fatigue of purchasing books and choosing music, and the work of turning on lights and driving, allowing humans to focus on activities more likely to optimize their sense of happiness. Futurists are now looking to bring ML platforms to more complex aspects of human society, specifically warfighting and policing.
Technology moralists and skeptics aside, this move is inevitable, given the need for rapid security decisions in a world with information overload. But as ML-powered weapons platforms replace human soldiers, the risk of governments misusing ML increases. Citizens of liberal democracies can and should demand that governments pushing for the creation of intelligent machines for warfighting include provisions maintaining the moral frameworks that guide their militaries.
In his popular book The End of History, Francis Fukuyama summarized debates about the ideal political system for achieving human freedom and dignity. From his perspective in the middle of 1989, months before the unexpected fall of the Berlin Wall, no other systems like democracy and capitalism could generate wealth, pull people out of poverty and defend human rights; both communism and fascism had failed, creating cruel autocracies that oppressed people. Without realizing it, Fukuyama prophesied democracys proliferation across the world. Democratization soon occurred through grassroots efforts in Asia, Eastern Europe and Latin America.
These transitions, however, wouldnt have been possible unless the military acquiesced to these reforms. In Spain and Russia, the military attempted a coup before recognizing the dominant political desire for change. China instead opted to annihilate reformers.
The idea that the military has veto power might seem incongruous to citizens of consolidated democracies. But in transitioning societies, the military often has the final say on reform due to its symbiotic relationship with the government. In contrast, consolidated democracies benefit from the logic of Clausewitzs trinity, where there is a clear division of labor between the people, the government and the military. In this model, the people elect governments to make decisions for the overall good of society while furnishing the recruits for the military tasked with executing government policy and safeguarding public liberty. The trinity, though, is premised on a human military with a moral character that flows from its origins among the people. The military can refuse orders that harm the public or represent bad policy that might lead to the creation of a dictatorship.
ML risks destabilizing the trinity by removing the human element of the armed forces and subsuming them directly into the government. Developments in ML have created new weapons platforms that rely less and less on humans, as new warfighting machines are capable of provisioning security or assassinating targets with only perfunctory human supervision. The framework of machines acting without human involvement risks creating a dystopian future where political reform will become improbable, because governments will no longer have human militaries restraining them from opening fire on reformers. These dangers are evident in China, where the government lacks compunction in deploying ML platforms to monitor and control its population while also committing genocide.
In the public domain, there is some recognition of these dangers on the misuses of ML for national security. But there hasnt been a substantive debate about how ML might shape democratic governance and reform. There isnt a nefarious reason for this. Rather its that many of those who develop ML tools have STEM backgrounds and lack an understanding of broader social issues. From the government side, leaders in agencies funding ML research often dont know how to consume ML outputs, relying instead on developers to explain what theyre seeing for them. The governments measure for success is whether it keeps society safe. Throughout this process, civilians operate as bystanders, unable to interrogate the design process for ML tools used for war.
In the short term, this is fine because there arent entire armies made of robots, but the competitive advantage offered by mechanized fighting not limited by frail human bodies will make intelligent machines essential to the future of war. Moreover, these terminators will need an entire infrastructure of satellites, sensors, and information platforms powered by ML to coordinate responses to battlefield advances and setbacks, further reducing the role of humans. This will only amplify the power governments have to oppress their societies.
The risk that democratic societies might create tools that lead to this pessimistic outcome is high. The United States is engaged in an ML arms race withChina and Russia, both of which are developing and exporting their own ML tools to help dictatorships remain in power and freeze history.
There is space for civil society to insert itself into ML, however. ML succeeds and fails based on the training data used for algorithms, and civil society can collaborate with governments to choose training data that optimizes the warfighting enterprise while balancing the need to sustain dissent and reform.
By giving machines moral safeguards, the United States can create tools that instead strengthen democracys prospects. Fukuyamas thesis is only valid in a world where humans can exert their agency and reform their governments through discussion, debate and elections. The U.S., in the course of confronting its authoritarian rivals, shouldnt create tools that hasten democracys end.
Christopher Wall is a social scientist for Giant Oak, a counterterrorism instructor for Naval Special Warfare, a lecturer on statistics for national security at Georgetown University and the co-author of the recent book, The Future of Terrorism: ISIS, al-Qaeda, and the Alt-Right. Views of the author do not necessarily reflect the views of Giant Oak.
Read the original:
As machine learning becomes standard in military and politics, it needs moral safeguards | TheHill - The Hill
Board of the International Organisation of Securities Commissions (IOSCO) publishes final guidance report for artificial intelligence and machine…
Subsequent to the consultation report published by IOSCO in June 2021, the final guidance report (IOSCO Report) for artificial intelligence (AI) and machine learning (ML) entitled The use of artificial intelligence and machine learning by market intermediaries and asset managers was released on 7 September 2021.
Per the IOSCO Report, market intermediaries and asset managers tend to achieve cost reductions and improve efficiency through the use of AL and ML. While market intermediaries, asset managers and investors receive benefits including efficiency enhancement, cost reduction and resource sparing, a concern is amplification of risk that affects the interests of consumers and other market participants.
In the light of the above, the IOSCO Report sets out some recommended measures to ensure that the interests of investors and other relevant stakeholders are protected. Further, Annex I and Annex 2 to the IOSCO Report outline the regulators responses to the challenges arising from AI and ML, and the guidance issued by supranational bodies respectively.
Read the rest here:
Board of the International Organisation of Securities Commissions (IOSCO) publishes final guidance report for artificial intelligence and machine...