Page 1,639«..1020..1,6381,6391,6401,641..1,6501,660..»

Analyzing the potential of AlphaFold in drug discovery – MIT News

Over the past few decades, very few new antibiotics have been developed, largely because current methods for screening potential drugs are prohibitively expensive and time-consuming. One promising new strategy is to use computational models, which offer a potentially faster and cheaper way to identify new drugs.

A new study from MIT reveals the potential and limitations of one such computational approach. Using protein structures generated by an artificial intelligence program called AlphaFold, the researchers explored whether existing models could accurately predict the interactions between bacterial proteins and antibacterial compounds. If so, then researchers could begin to use this type of modeling to do large-scale screens for new compounds that target previously untargeted proteins. This would enable the development of antibiotics with unprecedented mechanisms of action, a task essential to addressing the antibiotic resistance crisis.

However, the researchers, led by James Collins, the Termeer Professor of Medical Engineering and Science in MITs Institute for Medical Engineering and Science (IMES) and Department of Biological Engineering, found that these existing models did not perform well for this purpose. In fact, their predictions performed little better than chance.

Breakthroughs such as AlphaFold are expanding the possibilities for in silico drug discovery efforts, but these developments need to be coupled with additional advances in other aspects of modeling that are part of drug discovery efforts, Collins says. Our study speaks to both the current abilities and the current limitations of computational platforms for drug discovery.

In their new study, the researchers were able to improve the performance of these types of models, known as molecular docking simulations, by applying machine-learning techniques to refine the results. However, more improvement will be necessary to fully take advantage of the protein structures provided by AlphaFold, the researchers say.

Collins is the senior author of the study, which appears today in the journal Molecular Systems Biology. MIT postdocs Felix Wong and Aarti Krishnan are the lead authors of the paper.

Molecular interactions

The new study is part of an effort recently launched by Collins lab called the Antibiotics-AI Project, which has the goal of using artificial intelligence to discover and design new antibiotics.

AlphaFold, an AI software developed by DeepMind and Google, has accurately predicted protein structures from their amino acid sequences. This technology has generated excitement among researchers looking for new antibiotics, who hope that they could use the AlphaFold structures to find drugs that bind to specific bacterial proteins.

To test the feasibility of this strategy, Collins and his students decided to study the interactions of 296 essential proteins from E. coli with 218 antibacterial compounds, including antibiotics such as tetracyclines.

The researchers analyzed how these compounds interact with E. coli proteins using molecular docking simulations, which predict how strongly two molecules will bind together based on their shapes and physical properties.

This kind of simulation has been successfully used in studies that screen large numbers of compounds against a single protein target, to identify compounds that bind the best. But in this case, where the researchers were trying to screen many compounds against many potential targets, the predictions turned out to be much less accurate.

By comparing the predictions produced by the model with actual interactions for 12 essential proteins, obtained from lab experiments, the researchers found that the model had false positive rates similar to true positive rates. That suggests that the model was unable to consistently identify true interactions between existing drugs and their targets.

Using a measurement often used to evaluate computational models, known as auROC, the researchers also found poor performance. Utilizing these standard molecular docking simulations, we obtained an auROC value of roughly 0.5, which basically says youre doing no better than if you were randomly guessing, Collins says.

The researchers found similar results when they used this modeling approach with protein structures that have been experimentally determined, instead of the structures predicted by AlphaFold.

AlphaFold appears to do roughly as well as experimentally determined structures, but we need to do a better job with molecular docking models if were going to utilize AlphaFold effectively and extensively in drug discovery, Collins says.

Better predictions

One possible reason for the models poor performance is that the protein structures fed into the model are static, while in biological systems, proteins are flexible and often shift their configurations.

To try to improve the success rate of their modeling approach, the researchers ran the predictions through four additional machine-learning models. These models are trained on data that describe how proteins and other molecules interact with each other, allowing them to incorporate more information into the predictions.

The machine-learning models learn not just the shapes, but also chemical and physical properties of the known interactions, and then use that information to reassess the docking predictions, Wong says. We found that if you were to filter the interactions using those additional models, you can get a higher ratio of true positives to false positives.

However, additional improvement is still needed before this type of modeling could be used to successfully identify new drugs, the researchers say. One way to do this would be to train the models on more data, including the biophysical and biochemical properties of proteins and their different conformations, and how those features influence their binding with potential drug compounds.

This study both lets us understand just how far we are from realizing full machine-learning-based paradigms for drug development, and provides fantastic experimental and computational benchmarks to stimulate and direct and guide progress towards this future vision, says Roy Kishony, a professor of biology and computer science at Technion (the Israel Institute of Technology), who was not involved in the study.

With further advances, scientists may be able to harness the power of AI-generated protein structures to discover not only new antibiotics but also drugs to treat a variety of diseases, including cancer, Collins says. Were optimistic that with improvements to the modeling approaches and expansion of computing power, these techniques will become increasingly important in drug discovery, he says. However, we have a long way to go to achieve the full potential of in silico drug discovery.

The research was funded by the James S. McDonnell Foundation, the Swiss National Science Foundation, the National Institute of Allergy and Infectious Diseases, the National Institutes of Health, and the Broad Institute of MIT and Harvard. The Antibiotics-AI Project is supported by the Audacious Project, the Flu Lab, the Sea Grape Foundation, and the Wyss Foundation.

The rest is here:

Analyzing the potential of AlphaFold in drug discovery - MIT News

Read More..

Lecturer / Senior Lecturer in Computer Science, Formal Methods and Logic job with UNSW Sydney | 307657 – Times Higher Education

Summary: Join an organisation that is shaping the future direction of computing in Australia in a role that conducts independent research and delivers excellent teaching in Formal Methods and Logic in Computer Science.

Job Details

The Opportunity

Join the School of Computer Science and Engineering (CSE) as a Lecturer/Senior Lecturer. You will be conducting independent research and delivering excellent teaching.

This position is in the area of Formal Methods and Logic in Computer Science, with preference for algorithmic verification and applications towards areas including security foundations, distributed computing, hybrid systems and autonomous systems.

Formal Methods and Logic are well represented at the School. The Formal Methods Group works on developing the theoretical foundations for reasoning about computational systems, and enabling computers themselves to perform such reasoning, to support the development of computational systems to the highest levels of assurance concerning their correctness, security and reliability. Specific areas of current focus are foundational models and logics for reasoning about fault-tolerant distributed computing, information flow, privacy, machine learning and smart contracts in blockchain systems. In teaching, we are developing a pedagogy in which students are first motivated to reason informally but rigorously (e.g., using assertions and invariants) about program correctness and program derivation, and then introduced to program verifiers such as Dafny. Advanced teaching areas covered by the group include theory of computing, concurrency theory, and algorithmic verification. The School's Trustworthy Systems Group concentrates on provable correctness for an actual operating-system kernel (seL4) and the Knowledge Representation Group applies logic in AI.

The ideal candidate will have a track record and an ongoing research program of use-inspired basic research in Formal Methods in which new methods and theories are developed that help to write programs that are provably correct, with applications to the programming challenges of today and the future (e.g., privacy, security, reliability and autonomy). It is desirable that the candidate has made contributions to the theoretical foundations of model checkers, program verifiers (e.g., Dafny) or automatic theorem provers, and has experience in having used them on actual program-development projects. The candidate will have enthusiasm for conveying both theory and practice to students.

Evidence of publications in these venues (or similar) would be desirable:

This is an opportunity to join an organisation that is helping to shape the future direction of computing in Australia. The students and research produced in CSE can impact the world!

The role of Lecturer/Senior Lecturer reports to the Head of School and has no direct reports.

The School

Computer Science and Engineering (CSE) in the Faculty of Engineering at UNSW is one of the largest Schools of its kind in Australia with the greatest impact on society through our academic excellence in teaching, research, commercial and social engagement. The School is the largest with the Faculty of Engineering with over 3,400 students and 60 academic staff which is growing to 70 over the coming year with an operating budget of over $20 million. CSE is undergoing a period of expansion, advertising and recruiting for over 10 new academic staff in 2022.

Our academic staff have research focus in areas including Artificial Intelligence, Biomedical Image Computing, Data and Knowledge, Embedded Systems, Networked Systems and Security, Human Centred Computing, Programming Languages and Compilers, Service Oriented Computing, Theoretical Computer Science and Trustworthy Systems.

CSE offers undergraduate programs in Software Engineering, Computer Engineering, Computer Science and Bioinformatics, as well as a number of combined degrees with other disciplines. CSE attracts excellent students who have an outstanding record in international competitions. People join CSE for the opportunity to work with top-tier students and to join a community of scholars who support them to achieve their full potential. CSE attracts the brightest students as we offer the most technically challenging computing degrees in Australia. The challenges we present, ensure our students reach their greatest potential and are ready to have a lasting impact on society.

Our school is located in the heart of Sydney, and is Australias largest centre for computationally driven business, design and culture. This vibrant nexus brings together a diversity of creative engineering and design forces: where world-leading education allows our thousands of students and researchers to become world-leading and world-building innovators. CSE students take an active role in the creation of a vibrant student experience, with many student societies and are actively involved in teaching and learning opportunities within the school. For further information about the School, please visit http://www.cse.unsw.edu.au

UNSW

UNSW is currently implementing a ten-year strategy to 2025 and our ambition for the next decade is nothing less than to establish UNSW as Australias global university. Following extensive consultation in 2015, UNSW identified three strategic priority areas. Firstly, a drive for academic excellence in research and education. Universities are often classified as research intensive or teaching intensive. UNSW is proud to be an exemplar of both. We are amongst a limited group of universities worldwide capable of delivering research excellence alongside the highest quality education on a large scale. Secondly, a passion for social engagement, which improves lives through advancing equality, diversity, open debate and economic progress. Thirdly, a commitment to achieving global impact through sharing our capability in research and education in the highest quality partnerships with institutions in both developed and emerging societies. We regard the interplay of academic excellence, social engagement and global impact as the hallmarks of a great forward-looking 21st century university.

Skills & Experience

Lecturer (Level B)

Senior Lecturer (Level C)

Additional details about the specific responsibilities for this position can be found in the position description.

To Apply: If you are interested in an academic career in a role that conducts independent research and delivers excellent teaching, please click the apply now button and submit your CV, Cover Letter and systematic responses to the Skills and Experience.

Applicants are actively encouraged not to include conference/journal/CORE rankings but should instead focus on the impact of their research outputs in describing the excellence of their research. Clarity concerning individual contributions to group outputs is essential.

Please note applications will not be accepted if sent to the contact listed below.

Contact:

Eugene Aves Talent Acquisition Consultant

E: eugene.aves@unsw.edu.au

Applications close: 11:50 pm (Sydney time), on Wednesday 10th October 2022

UNSW is committed to equity diversity and inclusion. Applications from women, people of culturally and linguistically diverse backgrounds, those living with disabilities, members of the LGBTIQ+ community; and people of Aboriginal and Torres Strait Islander descent, are encouraged. UNSW provides workplace adjustments for people with disability, and access to flexible work options for eligible staff. The University reserves the right not to proceed with any appointment.

Read more:

Lecturer / Senior Lecturer in Computer Science, Formal Methods and Logic job with UNSW Sydney | 307657 - Times Higher Education

Read More..

Research Fellow, Computer Science job with NATIONAL UNIVERSITY OF SINGAPORE | 307509 – Times Higher Education

Job Description

The Computer Science Department at National University of Singapore is seeking a motivated postdoctoral researcher with expertise in artificial intelligence, specifically multi-agent reinforcement learning, starting immediately.

Qualifications

Job requirements:

How to apply:

To apply for this position, please send your CV to Harold Soh at harold@comp.nus.edu.sg with a cover letter and a brief statement of previous work and research interests. The initial term of appointment will be 1 year.The selected candidate will be offered competitive salaries and benefits.

NUS is a world-class university that provides an outstanding and supportive research environment. Its School of Computing is highly ranked within the top 10) among the computer science departments in the world. Singapore is a vibrant, well-connected city with low taxes and research hub in Asia

Covid-19 Message

At NUS, the health and safety of our staff and students are one of our utmost priorities, and COVID-vaccination supports our commitment to ensure the safety of our community and to make NUS as safe and welcoming as possible. Many of our roles require a significant amount of physical interactions with students/staff/public members. Even for job roles that may be performed remotely, there will be instances where on-campus presence is required.

Taking into consideration the health and well-being of our staff and students and to better protect everyone in the campus, applicants are strongly encouraged to have themselves fully COVID-19 vaccinated to secure successful employment with NUS.

More Information

Location: Kent Ridge CampusOrganization: School of ComputingDepartment : Department of Computer ScienceEmployee Referral Eligible: NoJob requisition ID : 17128

See the original post:

Research Fellow, Computer Science job with NATIONAL UNIVERSITY OF SINGAPORE | 307509 - Times Higher Education

Read More..

Collegiate achievements and honors for Sauk Valley-area students – Sauk Valley Media

College students from the Sauk Valley area who achieve academic recognition.

Spring Commencement

Dixon. Gretchen Bushman, Bachelor of Science, Communication Sciences and Disorders

Summer Deans List

Erie. Bailey C Youngberg

Morrison. Terrie Carroll

Forreston. Gavin M Fuchs

Shannon. Kaylee N Hammer

College of Liberal Arts and Sciences

Award in Environmental Science

West Brooklyn. Pamela Taylor

Undergraduate Summer Research and Artistry Award

West Brooklyn. Pamela Taylor

Johnny Carson Center for Emerging Media Arts Cohort selections

Dixon. Laynie Berkey

Spring graduates

Byron. Eryn Murphy, Bachelor of Science, Health Promotion, Magna Cum Laude

Erie. Alexis Verkruysse, Bachelor of Science, Elementary Education, Cum Laude

Morrison. Kaleb Banks, Bachelor of Science, Wildlife Ecology and Mgt, Cum Laude

May and summer graduates

Erie. Shannon Fry, Master of Science in Nursing

Spring Honors

Forreston. Sierra Reining.

Mount Carroll. Natalie Limesand

Spring Deans List

Dixon. Kate Bonnell

Oregon. Jadyn Bothe

Polo. Cole Faivre, Valeria Viteri-Pflucker

Morrison. Cassie Osborn

Jadyn Bothe, of Oregon, IL, a first-year majoring in Neuroscience.

Summer graduates

Byron. Nolan J Bielskis, Bachelor of Science, Law Enforcement & Justice Administration

Forreston. Gavin M Fuchs, Bachelor of Science, Exercise Science

Rock Falls. Kristen Lynn Shumard, Master of Science in Education, Educational Leadership

Sterling. Emily A Heitman, Bachelor of Science, Law Enforcement & Justice Administration; Keonna C Lauts, Bachelor of Business, Management

Summer Graduation List

Byron. Caleb Denton, Bachelor of Science, Mechanical Engineering

Creston. Abigail Kerns, Bachelor of Science, Nursing

Dixon. Jennifer LeMoine, Bachelor of Arts, Environmental Studies; Jill Silvest, Master of Science in Education, Educational Administration

Leaf River. Samantha Poe, Bachelor of Fine Arts, Art Studio and Design - Art Studio

Morrison. Kay Smith, Bachelor of General Studies, General Emphasis; Micheal Baumann, Master of Science in Education, Physical Education - Exercise Physiology

Prophetstown. Zachary Stanhoff, Bachelor of Science, Mathematics - General Mathematics

Rochelle. Sandra Galvan, Master of Science, Applied Human Development & Family Sciences: Marriage and Family Therapy; Laura Lopez, Bachelor of Science, Nursing; Kyle Seebach, Master of Science, Data Analytics

Sterling. Natalie Ramos, Bachelor of Arts, Psychology

Steward. Kelly Wakefield, Master of Science in Education, SPED-Assistive Technology Used by Persons with Visual Impairments

Spring Deans list

Dixon. Cadyn Grafton, Grace Mitchell, Paige Stees, Emma Rapp, Kaitlyn Ortgiesen, Ashley Winters, Julia Heller, Anna Logan.

Franklin Grove. Connor Colby.

Sublette. Margaret Vaessen.

Erie. Kadin Shaheen.

Morrison. Lindsey Houldson.

Prophetstown. Emily Brooks, Sydney Minseen.

Rock Falls. Kassandra Estrella, Andrew Cannell.

Sterling. Brianna Juarez, Hunter Carrell, Grace Gould, Michael Frank, Brooke Wilson, Priscila Espinoza Castillo.

Mount Carroll. Trevor Bickelhaupt, Olivia Charles, Brianna Brice.

Savanna. Chance Williams.

Byron. Hector Hernandez.

Oregon. Claudia Reckamp.

Polo. Gabriel Boothe, Molly Duncan, Teagan Prescott.

Rochelle. Sterling Devers, Addison Curtis, Tara Leininger,

Stillman Valley. Brooke Mickey.

Bronze tablet list

Sterling. Kolten Conckle, liberal arts and sciences

Polo. Molly Duncan, agricultural, consumer and environmental sciences

Spring Semester graduates

Amboy. Lauren Gerdes. Master of Human Resources and Industrial Relations. Human Resources and Industrial Relations

Dixon. Patrick Johnson. Grainger Engineering. Bachelor of Science in Industrial Engineering. Industrial Engineering

Dixon. Ashley Winters. Liberal Arts and Sciences. Bachelor of Arts in Liberal Arts and Sciences. Political Science

Sterling. Briana Emini. Education. Bachelor of Science in Elementary Education. Elementary Education

Milledgeville. Kelley Parks. Master of Science in Agricultural Leadership, Education, and Communications Agricultural Leadership, Education, and Communications

Savanna. Jordan Anderson. Liberal Arts and Sciences. Bachelor of Arts in Liberal Arts and Sciences. Communication

Savanna. Glen Johnston. Agricultural, Consumer and Environmental Sciences. Bachelor of Science in Agricultural Leadership, Education, and Communications. Agricultural Leadership, Education, and Communications

Morrison. Krysta Mapes. Master of Science in Library and Information Science. Library and Information Science

Morrison. Brenna Rickels. Master of Accounting Science. Accountancy

Rock Falls. Daniela Cervantes. Grainger Engineering. Bachelor of Science in Computer Science. Computer Science

Rock Falls. Karley Crady. Media. Bachelor of Science in Journalism. Journalism

Rock Falls. Nolan Moeller. Master of Science in Accountancy. Accountancy

Rock Falls. Faith Sandrock. Agricultural, Consumer and Environmental Sciences. Bachelor of Science in Agricultural and Consumer Economics. Agricultural and Consumer Economics

Sterling. Mitchell Clodfelter. Agricultural, Consumer and Environmental Sciences. Bachelor of Science in Agricultural and Consumer Economics. Agricultural and Consumer Economics

Sterling. Kolten Conklen. Hunter-Scott. Liberal Arts and Sciences. Bachelor of Arts in Liberal Arts and Sciences. Global Studies. East Asian Languages and Cultures. Highest Distinction. Summa Cum Laude

Sterling. Brianna Juarez. Agricultural, Consumer and Environmental Sciences. Bachelor of Science in Animal Sciences. Animal Sciences

Sterling. Sarah Ogg. Liberal Arts and Sciences. Bachelor of Science in Liberal Arts and Sciences. Molecular and Cellular Biology

Sterling. Logan Rocha. Grainger Engineering. Bachelor of Science in Computer Engineering. Computer Engineering

Sterling. Jerry Rodriguez. Fine and Applied Arts. Bachelor of Science in Architectural Studies. Architectural Studies

Sterling. Katelyn Smoot. Agricultural, Consumer and Environmental Sciences. Bachelor of Science in Animal Sciences. Animal Sciences

Sterling. Jacqueline Walters. Master of Science in Agricultural Education. Agricultural Education

Ashton. Seth McMillan. Master of Science in Agricultural Leadership, Education, and Communications. Agricultural Leadership, Education, and Communications

Byron. Annabella Andreen. Applied Health Sciences. Bachelor of Science in Interdisciplinary Health Sciences. Interdisciplinary Health Sciences. Highest Honors

Byron. Rachael Bell. Grainger Engineering. Bachelor of Science in Bioengineering. Bioengineering

Byron. Eric Hoshaw. Grainger Engineering. Bachelor of Science in Engineering Physics. Engineering Physics

Dixon. Tayla Schwarz. Liberal Arts and Sciences. Bachelor of Science in Liberal Arts and Sciences. Psychology

Forreston. Christian Groenewold. Bachelor of Science in Civil Engineering. Civil Engineering, Honors

Forreston. Christina Lewis. Master of Education in Education Policy, Organization and Leadership. Education Policy, Organization and Leadership

Monroe Center. Caroline HIckey. Education. Bachelor of Science in Early Childhood Education. Early Childhood Education

Monroe Center. Joseph Madrid. Fine and Applied Arts. Bachelor of Science in Architectural Studies. Architectural Studies

Oregon. Benjamin Libman. Media. Bachelor of Science in Advertising. Advertising

Oregon. Paul Reckamp. Master of Science in Electrical and Computer Engineering. Electrical and Computer Engineering

Oregon. Sophie West. Liberal Arts and Sciences. Bachelor of Arts in Liberal Arts and Sciences. English

Polo. Molly Duncan. Agricultural, Consumer and Environmental Sciences. Bachelor of Science in Crop Sciences. Crop Sciences. Highest Honors

Polo. Randal Gabaldon. Gies Business. Bachelor of Science in Finance and in Marketing

Continued here:

Collegiate achievements and honors for Sauk Valley-area students - Sauk Valley Media

Read More..

Niki Narayani Named SEC Co-Runner of the Week – Vanderbilt University

NASHVILLE, Tenn. Senior Niki Narayani has been named SEC Co-Runner of the Week as announced by the conference office Tuesday morning.

Niki has given the program its first glimpse into what we are working relentlessly to display this season, said Althea Thomas, Vanderbilt director of cross country, track and field. Hard work coupled with faith and execution was the formula Niki used to compete in the first meet and is the formula that has gotten her recognition among our SEC peers. She is a great leader for our program and a catalyst for the year.

Narayani was the womens 5k winner with a time of 18:00.88 in her come-from-behind victory, bringing Vandy to a total of 24 points. She paced the efforts of the Commodores, who began the competition with a 45-second delayed start.

Narayanis finish set the tone for the rest of the team as Vanderbilt finished in the top 11 and completed the race in less than 20 minutes. The Vanderbilt women are currently ranked No. 6 in the South Region, according to a poll by U.S. Track and Field Cross Country Coaches Association. The Dores finished ahead of Lipscomb, who is ranked seventh in the same region.

The Commodores have the weekend off before heading to Bloomington, Indiana, for the Coaching Tree Invitational on Sept. 16, hosted by Indiana University.

See the original post:

Niki Narayani Named SEC Co-Runner of the Week - Vanderbilt University

Read More..

Ten questions about the hard limits of human intelligence – Aeon

Despite his many intellectual achievements, I suspect there are some concepts my dog cannot conceive of, or even contemplate. He can sit on command and fetch a ball, but I suspect that he cannot imagine that the metal can containing his food is made from processed rocks. I suspect he cannot imagine that the slowly lengthening white lines in the sky are produced by machines also made from rocks like his cans of dog food. I suspect he cannot imagine that these flying repurposed dog food cans in the sky look so small only because they are so high up. And I wonder: is there any way that my dog could know that these ideas even exist? It doesnt take long for this question to spread elsewhere. Soon I start to wonder about concepts that I dont know exist: concepts whose existence I can never even suspect, let alone contemplate. What can I ever know about that which lies beyond the limits of what I can even imagine?

Attempting to answer this question only leads us to more questions. In this essay, Im going to run through a sequence of 10 queries that provide insight into how we might begin conceiving of whats at stake in such a question and how to answer it and there is much at stake. The question of what we can know of that which lies beyond the limits of our imagination is partially about the biological function of intelligence, and partially about our greatest cognitive prostheses, particularly human language and mathematics. Its also about the possibility of a physical reality that far exceeds our own, or endless simulated realities running in the computers of advanced nonhuman lifeforms. And its about our technological progeny, those children who will one day cognitively eclipse us. From the perspective of my 10 queries, human exceptionalism becomes very shaky. Perhaps we are more like dogs (or single-celled paramecia) than wed care to admit. Though human history is filled with rhapsodic testimony to human ingenuity and intelligence, this sequence of questions paints a different picture: I want to emphasise how horribly, and perhaps horrifyingly, limited and limiting our achievements are our language, science, and mathematics.

And so, the first question in the sequence is simple:

1. On some ill-defined objective scale, are we smart or are we stupid?

For vast stretches of time, the highest level of intelligence on Earth seems to have increased very slowly, at best. Even now, our brains process sensory-motor information using all kinds of algorithmic shenanigans that allow us to do as little actual thinking as possible. This suggests that the costs associated with intelligence are high. It turns out that brains are extraordinarily expensive metabolically on a per-unit-mass basis, far more than almost all other organs (the heart and liver being the exceptions). So, the smarter an organism is, the more food it needs, or it dies. Evolutionarily speaking, it is stupid to be smart.

We do not have a good understanding of exactly how our neural hardware grants us abstract intelligence. We do not understand how brain makes mind. But given that more intelligence requires more brain mass, which results in more metabolic costs, one would expect us to have the lowest possible level of abstract intelligence required for surviving in the precise ecological niche in which Homo sapiens developed: the barest minimum intelligence needed to scrape through a few million years of hunting and gathering until we got lucky and stumbled into the Neolithic Revolution.

Is this conclusion correct? To gain insight into the question of whether were smart or stupid, note that there are multiple types of intelligence. The ability to sense the external world is one such type of cognitive capability; the ability to remember past events is another; the ability to plan a future sequence of actions is another. And there are myriad cognitive capabilities that other organisms have but that we lack. This is true even if we consider only intelligences that we have created: modern digital computers vastly outperform us computationally in myriad ways. Moreover, the small set of those cognitive tasks that we can still perform better than our digital computers is substantially shrinking from year to year.

Maybe our mathematics can capture only a tiny sliver of reality

This will continue to change. The capabilities of future terrestrial organisms will likely exceed the current level of our digitally augmented intelligence. This sense of cognitive expansion is not unique to our current moment in history. Think about the collective cognitive capability of all organisms living on Earth. Imagine a graph showing this collective capability changing over billions of years. Arguably, no matter what precise time-series analysis technique we use, and no matter how we formalise cognitive capability, we will conclude that the trend line has a strictly positive slope. After all, in no period has the highest level of some specific cognitive capability held by any entity in the terrestrial biosphere shrunk; the entire biosphere has never lost the ability to engage in certain kinds of cognitive capability. Also, there is not just growth over time in the degree of each cognitive capability among all terrestrial species, but a growth in the kinds of cognitive capability. Life has become only smarter, and smarter in different ways. If we simply extrapolate this trend into the future, were forced to conclude that some future organisms will have cognitive capabilities that no currently living Terran species has including us.

Despite preening in front of our collective mirror about how smart we are, it seems that we have highly limited cognitive abilities compared with those that we (or other Terran organisms) will have in the future.

However, before getting too comfortable with this conclusion, we need to look a little closer at our graph of collective capability. Up until around 50,000 years ago, the collective intelligence on Earth was increasing gradually and smoothly. But then there was a major jump as modern Homo sapiens started on a trajectory that would ultimately produce modern science, art and philosophy. It may appear as though we are still part of this major jump, this vast cognitive acceleration, and that our kinds of intelligence far exceed those of our hominin ancestors.

2. Why does there appear to be a major chasm between the cognitive capabilities of our hominin ancestors and the cognitive capabilities of modern scientists, artists and philosophers?

There is no evident fitness benefit for a savannah-forged hairless ape to be able to extract from the deepest layers of physical reality cognitive palaces like the Standard Model of particle physics, Chaitins incompleteness theorem, or the Zen parable Ten Verses on Oxherding. In fact, there are likely major fitness costs to having such abilities. So why do we have them?

To grapple with this, its helpful to focus on the most universal of humanitys achievements, the most graphic demonstrations of our cognitive abilities: our science and mathematics. Our ability to exploit science and mathematics has provided us with cognitive prostheses and extended minds, from printing presses to artificial intelligences. Furthermore, the capabilities of those extended minds have been greatly magnified over time by the cumulative collective process of culture and technological development. In turn, these extended minds have accelerated the development of culture and technology. This feedback loop has allowed us to expand our cognitive capabilities far beyond those generated solely by genotypic evolution. The loop may even be the cause of the chasm between the cognitive capabilities of our hominin ancestors and the cognitive capabilities of the modern scientists, artists and philosophers.

Though the feedback loop has inflated our original cognitive capabilities (those generated by genotypic evolution), it is not clear that it has provided us with any wholly new cognitive capabilities. In fact, it might never be able to. Perhaps future forms of science and mathematics, generated via the feedback loop, will be forever constrained by the set of cognitive capabilities we had when we first started running the loop.

This suggests a different kind of resolution to the chasm between the cognitive abilities of our hominin ancestors and those of modern humans. Maybe the gap is not really a chasm at all. Perhaps it is more accurately described as a small divot in a vast field of possible knowledge. In an article titled The Unreasonable Effectiveness of Mathematics in the Natural Sciences (1960), the Hungarian-American theoretical physicist Eugene Wigner asked why our mathematical theories work so well at capturing the nature of our physical reality. Maybe the answer to Wigners question is that our mathematics isnt very effective at all. Maybe our mathematics can capture only a tiny sliver of reality. Perhaps the reason it appears to us to be so effective is because our range of vision is restricted to that sliver, to those few aspects of reality that we can conceive of.

The interesting question is not why our augmented minds seem to have abilities greater than those necessary for the survival of our ancestors. Rather, its whether our augmented minds will ever have the minimal abilities necessary for grasping reality.

3. Even aided by our extended minds, can we ever create entirely new forms of science and mathematics that could access aspects of physical reality beyond our conception, or are we forever limited to merely developing the forms we already have?

In 1927, an earlier version of this question was suggested by the English scientist John Burdon Sanderson Haldane in his book of essays Possible Worlds. Now, my own suspicion, he wrote, is that the universe is not only queerer than we suppose, but queerer than we can suppose. In the years that followed, similar verbal baubles suggested that the Universe may be stranger or odder than we can imagine or conceive. But, having other fish to fry, the authors of these early texts rarely fleshed out what they meant. They often implied that the Universe may be stranger than we can currently imagine due to limitations in current scientific understanding, rather than inherent limitations of what we can ever do with future efflorescences of our minds. Haldane, for example, believed that once we embraced different points of view, reality would open itself to us: one day man will be able to do in reality what in this essay I have done in jest, namely, to look at existence from the point of view of non-human minds.

In the decades since, other forms of this question have appeared in academic literature mostly in studies of the hard problem of consciousness and the closely related mind-body problem. This work on consciousness and minds has echoed Haldane by chasing the point of view of octopuses, viruses, insects, plants, and even entire ecosystems in the search for intelligence beyond the human.

The question of whether we are in a simulation or not is actually rather trivial

Many of these investigations have been informal, reflecting the squishy, hard-to-pin-down nature of the hard problem of consciousness. Fortunately, we can approach the underlying question of whether we can think beyond our current limits in a more rigorous manner. Consider the recently (re)popularised idea that our physical universe might be a simulation produced in a computer operated by some super-sophisticated race of aliens. This idea can be extended ad infinitum: perhaps the aliens simulating our universe might themselves be a simulation in the computer of some even more sophisticated species in a sequence of ever-more sophisticated aliens. Going in the other direction, in the not-too-distant future we might produce our own simulation of a universe, complete with entities who have cognitive capabilities. Perhaps those simulated entities can produce their own simulated universe, and so on and so on. The result would be a sequence of species, each running a computer simulation that produces the one just below it, with us somewhere in the sequence.

This question of whether we are in a simulation or not is actually rather trivial: yes, in some universes we are a simulation, and no, in some other universes we are not. For arguments sake though, lets restrict attention to universes in which we are indeed simulated. This leads us to our next question.

4. Is it possible for an entity that exists only in a computer simulation to run an accurate computer simulation of the higher entity that simulated them?

If the answer is no, then whatever we contemplate in our universe is only a small subset of what can be known by those who reside higher in the sequence of more complex simulations. And if the answer is no, it would mean that there are deep aspects of reality that we cannot even imagine.

Of course, the answer to this question depends on the precise definitions of terms such as simulation and computer. Formal systems theory and computer science provide many theorems that suggest that, whatever definitions we adopt, the answer to the question is indeed no. However, rather than expounding on these theorems that suggest our cognitive abilities are limited, Id like to take a step back. These theorems are examples of the content of our mathematics, examples of our mathematical ability and ideas. Much of this content already suggests our cognitive abilities are too limited to fully engage with reality. But what about other aspects of our mathematics?

5. Does the form, rather than the content, of our science and mathematics suggest that the cognitive abilities of humans are also severely constrained?

Open any mathematics textbook and youll see equations linked by explanatory sentences. Human mathematics is really the sum total of every equation and explanatory sentence inside every mathematics textbook ever written.

Now notice that each of those sentences and equations is a finite sequence of marks on the page, a finite sequence of visual symbols consisting of the 52 letters of the Latin alphabet, as well as special symbols such as + and =. For example, 1 + 1 + y = 2x is a sequence of eight elements from a finite set of marks. What we call mathematical proofs are strings of such finite sequences strung together.

This feature of human mathematics has implications for an understanding of reality in the broadest sense. To paraphrase Galileo, all our current knowledge about physics our formal understanding of the foundations of physical reality is written in the language of mathematics. Even the less formal sciences are still structured in terms of human language, using finite strings of symbols, like mathematics. This is the form of our knowledge. Our understanding of reality is nothing more than a large set of finite string sequences, each containing elements from a finite set of possible symbols.

Note that any sequence of marks on a page has no more meaning in and of itself than the sequences one might find in the entrails of a sacrificed sheep, or in the pattern of cracks in a heated tortoise shell. This observation isnt new. Much work in philosophy is a reaction to this observation that our science and mathematics is just a set of finite sequences of symbols with no inherent meaning. This work tries to formalise the precise way that such finite sequences might refer to something outside of themselves the so-called symbol-grounding problem in cognitive science and philosophy. The field of mathematics has reacted to this observation in a similar way, expanding formal logic to include modern model theory (the study of the relationships between sentences and the models they describe) and metamathematics (the study of mathematics using mathematics).

What is truly stunning about the fact that modern science and mathematics are formulated through a sequence of marks is its exclusivity: nothing other than these finite sequences of symbols is ever found in modern mathematical reasoning.

6. Are these finite strings of symbol sequences the form of our mathematics and languages necessary features of physical reality, or do they instead reflect the limits of our ability to formalise aspects of reality?

This question immediately gives rise to another:

7. How would our perception of reality change if human mathematics were expanded to include infinite strings of symbol sequences?

Infinite proofs with an infinite number of lines would never reach their conclusion in finite time, if evaluated at a finite speed. To reach their conclusion in finite time, our cognitive abilities would need to implement some kind of hypercomputation or super-Turing computing, which are fancy ways of referring to speculative computers more powerful than any we can currently construct. (An example of a hypercomputer is a computer on a rocket that approaches the speed of light, and so exploits relativistic time dilation to squeeze an arbitrarily large amount of computation into a finite amount of time.)

But even with hypercomputation, this suggested extension of our current form of mathematics would still be presented in terms of human mathematics. What would a mathematics be like whose very form could not be described using a finite sequence of symbols from a finite alphabet?

The American philosopher Daniel Dennett and others have pointed out that the form of human mathematics, and of our sciences more generally, just happens to exactly coincide with the form of human language. Indeed, starting with Ludwig Wittgenstein, it has become commonplace to identify mathematics as a special case of human language, with its own kind of grammar like that which arises in human conversation.

I marvel at the limits of human language, and the fact that these limitations appear to be universal

The design of inter-human communication matches that of formal logic and Turing-machine theory. Some philosophers have taken this as a wonderful stroke of fortune. We happen to have a cognitive prosthesis human language capable of capturing formal logic. They presume this means we are also capable of fully capturing the laws of the physical universe.

A cynic might comment, with heavy irony: How lucky can you get? Humans have exactly the cognitive capabilities needed to capture all aspects of physical reality, and not a drop more! A cynic might also wonder whether an ant, who is only capable of formulating the rules of the Universe in terms of pheromone trails, would conclude that it is a great stroke of fortune that ants happen to have the cognitive capability of doing precisely that; or whether a phototropic plant would conclude that it is a stroke of fortune that they happen to have the cognitive capability to track the Sun, since that must mean that they can formulate the rules of the Universe.

Linguists such as Noam Chomsky and others have marvelled at the fact that human language allows recursion, that we can produce arbitrary sequences of symbols from a finite alphabet. They marvel at the fact that humans can create what appears to be an apparently amazingly large set of human languages. But I marvel at the limits of human language. I marvel at the limits of our science and mathematics. And I marvel at the fact that these limitations appear to be universal.

8. Is it a lucky coincidence that mathematical and physical reality can be formulated in terms of our current cognitive abilities, or is it just that, tautologically, we cannot conceive of any aspects of mathematical and physical reality that cannot be formulated in terms of our cognitive capabilities?

Consider a single-celled, oblong paramecium, the kind that float in oceans or stagnant pools. It may seem obvious, but a paramecium like my dog cannot conceive of the concept of a question concerning issues that have no direct impact on its behaviour. A paramecium cannot understand the possible answers we have considered for our questions concerning reality, but neither would it understand the questions themselves. More fundamentally, though, no paramecium can even conceive of the possibility of posing a question concerning physical reality. Insofar as the cognitive concept of questions and answers might be a crucial tool to any understanding of physical reality, a paramecium lacks the tools needed to understand physical reality. It presumably does not even understand what understanding reality means, in the sense that we are using the term. Ultimately, this is due to limitations in the kind of cognitive capabilities paramecia possess. But are we so different? We almost surely have similar kinds of limitations in terms of our cognitive capabilities. So, the penultimate (and ironically self-referential) question in this essay is:

9. Just as the notion of a question is forever beyond a paramecium, are there cognitive constructs that are necessary for understanding physical reality, but that remain unimaginable due to the limitations of our brains?

It may help to clarify this question by emphasising what it is not. This question does not concern limitations on what we can know about what it is that we can never know. We can conceive of many things even if they can never be known. But among those things that we can never know is a strictly smaller subset of things that we cannot imagine. The issue is what we can ever perceive of that smaller set.

For example, we can conceive of other branches of the many worlds of quantum mechanics, even if we cannot know what happens in those branches. I am not here concerned with this kind of unknowable. Nor am I concerned with values of variables that are unknown to us simply because we cannot directly observe them, such as the variables of events outside our Hubble sphere, or events within the event horizon of a black hole. These events can never be known to us for the simple reason that our ancillary engineering capabilities are not up to the task, not for any reasons intrinsic to limitations of the science and maths our minds can construct. They can be known, but we cannot find a path to such knowledge.

The concern here is what kinds of unknowable cognitive constructs might exist that we can never even be aware of, never mind describe (or implement).

It seems likely that our successors will have a larger set of things they can imagine than our own

The paramecium cannot even conceive of the cognitive construct of a question in the first place, never mind formulate or answer a question. I wish to draw attention to the issue of whether there are cognitive constructs that we cannot conceive of but that are as crucial to understanding physical reality as the simple construct of a question. I am emphasising the possibility of things that are knowable, but not to us, because we are not capable of conceiving of that kind of knowledge in the first place.

This returns us to an issue that was briefly discussed above, of how the set of what-we-can-imagine might evolve in the future. Suppose that what-can-be-known-but-not-even-conceived-of is non-empty. Suppose we can know something about that which we truly cant imagine.

10. Is there any way that we could imagine testing whether our future science and mathematics can fully capture physical reality?

From a certain perspective, this question might appear to be a scientific version of a conspiracy theory, writ large. One might argue that it is not so different to other grand unsolvable questions. We also cant prove that ghosts dont exist, either theoretically or empirically; nor that Marduk, the patron god of ancient Babylon, doesnt really pull the strings in human affairs. However, there are at least three reasons to suspect that we actually can find the answer to (some aspects of) the question. Firstly, we could make some inroads if we ever constructed a hypercomputer and exploited it to consider the question of what knowledge is beyond us. More speculatively, as our cognitive abilities grow, we might be able to establish the existence of what we can never conceive of through observation, simulation, theory or some other process. In other words, it may be that the feedback loop between our extended minds and our technology does let us break free of the evolutionary accident that formed our hominin ancestors brains. Second, suppose we encounter extraterrestrial intelligence and can plug into, for example, some vast galaxy-wide web of interspecies discourse, containing a cosmic repository of questions and answers. To determine whether there are aspects of physical reality that are knowable but that humans cannot even conceive of might require nothing more than posing that question to the cosmic forum, and then learning the answers that are shared.

Consider our evolutionary progeny in the broadest sense: not just future variants of our species that evolve from us via conventional neo-Darwinian evolution, but future members of any species that we consciously design, organic or inorganic (or both). It seems quite likely that the minds of such successors will have a larger set of things they can imagine than our own.

It also seems likely that these cognitively superior children of ours will be here within the next century. Presumably we will go extinct soon after their arrival (like all good parents making way for their children). So, as one of our last acts on our way out the door, as we gaze up at our successors in open-mouthed wonder, we can simply ask our questions of them.

Parts of this essay were adapted from the article What Can We Know About That Which We Cannot Even Imagine? (2022) by David Wolpert.

Published in association with the Santa Fe Institute, an Aeon Strategic Partner.

Visit link:

Ten questions about the hard limits of human intelligence - Aeon

Read More..

Filings buzz in the mining industry: 30% increase in big data mentions in Q2 of 2022 – Mining Technology

Mentions of big data within the filings of companies in the mining industry rose 30% between the first and second quarters of 2022.

In total, the frequency of sentences related to big data between July 2021 and June 2022 was 279% higher than in 2016 when GlobalData, from whom our data for this article is taken, first began to track the key issues referred to in company filings.

When companies in the mining industry publish annual and quarterly reports, ESG reports and other filings, GlobalData analyses the text and identifies individual sentences that relate to disruptive forces facing companies in the coming years. Big data is one of these topics - companies that excel and invest in these areas are thought to be better prepared for the future business landscape and better equipped to survive unforeseen challenges.

To assess whether big data is featuring more in the summaries and strategies of companies in the mining industry, two measures were calculated. Firstly, we looked at the percentage of companies which have mentioned big data at least once in filings during the past twelve months - this was 57% compared to 24% in 2016. Secondly, we calculated the percentage of total analysed sentences that referred to big data.

Of the 10 biggest employers in the mining industry, Caterpillar was the company which referred to big data the most between July 2021 and June 2022. GlobalData identified 25 big data-related sentences in the United States-based company's filings - 0.4% of all sentences. Sibanye-Stillwater mentioned big data the second most - the issue was referred to in 0.18% of sentences in the company's filings. Other top employers with high big data mentions included Honeywell, ThyssenKrupp and CIL.

Across all companies in the mining industry the filing published in the second quarter of 2022 which exhibited the greatest focus on big data came from Erdemir. Of the document's 2,780 sentences, 10 (0.4%) referred to big data.

This analysis provides an approximate indication of which companies are focusing on big data and how important the issue is considered within the mining industry, but it also has limitations and should be interpreted carefully. For example, a company mentioning big data more regularly is not necessarily proof that they are utilising new techniques or prioritising the issue, nor does it indicate whether the company's ventures into big data have been successes or failures.

GlobalData also categorises big data mentions by a series of subthemes. Of these subthemes, the most commonly referred to topic in the second quarter of 2022 was 'data analytics', which made up 72% of all big data subtheme mentions by companies in the mining industry.

See the original post here:

Filings buzz in the mining industry: 30% increase in big data mentions in Q2 of 2022 - Mining Technology

Read More..

Pecan AI Leaps Over the Skills Gap to Enable Data Science On Demand – Datanami

As the big data analytics train keeps rolling on, there are still kinks to work out when implementing it in the business world. Building and maintaining a big data infrastructure capable of quickly turning large data sets into actionable insights requires data science expertise a skillset in high demand but with often low availability. There is also a skills gap between data scientists, analysts, and business users, and while several low or no-code platforms have aimed to resolve this, complexity remains for certain use cases.

One company looking to bridge the gap between business analytics and data science is Pecan AI. The company says its no-code predictive analytics platform is designed for business users across sales, marketing, and operations, as well as the data analytics teams that support them.

Pecan was built under the assumption that the demand for data science far exceeds the supply of data scientists. We said from the get-go, we wanted to help non-data scientists, specifically BI analysts, to basically leap through the gap of data science knowledge with our platform, Pecan AI CEO Zohar Bronfman told Datanami in an interview.

The Pecan AI platform allows users to connect their various data sources through its no-code integration capabilities. A drag-and-drop, SQL-based user interface enables users to create machine learning-ready data sets. Pecans proprietary AI algorithms can then build, optimize, and train predictive models using deep neural networks and other ML tools, depending on the needs of the specific use case. With less statistical knowledge required, along with automated data preparation and feature selection, the platform removes some of the technical barriers that BI analysts may face when leveraging data science.

Interestingly enough, in most of the data science use cases, you would spend, as a data scientist, more time and effort on getting the data right, extracting it, cleansing it, collating it, structuring it, and many other things that basically define data science use cases. And thats what weve been able to automate, so that analysts who have never done this before will be able to do so, said Bronfman.

Additionally, the platform offers monitoring features to continually analyze data for more accurate predictions, prioritize features as their importance changes over time, and monitor model performance via a live dashboard.

In data science, the changes that happen around us are very, very impactful and meaningful, and also potentially dangerous, said Bronfman, referencing how patterns of customer behavior can change as a reaction to factors such as inflation and supply chain disruptions, rendering current models obsolete. According to Bronfman, to continue delivering accurate predictions, the platform automatically looks for changes in patterns within data, and once it identifies a change, the models are retrained and updated by feeding new data into the algorithms to accommodate the more recent patterns.

An example Pecan AI dashboard showing a predicted churn rate. Source: Pecan AI

Bronfman and co-founder and CTO Noam Brezis started Pecan AI in 2016. The two met in graduate school while working toward PhDs in computational neuroscience, and their studies led them to research recent advancements in AI, including its capacity for automating data mining and statistical processes. Brezis became a data analyst with a focus on business analytics, and he was surprised to find that data science know-how was often relegated to highly specialized teams, isolated from the business analysts who could benefit the most from data sciences predictive potential. Bronfman and Brezis saw an opportunity to build a SQL-oriented platform that could leverage the power of data science for a BI audience while eliminating much of the manual data science work.

Pecan AI serves a variety of use cases including sales analytics, conversion, and demand forecasting. Bronfman is especially enthusiastic about Pecans predictive analytics capabilities for customer behavior, an area in which he sees three main pillars. The first pillar is acquisition, a stage when companies may be asking how to acquire and engage with new customers: For the acquisition side of things, predicted lifetime value has been one of the key success stories for us, Bronfman said of Pecans predictive lifetime value models. Those models eventually give you a very good estimation, way before things actually happen, of how well your campaigns are going to do from the marketing side. Once you have a predicted lifetime value model in place, you can wait just a couple of days with the campaign and say, Oh, the ally is going to disinvest in a month or three months time, so I should double down my spend on this campaign, or, in other cases, I should refrain from investing more.

The second customer behavior pillar is the monetization pillar, a time when companies may be asking how they can offer the customer a better experience to encourage their continued engagement: If you have the opportunity to offer an additional product, service, [or] brand, whatever that might be, you need to optimize both for what you are offering, and not less importantly, when you are offering [it]. So again, our predictions are able to tell you at the customer level, who should be offered what and when, said Bronfman.

Finally, the third pillar is retention, an area where Bronfman notes it is far more economically efficient to retain customers rather than acquire new ones: For the retention side of things, the classic use case, which has been extremely valuable and gotten us excited, is churn prediction. Churn is a very interesting data science domain because predicting churn has been notoriously challenging, and its a classic case where if youre not doing it right, you might, unfortunately, get to a place where you are accurate with your predictions but you are ineffective.

Pecan AI co-founders: CEO, Zohar Bronfman and CTO, Noam Brezis.

When predicting churn, Bronfman says that time is of the essence: When a customer has already made a final decision to churn, even if youre able to predict it before theyve communicated it, you wont be able in most cases, to change their mind. But if youre able to predict churn way in advance, which is what we specialize in, then you still have this narrow time window of opportunity to preemptively engage with the customer to give them a better experience, a better price, a better retargeting effort, whatever that might be, and increase your retention rates.

Investors and customers alike seem keen on what Pecan has to offer, and the company is seeing significant growth. So far, the company has raised a total of $116 million, including its latest Series C funding round of $66 million occurring in February, led by Insight Partners, with participation from GV and existing investors S-Capital, GGV Capital, Dell Technologies Capital, Mindset Ventures, and Vintage Investment Partners.

Pecan recently announced it has more than doubled its revenue in the first half of this year, with its annual recurring revenue increasing by 150%. Its customer count increased by 121%, with mobile gaming companies Genesis and Beach Bum and wellness brand Hydrant joining its roster which already includes Johnson & Johnson and CAA Club Group. The company also expanded its number of employees to 125 for a 60% increase.

Bronfman says Pecans growth stems from a strong tailwind of two factors: Analysts are loving the fact that they can evolve, upskill, and start being data scientists on demand. But also, we came to realize that business stakeholders love that they can drive quick and effective data science without necessarily requiring data science resources.

Related Items:

Pecan AI Announces One-Click Model Deployment and Integration with Common CRMs

Foundry Data & Analytics Study Reveals Investment, Challenges in Business Data Initiatives

Narrowing the AI-BI Gap with Exploratory Analysis

View original post here:

Pecan AI Leaps Over the Skills Gap to Enable Data Science On Demand - Datanami

Read More..

Data Scientist Training: Resources and Tips for What to Learn – Dice Insights

Data science is a complex field that requires its practitioners to think strategically. On a day-to-day basis, it requires aspects of database administration and data analysis, along with expertise in statistical modeling (and even machine learning algorithms). It also needs, as you might expect, a whole lot of training before you can plunge into a career as a data scientist.

There are a variety of training options out there for data scientists at all points in their careers, from those just starting out to those looking to master the most cutting-edge tools. Here are some platforms and training tips for all data scientists.

Kevin Young, senior data and analytics consultant at SPR, says that many data scientists treat Kaggle as a go-to learning resource. Kaggle is a Google-owned machine learning competition platform with a series of friendly courses to get beginners started on their data science journey.

Topics covered range from Python to deep learning and more. Once a beginner gains a base knowledge of data science, they can jump into machine learning competitions in a collaborative community in which people are willing to share their work with the community, Young says.

In addition to Kaggle, there are lots of other online resources that data scientists (or aspiring data scientists) can use to boost their knowledge of the field. Here are some free resources:

And here are some that will cost (although youll earn a certification or similar proof of completion at the end):

This is just a portion of whats out there, of course. Fortunately, the online education ecosystem for data science is large enough to accommodate all kinds of learning styles.

Seth Robinson, vice president of industry research at CompTIA, explains that individuals near the beginning of a data science career will need to build familiarity with data structures, database administration, and data analysis.

Database administration is the most established job role within the field of data, and there are many resources teaching the basics of data management, the use of SQL for manipulating databases, and the techniques of ensuring data quality. Beyond traditional database administration, an individual could learn about newer techniques involving non-relational databases and unstructured data, he adds.

Training for data analysis is newer, but resources such as CompTIAs Data+ certification can add skills in data mining, visualization, and data governance. From there, specific training around data science is even more rare, but resources exist for teaching or certifying advanced skills in statistical modeling or strategic data architecture, Robinson says.

Young cites two main segments of data science training: model creation and model implementation.

Model creation training is the more academic application of statistical models on an engineered dataset to create a predictive model: This is the training that most intro to data science courses would cover.

This training provides the bedrock foundations for creating models that will provide predictive results, he says. Model creation training is usually taught in Python, and covers the engineering of the dataset, creation of a model and evaluation of that model.

Model implementation training opportunities cover the step after the model is created, which is getting the model into production. This training is often vendor or cloud-specific to get the model to make predictions on live incoming data. This type of training would be through cloud providers such as AWS giving in-person or virtual education on their machine learning services such as Sagemaker, Young explains.

These cloud services provide the ability to take machine learning models produced on data scientists laptops and persist the model in the cloud, allowing for continual analysis. This type of training is vital as the time and human capital are usually much larger in the model implementation phase than in the model creation phase, Young says.

This is because when models are created, they often use a smaller, cleaned dataset from which a single data scientist can build a model. When that model is put into production engineering teams, DevOps engineers, and/or cloud engineers are often needed to create the underlying compute resources and automation around the solution.

The more training the data scientist has in these areas, the more likely the project will be successful, he says.

Young says one of the lessons learned during the pandemic that professionals in technology roles can be productive remotely. This blurs the lines a bit on the difference between boot camps compared to online courses as many boot camps have moved to a remote model, he says. This puts an emphasis on having the ability to ask questions to a subject matter expert irrespective of whether you are in a boot camp or online course.

He adds certifications can improve organizations standing with software and cloud vendors. This means that candidates for hire move to the top of the resume stack if they have certifications that the business values, Young says.

For aspiring data scientists deciding between boot camps versus online courses, he says probably the most important aspect to compare the two are the career resources offered. A strong boot camp should have a resource dedicated to helping graduates find employment after the boot camp, he says.

Robinson adds its important to note that data science is a relatively advanced field.

All technology jobs are not created equal, he explains. Someone considering a data science career should recognize that the learning journey is likely to be more involved than it would be for a role such as network administration or software development.

Young agrees, adding that data scientists need to work in a collaborative environment with other data scientists and subject matter experts reviewing their work. Data science is a fast-developing field, he says. Although fundamental techniques do not change, how those techniques are implemented does change as new libraries are written and integrated with the underlying software on which models are built.

From his perspective, a good data scientist is always learning, and any strongly positioned company should offer reimbursement for credible training resources.

Robinson notes in-house resources vary from employer to employer, but points to a macro trend of organizations recognizing that workforce training needs to be a higher priority. With so many organizations competing for so few resources, companies are finding that direct training or indirect assistance for skill building can be a more reliable option for developing the exact skills needed, while improving the employee experience in a tight labor market, he says.

Membership has its benefits. Sign up for a free Dice profile, add your resume, discover great career insights and set your tech career in motion. Register now

Excerpt from:

Data Scientist Training: Resources and Tips for What to Learn - Dice Insights

Read More..

Asia Pacific will lead the new wave of transformation in data innovation: Nium’s CTO Ramana Satyavarapu – ETCIO South East Asia

Ramana Satyavarapu, Chief Technology Officer, NiumIn a market such as Asia Pacific, the sheer volume of data and various emerging types of data create innumerable complexities for businesses that still require the adoption of data strategies from ground-up. For organisations that have understood the importance of data, they are yet to instil stronger data management practices in the current state of modern array. According to research revealed by Accenture, while only 3 of the 10 most valuable enterprises were actively taking a data-driven approach in 2008, that number has risen to 7 out of 10 today. All of it points to the fact that designing data-driven business processes is the only effective way to achieve fast-paced results and goals for organisations across sectors.

To further decode the nuances of the data landscape, with a special focus on the Asia Pacific region, we conducted an exclusive interaction with Ramana Satyavarapu, the Chief Technology Officer of Nium. Ramana is an engineering leader with a strong track record of delivering great products, organising and staffing geographically and culturally diverse teams, and mentoring and developing people. Throughout his career, he has delivered highly successful software products and infrastructure at big tech companies such as Uber, Google and Microsoft. With a proven track record of result-oriented execution by bringing synergy within engineering teams to achieve common goals, Ramana has a strong passion for quality and strives to create customer delight through technological innovation.

In this feature, he shares his outlook on the most relevant data management practices, effective data functionalities, building headstrong data protection systems, and leveraging optimal data insights for furthering business value propositions.

Ramana, what according to you are the most effective functions of data in the evolution of tech and innovation in Asia Pacific?

Data is becoming ubiquitous. Especially in Asia Pacific, because of the sheer number of people going digital. The amount of data available is huge. I will streamline its functions into three main areas:

First, understand the use case. Second, build just enough systems for storing, harnessing, and mining this data. For which, dont build everything in-house. Theres a lot of infrastructure out there. Data engineering has now turned into lego building, you dont have to build the legos from ground up. Just build the design structure using the existing pieces such as S3, Redshift and Google Storage. You can leverage all of these things to harness data. Thirdly, make sure the data is always encrypted, secure, and that there are absolutely robust, rock-solid, and time-tested protections around the data, which has to be taken very seriously. Those would be my three main principles while dealing with data.

How would you describe the importance of data discovery and intelligence to address data privacy and data security challenges?When you have a lot of data, reiterating my point about big datasets and their big responsibility, the number of security challenges and surface area attacks will be significantly higher. In order to understand data privacy and security challenges, more than data discovery and intelligence, one has to play a role in terms of two aspects - where we are storing it is a vault, we need to make sure the pin of the vault is super secure. Its a systems engineering problem more than a data problem. The second is, you need to understand what kind of data is this. No single vault is rock solid. Instead, how do we make sure that an intelligent piece of data is secure? Just store it in different vaults that individually, even if hacked or exposed - doesnt hurt it entirely. The aggregation of the data will be protected. Therefore, it must be a twofold strategy. Understand the data, mine it intelligently, so that you can save it not just in a single vault, but save it in ten different vaults. In layman terms, you dont put all your cash in a single bank or system. Therefore, the loss is mitigated and no one can aggregate and get ahold of all the data at once. Also, just make sure that we have solid security engineering practices to ensure the systems are protected from all kinds of hacks and security vulnerabilities.

The interpretative value of data provides immense scope for evaluating business processes. What role does data analytics play in the evolution of business success?There is a natural point where the functional value proposition that can be added or given to the customer, will start diminishing. There will be a natural point where data will be the differentiator. Ill give a pragmatic example which everybody knows - the difference between the Google search and Microsoft Bing search, both of which are comparably similar kinds of algorithms. But the results are significantly different! That's because one adopts fantastic data engineering practices. Its all about the insights and the difference that they can provide. At one point, the value addition from the algorithm diminishes and the quality and insights that you can draw from the data, will be the differentiator.

Twofold advantages of data insights or analytics. One, providing value to the customer beyond functionality. Like in the context of say Nium, or payments, or anyone whos doing a global money movement, weve identified an xyz company doing a massive money movement on the first day of every month - say to the Philippines or Indonesia. Instead of doing it on the first day of every month, why dont you do it on the last day of the previous month. That has been historically proven to be a better interchange or FX rate. At the end of the day, its all about supply and demand. Doing it one day before can save you a huge FX rate conversion which will benefit the business in many ways by one quantifiable amount, that is very powerful. Those kinds of insights can be provided to the customers by Nium. Being a customer-centric company, its our strong value proposition - we grow when the customer grows. Those insights, in addition to the business intelligence that can be drawn from it. Offering a new value proposition to the customer and just improving their processes is important.

For example, we are seeing that on an average, these customers transactions are taking x days or minutes, or this customer's acceptance rate is low, then we can improve the value, the reliability, and the availability of the system using analytics. We had a massive customer in the past, none other than McDonalds. We were looking at the data and we observed that theres a very specific pattern of transaction decline rate. However, independently, youll look at it and notice that only a few transactions are being declined. But if you look at it on a global scale, thats a significant amount of money and customer loss. When we analysed it further, we identified that this is happening with a very specific type of point of sale device in the east coast at the peak hour. We sent a detailed report of it to McDonalds saying we are identifying this kind of a pattern. McDonalds then contacted the point of sale device manufacturer and said that at this peak, these kinds of transactions, your devices are failing. That would have saved them hundreds and thousands of dollars.

Saachi, the whole idea is having a clear strategy of how we are going to use the data and we need to demystify this whole data problem space. There are data lakes, warehouses, machine learning, data mining, all of which are super complex terms. At the end of the data, break it down, and its really not that complex if you keep it simple.

In a world continually dealing with new-age data, mention some best data management practices for tech leaders. Again, theres no one set of practices that can determine that this will solve all your data problems. Then youd have to call me the data guru or something! To keep it simple, the three main aspects that I talked about - collection, aggregation, and insights - there are specific management practices for each of these strategies.

First, when it comes to data collection, focus on how to deal with heterogeneity. Data is inherently heterogeneous. From CSV files to text files to satellite images, theres no standard. Find a good orchestration layer and a good reliable, retry logic, with enough availability of ETLs to make sure this heterogeneous data is consistently and reliably collected. Thats number one. Im a big believer of: that what cannot be measured, is whats not done. Measure, measure, measure. In this case, have some validators, have some quality checks on consistency, reliability, freshness, timeliness, all the different parameters of if the data is coming to us in an accurate way. Thats the first step.

Second is standardisation. Whether its web-crawled data or Twitter information or traffic wave information or even satellite images, there was a dataset where we were measuring the number of sheep eating grass in New Zealand - so we were using image processing techniques to see the sheep. And why is that useful? Using that, you can observe the supply of merino wool sweaters in the world. If the sheep are reduced, the wool is less, and therefore the jacket will be costly. How do we store such data, though? Start with a time series and a standard identification. Every dataset, every data row, and every data cell has to be idempotent. Make sure that every piece of data, and the transformations of it, are traceable. Just have a time series with a unique identifier for each data value so that it can be consistently accessed. Thats a second.

Third, start small. Everyone presents people with machine learning or advanced data mining. Those are complex. Start with linear regressions and start identifying outliers. Start doing pattern matching. These are not rocket science to implement, start with them. Machine learning, in my opinion, is like a ten pound hammer. Its very powerful. But you want to have the right surface area and the right nail to hit it. If you use a ten pound hammer on a pushpin, the walls going to break. You need to have the right surface area or problem space to apply it. Even with ML, start with something like supervised learning, then move onto semi-supervised learning, then unsupervised learning, and then go to clustering, in a very phased manner.

That would be my approach on dividing it into the collection - having good validators or quality checks on it to ensure reliability, standardisation in the form of a time series, and then pattern recognition or simple techniques, wherefrom you can progress gradually onto how we want to mine the data and provide the insights.

To summarise, keep the data problem simple. Make sure you have a clear understanding of it - what is the use case that we are aiming to solve before we attempt to build a huge data lake or data infrastructure? Being pragmatic about the usage of data is very important. Again, data is super powerful. With lots of data, come lots of responsibilities, take it very seriously. Customers and users are entrusting us with their personal data, and that comes with a lot of responsibility. I urge every leader, engineer, and technologist out there to take it very seriously. Thank you!

Continued here:

Asia Pacific will lead the new wave of transformation in data innovation: Nium's CTO Ramana Satyavarapu - ETCIO South East Asia

Read More..