Category Archives: Data Science

Stats and data have ‘relevance everywhere’, says this Limerick … – SiliconRepublic.com

Prof Norma Bargary leads ULs Professional Diploma in Data Analytics. Here, she talks about her personal research interests and her career so far.

Prof Norma Bargarys love of maths and statistics was ignited when she was in her first year of a Mathematical Sciences degree at the University of Limerick (UL).

She recalls doing a module in statistics taught by Prof Ailish Hannigan. I immediately loved the subject because I could see its relevance everywhere, and decided to specialise in statistics for the final two years of my degree programme. I have worked as a statistician since then, she tells SiliconRepublic.com.

These days, Bargary is a professor herself at ULs Department of Mathematics and Statistics, where she chairs the data science and statistical learning side.

She was among the first recipients of the prestigious Senior Academic Leadership Initiative (SALI), a scholarship that promotes gender balance at a senior academic level in higher education institutions.

Her research interests vary from sports data to how media professionals get to grips with data to communicate it effectively or not, as the case may be.

She said she and her team have developed a series of studies to understand how comfortable journalists are with numbers. They designed interventions to build journalists numeracy skills, with Bargary saying that the ultimate aim of the studies is to improve how numbers are communicated to the public.

Where sports are concerned, she says she is very interested in data that are measured using sensors. For example, I work a lot with motion capture data which measures peoples movement patterns when doing tasks like running, jumping, kicking and rowing.

The data that are produced by these systems can be thought of as curves or functions; my research develops new ways to model such data using an area of statistics called functional data analysis.

As well as her own research, Bargary is at the forefront when it comes to the development of ULs data science education programmes.

She leads the UL@Work Professional Diploma in Data Analytics, which is a programme aimed at learners who are already working full time to give them the skills they need to break into the industry.

The programme is designed in consultation with stakeholders already working in the sector. Data analytics is a great area for career beginners and pivoters alike, says Bargary, highlighting the shortage of people with skills in that area.

What kind of skills do people need to have a career in data analytics? She lists critical thinking skills, such as how to formulate good research or business questions and identify the data needed to answer those questions.

In terms of technical skills, she says very good statistical skills and being able to code in languages like R or Python is a must.

Data is now a highly valuable resource and companies are increasingly striving to become data driven. In order to do that, and use data to its fullest, strong data analytics expertise is essential.

Bargary and the team behind the Professional Diploma in Data Analytics worked closely with industry partners to ensure the programme teaches learners these skillsets.

Those undertaking the programme learn how to work with data throughout the data analytics pipeline from data collection to data cleaning, wrangling, visualisation, modern statistical and predictive modelling techniques, and the communication of results back to key stakeholders via interactive reports and dashboards.

And data analytics and stats have wider societal implications, too. Now more than ever, we are faced with enormous societal challenges such as climate change, sustainability, housing and food shortages. Data and modelling have really important roles to play in helping us to untangle these issues and ultimately trying to address them.

Any professional who works with data such as journalists, as Bargary mentioned needs to know that it is not to be messed around with, however. Quality rather quantity is what should be aimed for.

Miscommunications and errors can occur when someone has failed to grasp the data they are working with.

The biggest misconception is that measuring lots of data means it must contain useful information, warns Bargary. This is a mistake she wants to correct.

Lots of data does not equal lots of useful data. If the data you collect are poor quality then there is nothing that data science can do.

10 things you need to know direct to your inbox every weekday. Sign up for the Daily Brief, Silicon Republics digest of essential sci-tech news.

Go here to see the original:

Stats and data have 'relevance everywhere', says this Limerick ... - SiliconRepublic.com

Active Travel England Partners With Alan Turing Institute To Leverage Data Into Investment – Forbes

Manchester's Gay Village. (Photo by Christopher Furlong/Getty Images)Getty Images

Active Travel England has commissioned the Alan Turing Institute to create software and data science techniques to support local authority delivery of walking and cycling schemes. The institute, based out of the British Library in London, is the UKs national data science and artificial intelligence institute.

The collaboration will run for two years at a total cost of $250,000 and enable the development of new functionality in the Active Travel Infrastructure Platform (ATIP), which helps councils to map out proposed schemes and see the impact they could have locally.

These new tools will be paired with existing data sources, such as OpenStreetMap, to create solutions that will help build the evidence needed to meet the national governments stated objectives on active travel, including for 50% of short trips in urban areas to be made by walking, wheeling and cycling by 2030.

The new software engineering and data science techniques will complement data collection and analysis work done by Active Travel Englands head of data Dr. Robin Lovelace.

The partnership with Alan Turing Institute is hugely important, Lovelace told Forbes.com.

Transport models and datasets represent leverage points in the transport planning system, he stressed.

The lack of data and robust analysis of active modes has led to them not being taken seriously. New datasets can ensure that investment goes where its most needed.

Active Travel Minister Jesse Norman said the partnership will enable local councils to draw on the latest technology and maximize active travel's environmental, economic and health benefits.

Meanwhile, the government he represents last month revealed swingeing cuts to Englands active travel budget.

According to the Walking and Cycling Alliance (WCA), a body made up of cycling and walking organizations including British Cycling, Living Streets, Ramblers and Sustrans, the cuts means a two-thirds cut to promised capital investment in walking and cycling.

It is heartbreaking to see vital active travel budgets wiped away in England, at the exact time when they are most essential to U.K. economic, social and environmental prospects, said a WCA statement.

It is incredibly disappointing that the active travel budget has seen such extensive cuts at a time where we need to really make progress on decarbonisation and when people need cheap transport choices, said a joint statement by Conservative MP Selaine Saxby and Labour MP Ruth Cadbury, co-chairs of the All Party Parliamentary Group on Cycling and Walking.

They added: We understand that there are pressures on the public purse but active travel schemes frequently have much higher benefit:cost ratios than road building schemes, many of which are still going ahead despite falling value for money for taxpayers.

I was Press Gazette's Transport Journalist of the Year, 2018. I'm also an historian my most recent books include "Roads Were Not Built for Cars" and "Bike Boom", both published by Island Press, Washington, D.C.

See the original post here:

Active Travel England Partners With Alan Turing Institute To Leverage Data Into Investment - Forbes

Data Science Certification Programs and Their Benefits – IndiaCSR

Data science is an expeditiously growing field that involves analyzing, interpreting, and making predictions based on large sets of data. As the field continues to expand, these programs have become progressively popular as a way for individuals to validate their skills and knowledge. In this article, we will understand various types of data science programs available, the benefits of obtaining a certification, and factors to consider when selecting a program.

Data science programs come in different types and are offered by various providers. The most common types of certification programs are vendor-specific, industry-specific, and vendor-neutral.

Vendor-specific certification programs are offered by technology companies, such as Microsoft and Amazon, and are designed to validate an individuals proficiency in their specific technologies. Industry-specific certification programs, on the other hand, are tailored to a particular industry, such as healthcare or finance. Vendor-neutral certification programs, such as those offered by the Data Science Association and the Institute of Electrical and Electronics Engineers (IEEE), are not tied to a specific technology or industry. Ed-tech platforms like Great Learning also offer well-curated programs in association with world-class universities.

The requirements for certification programs vary depending on the provider and the type of program. Some programs require passing an exam, while others require completing a course or submitting a project.

There are several benefits to obtaining a data science certification. Here are some of the most significant benefits:

Certification programs provide recognition and credibility to individuals, validating their skills and knowledge in the field. This recognition can be particularly valuable for job seekers, as it demonstrates to potential employers that they have the necessary qualifications for the job.

Many certification programs offer additional resources, such as online forums, workshops, and study materials, which can help you stay up-to-date with the latest trends and best practices in the field.

A data science certification provides tangible evidence of your knowledge and expertise in the field. Employers and colleagues can see that you have invested time and effort into gaining a deep understanding of the subject matter.

Data science certifications can also open up new career opportunities and advancement paths. Individuals with certifications are often preferred by employers, which can lead to promotions, salary increases, and other benefits.

Certification programs can also help individuals enhance their skills and acquire new ones. The coursework and exams required for certification can help individuals develop new skills and deepen their understanding of the field.

Certification programs can also provide networking opportunities. Individuals in certification programs can connect with other professionals in the field, allowing them to expand their network and potentially find new job opportunities.

Learning data science can lead to higher salary potential. According to a study, certified professionals in the IT field can earn up to 12% more than their non-certified peers.

When selecting a data science certification program, its essential to consider several factors, including:

When choosing a certification program, it is important to consider all of these factors in order to select the best program for your needs. Research the provider, determine the requirements, and consider the cost to ensure you are making the right decision for your future.

In conclusion, pursuing post-graduation in data science can provide significant benefits to individuals looking to enhance their skills. They can provide the opportunity to develop data science expertise and apply it to a specific industry. Furthermore, they offer the chance to learn the latest technologies and techniques, as well as become a certified data scientist. Certificate programs provide a great way to gain recognition and credibility in the workplace, as well as to further your career. Therefore, if you are looking to break into the data science field or simply want to expand your existing knowledge, looking into data science programs could be the perfect choice for you.

Read more here:

Data Science Certification Programs and Their Benefits - IndiaCSR

What you need to know to accelerate your cloud and data strategy – CIO

At Choice Hotels, cloud is a tool to help the hospitality giant achieve corporate goals. That can include making progress on immediate objectives, such as environmental sustainability, while keeping an eye on trendy topics such as the metaverse and ChatGPT.

Were investing in technology, were investing in leveraging the cloud to do meaningful things while we figure out what does tomorrow look like? said CIO Brian Kirkland.

Kirkland will describe key points on how cloud is enabling business value, including its sustainability initiatives, at CIOs Future of Cloud & Data Summit, taking place virtually on April 12.

The day-long conference will drill into key areas of balancing data security and innovation, emerging technologies, and leading major initiatives.

The program kicks off with a big-picture view of how the cloud will change the way we live, work, play, and innovate from futurist and Delphi Group Chairman and Founder Tom Koulopoulos. Afterward, he will answer questions in a lively discussion with attendees.

Before organizations map an architectural approach to data, the first thing that they should understand is data intelligence. Stewart Bond, IDCs vice president for data integration and intelligence software, will dissect this foundational element and how it drives strategy as well as answer audience questions about governance, ownership, security, privacy, and more.

With that foundation, CIOs can move on to considering emerging best practices and options for cloud architecture and cloud solution optimization. David Linthicum, chief cloud strategy officer at Deloitte Consulting and a contributor to InfoWorld, will delve into strategies that deliver real business value a mandate that every IT leader is facing now.

Want to know how top-performing companies are approaching aspects of cloud strategy? Hear how Novanta Inc. CIO Sarah Betadam led a three-year journey to becoming a fully functional data-driven enterprise. Later, learn how Tapestry home to luxury consumer brands such as Coach and Kate Spade developed a cloud-first operating model in a conversation between CIO Ashish Parmar and Vice President of Data Science and Engineering Fabio Luzzi.

Another top trend is AI. Phil Perkins, the co-author of The Day Before Digital Transformation, will discuss the most effective applications of AI being used today and what to expect next.

At some organizations, data can be a matter of life and death. Learn about a data-focused death investigations case management system used to influence public safety in a conversation between Gina Skagos, executive officer, and Sandra Parker, provincial nurse manager, at the Province of Ontarios Office of the Chief Coroner.

Throughout the summit, sponsors including IBM, CoreStack, VMware, and Palo Alto Networks will offer thought leadership and solutions on subjects such as new models of IT consumption, cloud security, and optimizing hybrid multi-cloud infrastructures.

Check out the full summit agenda here. The event is free to attend for qualified attendees. Dont miss out register today.

Continue reading here:

What you need to know to accelerate your cloud and data strategy - CIO

Finding Life Purpose in Turning Vision Into Reality | Maryland Smith – Robert H. Smith School of Business

This Q&A has been edited and condensed.

What is your job title and where do you work?

I am Principal Product Manager at Opendoor, which buys and sells homes and makes instant cash offers online. I work at the headquarters in San Francisco.

What does a day in your role look like and how do you approach new projects?

On a day-to-day basis, I work with a diverse group of stakeholders to execute the product vision Ive developed to further the companys growth. My responsibilities include identifying new opportunities aimed at driving profitability and growth, making sure stakeholders are aligned on product vision, working with design and engineering teams to plan and deliver initiatives, partnering with analytics/data science teams to measure impact and presenting the projects outcomes and long-term mission to leadership.

I approach new projects by first understanding the problem space, customer needs, and why the project matters to Opendoor. Once I build a good understanding, I then start writing product memos, plan the kick-off of the project and perform the pre-mortems.

Is there something about your professional journey that people would find surprising?

I have been a product manager since day one of my career and have been building products for over a decade. Building consumer products in different industries has left me with a unique edge that allows me to quickly connect the dots and ship products quickly. I have always found new ways to improve myself and have been open to feedback. Learning and trying new technologies (for example, currently working on an AI project leveraging GPT3) has enabled me to keep improving at my craft.

Tell us about your path from graduation to your current job.

Immediately after graduating from Smith, I joined Home Depot as a product manager and shipped highly impactful products that brought in record revenue and that got me promoted to senior product manager. In that role, I managed two product areas. One of them was risk and fraud. I gained vast expertise in product strategy and combating bad actors in the digital world. To gain more knowledge in my field, I decided to take a job as payment and risk product manager at Eventbrite, where I was responsible for creating products that reduced risk and fraud-related losses and along the way I started building monetization products. After success at Eventbrite, I was offered a job at Opendoor. I accepted because it was a chance to leverage my years of product management experience with solving complex problems, enabling me to make a difference for millions of homebuyers and sellers in the U.S.

Are you where you thought you would be in your career? What are your goals?

Yes, Im where I thought I would be. At this stage of my career, I wanted to directly impact critical outcomes for a business and be at the forefront of solving complex problems that sit at the intersection of technology, data science and user experience. My goal is to create solutions and products that make a meaningful difference in peoples lives at Opendoor and that is what we are doing.

What Smith resources or relationships did you leverage for your career?

The Smith School curriculum is very well designed and that enriched my learning experience. I was able to build a deep connection with some of the faculty who mentored me and were always open to discussing ideas. These relationships helped me secure funding for startups I co-founded, helped me hire students to assist with research projects and provided recommendations for career opportunities.

How has your Smith education helped with your career? Were there specific classes, experiential projects, team projects or internships that have been especially helpful to you?

Ive seen great value in applying concepts in the real world that I learned from my coursework at Smith. At different times in my career, Ive gone back to the slides taught by Professor Siva Viswanathan when I needed to brush up on theories and fundamentals applicable to driving profitability and growth to consumer and marketplace technology companies. Also, getting to run two startups while I was at Smith set me off on a successful path.

Why did you decide to get a business degree and why did you choose Smith?

I decided to earn a Master of Science in Information Systems degree from Smith because the coursework sharpened my skills in digital strategy, data science, and business strategy. These are all critical for my career.

What about your personal journey has led to your success?

I have always been a curious person who liked helping people. As a kid, I used to love the Captain Planet character in a cartoon I watched (Captain Planet and the Planeteers). He was responsible for saving the world. I have always wanted to use my skills to make the lives of people around me better, whether it was helping someone with their studies, career coaching or creating products that make a meaningful difference in peoples lives.

Is there anything else you would like to add?

Everyone has something they can bring to the table due to the unique experiences they had growing up. I would say from my experience and echoing the University of Marylands brand tagline Fearlessly Forward, one should try to discover their Ikigai (life purpose) and spend the rest of their lives perfecting that which will make this world a little better for their being in it.

Read the original:

Finding Life Purpose in Turning Vision Into Reality | Maryland Smith - Robert H. Smith School of Business

New podcast explores whether data can solve big problems – MIT Sloan News

open share links close share links

From sports betting to policing to the dark side of the data economy, data and society intersect at many points. The new podcast Data Nation, from the MIT Institute for Data, Systems, and Society, takes a closer look at these intersections, including whether data can be used to solve societal problems and answer difficult questions.

Hosted by Liberty Vittert, SB 10, a data science professor at Washington University in St. Louis, and MIT professor Munther Dahleh, the director of the Institute for Data, Systems, and Society, the podcast examines a wide range of issues, from misinformation to credit scores. Season 2 launched in March with a look at how brain data can shed light on sleep and anesthesia.

Guests include MIT professors, journalists, and experts in the field. Heres a look at some highlights from Season 1.

The digital economy is fueled by data, which companies can use to reach users more effectively and develop products.

A vast amount of that data flows to platforms like Facebook for free. These platforms dont buy or sell customer data; users willingly share it with them, thus allowing the companies to run more-targeted ads.

While those big companies deserve scrutiny, people should also look at more obscure data brokers companies that exclusively buy and sell consumer information, according to MIT Sloan professorDean Eckles.There [are]a lot of ways that data is flowing around, often involving the companies that you havent necessarily heard of, he said. This is happening a little bit more behind the scenes. Eckles contends that there should be more regulatory scrutiny for these data brokers, which often dont have a direct relationship with consumers.

People should be conscious of their data footprint and know the basics of how online advertising works, and maybe take some steps to preserve [their] privacy, he said, noting that actions like turning off app tracking and using encrypted messaging services work well.

The sports industry has been an early adopter of data and analytics, with teams and players using them to gain a competitive edge. With sports betting now legal in a majority of states, sports analytics can be used by people off the field, too.

Anette (Peko) Hosoi, a mechanical engineering professor at MIT who studies sports data and technology, said people placing bets on sports should consider the extent to which the outcome is skill-based, as the rules of some sports reward skill more than others. Hosois research has found that basketball and baseball are more skill-based compared with hockey and football.

Any activity that you do is going to have some elements of skill and some elements of luck, Hosoi said. Youre really asking, where does it sit on this spectrum? ... When youre betting on sports, having those statistical algorithms and having that statistical knowledge makes a difference.

Bettors should keep in mind the intended outcome when they decide how to place bets. For instance, if your aim is to have fun with your family, something relatively random, like football, would fit the bill. But if youre trying to make money, go for something more deterministic, like basketball, Hosoi said. People can make more informed bets by focusing on high-skill sports that generate a lot of data.

Opioid use is aleading cause of injury-related deaths in the U.S., and opioid-related deaths reached a record high in 2022.

Businesses can play a role in addressing this epidemic, according to MIT Sloan professorAndrew Lo.

Because opioids are addictive, theyve generated lots of revenue for companies a business incentive that ultimately caused the crisis, Lo said. In 2022, drug distributors and wholesalers finalized an opioid settlement that is now up to $32 billion a figure that only hints at how much money was generated in revenue during the years leading up to the crisis, he said.

The key now is incentivizing companies to develop nonaddictive pain medicines. Companies need to understand that they will be financially rewarded and earn goodwill if theyre able to do so, Lo said. If you do that at a large enough scale, the chances are youre going to hit one or two or three different really successful, really powerful drugs that can deal with both the crisis as well as with pain management, he said.

With massive amounts of historic and location-specific data available, police are able to analyze when and where various types of crime have taken place, for example, and allocate resources accordingly.

But there is reason to be wary of these approaches, said S. Craig Watkins, the Martin Luther King Jr. visiting professor at MIT, particularly among communities of color and many working-class or poor individuals

For the communities who bear the brunt of these systems, who are disproportionately profiled and surveilled as a result of these systems, theres just no possible way that they could see these technologies as a net benefit in any way, shape, form, or fashion, he said. Its going to require a strategic effort in terms of convincing them that these systems can lead to sort of a net benefit.

For positive impacts to come to fruition, there need to be clear procedures, policies, and practices for data-informed profiling and policing, Watkins said, and organizations should be intentional about their adoption and deployment of these systems. We cant assume that just by virtue of them existing and by virtue of us adopting and deploying them, they will generate these net benefits, he said.

Read next: Data literacy for leaders

Read the rest here:

New podcast explores whether data can solve big problems - MIT Sloan News

What Is the Role of Data Governance in Healthcare? – HealthTech Magazine

How Does Data Governance Impact Healthcare Organizations?

Data is omnipresent in healthcare organizations. The more accessible and reliable it is, the more likely you are to develop insights from it, says Jonathan Shannon, associate vice president of healthcare strategy at LexisNexis. On the other hand, he adds, your entire business suffers in various ways when data governance policies are poor.

Krishnan describes three telltale signs of poor governance:

Its one thing if a patient gets the same marketing materials twice, she says. (That can happen if, say, there are records for Sam Smith and Sam S. Smith at the same mailing address.) But I dont want my diagnosis to be wrong because physicians dont have access to all of my records.

More broadly, poor data governance in healthcare can have significant business implications. Shannon points to the referral process. There are multiple benefits to referring patients to in-network providers: Patients avoid the high cost of seeing physicians not covered by their insurance plan, and organizations keep patient defections to a minimum.

If provider directories are inaccurate and data from the Centers for Medicare and Medicaid Services indicates that 49 percent of them are then its that much harder to make in-network referrals, he says. Very important procedures may be sent up the road and out of the network because someone didnt know.

Finally, poor governance poses security and regulatory risks. The 21st Century CURES Act requires organizations to make data available to other healthcare stakeholders, including patients. This requires a delicate balance between security and availability, Shannon says.

Now, theres more pressure to make data more accessible, he says, primarily with open application programming interfaces. Without data governance, you cant support open access with APIs.

DISCOVER: How Community Medical Centers powers operations with data.

When it comes to addressing data governance, healthcare organizations tend to fall into one of three buckets, Krishnan says. Some are just getting started and need help putting a general framework in place. Others have a framework but also have many data silos; this is especially tricky when data is on-premises and in both public and private cloud environments.

Still others have made progress but worry about the implications of duplicate records within newly unified data sets, whether its difficulty with regulatory compliance or a lag time to get business-ready data to the teams that need it. These organizations want help to scale for new applications, Krishnan says. They want to be enterprise-ready and future-proof.

A common starting point is what Shannon refers to as a one-time cleanup of the organizations data repository.

By definition, data grows and changes over time. If your organization has been in business for decades, youve been accumulating data for decades, he says.

When technology upgrades are on the horizon, its both expensive and counterproductive to move millions of unnecessary records those that are duplicates, incomplete, from deceased patients, and so on. Through a combination of referential and probabilistic modeling methods, Shannon says a repository with data from 7 million patients could be trimmed to 1 million patient records. As a result, the repository is more accurate, less expensive to maintain, and well suited for use with next-generation applications for decision support, population health management, and predictive analytics and modeling.

For Money, addressing organizational culture is fundamental to improving data governance.

You cant buy data governance off the shelf, she says. It has to be understood from the highest level of the organization to the bottom. It should be invisible. No matter the organization, what youre doing is producing quality, usable, effective products, and data governance is a tool to make that happen.

Continue reading here:

What Is the Role of Data Governance in Healthcare? - HealthTech Magazine

The Berkson-Jekel Paradox and its Importance to Data Science – KDnuggets

If you are a Data Scientist or an aspiring one, you will know the importance of statistics in the sector. Statistics help Data Scientists to collect, analyze, and interpret the data by identifying patterns and trends, to then make future predictions.

A statistical paradox is when a statistical result contradicts expectations. It can be very difficult to pinpoint the exact cause, as it is hard to understand the data without the use of further methods. However, they are an important element for Data Scientists as it gives them a lead on what could possibly be causing the misleading results.

Here is a list of statistical paradoxes relevant to data science:

In this article, we will be focusing on the Berkson-Jekel paradox and its relevance to Data Science.

Berkson-Jekel paradox is when two variables are correlated in data, however, when the data is grouped or subsetted, the correlation is not identified. To put it in layman's terms, the correlation is different in different subgroups of the data.

The Berkson-Jekel paradox is named after the first statisticians who described the paradox, Joseph Berkson and John Jekel. The discovery of the Berkson-Jekel paradox is when the two statisticians were studying the correlation between smoking and lung cancer. During their study, they found a correlation between people who had been hospitalized for pneumonia and lung cancer, in comparison to the general population. However, they conducted further research which showed that the correlation was due to smokers being hospitalized for pneumonia more, in comparison to people who did not smoke.

Based on the statistician's first research on the Berkson-Jekel paradox, you may say that more research was required to figure out the exact reasoning behind the correlation. However, there are also other reasons why the Berkson-Jekel paradox occurs.

Statistical reasoning is very important in Data Science, and the main issue is dealing with misleading results. As a data scientist, you want to ensure that you are producing accurate results that can be used in the decision-making process and for future predictions. Making incorrect predictions or misleading results is the last thing on the cards.

There are a few methods that you can use to avoid the Berkson-Jekel Paradox:

If you are dealing with misleading results due to the sample data not being representative of the population, a solution would be to use data from a variety of sources. This will help you to get a more representative sample of the population, research more on the variables, and get a better understanding.

Misleading outputs can hold a company back. Therefore, when working with data, data professionals need to understand the limitations of the data theyre working with, different variables and the relationship between them, and how to reduce misleading results from happening.

If you would like to know more about Simpsons Paradox, have a read of this: Simpsons Paradox and its Implications in Data Science

If you would like to know more about the other statistical paradoxes, have a read of this: 5 Statistical Paradoxes Data Scientists Should KnowNisha Arya is a Data Scientist, Freelance Technical Writer and Community Manager at KDnuggets. She is particularly interested in providing Data Science career advice or tutorials and theory based knowledge around Data Science. She also wishes to explore the different ways Artificial Intelligence is/can benefit the longevity of human life. A keen learner, seeking to broaden her tech knowledge and writing skills, whilst helping guide others.

Original post:

The Berkson-Jekel Paradox and its Importance to Data Science - KDnuggets

Introduction to Data Science with Python: How is it Beneficial? – Analytics Insight

Learn how beneficial data science with Python in this simplified guide with meaningful resources

The need for more effective and efficient data storage increased significantly as the globe entered the era of big data in recent decades. Businesses utilizing big data put a lot of effort into developing frameworks that can store a lot of data. Eventually, frameworks like Hadoop were developed, aiding in the storage of enormous volumes of data.

When the storage issue was resolved, attention turned to processing the data that had already been saved. Data science has emerged as the method of the future for handling and evaluating data in this situation. Data science is becoming a crucial component of any industry dealing with massive volumes of data. Businesses currently employ experts and data scientists who take the data and transform it into a useful resource.

Lets now get into data science and the advantages of using Python for data science.

Lets start by studying data science and then using Python to learn about it. Finding and examining data in the actual world is fundamental to data science, which then employs this knowledge to address practical business issues.

Now that you are aware of what data science is, lets first discuss Python before delving deeply into the subject of data science with Python.

We require a programming language or tool, such as Python, for data science. Although there are other data science tools, such as SAS and R, this post will concentrate on Python and how it may help with data science.

Python has recently gained a lot of popularity as a programming language. Its usage in data science, the Internet of Things, artificial intelligence, and other technologies have increased its appeal.

Since it has expensive mathematical or statistical features, Python is utilized as a programming language for data research. That is one of the key explanations for why Python is used by data scientists all around the world. Python has emerged as the preferred programming language, particularly for data science, if you follow patterns over the previous few years.

Python is one of the most popular programming languages for data science for several additional reasons, including:

Speed: Python is comparatively quicker than other programming languages in terms of speed.

Availability: There are several packages created by other users that are readily available and may be utilized.

Design objective: Pythons syntactic responsibilities are simple to comprehend and intuitive, making it easier to create applications with intelligible code.

Python is a straightforward programming language to learn, and it supports certain fundamental operations like adding and printing statements. But, you must import certain libraries if you wish to undertake data analysis. Many instances include:

Pandas: Tool for working with structured data.

NumPy: A powerful library that helps you create n-dimensional arrays

SciPy: Offers scientific features like Fourier analysis and linear algebra

Matplotlib: Mostly used for visualization.

Scikit-learn: Used for all machine learning operations.

Original post:

Introduction to Data Science with Python: How is it Beneficial? - Analytics Insight

Scientific Journeys: From genetics to the environment and back – Environmental Factor Newsletter

Last October, David Reif, Ph.D., joined what he calls his dream team in the NIEHS Division of Translational Toxicology. As head of the Predictive Toxicology Branch, he leads a multidisciplinary group focused on predicting how individuals and populations respond to environmental exposures. The group aims to improve public health through the development and promotion of cutting-edge, computer-based methods and research models.

Reif recently talked with Environmental Factor about why he transitioned from academia to NIEHS, his journey from genetics to toxicology, and what most excites him about the future.

Environmental Factor: What drew you to NIEHS?

David Reif: Team science. Im a data scientist who enjoys tackling big problems that require expertise beyond my own. I had been a professor for 10 years. I really liked it, but I found that the projects that most motivated me were those involving long-term collaborations. I had been following and using team-built tools coming out of NIEHS, teaching them to my students and incorporating them in my own research. Then this opportunity came up, and I thought it would be great to be on the inside to advance translational toxicological research and predict how gene-by-environment interactions can influence human health.

EF: What makes the Predictive Toxicology Branch unique?

DR: We have a mix that doesnt exist anywhere else in world. Its basically my dream super lab, my dream team. We have computational quantitative biologists, computational chemists, scientists working on geospatial health analytics, and researchers promoting in vitro [cell-based] models and new approach methodologies, all together in one branch.

EF: Can you share how your education and training shaped your career path?

DR: My graduate training was in human genetics and statistics, but I really wanted to study environmental health problems. At that time, in my view, the tools for measuring the environments impact on health were unsophisticated. But scientists had just mapped the human genome, so it was an exciting time, and it felt like everything was possible.

I completed postdoctoral training at the U.S. Environmental Protection Agency, in the just-launched National Center for Computational Toxicology, which had the atmosphere of a startup company. The new center marked the beginning of programs to rapidly test all the chemicals we didnt know about. I was there for seven years as a statistician [principal Investigator], and I really invested in the field. I took formal courses in toxicology, and I started going to Society of Toxicology conferences. I stayed in the toxicology and environmental health research space as a professor at North Carolina State University, even though I joined a genetics department.

EF: Is the shift from genetics to toxicology common?

DR: I don't know if it's common, but I think it's conducive because genetics is a mechanism for both responding to the environment in the near term and a way to transmit information across generations. And you can apply genetics and genomics to lots of different kinds of problems. In some projects here, I'm full circle back to doing clinical studies involving human exposures, but now we have a vastly more sophisticated characterization of the environment to consider.

For example, the exposome, which represents the totality of our environmental exposures, could not be effectively measured back when I was training. Huge progress has been made recently, much like genetic and genomics technologies advanced quickly when I was earning my Ph.D. I feel like we're now in the exposomics era, and it's letting new kinds of science come to the fore.

EF: What is your vision for the Predictive Toxicology Branch?

DR: We have an opportunity to use machine learning, artificial intelligence, and many other cutting-edge tools to produce scientific knowledge that translates directly to human health. It is truly predictive data science that doesn't exist elsewhere because we can make a prediction, build models based on tons of data, and then test those in new experiments. We can predict what we think is going happen, and we can generate the data to validate that prediction.

I want our branch to be a destination for people to come and learn these skills, to establish the branch as a center of excellence for training, and to attract the best scientific talent.

EF: With many chemicals in the environment today, what should the average person know?

DR: One of the best things about government science is that it doesn't have an agenda. Its about truth and robustness. I'm hoping that we can provide those things to people, so when they ask questions about what they are exposed to and how to avoid harm, we can be a trusted source for that information.

I think we're at a cool inflection point where the confluence of technology, talent, and awareness are all coming together. Things have to change, and we're ready to change them.

(Caroline Stetler is Editor-in-Chief of the Environmental Factor, produced monthly by the NIEHS Office of Communications and Public Liaison.)

More:

Scientific Journeys: From genetics to the environment and back - Environmental Factor Newsletter