Category Archives: Data Science

FAIR Skies Ahead for Biomedical Data Project Looking to Benefit Research Community – Datanami

Dec. 22, 2023 The San Diego Supercomputer Center at UC San Diego, along with theGO FAIR Foundation, the National Center for Atmospheric Research, the Ronin Institute and other partners, will conduct data landscaping work funded by theFrederick National Laboratory for Cancer Research, operated by Leidos Biomedical Research, Inc., on behalf of the National Institute of Allergy and Infectious Diseases (NIAID). SDSCs Research Data Services Director Christine Kirkpatrick leads theGO FAIR U.S.Office at SDSC and serves as PI for the new project.

The NIAID Data Landscaping and FAIRification project seeks to benefit biomedical researchers and the broader community to generate and analyze infectious, allergic and immunological data. Using theFAIR Principles as a guide, the project teamoffering a broad background to ensure that metadata, a set of data that describes and gives information about other data, for biomedical research is findable, accessible, interoperable and reusable (FAIR)will provide guidance on approaches to enhance the quality of metadata within NIAID and NIH supported repositories and resources that harbor data and metadata.

Structured trainings and guidance will be offered to support stakeholders, including components from the model pioneered by GO FAIR leveraging established M4M workshops and adopting FAIR Implementation Profiles (FIPs). This work will be underpinned by interviews with stakeholders and an assessment to explore the relationship between FAIR resources and scientific impact. The initial period of the federally funded contract, which runs from Sept. 20, 2023 to Sept. 30, 2024, is valued at $1.3 million.

Highlights of the teams expertise include co-authoring the FAIR Guiding Principles, facilitating metadata for machines (M4M) workshops, developing the FAIR Implementation Profile approach, and contributing to improvements on data policy and metadata practices and standards.

Our team is elated to be working with our NIAID project sponsors at the Office of Data Science and Emerging Technologies (ODSET) through Leidos Biomedical Research, remarked Kirkpatrick, PI of the landscaping project. NIAID is renowned for its significant data resources and impactful scientific research. Having the chance to apply our collective expertise in research data management in support of the NIAID mission areas of infectious disease, allergy and immunology will be both impactful to the FAIR ecosystem, and meaningful work for our team. Further, I believe this work will become more common in the future as organizations begin to see data as a strategic asset, rather than focus on the cost of storing it.

The project follows alongside another key project in the Leidos Biomedical Research portfolio, theNIAID Data Ecosystem Discovery Portal, led by The Scripps Research Institute. The project team will work hand in hand with the Scripps team to ensure repository improvements maximize the Discovery Portals ability to search across the wide array of data assets produced by NIAID-funded research.

The project team includes co-authors of the 2016 FAIR Principles paper (Barend Mons and Erik Schultes), leaders in research data consortia, scholars in informatics, biomedical research and pioneers in FAIR training, interoperability practices and methodology for assessing scientific impact. Team members are Chris Erdmann, Doug Fils, John Graybeal, Nancy Hoebelheinrich, Kathryn Knight, Natalie Meyers, Bert Meerman, Barbara Magagna, Keith Maull and Matthew Mayernik. These experts are complemented by world-class systems integrators and project managers from SDSC: Alyssa Arce, Julie Christopher and Kevin Coakley.

Source: Christine Kirkpatrick and Julie Christopher, SDSC Research Data Services

Read the original:

FAIR Skies Ahead for Biomedical Data Project Looking to Benefit Research Community - Datanami

Websites to Apply for Paid Data Science Internships – Analytics Insight

Securing a paid internship in the field of Data Science is a significant stepping stone for aspiring data scientists. The experience gained during an internship not only enhances practical skills but also opens doors to valuable networking opportunities. In this article, we explore some of the best websites where aspiring data scientists can find paid internship opportunities to kickstart their careers.

LinkedIn has evolved into a powerhouse for professional networking and job opportunities. Companies often post internship positions directly on LinkedIn, making it a crucial platform for aspiring data scientists. Follow relevant companies, join groups, and stay updated with the latest internship postings.

Indeed is a widely used job search engine that aggregates internship postings from various sources. It allows users to filter positions based on location, salary, and job type, providing a comprehensive and user-friendly interface for finding paid Data Science internships.

Glassdoor not only provides job listings but also offers insights into company reviews and interview experiences. This can be invaluable when researching potential employers. The platform often features paid Data Science internships from leading companies, helping candidates make informed decisions.

Internshala is a platform specifically designed for internships, making it a go-to resource for students and recent graduates. The website features a dedicated section for Data Science internships, allowing users to filter positions based on location, duration, and stipend.

Kaggle is renowned for hosting data science competitions, but it also serves as a platform for job postings and internships. The Kaggle Jobs board frequently features paid Data Science internships from companies looking to engage with the vibrant Kaggle community.

For those interested in the startup ecosystem, AngelList is a platform that connects startups with potential employees and interns. Data Science interns can find unique opportunities to contribute to innovative projects in a startup environment.

Similar to Indeed, SimplyHired is a job search engine that aggregates internship listings. It simplifies the job search process by providing a streamlined interface, allowing users to search for paid Data Science internships with ease.

Chegg Internships is a platform connecting students with internship opportunities across various industries, including Data Science. It features a range of internships with detailed information about the company, job responsibilities, and stipends.

CareerBuilder is another comprehensive job search engine that includes a variety of internship opportunities. It allows users to search for Data Science internships based on location, industry, and company size.

Dice is a platform primarily focused on technology-related opportunities, making it an excellent resource for those seeking Data Science internships in the tech industry. It features internships from both established companies and startups.

The journey to securing a paid Data Science internship begins with exploring diverse platforms that cater to the specific needs and preferences of aspiring data scientists. By leveraging the resources provided by these websites, candidates can discover exciting opportunities, gain practical experience, and embark on a rewarding career in the dynamic field of Data Science.

Continue reading here:

Websites to Apply for Paid Data Science Internships - Analytics Insight

Mapping the relations between Manhattan Project scientists using network science – Phys.org

This article has been reviewed according to ScienceX's editorial process and policies. Editors have highlighted the following attributes while ensuring the content's credibility:

fact-checked

preprint

trusted source

proofread

by Ingrid Fadelli , Phys.org

close

The Manhattan Project was a top-secret program that culminated in the development of the first atomic bombs during World War 2. This covert and controversial research endeavor involved many gifted and reputable scientists, including physicist J. Robert Oppenheimer.

Miln Janosov, Founder of Geospatial Data Consulting and Chief Data Scientist at Baoba, recently set out to map the relationships between scientists who took part in the Manhattan Project using methods rooted in network science. Network or data science is a field of research that explores the intricate connections between people in a group or between individual parts of networked systems. The work is published on the arXiv preprint server.

"I have been working with social networks and mapping unusual datasets to uncover hidden connections for a while," Janosov said. "During this journey, I also mapped hidden networks of scientists, including for instance, the network of Nobel laureates in another project released earlier this year. So, I already had a history of mapping scientists' networks. After watching the long-awaited Oppenheimer movie, I decided to also untangle the collaboration and social connections behind the Manhattan project, which if one of the largest, most impactful scientific collaborations of human history."

The release of the popular movie Oppenheimer in July this year re-awoke significant public interest in the Manhattan Project and the substantial research efforts that led to the development of the atomic bomb. This inspired Janosov, a trained network scientist with a background in physics, to explore this topic in his research.

"A practical and traditionally accepted way of building networks of scientists relies on shared publications," Janosov explained. "However, even today, some of the Manhattan Project's science is classified, so that direction would have distorted the picture. So, I decided to drop this steer away from classified and private data to the most public information platform availableWikipedia."

close

close

To map the relationships between different scientists involved in the Manhattan project, Janosov firstly collected every Nobel laureate's Wikipedia page and compiled these pages into a dataset. Subsequently, he used language processing techniques to analyze the texts included in these pages.

"This approach allowed me to quantify how often each laureate's page refers to others," Janosov said. "This was all I needed to build their network, in which each scientist was a node linked based on Wikipedia mentions and references. For instance, the Wiki page of Oppenheimer mentions Enrico Fermi more than 10 times, leading to a strong link between the two physicists."

The map created by Janosov represents the most renowned scientists involved in the Manhattan Project as dots and the connections between these scientists as lines that connect the dots. These dots and lines create an intricate web of relationships, highlighting research circles that closely collaborated at the time.

"It's exciting to see how the network's community structure outlines the different departments and historically well-known cliques who worked in the projects, such as the Theoretical Division with Feynman or World War II refugees around Borh," Janosov said. "However, my favorite part is about the Hungarian immigrants who run under the nickname Martians: Teller, Wigner, Szilard, and Neuman, who played a foundational role in the dawn of the atomic. As it turns out, also in this network, using the right coloring, their strong connectedness is also clearly visible."

close

The colorful map of The Manhattan Project created by Janosov is one of the most recent examples of just how valuable network science can be for creating representations of human connections and visual maps of complex systems with many interacting components. Future studies in this rapidly evolving area of research could shed some new light on a wide array of topics rooted in both science and the humanities.

"Nowadays, I am most focused on questions related to urban planning, geospatial data science, and sustainability," Janosov added. "I am currently exploring a crucial question in this domain, where network science can also be appropriately applied."

More information: Milan Janosov, Decoding the Manhattan Project's Network: Unveiling Science, Collaboration, and Human Legacy, arXiv (2023). DOI: 10.48550/arxiv.2310.01043

Journal information: arXiv

2023 Science X Network

See the original post here:

Mapping the relations between Manhattan Project scientists using network science - Phys.org

6 Great Beginner-Friendly Tips to Overcome Your First Data Science Project – Towards Data Science

Doing your first project might be the single most important milestone in your data science journey. However, knowing what the first steps are in this undertaking is often fraught with challenges. Im here to instill into you that it doesnt have to be like that.

In this article, Im going to share with you exactly what you need to know to begin your first project.

My goal is to clear up any misconceptions that you might have about how to start your first data science project and give you the confidence to begin as soon as possible.

These are six essential insights that will cut through your apprehension for projects. The last one has the potential to transform the trajectory of your entire career.

Lets dive in!

Why do a project in the first place?

Is it to show your skills to prospective employers? Is it to use as a conversation starter when reaching out to people on LinkedIn?

See original here:

6 Great Beginner-Friendly Tips to Overcome Your First Data Science Project - Towards Data Science

FINRA and CFTC make critical hires towards integrating data analytics and surveillance technology – FinanceFeeds

The CFTC has announced Ted Kaouk as Chief Data Officer, John Coughlan as Chief Data Scientist. FINRA has appointed Feral Talib as Head of Surveillance and Market Intelligence.

The Commodity Futures Trading Commission (CFTC) and the Financial Industry Regulatory Authority (FINRA) have both announced critical appointments, signifying an increased focus on data analytics and market surveillance.

With the CFTC appointing Ted Kaouk as Chief Data Officer and John Coughlan as Chief Data Scientist, and FINRA announcing Feral Talib as Head of Surveillance and Market Intelligence, the two regulatory organizations are takingsignificant steps towards integrating advanced data analytics and surveillance technology in the regulatory framework, reflecting a proactive stance in adapting to the rapidly evolving financial markets.

CFTC Chairman Rostin Behnam recently announced the appointment of Ted Kaouk as the Chief Data Officer and Director of the Division of Data (DOD).

This strategic move aims to bolster the CFTCs data-driven approach to regulatory oversight. Kaouk, with an impressive background, including roles as Chief Data Officer at the Office of Personnel Management (OPM) and the Department of Agriculture, brings a wealth of experience in data integration and strategy development. His appointment is seen as a crucial step in enhancing the CFTCs ability to make informed policy decisions.

Alongside Kaouk, John Coughlan has been named the agencys first Chief Data Scientist. With eight years at the CFTC, Coughlans appointment marks a new era in data science for the agency. His expertise in machine learning and data analytics will be instrumental in advancing the CFTCs use of artificial intelligence for effective oversight of the derivatives markets.

Chairman Behnam underscored the importance of these appointments.The massive shifts in financial markets driven by advances in technology put the CFTC at the center of a new era of financial data, empowering us to more efficiently and effectively execute our mission.

With these new critical hires, the CFTC is upskilling our data science staff, and increasing capacity and capability to be at the forefront of market innovations. We now have the team in place to set a strategy with concrete benchmarks and a clear path forward.

Meanwhile, FINRA has appointed Feral Talib as the Executive Vice President and Head of Surveillance and Market Intelligence. Talibs role, a new addition to FINRA, is set to begin on January 2. His extensive experience in market surveillance, most notably as the Global Head of Market Surveillance at BNP Paribas Group, positions him as a key player in strengthening FINRAs surveillance capabilities.

Talib will be responsible for leading FINRAs surveillance program, which is crucial for maintaining the integrity of the U.S. securities markets. His focus will be on continuous innovation, ensuring that the surveillance systems keep pace with the evolving and complex nature of modern financial markets.

Stephanie Dumont, Executive Vice President and Head of Market Regulation and Transparency Services at FINRA, praised Talibs expertise and track record.Feral has extensive experience leading surveillance programs that will bolster our mission of protecting investors and promoting market integrity. He has a proven track record of overseeing comprehensive surveillance portfolios while utilizing cutting-edge surveillance techniques. Ferals surveillance expertise will help us continue to innovate and enhance the effectiveness and technological sophistication of our surveillance program. Feral will be a key addition to FINRAs ongoing leadership in regulatory surveillance.

I am excited to be joining FINRA and to have the opportunity to lead its Surveillance and Market Intelligence unit. Robust surveillance is vital to ensuring fair markets and protecting investors, and we are at the cusp of an evolutionary leap in surveillance capabilities through the use of artificial intelligence and our ability to process unstructured data. By combining advanced surveillance and detection with FINRAs traditional investigative expertise, I look forward to continuing and building upon the excellent work FINRA is already doing in this space, Talib said.

Related

Read the rest here:

FINRA and CFTC make critical hires towards integrating data analytics and surveillance technology - FinanceFeeds

Top KDnuggets Posts of 2023: Free Learning Resources and More – KDnuggets

Happy holidays, everyone.

With 2023 almost in the books, KDnuggets is happy to share that we are bringing to a close our most successful year yet! We have experienced unparalleled levels of readership this year, have brought on scores of new readers, and covered topics worthy of our audience's time, all while fostering relationships with our partners and sponsors.

As the year comes to an end, it's time to review what you the readers have made the most popular posts of the year on KDnuggets.

This list is based on the number of raw views of all posts published on the site between January 1, 2023, and the date of writing (December 14, 2023). We will be publishing a pared down schedule over the coming few weeks, and with the holidays upon us it makes more sense to perform this assessment now and get it out of the way. Do be sure to keep this publication date caveat in mind, however.

Also keep in mind that KDnuggets' traffic increased dramatically more and more so toward the end of the year, and so articles published later in the year are more well represented than those published earlier on, by and large.

And now, without further ado, here are the 20 most popular KDnuggets posts published in 2023.

See any common themes? We sure do!

We want to thank our immensely talented writing staff for their hard work all year long! It's great to see that their insights and expertise do not go unnoticed by our readers. You can find out more about the writing staff here.

We also thank each and every community member who has submitted an article for publication throughout the year. These additional insights are also very much appreciated, and we are happy to be able to provide a platform for quality data science related content to reach a wider readership than it otherwise might. Keep those submissions coming in the new year.

Thanks again, and we will see you in 2024.

See the original post here:

Top KDnuggets Posts of 2023: Free Learning Resources and More - KDnuggets

From Troops to Tech Leaders: National University Paves the Way for Service Members to Transition into AI and Data … – PR Web

With prestigious National Science Foundation grant, Veteran-founded National University announces new bachelor's degree helping active-duty military and Veterans up- and reskill for careers in data science and AI

SAN DIEGO, Dec. 21, 2023 /PRNewswire-PRWeb/ -- National University, a nonprofit and Veteran-founded Minority Serving Institution that serves 50,000 degree-seeking students and 80,000 workforce and professional development students annually, today announced the development of a new B.S. Data Science degree aimed at helping military service members and Veterans transition into the in-demand fields of data science and artificial intelligence (AI) technology. This effort is made possible through a $500,000 grant from the National Science Foundation (NSF) and will play a vital role in addressing the growing demand for professionals in these rapidly evolving sectors.

"Even as we grapple with understanding the vast and still emerging capabilities of artificial intelligence, we face an equally important obligation to ensure that this new AI and technology-driven economy is accessible for every learner and worker, including those who have served our nation in uniform," said Dr. Jodi Reeves, National University department chair of data science and the associate director for education, diversity, and outreach for TILOS. "The U.S. armed services have long been at the forefront of artificial intelligence and data innovation, so we have a profound opportunity to align the talents of dedicated service members in transition and Veterans to meet the needs of a fast-changingand increasingly tech-driveneconomy where these same technologies will play an outsized role."

Every year, approximately 500,000 members of the U.S. military depart active duty and begin the transition to the civilian workforce. The initiative comes as a response to the escalating demand for skilled individuals in data science and AI technology, with data science being the most in-demand job in the industry. As these fields continue to shape industries across the globe, the demand for experts in this domain is at an all-time high. According to the Bureau of Labor Statistics, data scientist roles are projected to grow 35% between 2022 and 2032, at a much faster rate than the average for all other occupations.

In 2021, National University was selected to be part of a prestigious team awarded a $20 million grant from the National Science Foundation (NSF) for its contribution to The Institute for Learning-enabled Optimization at Scale (TILOS), an AI institute led by the University of California, San Diego, alongside partners such as Yale University, MIT, the University of Pennsylvania, and the University of Texas at Austin. Through the grant, National University will receive $100,000 annually for five years to support faculty members developing new curriculum, in order to help the university serve its diverse student demographics, including working adults and members of the military community.

The AI reskilling program is one of several programs and initiatives created by National University in response to the growing demand for skilled workers who are passionate about the field of data science and AI technology. Since the AI institute's inception, the AI specialization of M.S. Data Science has grown to be the largest specialization at National University, with 56% of the students being military affiliated. The new B.S. of Data Science degree will include concentrations in AI and machine learning, cybersecurity analytics, and bioinformatics. New courses begin in February 2024.

"To make good on our commitment to serving military and Veteran students to the best of our ability, we need to continuously find ways to bridge the divide between civilian systems of education and employment and our modern military we must evolve the programs we offer, and the ways in which we serve this unique population of learners," said Meg O'Grady, Senior Vice President, Military and Government Programs at National University. "This is about creating inclusive pathways to careers in data, technology and artificial intelligence for Veterans and service members in transitionand finding new ways to align military skills and credentials with the needs of the emerging AI economy."

Throughout its 50-year history, National University has established a strong reputation for its focus on serving military-connected students and Veterans. Today, its student population reflects the shifting and highly diverse demographics of higher education today. Approximately 70 percent of its students take the majority of their classes online. More than 25 percent identify as Hispanic, and 10 percent identify as Black. More than 80 percent of undergraduates are transfer students. The average age of its students is 33. And about 1 in 4 students are active-duty service members or Veterans.

Established in 1971 by retired U.S. Navy Capt. David Chigos, the university also has a rich history of commitment to military personnel. Recognized as a 2022-2023 Military Friendly School and a participant in the Yellow Ribbon Program, National University proudly offers over 190 degree programs, tailored to meet the needs of active-duty military members, Veterans, and their dependents.

National University's four-week course structure is designed to accommodate the unique demands of military life, enabling students to pursue their degrees without disrupting training, service and deployment schedules. In addition, the university's Veteran Center assists in the transition from military to civilian life, offering guidance and support services. National University also utilizes transfer-friendly policies that enable students to leverage previously earned college credits, professional certifications, and military training.

To learn more about National University's offerings, visit our website at NU.edu.

About National University National University, a Veteran-founded nonprofit, has been dedicated to meeting the needs of hard-working adults by providing accessible, affordable higher education opportunities since 1971. As San Diego's largest private nonprofit university, NU offers over 190+ online and on-campus programs with flexible four-week and eight-week classes and one-to-one graduate education models designed to help students reach their goals while balancing busy lives. Since its founding, the NU community has grown to 130,000 learners served per year50,000 degree-seeking students and 80,000 workforce and professional development studentsand 230,000 alumni around the globe, many of whom serve in helping industries such as business, education, health care, cybersecurity, and law and criminal justice. To learn more about National University's new possibilities in education including next-generation education, credential-rich education, and whole human education, visit NU.edu.

Media Contact

Ashleigh Webb, National University, 760-889-3494, [emailprotected], https://www.nu.edu/

Twitter

SOURCE National University

Read the original:

From Troops to Tech Leaders: National University Paves the Way for Service Members to Transition into AI and Data ... - PR Web

React Props Explained With Examples – Built In

React has a different approach to data flow and manipulation than other frameworks, and thats why it can be difficult in the beginning to understand concepts like props, state and others.

Props is a special keyword in React that stands for properties and is used for passing data from one component to another. Data with props are passed in a unidirectional flow from parent to child.

Were going to focus on Reacts Props feature and how to use it. Props is a special keyword in React that stands for properties, and its used for passing data from one component to another.

To understand how props work, first, you need to have a general understanding of the concept of React components. Well cover that and more in this article.

React is a component-based library that divides the UI into little reusable pieces. In some cases, those components need to communicate or send data to each other, and the way to pass data between components is by using props.

As I shared, props is a keyword in React that passes data from one component to another. But the important part here is that data with props are being passed in a unidirectional flow. This means its passed one way from parent to child.

Props data is read-only, which means that data coming from the parent shouldnt be changed by child components.

Now, lets see how to use props with an example.

An error occurred.

More on ReactA Guide to React Hooks With Examples

I will be explaining how to use props step-by-step. There are three steps to using Reactprops:

In this example, we have a ParentComponent including another ChildComponent:

And this is our ChildComponent:

The problem here is that when we call the ChildComponent multiple times, it renders the same string again and again:

But what we like to do here is to get dynamic outputs, because each child component may have different data. Lets see how we can solve this issue by using props.

We already know that we can assign attributes and values to HTML tags:

Likewise, we can do the same for React components. We can define our own attributes and assign values with interpolation { }:

Here, Im declaring a textattribute to the ChildComponent and then assign a string value: Im the 1st child.

Now, the ChildComponent has a property and a value. Next, we need to pass it via props.

Lets take the Im the 1st child! string and pass it by using props.

Passing props is very simple. Just as we pass arguments to a function, we pass props into a React component and props brings all the necessary data.Arguments passed to a function:

Arguments passed to a React component:

Weve created an attribute and its value, then we passed it through props, but we still cant see it because we havent rendered it yet.

A prop is an object. In the final step, we will render the props object by using string interpolation:{props}.

But first, log props to console and see what it shows:

As you can see, props returns an object. In JavaScript, we can access object elements with dot notation. So, lets render our text property with an interpolation:

And thats it. Weve rendered the data coming from the parent component.Before closing, lets do the same for other child components:

As we can see, each ChildComponent renders its own prop data. This is how you can use props for passing data and converting static components into dynamic ones.

More on ReactHow to Make API Calls in React With Examples

To recap:

Understanding Reacts approach to data manipulation takes time. I hope my post helps you to become better at React.

View post:

React Props Explained With Examples - Built In

Introducing Microsoft Fabric: Will Power BI Be Replaced in 2024? – DataDrivenInvestor

Hey there, Im excited to share my personal journey with a revolutionary analytics platform I recently discovered, Microsoft Fabric.

Microsoft Fabric represents a paradigm shift in the way Power BI users interact with and visualize data. By leveraging advanced technologies and cutting-edge design principles, Microsoft Fabric introduces a new era of intuitive and immersive data experiences.

Microsoft Fabric is an end-to-end analytics solution with full-service capabilities including data movement, data lakes, data engineering, data integration, data science, real-time analytics, and business intelligence all backed by a shared platform providing robust data security, governance, and compliance.

Your organization no longer needs to stitch together individual analytics services from multiple vendors. Instead, use a streamlined solution thats easy to connect, onboard, and operate.

When I first learned about Microsoft Fabric, I was impressed by its promise to reshape how everyone accesses, manages, and acts on data and insights. In the past, it was a challenge to connect every data source and analytics service together now, its all possible on a single, AI-powered platform.

For instance, imagine having a data estate sprawled across different sources and platforms. As a data engineer, its a daunting task to connect and curate data from these different sources.

One of the biggest hurdles in data analysis is managing AI models. But with Microsoft

The rest is here:

Introducing Microsoft Fabric: Will Power BI Be Replaced in 2024? - DataDrivenInvestor

The Future of Data Engineering in an AI-Driven Landscape – CXOToday.com

By Jeff Hollan

Jeff Hollan, Director of Product Management, Snowflake, highlights the anticipated developments in 2024 as artificial intelligence becomes integrated into the business operations

Data engineering will evolve and be highly valued in an AI world.

Theres been a lot of chatter that the AI revolution will replace the role of data engineers. Thats not the case, and in fact their data expertise will be more critical than ever just in new and different ways. To keep up with the evolving landscape, data engineers will need to understand how generative AI adds value. The data pipelines built and managed by data engineers will be perhaps the first place to connect with large language models for organizations to unlock value. Data engineers will be the ones who understand how to consume a model and plug it into a data pipeline to automate the extraction of value. They will also be expected to oversee and understand the AI work.

Data scientists will have more fun.

Just as cloud infrastructure forced IT organizations to learn new skill sets by moving from builders of infrastructure and software, to managers of third-party infrastructure and software vendors, data science leaders will have to learn to work with external vendors. It will be an increasingly important skill to be able to pick the right vendors of AI models to engage with, similar to how data scientists today choose which frameworks to use for specific use cases. The data scientist of tomorrow might be responsible for identifying the right vendors of AI models to engage with, determining how to feed the right context into a large language model (LLM), minimizing hallucinations, or prompting LLMs to answer questions correctly through context and formalizing metadata. These are all new and exciting challenges that will keep data scientists engaged and hopefully inspire the next generation to get into the profession.

BI analysts will have to uplevel.

Today, business intelligence analysts generally create and present canned reports. When executives have follow-up questions, the analysts then have to run a new query to generate a supplemental report. In the coming year, executives will expect to interact directly with data summarized in that overview report using natural language. This self-service will free up analysts to work on deeper questions, bringing their own expertise to what the organization really should be analyzing, and ultimately upleveling their role to solve some of the challenges AI cant.

(The author is Jeff Hollan, Director of Product Management, Snowflake, and the views expressed ins this article are his own)

Follow this link:

The Future of Data Engineering in an AI-Driven Landscape - CXOToday.com