Category Archives: Data Science

Microsoft DataNext: ‘Putting AI Innovation into Action’ ended on a successful note – Analytics India Magazine

The AI-focused virtual summit, DataNext: Putting AI innovation into action, successfully concluded on March 22, 2022. The event saw the participation of over 1000+ industry stakeholders across businesses, 40+ speakers, and 15+ immersive demos and industry-specific solutions to improve business outcomes powered by data, analytics and AI

These sessions gave an in-depth insight into some of the best practices and megatrends curated with exclusive live showcases of technological innovations, immersive demos, hands-on labs, and more. Some of the speakers and delegates who participated in the day-long digital event included Gramener CEO Anand S, Flipkarts SVP, and Head of Analytics and Applied Sciences Ravi Vijayaraghavan; Microsoft Asias GM Data and AI Renee Lo; Microsoft India CTO Dahnesh Dilkhush, among others.

At the opening session, Anand S from Gramener, who spoke on Natural intelligence: Teaching humans to use data, highlighted the importance of telling stories with data and how businesses can leverage this to present their data vividly.

Dont share the analysis. Dont share the data. Instead, share the insights as complete stories, said Anand S, saying that the real effectiveness of clear storytelling comes with practice.

Following that, in an exclusive spotlight session, Flipkarts SVP Vijayaraghavan spoke on Analytics and Data Science for E-commerce overview and use cases on Customer Experience. In one of the most insightful sessions of the Microsoft DataNext event, he explained how the Indian eCommerce giant uses analytics and data science techniques to solve multiple business problems and enhance the customer experience in real-time. The session predominantly focused on applications of consumer modelling by leveraging various AI/ML techniques and statistical methods in driving a better experience for its customers.

This was followed by a keynote presentation from Microsoft Asias GM data and AI Renee Lo, where she spoke on the topic Future of Decisions: A Chief Digital Officers Top of Mind. Here, she addressed some of the underlying factors driving digital transformation today, and how business leaders and CDOs can navigate through ambiguity.

She said that businesses should look at the entire landscape and not just a data platform. Further, she emphasised that businesses should look at their people, operating model and executive strategy, to ultimately work on big business use cases, impacting the entire organisation.

The one-day Microsoft DataNext event had exciting industry-focused roundtable discussions, deep-dive tech workshops and solution demos across diverse topics and formats.

The keynote speech was followed by a roundtable discussion on how CDOs can enable data-driven culture within organisations. The session was moderated by Biswajit Das, Director at KPMG, alongside other industry leaders, including Microsoft India CTO of Azure Dahnesh Dikhush, Innominds Meduri Ravi Kumar, Happiest Minds Technologies Ajay Agarwal, eClerxs Gokulraj Perumal, Capillary Technologies Gaurav Mehta, CMS Info Systems Rohit Kilam, and Cigniti Technologies Rajesh Pawar.

Other tech roundtable discussions included:

Besides these tech roundtable discussions, Microsoft DataNext had a series of workshops around data modernisation, cloud-scale analytics, and Azure AI from leading solution experts across Asia, Australia and India, comprising topics such as

Microsoft DataNext summit was curated in collaboration with Analytics India Magazine. The event was aimed to help business and technology leaders understand and unlock the next wave of data and AI solutions for accelerated growth.

Read more:

Microsoft DataNext: 'Putting AI Innovation into Action' ended on a successful note - Analytics India Magazine

Confiz Data Summit sets the stage to celebrate technological excellence, and promote innovation in tech – TechJuice

In light of the ever-growing technological innovation and the rising growth opportunities in the field of data science, an industry-leading software house headquartered in Pakistan, Confiz, decided to establish a globally recognized knowledge-sharing platform called the Confiz Data Summit. Hosted on 26th March 2022, the event is believed to be Pakistans first-ever data and machine learning summit organized to further Confizs agenda to inspire ideas, and innovation in the data science community as the global big data market is set to reach $234.6 Billion by 2026.

At Confiz, we believe that hard work, innovation, and knowledge sharing are the key factors for technological advancement. The Confiz Data Summit is an effort to promote and propagate our values for students, professionals, and educators to bring into practice our efforts for a better and more knowledgeable data community in Pakistan., Faheem Jabbar keynote speaker at the Summit and Senior Data Architect at Confiz.

The event was the first-ever hosted of its kind, and was planned in collaboration with the University of Management and Technology (UMT) on 26th March 2022. It was revered by the 300+ online registrants and 150+ on-site attendees as it presented an opportunity to learn, share, and celebrate the latest trends and technologies shaping the future of data.

The speakers spoke about modern day technology applications, unique business use cases, and the fast-expanding global technology landscape. During the speeches, great insights on vision technology, demand forecasting, the future of artificial intelligence and machine learning, data ingestion, and prominent data management techniques were shared.

Our goal for today is to dive deep into the intricate world of data science, discuss the recent developments in this fast-expanding industry, and dissect modern problem-solving techniques as crucial to leveraging data innovation., Aitezaz Sheikh Vice President Engineering at Confiz.

Prominent speakers at the event included:

Syed Fahad Khalid Lead Product Manager AI, Meta

Muhammad Fahad Bhatti Specialist Data Governance Trainer

Nauman Bashir Sr. Software Architect, Confiz

Naufil Hassan Senior Machine Learning Engineer, Confiz

Jawwad Mansoor Principal Data Scientist, Confiz

Dr. Fazeel Abid Assistant Professor, UMT

Usman Barkat Chief Innovation Officer, Algo

The event also served as a networking platform for those in attendance including seasoned IT professionals, software developers, machine learning experts, data scientists, and university professors, who shared the common passion towards revolutionary change-enablement in the field of data science.

The Confiz Data Summit proved to be a success, fueled by an overwhelming response from the students and industry experts globally, and has set the stage for similar thought-leadership initiatives for other industry leaders in the technology landscape. Confiz commits to carry on the initiative annually to invest in the culture of knowledge sharing, collaborate with industry professionals and support future data enthusiasts globally.

About Confiz

Confizis a global technology service and solutions company, catering to small, medium, and large enterprises, with multiple Fortune 100 customers in retail, CPG, manufacturing, and other verticals. The company was founded in 2005 and has now grown to a workforce of more than 600 team members working from offices in the US, Europe, the Middle East, and South Asia.

Its expertise areas include cloud, data analytics, business intelligence and footfall analytics solutions, Microsoft Power Platform, Microsoft Dynamics 365 ERP and CRM, and bespoke development and consulting services. Over the years, the organization has excelled through seasoned leadership, a highly qualified and efficient workforce, and a strong uncompromising focus on quality deliverables.

Original post:

Confiz Data Summit sets the stage to celebrate technological excellence, and promote innovation in tech - TechJuice

5 Ways to Scale AI Projects and Adopt Artificial Intelligence in Your Company – Analytics Insight

In the past few years, the tech sector has been in love with artificial intelligence (AI). With applications ranging from high-end data science to automated customer service, this technology is appearing all across the enterprise. The key to successfully scaling an AI project is identifying which challenges you will face along the way and how to solve them. Here is how you can accelerate AI adoption and scale AI projects correctly:

Organizations will have to extend the number of data sources and collect different types of data. The more diverse data sources are, the more depth AI-based algorithms will have and the better they will perform. Make sure to evaluate the authenticity and accuracy of each data source before feeding its data to AI-based models.

A playbook is an all-in-one solution to automate and grow any sports, camps & youth, a fitness organization, or facility. Developing a team is important for the success of an AI project. Once you have a team, you need to provide them with the right training, create an AI strategy, and establish internal and external customer communication channels. It works for many types of organizations.

Multi-pronged skills are key to enhancing the employability Quotient of youth. Completing AI projects or scaling them is not easy. Finding individual data experts, data security analysts, machine learning engineers, etc is not easy. Since AI-based algorithms are resource-intensive there is a need to use a dedicated server.

In order to complete the AI project successfully firstly find the best use case and partner with business leaders. They will also have to engage a broader ecosystem to get valuable insights, technology, and talent. Set clear goals and milestones to keep your team focused otherwise, your AI projects can easily get derailed from the path.

AI and MI models are as good as the quality of data you feed them. If feed AI and machine learning models with high-quality data, these models will work perfectly. Once the data dont have inconsistencies and issues, MI and AI-based models will work flawlessly and deliver desired results.

Share This ArticleDo the sharing thingy

About AuthorMore info about author

Analytics Insight is an influential platform dedicated to insights, trends, and opinions from the world of data-driven technologies. It monitors developments, recognition, and achievements made by Artificial Intelligence, Big Data and Analytics companies across the globe.

Read the original post:

5 Ways to Scale AI Projects and Adopt Artificial Intelligence in Your Company - Analytics Insight

Art and science come together in OSU professor’s exhibit at The Little Gallery – Orange Media Network

Zeva Rosenbaum, Photographer

Professor Jerri Bartholomew stands next to her art installations Salmon and There Will Be Good Years: 2009-2021 (lower) at The Little Gallerys exhibition of her work titled Abstracted: Where Science Meets Art and Music on March 4. This piece and others incorporate data from her microbiology work into art pieces created by herself and other collaborative artists.

Jerri Bartholomew has combined her glass art and extensive microbiology work in a new exhibit titled Abstracted: Where Science Meets Art and Music at The Little Gallery in Kidder Hall through April 8.

Bartholomew, a microbiology professor at Oregon State University, discovered her passion for microbiology, specifically in the field of fish-related diseases, during her graduate studies at OSU. Forty-one years later, shes still here and has worked across various roles in the microbiology department. Bartholomew is also director of the John L. Fryer Aquatic Health Laboratory.

According to Bartholomew, shes always had an interest in art. She said that, even though she chose science as a career, art has always been an important aspect of her life. She took glass art classes toward the end of her graduate studies and became very interested in it, describing glass art as a field in itself.

The inspiration behind her art, Bartholomew said, is her science. She said shes been creating scientifically-based art pieces for over 20 years and while she was on sabbatical in 2021, one of her goals was to explore the interconnectivity of art and science and find ways to get people interested in the science they do.

The art-sci network on the OSU Corvallis, Ore. campus is a community Bartholomew wants to draw more attention to, including their Seminarium group intended for students who want to explore the connections between art and science.

Bartholomew said there are a lot of great art-sci connections at OSU that, despite having no current formal structure, are a strength to keep in mind.

The Little Gallery and Helen Wilhelm, the gallery curator, have featured other exhibits involving the art-sci community in the past. Bartholomew said The Little Gallery is a nice, personal space among other gems on campus that are lesser-known and she hopes theyll have more exposure in the future.

According to Bartholomew, the title of the exhibit, Abstracted, is inspired by the first part of a scientific manuscript: the abstract. She said this part of the paper distills down the entire project and is the part most likely to be read. She said she wanted to see how far she could take abstraction in the artistic sense while still making the viewer curious about the science itself.

Bartholomew said she combines science and art in a few different ways. One of the pieces in the show includes data graphs taken from one of her research projects on the Klamath River, which she then incorporated into glass. She said art helped her to ask further questions of her own science that she may not have otherwise.

Another way she combined technology and art was by collaborating with musicians, Bartholomew said.

Jason Fick, OSU assistant professor of Music Technology, coordinator of Music Technology and Production, and president of the College Music Society, Northwest Region, sonified Bartholomews data into a 16-minute digital musical piece which is included in the exhibit Murky Waters.

Fick said he and Bartholomew have been working together on this data for about two years after meeting in 2016. Fick said he created a software platform to read data and map it to sound behaviors in real-time. The data includes waterborne spore density, water flow and temperature.

According to Bartholomew, she worked with Andrew Myers, instructor of Art at OSU, and Dana Reason, OSU assistant professor of Contemporary Music and musician, on a piece concerning developmental cycles of parasites titled Microdestruction: Using Art and Music to Understand Parasite Development. Meyers created a live drawing from the data while Reason composed a musical piece to tell its story.

Reason, who is also the coordinator of Popular Music Studies for Ecampus and Corvallis campuses, said she hasnt seen the full show yet, but shes excited to share and celebrate this exhibit.

Prior to working on this show, Myers and Bartholomew co-taught a course titled Art of the Microbiome, which combined art and microbiology, and both serve as advisors for Seminarium, an OSU student club that examines the intersection of art and science.

In this project I am giving my visual interpretation using traditional drawing media, in collaboration with Dana Reasons performance of the life cycle of the parasite central to Jerris research, Myers said. Collaborations, specifically interdisciplinary collaborations, are extremely interesting to me and this one has been very rewarding.

Bartholomew said art and science dont have to be separate, and indeed didnt used to be until the past 150 years or so.

According to Bartholomew, next year shell be working on the Klamath River dam removal, and she will be incorporating aspects of that into her art next winter.

Read the original:

Art and science come together in OSU professor's exhibit at The Little Gallery - Orange Media Network

Clemson data science and analytics master’s program ranked among top in the nation Clemson News – Clemson News

March 2, 2022March 1, 2022

Fortune Education has ranked Clemson Universitys online Master of Science in data science and analytics (DSA) program as one of the best in the country.

Clemsons program, a collaboration between theCollege of Scienceand theWilbur O. and Ann Powers College of Business, ranks 14th in Fortune Educations first-ever onlineranking of data science graduate programs.

There are few programs in the world that two departments develop, said Ellen Breazel, a senior lecturer in the School of Mathematical and Statistical Sciences and the co-coordinator of the program. There are degrees that have both business and statistics, but the management department or the statistics/math department typically teaches them. Clemsons willingness to form a degree program that has equal shares in two departments means our students are getting the experts in both fields.

The demand for data scientists has grown exponentially and theres no sign of it slowing down. TheU.S. Bureau of Labor Statisticsprojects data science-related jobs will grow by 28 percent per year through 2026.

Fortune Educationranking considered selectivity and demand. A programs selectivity score considered the average undergraduate GPA of incoming students, the average number of years of work experience of those students and the programs acceptance rate. The demand score measured the total enrollment size of the program and the number of applicants for the most recent year.

Clemsons first DSA cohort started coursework in 2020.

Many programs speak of cohorts, but the reality is theres a limited difference, said Russ Purvis, professor of management and the DSA program co-director. Our participants are very keen on the importance of such an approach. Some people may think this would not be important for an online program. However, it is quite the contrary. The technical stretches of the program demand students to lean on each other. This and other emotional intelligence skill sets are essential for the workplace and designed to be needed within the program to succeed.

There are currently 51 students in the program.

Seven students graduated last December. An additional 28 students are on track to graduate in 2022.

Craig Fick is one of the programs first graduates. The New Smyrna Beach, Florida, resident worked as a departmental director at a hospital in South Carolina. Some of his job responsibilities aligned with those of a data analyst, such as creating weekly reports and tracking and improving key performance indicators.

I liked doing those tasks, and I knew there had to be easier ways of doing them. Ultimately, this led me to the data science program at Clemson, he said. The program gave me a well-rounded understanding of the data science and analytics world. Coming from a nearly non-technical background, this was very important. The advanced coursework concepts that are covered, I would have had a rather hard time comprehending outside of the program.

The degree comprises 10 courses, five in mathematical and statistical sciences and five in management. There are no required prerequisites, but some background in quantitative reasoning through coursework or work experience is recommended.

Breazel and Purvis said more and more businesses realize the value of big data.

Data-driven decision-making is powerful. Research shows that organizations using business analytics provide strategic planning with information useful in dealing with dynamic environments, Purvis said. Business analytics is also useful when integrated into performance management systems.

For more information, visit the programswebsiteor email msdsa@clemson.edu.

Read more:

Clemson data science and analytics master's program ranked among top in the nation Clemson News - Clemson News

Winter Collaboratorium to Feature Research from Across the Data Science Institute, Biological Sciences Division, and Pritzker School of Molecular…

Published on Thursday, March 3, 2022

Students file into the Fall 2021 Collaboratorium at the Harper Center. (Photo credit: eClarke Photo)

The Collaboratorium unites University of Chicago students with researchers and faculty who are exploring commercialization opportunities for their work.

The program provides the opportunity for scientists and researchers who want to explore commercialization opportunities to showcase their work and network with students and alumni who may be interested in connecting to pursue further academic study, market research, a business partnership, or participation in an academic competition, such as the New Venture Challenge.

The event will be held in person at the Chicago Booth School of Business Harper Center in Hyde Park with a livestream option available.

>> Register for the Collaboratorium pitch and networking event, here.

The Collaboratorium connects UChicago community members from across campus in ways that they might not otherwise interact, explained Ellen Zatkowski, Polsky Center assistant director and manager of the Collaboratorium. These connections between world-class scientists and talented students foster strong collaborations that have generated enormous impact by bringing cutting-edge technologies to the wider world.

The teams presenting, include:

// Questions? Contact Ellen Zatkowski at ellen.zatkowski@chicagobooth.edu.

Link:

Winter Collaboratorium to Feature Research from Across the Data Science Institute, Biological Sciences Division, and Pritzker School of Molecular...

Analytics and Data Science News for the Week of March 4; Updates from Alteryx, Grow Inc., Stardog, and More – Solutions Review

The editors at Solutions Review have curated this list of the most noteworthy analytics and data science news items for the week of March 4, 2022. In this weeks roundup, product news from Alteryx and Stardog, and Grow Inc. gets acquired by Epicor.

Keeping tabs on all the most relevant analytics and data science news can be a time-consuming task. As a result, our editorial team aims to provide a summary of the top headlines from the last week, in this space. Solutions Review editors will curate vendor product news, mergers and acquisitions, venture capital funding, talent acquisition, and other noteworthy analytics and data science news items.

The availability of new Alteryx solutions in the cloud means users only need a browser to gain access to insights, with a setup that can be done in minutes. The company is addressing this market need by offering Alteryx Designer Cloud, Alteryx Machine Learning, Alteryx Auto Insights, and Trifacta Data Engineering Cloud in one unified suite, the Alteryx Analytics Cloud.

Read on for more.

Grow offers a no-code, full-stack business intelligence and data visualization tool. The product features data integration capabilities that enable users to connect, store, and blend data from hundreds of data sources. Grow then provides the ability to marry and transform disparate data sources so you can filter, slice, and explore different visualizations. The built-in data explorer defines how you want to navigate data via charts and graphs which are displayed in metrics and dashboards.

Read on for more.

Stardog Designer is a no-code, visual environment for creating knowledge graphs, which helps data and analytics teams to easily apply knowledge graph technology in their work. The tool, along with related innovations completes the companys user experience for data and analytics teams so they can connect to data lakes, visually create semantic data models, and prepare and map source metadata to semantic models.

Read on for more.

Voltron Data is one of the most significant contributors to Arrow. Arrow is a multi-language toolbox for accelerated data interchange and in-memory computing. The Voltron Enterprise Subscription for Arrow is tailored to organizations building and running applications that depend on Arrow. The service offers on-demand assistance from Arrow developers, simplified issue reporting, and direct access to leaders in the project.

Read on for more.

Businesses can now use Cape Privacy to employ encryption-in-use, securely operationalize their most highly classified data, and run predictive machine learning models on encrypted data stored in private clouds or in a third-party data cloud. As a self-service, enterprise-grade platform, Cape Privacy empowers businesses to run as many data models as needed to gain the best possible insights.

Read on for more.

For consideration in future data analytics news roundups, send your announcements to tking@solutionsreview.com.

Tim is Solutions Review's Editorial Director and leads coverage on big data, business intelligence, and data analytics. A 2017 and 2018 Most Influential Business Journalist and 2021 "Who's Who" in data management and data integration, Tim is a recognized influencer and thought leader in enterprise business software. Reach him via tking at solutionsreview dot com.

Follow this link:

Analytics and Data Science News for the Week of March 4; Updates from Alteryx, Grow Inc., Stardog, and More - Solutions Review

Welcome to the Age of the Engineer-Data Scientist | Transforming Data with Intelligence – TDWI

Welcome to the Age of the Engineer-Data Scientist

The growing enthusiasm for a new hybrid role raises significant questions. We answer them here.

The typical product development/simulation engineering team now enjoys access to a wealth of data that can and should be informing their product design and manufacturing processes. However, finding critical insight within these vast reservoirs of information is another matter. New skill sets are urgently needed. Specifically, engineers must be able to harness artificial intelligence (AI) and machine learning (ML) to support and accelerate better decision-making.

This fundamental shift is illustrated by the emergence of a new hybrid role -- the engineer-data scientist. Whats more, the success of these multi-skilled pioneers will be crucial to the future of the enterprises that are recruiting and training them. Ultimately, engineer-data scientists must shoulder the task of turning the undisputed potential of AI and ML into faster time to market, and they must design more efficient products that perform better for customers and end users.

Its a big ask, and the growing enthusiasm for the new role raises significant questions. Are engineers really the best people to pick up the data science baton? If they are, what skills do they need? On a more practical level, how can they acquire the mindset and capabilities of data scientists? What are the implications for organizations?

Gaining Traction

In engineering and beyond, the data science revolution is gaining traction. A recent PwC survey reports that 86 percent of respondents describe AI as a mainstream technology within their organization. However, in many respects, we have only scratched the surface. A Capgemini report reveals that by deploying AI at scale, automotive OEMs could increase profitability by 16 percent. There is frustration, too. The aforementioned PwC survey also notes that 76 percent of organizations are barely breaking even on AI.

In the search for better return on investment, the creation of the engineer-data scientist is a significant landmark. It reflects growing recognition that solutions should be driven by domain expertise. In other words, the people with granular understanding of the metadata and engineering challenges are the best people to apply the tools that will uncover insight and can thus navigate the best route forward.

Are engineers a good fit for the role? There are convincing arguments in their favor. To start with, although the impact of AI and ML will be revolutionary, it also represents an evolution from what has come before. There are clear parallels with the principles of established engineering techniques such as experiment design, as well as modern simulation and optimization tools. Across every discipline and sector, engineers are comfortable working with simulation, analytical modeling, and statistics.

A Small Step

Of course, the scale and speed at which AI and ML work (and their unprecedented ability to embed continual learning) are revolutionary. At the same time, given their existing capabilities, most engineers will find that embracing data science is a small step rather than a giant leap. By nature, engineers are curious and thrive on solving problems. Moreover, they do not work in the world of pure science. Their focus is on delivering commercially viable solutions. Ultimately, engineers are motivated by a practical desire to build something better. Instinctively, they will be drawn to tools that can help achieve this goal.

Engineers adopting data science are greatly helped by the latest low-code and no-code AI and ML tools. Democratization is hard at work, and the workflows are increasingly familiar and intuitive. However, prospective engineer-data scientists still need to develop new skills. The encouraging news from universities is that data science is an increasingly popular option in engineering courses. Given the urgency of the requirement, we will also see plenty of on-the-job training for more experienced engineers.

Acquiring Skills

Where do I start? is probably the most common question we hear from aspiring engineer-data scientists. The short answer is with algorithms. AI and ML are essentially about matching and applying the right algorithm to the right problem. It is extremely unlikely that an engineer-data scientist will have to take on the job of actually writing these algorithms.

Beyond this, engineers should be encouraged to get involved in projects where they will use AI and ML. From here, we can be confident that their inclination toward hands-on learning is an ideal springboard for a new wave of data science advocates.

This largely organic career path means that engineer-data scientists are always likely to prove an easy fit with their colleagues and organizations in general. Specialist data scientists will remain an important piece of the puzzle, scaling solutions developed by domain experts and building the necessary infrastructures. The difference made by the engineer-data scientist will be seen in design and manufacturing outcomes, not corporate restructures.

If further evidence were needed of the suitability of engineers to take on these new responsibilities, it can be found in a data science sector that is recruiting engineers to fill its own skills gap. Hopefully, the engineer-data scientist role will mitigate the risk of a brain drain. For anyone interested in smarter, more sustainable products, we need our engineers to keep on engineering. We also need them to turn their talents and attention to getting the very best from what data science offers.

About the Author

Brett Chouinard is the chief product and strategy officer at Altair, where he is responsible for the strategy and vision of Altair products, which includes facilitating the development, sales, and delivery of Altairs solutions. You can reach the author on LinkedIn.

Here is the original post:

Welcome to the Age of the Engineer-Data Scientist | Transforming Data with Intelligence - TDWI

Women Data Scientists of the World, Unite! – Ms. Magazine

The Women in Data Science Conference was created to ensure women will be represented in the data science field. By 2030, they want 30 percent of all data scientists to be women.The Women in Data Science Conference was created to help women achieve better representation in data science. The conference this year will be held on March 7, 2022the day before International Womens Day. (Courtesy of Women in Data Science Conference)

The Women in Data Science Conference (WiDS) was born of a problem: How can we remove the barriers to success that traditionally bar women from accessing the increasingly critical field of data science?

WiDS co-founder Professor Margot Gerritsen is no stranger to this problem. Gerritsen, who received her Ph.D. in scientific computing and computational mathematics at Stanford University, recalls that as a woman and an international student pursuing a degree in computational science nearly three decades ago, there were few people she felt closely connected toand fewer still who understood the challenges she faced in scientific fields. You cant be what you cant see wasnt yet a slogan, but Gerritsen knew she wanted to help break down the barriers she had faced in the field so that other women would not have to overcome the same obstacles.

Along with co-founders Karen Matthys and Esteban Arcaute, Gerritsen set out to help diversify data science. Their vision of an inclusive future for data science lies at the core of WiDSs mission.

In the current field of data science how data is collected and used as well as who is allowed to collect and use it is extremely limited. Because most data scientists are white men, the kind of data collected and how that data is analyzed often leaves out important groups of people including women, people of color, Indigenous peoples, LGBTQ+ people and more.

Gerritsen points out that these gaps in data science can be quite dangerous since limited perspectives and incomplete and biased sets of data are being used to make decisions that will affect everyone. To Professor Gerritsen, having a diverse group of data scientists at the decision table is vital to creating equitable solutions to the problems we face today.

Diversifying data science also allows people from all groups to access what Professor Gerritsen refers to as the new oil in an evolving economic world: data. Data is a resource that, like oil and gold, gives economic and political power to those that possess it. Gerritsen believes that diversifying data science ensures that people from all backgrounds can access this growing route to powernot just the Elon Musks and Jeff Bezoses of the world.

Because most data scientists are white men, the kind of data collected and how that data is analyzed often leaves out important groups of people.

Diversifying the field, according to Gerritsen, means identifying and removing traditional barriers to entry. WiDS models how to do this through focusing on accessibility. Rather than charging expensive entry fees, WiDS livestreams the conference and provides all WiDS programing for free so those who cannot afford to participate in person are still able to access important information. For Gerritsen, who balanced being a single mother and a full-time worker early in her career, ensuring that women with diverse needs can access WiDS resources is of paramount importance.

Gerritsen also recognizes that a conference in the United States, representing the perspectives of U.S.-based data scientists, could not address the problems women in data science in different regions of the world face. To avoid this potential disconnect, WiDS created local conferences and programming in accordance with local data scientists across the international stage. They ensure that women throughout the world can access resources that speak to their communitys particular needs. At the same time, this model reduces the negative financial and environmental consequences of conference travel.

Of course, many other barriers have been constructed around data science, which Gerritsen is helping to dismantle. WiDS provides mentorships to women in the field who previously have not been able to learn from data scientists that look like them. Gerritsen cites studies showing that for people to feel like they belong in a specific field, about 30 percent of people in that field need to resemble them. This has inspired WiDSs initiative 30 by 30, a project aiming to have 30 percent of people in data science be women by the year 2030.

To Gerritsen, ensuring women can see other women in the field will help them destroy the myth that data science is a field exclusively for men. In constructing the conference and programming around accessibility, WiDS has turned what could have been another expensive and exclusive 20,000-person conference into a network of women working together to find solutions to the problems they face.

WiDSs goal of addressing the unique needs of women of all backgrounds manifests itself in one of the conferences most notable events: the two-month-long Datathon. Every year, WiDS challenges people from all experience levels and fields to work collaboratively with data to solve a problem facing the world.

This year, the Datathon challenged its participants to create solutions to climate change that center on energy efficiency. Working in mixed-gendered teams, participants use their unique backgrounds and experiences to contribute to the efforts of the whole WiDS community.

Gerritsen says Datathon inspires creative and collaborative solutions and creates interest in the field for women of all ages. WiDS provides a model for how to create an international coalition of women in data science, working regionally and nationally to diversify the field and solve some of the most urgent problems of our time.

The Women in Data Science Conference will broadcast live from Stanford University on March 7, 2022, from 8 a.m. to 5 p.m. PTthe day before International Womens Day. Tune into WiDS Worldwide Livestream throughout the day on March 7 to watchkeynotes, tech talks, panel discussions and meet-the-speaker interviews.

Up next:

If you found this articlehelpful,please consider supporting our independent reporting and truth-telling for as little as $5 per month.

More:

Women Data Scientists of the World, Unite! - Ms. Magazine

U of T expert on human-centered data science and the problem with the motto ‘move fast and break things’ – University of Toronto

Move fast and break things has become a clich in entrepreneurship and computer science circles.

But Shion Guha, anassistant professor at the University of Toronto's Faculty of Information and a faculty affiliate at the Schwartz Reisman Institute for Technology and Society, says the motto which was once Facebook's internal motto is a bad fit for the technology sector since algorithms are susceptible to biases that can affect human lives.Instead, Guha advocates a human-centered approach to data science that prioritizes the best outcomes for people.

I believe in worlds where data-driven decision-making has positive outcomes, but I don't believe in a world where we do this uncritically, he said. I don't believe in a world where you just throw stuff at the wall and see what sticks, because that hasnt worked out at all.

Guha,the co-author of a new textbook on human-centered data science, spoke to the Schwartz Reisman Institutes Daniel Browne about the need for a more deliberate andcompassionate approach to data science.

Can you tell us about your background?

My academic background is primarily in statistics and machine learning. I graduated with my PhD from Cornell in 2016, and then was an assistant professor at Marquette University for five years before joining the Faculty of Information last year. U of T is one of the first universities in the world to launch an academic program in human-centered data science, so I was nudged to apply.

My co-authors on the book [Human-Centered Data Science: An Introduction, MIT Press March 2022]and I are some of the first people to have talked about the concept of human-centered data science, in a workshop at one of our main conferences in 2016. We decided to write a textbook about the field because we felt there was a missing link between what is taught in the classroom and what happens in practice. In the last few years, the field has talked a lot about algorithmic biases and unforeseen consequences of technology on society. And so, we decided that instead of writing an academic monograph, we wanted to write a practical textbook for students.

What does it mean for data science to be human-centered, and how does this approach differ from other methodologies?

The main idea is to incorporate human-centered design practices into data science to develop human-centered algorithms. Human-centered design is not a new thing;its something that has been talked about a lot in the fields of design, human-computer interaction and so on. But those fields have always been a little divorced from AI, machine learning and data science.

Now, with the advent of this tremendous growth in data science jobs came all of these criticisms around algorithmic bias, which raises the question of whether we are training students properly. Are we teaching them to be cognizant of potential critical issues down the line? Are we teaching them how to examine a system critically? Most computer scientists tend to adopt a very positivist approach. But the fact is that we need multiple approaches, and human-centered data science encourages these practices. Right now, a lot of data science is very model-centered the conversation is always around what model can most accurately predict something. Instead, the conversation should be:What can we do so that people have the best outcomes? Its a slightly different conversation; the values are different.

Human-centered data science starts off by developing a critical understanding of the socio-technical system under investigation. So, whether its Facebook developing a new recommendation system, or the federal government trying to decide on facial recognition policy, understanding the system critically is often the first step. And weve actually failed a generation of computer science and statistics students because we never trained them about any of this. I believe in worlds where data-driven decision-making has positive outcomes, but I don't believe in a world where we do this uncritically. I don't believe in a world where you just throw stuff at the wall and see what sticks, because that hasnt worked out at all.

Next, we engage in a human-centered design process, which can be understood through three different lenses. First, there's theoretical design: the model should be drawn from existing theory what do we know about how people are interacting in a system. For instance, a lot of my work is centered around how algorithms are used to make decisions in child welfare. So, I need to ensure whatever algorithm I develop draws from the best theories about social work and child welfare.

Second, there's something called participatory design, which means inviting all the stakeholders into the process to let them interpret the model. I might not know everything about child welfare, but my models are interpreted by specialists in that area. Participatory design ensures that the people who are affected by the system make the decisions about its interpretation and design.

The third process is called speculative design, which is about thinking outside the box. Let's think about a world where this model doesn't exist, but something else exists. How do we align this model with that world? One of the best ways to describe speculative approaches is the [British TV] series Black Mirror, which depicts technologies and systems that could happen.

Human-centered design practices are about taking these three aspects and incorporating them in the design of algorithms. But we don't stop there, because you cant just put something into society without extensive testing, you need to do longitudinal field evaluation. And Im not talking about six-week evaluations, which are common Im talking about six months to a year before putting something into practice. So, all of this is a more critical and slowed-down design process.

What helps you to collaborate successfully with researchers in other disciplines?

I think one of the major impediments to collaboration between disciplines, or even sub-disciplines, are the different values people have. For instance, in my work in child welfare, the government has a set of values to optimize between spending money and ensuring kids have positive outcomes while the people who work in the system have different values they want each child to have a positive outcome. When I come in as the data scientist, Im trying to make sure the model I build reconciles these values.

My success story has been in working with child welfare services in Wisconsin. When they came to us, I cautioned them that we needed to engage with each other through ongoing conversations to make something successful. We had many stakeholders: researchers in child welfare, department heads, and street-level case workers. I brought them together many times to figure out how to reconcile their values, and that was one of the hardest things that I ever did, because people talk about their objectives, but don't often talk about their values. It's a hard thing to say, OK, this is what I really believe how the system should work.

We conducted workshops for about a year to understand what they needed, and what we eventually realized was that they were not interested in building an algorithm that predicted risk-based probabilities, they were interested in something else: how to make sense of narratives, such as how to describe the story of a child in the system.

If a new child comes into the system, how can we look back and consider how this child displays the same features as other historical case studies? What positive outcomes can we draw upon to ensure this new child gets the services they need? It's a very different and holistic process its not a number, it's not a classification model.

If I had just been given some data, I would have developed a risk-based system that would have ultimately yielded poor outcomes. But because we engaged in that difficult community building process, we figured out that what they really wanted was not what they told me they wanted. And this was because of a value mismatch.

Similarly, when I go to machine learning conferences, theres a different kind of value mismatch. People are more interested in discussing the theoretical underpinnings of models. I am interested in that, but Im also interested in telling the story of child welfare, Im interested in pushing that boundary. But a lot of my colleagues are not interested in that their part of academia values optimizing quantitative models, which is fine, but then you can't claim you're doing all these big things for society if that's really what your values are.

It's interesting to note how much initial effort is required, involving a lot of development that many wouldn't necessarily consider as part of system design.

You know, the worst slogan that Ive ever heard in the technology sector, even though people seem to really like it for some reason, is move fast and break things. Maybe for product recommendations that's fine, but you don't want to do that if you've got the lives of people on the line. You can't do that. I really think we need to slow down and be critical about these things. That doesn't mean that we don't build data-driven models. It means that we do them thoughtfully, and we recognize the various risks and potential issues down the line, and how to deal with it. Not everything can be dealt with quantitatively.

Issues around algorithmic fairness have become very popular and are the hottest field of machine learning right now. The problem is that we look at this from a very positivist, quantitative perspective, by seeking to make algorithms that are mathematically fair, so different minority groups do not have disproportionate outcomes. Well, you can prove a theorem saying that and put it into practice, but heres the problem: models are not used in isolation. If you take that model and put it where people are biased, when biased people interact with unbiased, mathematically fair algorithms it will make the algorithms also biased.

Human-AI interaction is really important. We can't pretend our systems are used in isolation. Most problems happen because the algorithmic decision-making process itself is poorly understood, and how people make a particular decision from the output of an AI system is something we don't yet understand well. This creates a lot of issues, yet the field of machine learning doesn't value that. The field values mathematical solutions, except it's a solution only if you view it in the context of a reductionist framework. It has nothing to do with reality.

What are some of the challenges around the use of algorithmic decision-making?

My co-authors and I identify three key dimensions of algorithmic decision-making. One dimension is that decisions are mediated by the specific bureaucratic laws, policies, and regulations that are inherent to that system. So, there are certain things you can do, and cant do, that are mandated by law. The second dimension is very important, we call it human discretion. For example, police may see a minor offense like jaywalking but choose to selectively ignore it because they are focused on more significant crimes. So, while the law itself is rigid, inside the confines of the law there is discretion.

The same thing happens with algorithmically mediated systems, where an algorithm gives an output, but a person might choose to ignore it. A case worker might know more about a factor that the algorithm failed to pick up on. This works the other way too, where a person might be unsure and go along with an algorithmic decision because they trust the system. So, theres a spectrum of discretion.

The third aspect is algorithmic literacy. How do people make decisions from numbers? Every system gives a separate visualization or output, and an average social worker on the ground might not have the training to interpret that data. What kinds of training are we going to give people who will implement these decisions?

Now, when we take these three components together, these are the main dimensions of how people make decisions from algorithms. Our group was the first group to unpack this in the case of public services, and it has major implications for AI systems going forward. For instance, how you set up the system affects what kinds of opportunities the user has for exercising discretion. Can everyone override it? Can supervisors override it? How do we look at agreements and disagreements and keep a record of that? If I have a lot of experience and think that the algorithms decision is wrong, I might disagree. However, I might also be afraid that if I don't agree, my supervisor will punish me.

Studying the algorithmic decision-making process has been crucial for us in setting up the next series of problems and research questions. One of the things that Im very interested in is changes in policy. For example, my work in Wisconsin was utilized to make changes that had positive outcomes. But a critical drawback is that I haven't engaged with legal scholars or the family court system.

One of the things I like about SRI is it that brings together legal scholars and data scientists, and Im interested in collaborating with legal scholars to think about how to write AI legislation that will affect algorithmic decision-making processes. I think it demands a radical rethinking of how laws are drafted. I don't think we can engage in the same process anymore; we need to think beyond that and engage in some speculative design.

What is the most important thing that people need to know about data science today, and what are the challenges that lie ahead for the discipline?

Obviously, Im very invested in human-centered data science. I really think this process works well, and since U of T began its program, the field has expanded to other universities and is gaining momentum. I really want to bring this to the education of our professional data science students those who are going to immediately go out into industry and start applying these principles.

Broadly, the challenges for the discipline are the problems I've alluded to, and human-centered data science responds to these issues. We should not be moving fast, we should not be breaking things not when it comes to making decisions about people. It doesn't have to be high stakes, like child welfare. You can imagine something like Facebook or Twitter algorithms where ostensibly you're doing recommendation systems, but that really has ramifications for democracy. There are lots of small things that have major unintended consequences down the line, even something like algorithms in the classroom to predict whether a child is doing well or not.

The other main challenge is this value mismatch problem I described. We need to teach our next generation of students to be more compassionate, to encourage them to think from other perspectives, and to center other people's values and opinions without centering their own. So how do we get better? Again, human-centered design has worked very well in other areas, and we can learn what worked well and apply it here. Why should we pretend that we have nothing to learn from other areas?

See the original post here:

U of T expert on human-centered data science and the problem with the motto 'move fast and break things' - University of Toronto