Page 2,384«..1020..2,3832,3842,3852,386..2,3902,400..»

Variable Names: Why They’re a Mess and How to Clean Them Up – Built In

Quick, what does the following code do?

Its impossible to tell right? If you were trying to modify or debug this code, youd be at a loss unless you could read the authors mind. Even if you were the author, a few days after writing this code you might not remember what it does because of the unhelpful variable names and use of magic numbers.

Working with data science code, I often see examples like above (or worse): code with variable names such as X, y, xs, x1, x2, tp, tn, clf, reg, xi, yi, iiand numerous unnamed constant values. To put it frankly, data scientists (myself included) are terrible at naming variables.

As Ive grown from writing research-oriented data science code for one-off analyses to production-level code (at Cortex Building Intelligence), Ive had to improve my programming by unlearning practices from data science books, coursesand the lab. There are significant differences between deployable machine learning code and how data scientists learn to program, but well start here by focusing on two common and easily fixable problems:

Unhelpful, confusing or vague variable names

Unnamed magic constant numbers

Both these problems contribute to the disconnect between data science research (or Kaggle projects) and production machine learning systems. Yes, you can get away with them in a Jupyter Notebook that runs once, but when you have mission-critical machine learning pipelines running hundreds of times per day with no errors, you have to write readable and understandable code. Fortunately, there are best practices from software engineering we data scientists can adopt, including the ones well cover in this article.

Note: Im focusing on Python since its by far the most widely used language in industry data science. Some Python-specific naming rules (see here for more details) include:

More From Will KoerhsenThe Poisson Process and Poisson Distribution, Explained

There are three basic ideas to keep in mind when naming variables:

The variable name must describe the information represented by the variable. A variable name should tell you concisely in words what the variable stands for.

Your code will be read more times than it is written. Prioritize how easy your code is to read over than how quick it is to write.

Adopt standard conventions for naming so you can make one global decision in a codebase instead of multiple local decisions.

What does this look like in practice? Lets go through some improvements to variable names.

If youve seen these several hundred times, you know they commonly refer to features and targets in a data science context, but that may not be obvious to other developers reading your code. Instead, use names that describe what these variables represent such as house_features and house_prices.

What does the value represent? It could stand for velocity_mph, customers_served, efficiencyorrevenue_total. A name such as value tells you nothing about the purpose of the variable and just creates confusion.

Even if you are only using a variable as a temporary value store, still give it a meaningful name. Perhaps it is a value where you need to convert the units, so in that case, make it explicit:

If youre using abbreviations like these, make sure you establish them ahead of time. Agree with the rest of your team on common abbreviations and write them down. Then, in code review, make sure to enforce these written standards.

Avoid machine learning-specific abbreviations. These values represent true_positives, true_negatives, false_positivesand false_negatives, so make it explicit. Besides being hard to understand, the shorter variable names can be mistyped. Its too easy to use tp when you meant tn, so write out the whole description.

The above are examples of prioritizing ease of reading code instead of how quickly you can write it. Reading, understanding, testing, modifying and debugging poorly written code takes far longer than well-written code. Overall, trying to write code faster by using shorter variable names will actually increase your programs development and debugging time! If you dont believe me, go back to some code you wrote six months ago and try to modify it. If you find yourself having to decipher your own past code, thats an indication you should be concentrating on better naming conventions.

These are often used for plotting, in which case the values represent x_coordinates and y_coordinates. However, Ive seen these names used for many other tasks, so avoid the confusion by using specific names that describe the purpose of the variables such as times and distances or temperatures and energy_in_kwh.

When Accuracy Isn't Enough...Use Precision and Recall to Evaluate Your Classification Model

Most problems with naming variables stem from:

On the first point, while languages like Fortran did limit the length of variable names (to six characters), modern programming languages have no restrictions so dont feel forced to use contrived abbreviations. Dont use overly long variable names either, but if you have to favor one side, aim for readability.

With regards to the second point, when you write an equation or use a model and this is a point schools forget to emphasize remember the letters or inputs represent real-world values!

We write code to solve real-world problems, and we need to understand the problem our model represents.

Lets see an example that makes both mistakes. Say we have a polynomial equation for finding the price of a house from a model. You may be tempted to write the mathematical formula directly in code:

This is code that looks like it was written by a machine for a machine. While a computer will ultimately run your code, itll be read by humans, so write code intended for humans!

To do this, we need to think not about the formula itself (the how)and consider the real-world objects being modeled (the what). Lets write out the complete equation. This is a good test to see if you understand the model):

If you are having trouble naming your variables, it means you dont know the model or your code well enough. We write code to solve real-world problems, and we need to understand the problem our model represents.

While a computer will ultimately run your code, itllbe read by humans, so write code intended for humans!

Descriptive variable names let you work at a higher level of abstraction than a formula, helping you focus on the problem domain.

One of the important points to remember when naming variables is: consistency counts. Staying consistent with variable names means you spend less time worrying about naming and more time solving the problem. This point is relevant when you add aggregations to variable names.

So youve got the basic idea of using descriptive names, changing xs to distances, e to efficiency and v to velocity. Now, what happens when you take the average of velocity? Should this be average_velocity, velocity_mean, or velocity_average? Following these two rules will resolve this situation:

Decide on common abbreviations: avg for average, max for maximum, std for standard deviation and so on. Make sure all team members agree and write these down. (An alternative is to avoid abbreviating aggregations.)

Put the abbreviation at the end of the name. This puts the most relevant information, the entity described by the variable, at the beginning.

Following these rules, your set of aggregated variables might be velocity_avg, distance_avg, velocity_min, and distance_max. Rule two is a matter of personal choice, and if you disagree, thats fine. Just make sure you consistently apply the rule you choose.

A tricky point comes up when you have a variable representing the number of an item. You might be tempted to use building_num, but does that refer to the total number of buildings, or the specific index of a particular building?

Staying consistent with variable names means you spend less time worrying about naming and more time solving the problem.

To avoid ambiguity, use building_count to refer to the total number of buildings and building_index to refer to a specific building. You can adapt this to other problems such as item_count and item_index. If you dont like count, then item_total is also a better choice than num. This approach resolves ambiguity and maintains the consistency of placing aggregations at the end of names.

For some unfortunate reason, typical loop variables have become i, j, and k. This may be the cause of more errors and frustration than any other practice in data science. Combine uninformative variable names with nested loops (Ive seen loops nested include the use of ii, jj, and even iii) and you have the perfect recipe for unreadable, error-prone code. This may be controversial, but I never use i or any other single letter for loop variables, opting instead for describing what Im iterating over such as

or

This is especially useful when you have nested loops so you dont have to remember if i stands for row or column or if that was j or k. You want to spend your mental resources figuring out how to create the best model, not trying to figure out the specific order of array indexes.

(In Python, if you arent using a loop variable, then use _ as a placeholder. This way, you wont get confused about whether or not the variable is used for indexing.)

All of these rules stick to the principle of prioritizing read-time understandability instead of write-time convenience. Coding is primarily a method for communicating with other programmers, so give your team members some help in making sense of your computer programs.

A magic number is a constant value without a variable name. I see these used for tasks like converting units, changing time intervals or adding an offset:

(These variable names are all bad, by the way!)

Magic numbers are a large source of errors and confusion because:

Only one person, the author, knows what they represent.

Changing the value requires looking up all the locations where it's used and manually typing in the new value.

Instead of using magic numbers in this situation, we can define a function for conversions that accepts the unconverted value and the conversion rate as parameters:

If we use the conversion rate throughout a program in many functions, we could define a named constant in a single location:

(Remember, before we start the project, we should establish with our team that usd = US dollars and aud = Australian dollars. Standards matter!)

Heres another example:

Using a NAMED_CONSTANT defined in a single place makes changing the value easier and more consistent. If the conversion rate changes, you dont need to hunt through your entire codebase to change all the occurrences, because youve defined it in only one location. It also tells anyone reading your code exactly what the constant represents. A function parameter is also an acceptable solution if the name describes what the parameter represents.

As a real-world example of the perils of magic numbers, in college, I worked on a research project with building energy data that initially came in 15-minute intervals. No one gave much thought to the possibility this could change, and we wrote hundreds of functions with the magic number 15 (or 96 for the number of daily observations). This worked fine until we started getting data in five and one-minute intervals. We spent weeks changing all our functions to accept a parameter for the interval, but even so, we were still fighting errors caused by the use of magic numbers for months.

More From Our Data Science ExpertsA Beginner's Guide to Evaluating Classification Models in Python

Real-world data has a habit of changing on you. Conversion rates between currencies fluctuate every minute and hard-coding in specific values means youll have to spend significant time re-writing your code and fixing errors. There is no place for magic in programming, even in data science.

The benefits of adopting standards are that they let you make a single global decision instead of many local ones. Instead of choosing where to put the aggregation every time you name a variable, make one decision at the start of the project, and apply it consistently throughout. The objective is to spend less time on concerns only peripherally related to data science: naming, formatting, style and more time solving important problems (like using machine learning to address climate change).

If you are used to working by yourself, it might be hard to see the benefits of adopting standards. However, even when working alone, you can practice defining your own conventions and using them consistently. Youll still get the benefits of fewer small decisions and its good practice for when you inevitably have to develop on a team. Anytime you have more than one programmer on a project, standards become a must!

Keep Clarifying Your Code5 Ways to Write More Pythonic Code

You might disagree with some of the choices Ive made in this article, and thats fine! Its more important to adopt a consistent set of standards than the exact choice of how many spaces to use or the maximum length of a variable name. The key point is to stop spending so much time on accidental difficulties and instead concentrate on the essential difficulties. (Fred Brooks, author of the software engineering classic The Mythical Man-Month, has an excellent essay on how weve gone from addressing accidental problems in software engineering to concentrating on essential problems).

Now let's go back to the initial code we started with and fix it up.

Well use descriptive variable names and named constants.

Now we can see that this code is normalizing the pixel values in an array and adding a constant offset to create a new array (ignore the inefficiency of the implementation!). When we give this code to our colleagues, they will be able to understand and modify it. Moreover, when we come back to the code to test it and fix our errors, well know precisely what we were doing.

Clarifying your variable names may seem like a dry activity, but if you spend time reading about software engineering, you realize what differentiates the best programmers is the repeated practice of mundane techniques such as using good variable names, keeping routines short, testing every line of code, refactoring, etc. These are the techniques you need to take your code from research or exploration to production-ready and, once there, youll see how exciting it is for your data science models to influence real-life decisions.

This article was originally published on Towards Data Science.

Original post:

Variable Names: Why They're a Mess and How to Clean Them Up - Built In

Read More..

Succeeding in Data Science Projects Inputs that Could Help You – Analytics Insight

For organizations, interest in ML, artificial intelligence (AI) and data science is developing. There is growing potential around data science to make new bits of knowledge and administrations for inward and outer clients. Nonetheless, this investment can be squandered if data science projects dont satisfy their customers. How might we ensure that these projects succeed?

To work on your odds of coming out on top around your ventures, it merits investing energy to take a gander at how information science functions practically speaking, and how your association works. While it incorporates the word science in its title, indeed information science requires a mix of both craftsmanship and science to deliver the best outcomes. Utilizing this current, its then conceivable to inspect increasing the outcomes. This will assist you with effectively transforming information science results into creation activities for the business.

At the most basic level, data science includes concocting thoughts and afterward utilizing data to test those theories. Utilizing a blend of various algorithms, plans, and approaches, data scientists can search out new experiences from the information that organizations make. In light of trial, error, and improvement, the groups included can make a scope of new experiences and revelations, which would then be able to be utilized to illuminate choices or create new products. This would then be able to be utilized to foster (ML) algorithms and AI arrangements.

The greatest risk around these projects is the gap between business assumptions and reality. Artificial intelligence has gotten an immense measure of promotion and consideration in the course of recent years. This implies that many ventures have unrealistic expectations. To forestall this issue, set out how your tasks will uphold generally speaking business objectives. You would then be able to begin little with projects that are easy and that can show improvements. Whenever you have set out some standard procedures around what AI can convey and penetrated the publicity swell around AI to make this all the same old thing you can maintain the attention on the outcomes that you convey.

Another big problem is that teams dont have the necessary skills to translate their vision into effective processes. While the ideas might be sound, a lack of understanding of the nuances of applying machine learning and statistics in practice can lead to poor outcomes. To prevent these kinds of problems, its important to establish a smoothly operating engineering culture that weaves data science work into the overall production pipeline. Rather than data science being a distinct team, work on how to integrate your data scientists into the production deployment process. This will help minimize the gap from data research and development to production.

Another enormous issue is that the teams dont have the vital abilities to make an interpretation of their vision into effective processes. While the thoughts may be sound, an absence of comprehension around the subtleties of applying AI and insights by and by can prompt helpless results. To forestall these sorts of issues, set up an easily working designing society that meshes information science work into the general creation pipeline. Maybe then information science is a particular group, work on the best way to coordinate your information researchers into the creation organization process. This will assist with limiting the hole from information innovative work to create.

While supporting inventiveness around data science, any work should have the business objectives as a top priority. This should put an accentuation on the thing result you are hoping to accomplish or find by utilizing data to demonstrate (or discredit) a hypothesis depending on how well that business objective was met. Close by this present, assess new advances for any enhancements by the way they may assist with meeting objectives. Keeping at the forefront is significant for data scientists, yet it is crucial to center around how any new technology can assist with meeting that particular and measurable business outcome.

In view of these thoughts, you can help your data science group take their inventiveness and apply it to find fascinating outcomes. When this research begins to find bits of insights, you would then be able to see how to drive this into creation. This includes making spans from the information science improvement and exploration group to those liable for running creation frameworks so that new models can be passed across.

Share This ArticleDo the sharing thingy

About AuthorMore info about author

See the article here:

Succeeding in Data Science Projects Inputs that Could Help You - Analytics Insight

Read More..

Inspiring Innovation; New Short Talks Features Karl Schubert and Data Science Program – University of Arkansas Newswire

University of Arkansas

Karl Schubert

The November episode of Short Talks from the Hill features Karl Schubert, professor of practice and associate director of the Data Science Program. Schubert came back to the University of Arkansas after a 35-year career in private industry.

Schubert discusses his unusual path back to the university and his general desire to inspire innovation in students. He also discusses the creation of the multidisciplinary Data Science Program and a recent National Science Foundation grant of nearly $1 million to support low-income students interested in studying innovation in science, technology, engineering and math.

On the benefits of having a non-traditional background, Schubert says in the podcast: "I was viewed by the faculty as what I called non-denominational. You know, that is, that I wasn't in a department specifically. I was working for three deans, and so I didn't have any particular favoritism to any particular department or any particular college."

To listen to Schubert discuss his role at the university, go to ResearchFrontiers.uark.edu, the home of research news at the University of Arkansas, or visit the "On Air" and "Programs" link at KUAF.com.

Short Talks From the Hill highlights research, scholarly work, and creative activity at the University of Arkansas. Each segment features a university a faculty member discussing his or her work. Previous podcasts can be found under the 'Short Talks From the Hill' link at ResearchFrontiers.uark.edu.

Thank you for listening!

Read more from the original source:

Inspiring Innovation; New Short Talks Features Karl Schubert and Data Science Program - University of Arkansas Newswire

Read More..

How to succeed around data science projects – Information Age

Denise Gosnell, chief data officer at DataStax, discussed how preparation, process and open source can help to ensure success from data science projects

Its important to set out how your projects will support overall business goals.

For businesses, investment in machine learning, artificial intelligence (AI) and data science is growing. There is huge potential around data science to create new insights and services for internal and external customers. However, this investment can be wasted if data science projects dont fulfil their promises. How can we make sure that these projects succeed?

According to McKinsey, around half of all the companies they served have adopted AI in at least one function, and there is already a small cohort of companies that can ascribe at least 20% of their earnings before interest and taxes to AI. Around $341.8 billion will be spent on AI solutions during 2021, a rise of 15.2 percent year over year, according to IDC.

IDC also found around 28% of AI and ML initiatives have failed so far. Based on the figure above, that would equate to $88.1 billion of spend on tooling associated with failed projects. The analyst firm identified reasons for this including the lack of staff with necessary expertise, and a lack of production-ready data as reasons for this. Alongside this, feeling unconnected and lacking an integrated development environment was another reason for projects not being successful.

To improve your chances of success around your projects, it is worth spending time to look at how data science works in practice, and how your organisation operates. While it includes the word science in its title, in fact data science requires a blend of both art and science in order to produce the best results. Using this, its then possible to examine scaling up the results. This will help you successfully turn data science results into production operations for the business.

At the most simple level, data science involves coming up with ideas and then using data to test those theories. Using a mix of different algorithms, designs and approaches, data scientists can seek out new insights from the data that companies create. Based on trial, error and improvement, the teams involved can create a range of new insights and discoveries, which can then be used to inform decisions or create new products. This can then be used to develop machine learning (ML) algorithms and AI deployments.

We gauged the perspectives of experts in data science, asking them about the biggest emerging trends in data science. Read here

The biggest risk around these projects is the gap between business expectations and reality. AI has received a huge amount of hype and attention over the past few years. This means that many projects have unrealistic expectations.

Unrealistic expectations can be in scope, speed, and/or technologies. Great project managers understand how to navigate challenges in scope and speed; it is the misinterpretation of the promises of AI technologies which have been causing the biggest problems for new projects. Rather than being focused on improving a process or delivering one insight, AI gets envisioned as changing how a company runs from top to bottom, or that a single project will deliver a change in profitability within months.

To prevent this problem, its important to set out how your projects will support overall business goals. You can then start small with projects that are easy to understand and that can show improvements. Once you have set out some ground rules around what AI can deliver and punctured the hype balloon around AI to make this all business as usual you can keep the focus on the results that you deliver.

Another big problem is that teams dont have the necessary skills to translate their vision into effective processes. While the ideas might be sound, a lack of understanding around the nuances of applying machine learning and statistics in practice can lead to poor outcomes. This issue is also due to the hype around AI and ML the demand for data science skills means that there is a lot of competition for those with experience, while even those starting out can command big salaries. This lack of real world experience is what can lead to problems over time.

Even with a realistic vision and experienced staff in place, AI projects can still fail to deliver results. In this case, the reason is normally that poor processes, inconsistent communication, and gaps between teams exist.

To prevent these kinds of problems, its important to establish a smoothly operating engineering culture that weaves data science work into the overall production pipeline. Rather than data science being a distinct team, work on how to integrate your data scientists into the production deployment process. This will help minimise the gap from data research and development to production.

While it is important to support creativity around data science, any work should have the business goals in mind. This should put the emphasis on what result you are looking to achieve or discover by using data to prove (or disprove) a hypothesis based on how well that business goal was met.

The team at Netflix has written about this, and how their approach to shared hypothesis testing helps keep the team focused. By concentrating on specific objectives, you can avoid getting lost or spending time on projects that wont pay off.

Alongside this, its important to evaluate new technologies for any improvements in how they might help meet goals. Keeping at the cutting edge is important for data scientists, but it is essential to focus on how any new technology can help meet that specific and measurable business outcome.

Based on these ideas, you can help your data science team take their creativity and apply it to discover interesting results. Once this research starts to find insights, you can then look at how to push this into production. This involves creating bridges from the data science development and research team to those responsible for running production systems, so that new models can be passed across.

Jai Gandhi, vice-president of data and analytics at Ciklum, discusses what retailers can learn from Netflix when leveraging data to drive innovation and sales. Read here

One critical element here is that you should encourage everyone to use the same tools on each side. One of the biggest hurdles can be when the data science team delivers a new model and workflow around data, and then those responsible for running the model in production have to re-develop that model to work with the existing infrastructure that is in place. The emphasis here is to avoid the old trope of This worked on my laptop! as laptops cant be pushed to production and rework is expensive.

Using open source can help to achieve this consistency. From databases like Apache Cassandra, through to event streaming with Apache Pulsar, data enrichment with Apache Flink and analytics with Apache Spark, common tools used for working with data are mainly open source and easy to link together. Alongside this open data infrastructure, TensorFlow is important for how algorithms and machine learning models can be created and tested. You can use something like Apache Airflow to manage the workflow process that your team has in place. This makes it easier to build a stack that is common to everyone.

Alongside getting consistency on tools and infrastructure, both sides need to agree on common definitions and context. This involves setting the right goals and metrics so that everyone is aware of how the team will be evaluated over time. At the same time, it should also be an opportunity to keep re-assessing those metrics, so that the emphasis is always on delivering the right business outcomes. Anthropologist Marilyn Strathern described this as, When a measure becomes a target, it ceases to be a good measure. This sees teams concentrating too specifically on metrics and measurement to the detriment of the overall goal.

Lastly, the role of testing should not be overlooked. Once new models are developed that should have the desired impact, those models should be tested to ensure that they work as expected and are not falling foul of issues within the test data or any biases that were not accounted for. Testing using the same tools and processes as will be used in production not only helps solidify the value that data science creates, but makes it easier to scale that work out. Skipping this step or not giving it the right degree of rigour leads to problems over time.

Data science has huge potential to help businesses improve their operations. It can be used to develop new products, show where to invest, and help people make better decisions in their roles.

To avoid the risk of failure, look at how you can build on an open source data stack to make the process around moving from initial discovery through to full production easier. This consistency should make it easier for your data scientists to work around data, and for your operational staff to implement those insights into production.

Here is the original post:

How to succeed around data science projects - Information Age

Read More..

California Tries to Close the Gap in Math, but Sets Off a Backlash – The New York Times

If everything had gone according to plan, California would have approved new guidelines this month for math education in public schools.

But ever since a draft was opened for public comment in February, the recommendations have set off a fierce debate over not only how to teach math, but also how to solve a problem more intractable than Fermats last theorem: closing the racial and socioeconomic disparities in achievement that persist at every level of math education.

The California guidelines, which are not binding, could overhaul the way many school districts approach math instruction. The draft rejected the idea of naturally gifted children, recommended against shifting certain students into accelerated courses in middle school and tried to promote high-level math courses that could serve as alternatives to calculus, like data science or statistics.

The draft also suggested that math should not be colorblind and that teachers could use lessons to explore social justice for example, by looking out for gender stereotypes in word problems, or applying math concepts to topics like immigration or inequality.

The battle over math comes at a time when education policy, on issues including masks, testing and teaching about racism, has become entangled in bitter partisan debates. The Republican candidate for governor in Virginia, Glenn Youngkin, seized on those issues to help propel him to victory on Tuesday. Now, Republicans are discussing how these education issues can help them in the midterm elections next year.

Even in heavily Democratic California a state with six million public school students and an outsize influence on textbook publishing nationwide the draft guidelines encountered scathing criticism, with charges that the framework would inject woke politics into a subject that is supposed to be practical and precise.

People will really go to battle for maths to stay the same, said Jo Boaler, a professor of education at Stanford University who is working on the revision. Even parents who hated maths in school will argue to keep it the same for their kids.

The battle over math pedagogy is a tale as old as multiplication tables. An idea called new math, pitched as a more conceptual approach to the subject, had its heyday in the 1960s. About a decade ago, amid debates over the national Common Core standards, many parents bemoaned math exercises that they said seemed to dump line-by-line computation in favor of veritable hieroglyphs.

Today, the battles over the California guidelines are circling around a fundamental question: What, or whom, is math for?

Testing results regularly show that math students in the United States are lagging behind those in other industrialized nations. And within the country, there is a persistent racial gap in achievement. According to data from the civil rights office of the Education Department, Black students represented about 16 percent of high school students but 8 percent of those enrolled in calculus during the 2015-16 school year. White and Asian students were overrepresented in high-level courses.

We have a state and nation that hates math and is not doing well with it, Dr. Boaler said.

Critics of the draft said the authors would punish high achievers by limiting options for gifted programs. An open letter signed by hundreds of Californians working in science and technology described the draft as an endless river of new pedagogical fads that effectively distort and displace actual math.

Williamson M. Evers, a senior fellow at the Independent Institute and a former official with the Education Department during the administration of George W. Bush, was one of the authors of the letter and objected to the idea that math could be a tool for social activism.

I think thats really not right, he said in an interview. Math is math. Two plus two equals four.

Distress over the draft made it to Fox News. In May, Dr. Boalers name and photograph were featured on an episode of Tucker Carlson Tonight, an appearance she did not know about until she began receiving nasty letters from strangers.

Like some of the attempted reforms of decades past, the draft of the California guidelines favored a more conceptual approach to learning: more collaborating and problem solving, less memorizing formulas.

It also promoted something called de-tracking, which keeps students together longer instead of separating high achievers into advanced classes before high school.

The San Francisco Unified School District already does something similar. There, middle school math students are not split up but rather take integrated courses meant to build their understanding year by year, though older high school students can still opt into high-level classes like calculus.

Sophia Alemayehu, 16, a high school junior in San Francisco, advanced along that integrated track even though she did not always consider herself a gifted math student. She is now taking advanced calculus.

In eighth and ninth grade, I had teachers tell me, Oh, youre actually really good at the material, she said. So it made me think, maybe Im good at math.

The model has been in place since 2014, yielding a few years of data on retention and diversity that has been picked over by experts on both sides of the de-tracking debate. And while the data is complicated by numerous variables a pandemic now among them those who support San Franciscos model say it has led to more students, and a more diverse set of students, taking advanced courses, without bringing down high achievers.

Youll hear people say that its the least common denominator that discourages gifted kids from advancing, Elizabeth Hull Barnes, the math supervisor for the district, said. And then its like, nope, our data refutes that.

But Dr. Evers, the former Education Department official, pointed to research suggesting that the data on math achievement in places like San Francisco was more cherry-picked than conclusive. He added that Californias proposed framework could take a more nuanced approach to de-tracking, which he saw as a blunt tool that did not take the needs of individual districts into account.

Other critics of de-tracking say it amounts to a drag on children who would benefit from challenging material and that it can hurt struggling students who might need more targeted instruction.

Divya Chhabra, a middle school math teacher in Dublin, Calif., said the state should focus more on the quality of instruction by finding or training more certified, experienced teachers.

Without that, she said, students with potential would quickly fall behind, and it would only hurt them further to take away options for advanced learning. I feel so bad for these students, she said. We are cutting the legs of the students to make them equal to those who are not doing well in math.

Tracking is part of a larger debate about access to college. Under the current system, students who are not placed in accelerated courses by middle school may never get the opportunity to take calculus, which has long been an informal gatekeeper for acceptance to selective schools.

According to data from the Education Department, calculus is not even offered in most schools that serve a large number of Black and Latino students.

The role of calculus has been a talking point among math educators for years, said Trena Wilkerson, the president of the National Council of Teachers of Mathematics. If calculus is not the be-all, end-all thing, then we need everyone to understand what the different pathways can be, and how to prepare students for the future, she said.

Californias recommendations aim to expand the options for high-level math, so that students could take courses in, say, data science or statistics without losing their edge on college applications. (The move requires buy-in from colleges; in recent years, the University of California system has de-emphasized the importance of calculus credits.)

For now, the revision process has reached a sort of interlude: The draft is being revised ahead of another round of public comment, and it will not be until late spring, or maybe summer, that the states education board will decide whether to give its stamp of approval.

But even after that, districts will be free to opt out of the states recommendations. And in places that opt in, academic outcomes in the form of test scores, retention rates and college readiness will add to the stormy sea of data about what kinds of math instruction work best.

In other words, the conversation is far from over.

Weve had a really hard time overhauling math instruction in this country, said Linda Darling-Hammond, the president of Californias board of education. We cannot ration well-taught, thoughtful mathematics to only a few people. We have to make it widely available. In that sense, I dont disagree that its a social justice issue.

See original here:

California Tries to Close the Gap in Math, but Sets Off a Backlash - The New York Times

Read More..

Training students at the intersection of power engineering and computer science WSU Insider – WSU News

A WSU research team has received a $1.2 million U.S. Department of Education grant to train graduate students at the intersection of artificial intelligence (AI), data science, and engineering to address challenges of the future electric power grid.

Led by Assefaw Gebremedhin, associate professor in the School of Electrical Engineering and Computer Science, theGraduate Assistance in Areas of National Need (GAANN) grant aims to enhance teaching and research in areas of national need.

AI and the closely related area of data science affect nearly everything that we do, said Gebremedhin. We need to have power engineers who speak both languages who are trained to be good power engineers and are also able to do good data science.

In recent years, the US power grid has been rapidly evolving from a network of centralized fossil fuel-powered generation plants to a system that includes more distributed generation and renewable resources. As power becomes more decentralized, traditional ideas about power grid operations have been changing.

Distributed assets need to be controlled and managed differently than in the past.

Climate change is also leading to an increase in extreme weather events, which means that the power system has to be more resilient and operate under fast-changing conditions, says Gebremedhin. Changes in technology are also allowing customers to be more actively and directly involved in controlling their energy use.

These rapid transformations threaten power grid reliability, he said.

The US power industry is increasingly adopting machine learning and data analytics technologies to improve its reliability, resiliency, and efficiency.

Meanwhile, software that gets developed in the power industry as well as in many other engineering applications is increasingly getting more complex. Software engineers of the future would not only need to know how to build and maintain complex software, but they would also need to know how to extract knowledge from massive amounts of data and adapt that knowledge to consider different human factors.

As part of the grant, a total of eight U.S. PhD students will receive training, focusing on the application of AI and data science to power engineering and software engineering.

The new workforce needs to be trained in traditional topics on electric and power engineering along with having an understanding of data science and machine learning, information and communication technology, and control and automation, he said.

With programs in power engineering, machine learning and AI, and software engineering, the School of EECS presents a unique opportunity to bridge the fields of computer science and power engineering.

There are just a few schools in the country where you have these disciplines housed in the same school, which is a great asset, he said.

The three-year program will focus on recruitment of students from underrepresented groups in engineering and computer science, including women, black and Hispanic students. In addition to Gebremedhin, the program is led by three women faculty members in electrical engineering and computer science, Anamika Dubey, Venera Arnaoudova, and Noel Schulz. The students will receive training in teaching and mentoring and will also have opportunities to participate in internships through the Pacific Northwest National Laboratory.

Visit link:

Training students at the intersection of power engineering and computer science WSU Insider - WSU News

Read More..

A look at some of the AI and ML expert speakers at the iMerit ML DataOps Summit – TechCrunch

Calling all data devotees, machine-learning mavens and arbiters of AI. Clear your calendar to make room for the iMerit ML DataOps Summit on December 2, 2021. Join and engage with AI and ML leaders from multiple tech industries, including autonomous mobility, healthcare AI, technology and geospatial to name just a few.

Attend for free: Theres nothing wrong with your vision the iMerit ML DataOps Summit is 100% free, but you must register here to attend.

The summit is in partnership with iMerit, a leading AI data solutions company providing high-quality data across computer vision, natural language processing and content that powers machine learning and artificial intelligence applications. So, what can you expect at this free event?

Great topics require great speakers, and well have those in abundance. Lets highlight just three of the many AI and ML experts who will take the virtual stage.

Radha Basu: The founder and CEO of iMerit leads an inclusive, global workforce of more than 5,300 people 80% of whom come from underserved communities and 54% of whom are women. Basu has raised $23.5 million from investors, led the company to impressive revenue heights and has earned a long list of business achievements, awards and accolades.

Hussein Mehanna: Currently the head of Artificial Intelligence and Machine Learning at Cruise, Mehanna has spent more than 15 years successfully building and leading AI teams at Fortune 500 companies. He led the Cloud AI Platform organization at Google and co-founded the Applied Machine Learning group at Facebook, where his team added billions of revenue dollars.

DJ Patil: The former U.S. Chief Data Scientist, White House Office of Science and Technology Policy, Patils experience in data science and technology runs deep. He has held high-level leadership positions at RelateIQ, Greylock Partners, Color Labs, LinkedIn and eBay.

The iMerit ML DataOps Summit takes place on December 2, 2021. If your business involves data-, AI- and ML-driven technologies, this event is made for you. Learn, network and stay current with this fast-paced sector and do it for free. All you need to do is register. Start clicking.

Continued here:

A look at some of the AI and ML expert speakers at the iMerit ML DataOps Summit - TechCrunch

Read More..

Exploring, Monitoring and Modeling the Deep Ocean Are Goals of New Research – UT News – UT News | The University of Texas at Austin

AUSTIN, Texas A team led by scientists from The University of Texas at Austin is attempting to boldly go where no man has gone before: the Earths deepest oceans.

In the 1989 science fiction film The Abyss, a search and recovery team is tasked with finding a lost U.S. submarine that has vanished somewhere deep in uncharted waters of the Atlantic Ocean. Although the teams discovery of an extraterrestrial species living on the ocean floor is imaginative, it did highlight how little we know about what may be present in the deepest parts of the Earths oceans.

Water covers more than 70% of the planets surface, but only 10% of the undersea world has been explored. Oceans provide about 90% of living space on the planet by volume. They also absorb more than 90% of the Earths radiative heat imbalance leading to ocean warming, and about a third of anthropogenic carbon dioxide emissions leading to ocean acidification.

Now, more than 30 years since the release of The Abyss, scientists have gained some new insights. For example, the deep ocean (below 200 meters, from the mesopelagic zone downward) could provide a vast repository for biodiversity providing critical climate regulation and housing a wealth of hydrocarbon, mineral and genetic resources. Nevertheless, the deep ocean remains a mostly unknown realm of our planet. Deep-ocean habitats are under increasing pressure from climate change and human activities such as seafloor mining, fishing and contamination.

Through its Accelerating Research through International Network-to-Network Collaborations (AccelNet) program, the National Science Foundation is funding a team led by the Oden Institute for Computational Engineering and Sciences at UT Austin to implement a Deep-Ocean Observing Strategy (iDOOS). The initiative brings together U.S. and international networks engaged in deep-ocean observing, mapping, exploration, modeling, research and sustainable management to leverage each others efforts, knowledge and resources.

By connecting deep-ocean observers across disciplines, expanding the observing community to include nontraditional partners, and linking data providers to users, iDOOS will enhance the deep-ocean capabilities of the Global Ocean Observing System (GOOS) and target societal needs, said project lead Patrick Heimbach, director of the Computational Research in Ice and Ocean Systems group at the Oden Institute and faculty member at the Jackson School of Geosciences.

IDOOS will address several of the stated Challenges of the IOC United Nations Decade of Ocean Science for Sustainable Development (2021-2030), in particular: the goal to ensure a sustainable [deep] ocean observing system across all ocean basins that delivers accessible, timely, and actionable data and information to all users.

One of the first programs to be endorsed by the U.N. Ocean Decade initiative, the initiative also tackles another key challenge set engaging with a range of stakeholders to develop or contribute to a comprehensive ocean cyberinfrastructure that supports a digital-twin [deep] ocean, enabling applications from big data analytics to simulation-based science.

Through engagement with policymakers, regulators and science coordinators, iDOOS will raise awareness and support for deep-ocean science and bring science into critical decisions regarding climate, biodiversity and sustainability. It will foster a community of future leaders informed in deep-ocean observing, modeling, data science, sustainable development, and international law at a global level who are adept at communicating to regulators and policymakers, as well as to fellow scientists.

Heimbach and his research team at UT Austin will lead the project in partnership with experts from the Scripps Institution of Oceanography at the University of California San Diego, the Woods Hole Oceanographic Institution, the Monterey Bay Aquarium Research Institute, the University of Hawaii at Manoa, and The University of Rhode Island

Read the rest here:

Exploring, Monitoring and Modeling the Deep Ocean Are Goals of New Research - UT News - UT News | The University of Texas at Austin

Read More..

UVA Science and Engineering Faculty Win 12 NSF Career Awards – University of Virginia

From stopping deadly diseases to developing futuristic materials, from making self-driving vehicles smarter to studying global inequalities in pollution exposure, the University of Virginias early career faculty are more deeply involved than ever in making peoples lives safer, healthier and more efficient.

In 2021 so far, 12 UVA assistant professors have earned National Science Foundation Early Career Development Awards, among the most competitive and prestigious grants for science and engineering faculty in the first stages of their careers. Thats up from eight CAREER Awards in 2020, and four to five awards per year before then.

The CAREER Award is given to early career researchers who have the potential to make a significant impact through their careers as academic researchers and educators, Melur Ram Ramasubramanian, UVAs vice president for research, said. Getting 12 of these prestigious awards for our faculty so far this year is impressive, and really shows the great talent we have across the University.

Meet the most recent UVA CAREER Award winners, whom NSF expects to become the next great leaders and role models in research and education.

In May 2020, Partners for Automated Vehicle Education shared results from a poll of 1,200 Americans about attitudes around autonomous vehicle technology. Three in four believed the technology was not ready for primetime; almost half indicated they would never ride in a self-driving car; and a fifth do not believe that autonomous vehicles will ever be safe.

The poll outlines the deep skepticism surrounding self-driving vehicles. Methods to improve and prove safety will be needed for broad-based acceptance. Behls pioneering research at UVA is accelerating safety for autonomous vehicles.

Using auto racing as a platform, Behl has invented artificial intelligence methods to agilely maneuver an autonomous vehicle while pushing the limits of its steering, throttle and braking capabilities. His novel racing research is creating advanced algorithms that hold the key to safer autonomous vehicles, enabling them to avoid collisions even when they encounter unexpected challenges at high speeds while close to obstacles or other vehicles.

Demonstrating their skills in programming a full-sized, fully autonomous race car, Behl and his student Cavalier Autonomous Racing team clocked the fastest laps from a U.S. university team in the historic Indy Autonomous Challenge, held Oct. 23 at the Indianapolis Motor Speedway.

Fibrosis, the stiffening of normally soft or pliant living tissue, contributes significantly to about 40% of the overall deaths in the developed world.

Yeah, its a hell of a stat, Caliari said. But thats because fibrosis, or chronic scarring, itself isnt a disease; its an outcome of many different diseases.

The list includes some cancers, viral infections such as hepatitis, and idiopathic pulmonary fibrosis, a cruel condition in which scar tissue grows in the lungs, restricting the flow of oxygen. Idiopathic means the disease has no known cause. It has no cure, either.

Researchers like Caliari believe stopping or even reversing the progression of fibrosis is possible, but they need to know a lot more about what is happening in the body to make cells go from normal to a diseased state. Caliari is using biomaterials developed in his lab to open a window on that process.

Caliaris plans include partnering with the UVA chapter of the Society of Hispanic Professional Engineers to develop teaching modules on biomaterials concepts for elementary school students and initiating a high school summer research program involving labs in UVAs Fibrosis Initiative.

Holographic displays, color-changing frames and pliable screens are just a few of the innovations the next generation of smartphones may offer. And while engineers and coders will be responsible for making much of that technology possible, theres a good chance well also need to thank Gilliard.

Gilliard explores strategies for incorporating boron into chemical compounds to help him understand how to harness the elements unique capacity to carry and transfer electric charge and to produce the colors displayed by our cellphones and electronic devices.

In collaboration with the chemical engineering department here at UVA, weve already started to explore the applications of some of these boron-based materials, and were seeing that the utility is probably going to be pretty important going forward, Gilliard said.

His research may also make components in those devices more stable over time, less expensive to produce and less harmful to the environment.

This is the technology that has resulted in your lights in your home lasting much longer than they did even five years ago, Gilliard said.

We have a unique potential to be one of the leaders in this area of boron chemistry, he added. There are not many people in the United States exploring these areas of chemistry, and increasing our ability to compete globally in this area of science is extremely important.

As we struggle to come to terms with the fact that more than half a million lives have been lost to COVID-19 in the United States, it can be easy to forget that nearly as many people die from malaria worldwide every single year.

Malaria is caused by a single-celled, mosquito-borne parasite, but like a virus, it can adapt to survive a variety of challenges that could wipe it out completely. Gler studies how the malaria parasite responds to changes in its environment that are hostile to its survival.

Over the course of its life cycle, the malaria parasite must be able to adapt to the conditions that allow it to survive in the body of a mosquito, in the liver of an infected host or in a hosts bloodstream before it infects another mosquito. Evolution has also equipped it with the capacity to develop a resistance to the drugs that researchers develop to defeat it. Gler uses a powerful combination of laboratory studies and computational modeling to understanding the complexities of the cellular behaviors that make the malaria parasite so resilient.

The CAREER Award will help us look, specifically, at how the parasites respond to stress, so if theyre in one of these new environments and its stressful for them maybe theres a limiting nutrient or a drug present and its causing stress what sort of programs are going on inside that cell that allow it to survive? Gler said. Ultimately, what we learn could help us find a better way to treat this disease.

The National Science Foundation places a priority on inventing new computing and networking technologies. The need is urgent, because such technologies will help researchers use big data sets to find solutions for complex global challenges.

The problem is that the amount of data available globally has outpaced the processing power needed to analyze it. International Data Corporation predicts that the collective sum of the worlds data will grow to 175 zettabytes 175 trillion gigabytes by 2025, a massive data explosion compared to 4.4 zettabytes available in 2015.

Khan is developing revolutionary computer architectures that will make problem-solving with big data possible.

Datasets are so large they must be broken up into bundles across multiple computers in a data center, Khan said. Computations get bottlenecked as larger and larger data packets get moved from computer to computer in progression to a single processor.

Khans research aims to redesign programmable switches and smart network interface cards to allow data to be processed in transit instead, a fundamental redesign of outdated computer infrastructure. Her research team has built the first protype network that uses the revolutionary architecture, making data requests four times faster.

In the real world, this would mean people could update their social media or make online transactions, like purchasing tickets, lightning-fast compared to today.

Reducing the amount of data that needs to be moved to that single point of processing dramatically speeds things up and fuels the entire systems capacity, Khan said. We are expecting that processing in the reconfigured network will achieve more than 10-fold increases in processing speeds for scientific and machine-learning workloads.

Despite the fact that the STEM workforce has shown considerable growth in recent years, Black workers, and especially Black women, remain underrepresented in the fields of science, technology, engineering and math. And according to a study of trends in STEM degrees by the Pew Research Center, the gap is unlikely to narrow any time soon. For Seanna Leath, an assistant professor of psychology, universities have a critical part to play in addressing Black womens retention in STEM fields.

Leaths CAREER award will allow her to explore how improving the academic, social and psychological wellbeing of Black college women will help attract them to the study of STEM disciplines and allow them to thrive as students. Funding from the award will allow Leath to develop longitudinal surveys and interview tools to assess Black undergraduate womens experiences over a four-year period, and using the data she collects, she hopes to identify the most important factors affecting the motivation and retention of Black women in STEM degrees.

You might think that air pollution is an equal-opportunity threat, but there is a solid body of evidence suggesting that not everyone who lives in urban areas experiences the same level of exposure, which means that some communities are faced with a lower quality of life and a lower life expectancy.

Using a variety of airborne and ground-based data-collection methods, Pusede, an atmospheric chemist, is interested in advancing sciences understanding of how variations in exposure to airborne pollutants occur in urban areas and why.

With the help of the CAREER award, Pusede will conduct field work in Dakar, Senegal, which will lead to the training of U.S. and Senegalese students in an international collaboration of physical and social scientists that will involve collecting and integrating scientific data and demographic information from a wide range of sources to shed light on inequalities in pollutant exposure and their consequences.

Pusedes project will also include the development of educational and public-outreach activities based on her research, including the development of a middle-school curriculum aimed at encouraging students interest in the STEM fields and demonstrating how those fields can help advance the cause of environmental justice.

Did you know that a tuna is a super swimmer?

Theyre really fast, theyre really strong, theyre big, theyre at the top of their food chain without any natural predators. Theyre a model organism for roboticists because theyre phenomenal swimmers, Quinn said. Besides being fast, tuna dart back and forth very quickly complex, high-speed maneuvers and were not sure how they do it.

Quinn is using his CAREER Award to find out. His Smart Fluids Systems Lab is using a tuna model rigged up to swim inside a tank to try to discover how liquids flow past the fish a process called fluid dynamics and govern high-speed, irregular or asymmetric swimming.

By mapping out these flows, bio-inspired roboticists who have to rely on models of low-speed, regular or symmetric movements when designing and testing robots will have the information they need to start modeling and designing fast, highly maneuverable water and aerial drones. Even though Quinn is studying swimming, the principles of fluid dynamics apply to water and air propulsion, so his research will inform both.

Well be creating the first-ever flow visualizations of bio-inspired robots darting side-to-side, Quinn said. Our measurements could lay the groundwork for a new generation of intelligent swimming and flying machines.

According to the U.S. Energy Information Administration, approximately 5% of the energy that is generated by power plants in the country is lost to resistance in the power lines used to transmit energy to homes and businesses, and the complexity of the problem at the atomic level makes it difficult for researchers to understand exactly what properties of the electrons involved might lead to the ability to conduct current without resistance.

Direct imaging of electrons is almost impossible, but quantum simulation can shed light on the microscopic properties of these complex quantum systems. CAREER winner Peter Schauss, an assistant professor of physics who specializes in experimental atomic, molecular and optical physics, will use funding from the award to develop quantum simulations using atoms cooled to a few billionths of a degree and trapped in an artificial crystal of light. Using a quantum gas microscope with high-resolution imaging capabilities that will allow him to capture images of individual atoms, hell search for answers that could lead to the next generation of superconductors.

In this modern age of materials, complex alloys lighten the weight of cars and planes to save fuel and help the environment, biomaterials replace human joints so we can remain active into our elder years, and graphene-coated smart screens put us in touch literally with individual creativity and global commerce.

These breakthroughs demonstrate the power of nanotechnology, a term introduced in 1974 to describe the precision machining of materials to the atomic scale. While experimentation, development and commercialization of nanomaterials has evolved, the textbook model describing how and when a material changes its form remains stuck in the 1970s.

Zhou has a plan to bring this model into the modern age and democratize materials design. He will use his CAREER Award to innovate a valuable tool in alloy development called the CALPHAD method, which stands for CALculation of PHAse Diagrams.

My grand vision is to make computational tools easy to use and valuable for all materials scientists, Zhou said.

Editors note: Two additional faculty members from UVA earned CAREER Awards, but are no longer with the University.

View original post here:

UVA Science and Engineering Faculty Win 12 NSF Career Awards - University of Virginia

Read More..

Microsoft Excel is still the data analytics gold standard. The pre-Black Friday sale can teach you fast. – The Next Web

TLDR: The Ultimate 2022 Pivot Tables and Dashboard in Excel Bundle brings all the pro tips of hardcore data analysis to any user for just $16.99.

Anybody can plunk some numbers into a rudimentary spreadsheet. That doesnt mean youre somehow now a Microsoft Excel expert. Not quite. That heritage business software has been around for decades because its incredibly versatile, but if you dont understand some of the basics, then the true subtle power of Excel is lost.

Which brings us to pivot tables. If you dont fully grasp pivot tables, the way that Excel users extract important data and aggregate it for display from much larger data sets, then you dont really get Excel. With the coursework in The Ultimate 2022 Pivot Tables and Dashboard in Excel Bundle ($16.99 after code SAVE15NOV from TNW Deals), anyone with an eye for data, analytics, and data science can become a Pivot Table pro and understand all the ways that powerful function works for Excel users.

The collection includes three courses, packed with over 22 hours of learning that can take even first time Pivot Table users from the basics through to tips and tricks only Excel elite know how to achieve.

It all begins with Pivot Table for Beginners, your introduction to this interactive way of quickly summarizing large amounts of data. Users will get a feel for how pivot tables work, how to input data sets into those tables, and even how to clean your data so you can get the true, proper analysis that you want.

The training escalates with Advanced Pivot Tables, as learners drill deeper into this powerful data analysis function. From those original basics, this course elevates the training to give users a knowing understanding of features like advanced sorting, slicers, timelines, calculated fields, pivot charts, conditional formatting, and more.

Finally, Dashboards in Excel lets users take that data visualization to new levels. This in-depth course gets students exposure to some essential formulas needed to create dashboards in Excel, Pivot Tables, Pivot Charts, Form Controls and more.

The training walks users through creating their own sales and HR dashboards through the use of insightful step-by-step guides.

A $249 collection of training, users can pick up The Ultimate 2022 Pivot Tables and Dashboard in Excel Bundle now at one of its lowest prices of the year thanks to the current pre-Black Friday sale. When shoppers enter the code SAVE15NOV during checkout, buyers can get the complete package for just $16.99.

Prices are subject to change

See original here:

Microsoft Excel is still the data analytics gold standard. The pre-Black Friday sale can teach you fast. - The Next Web

Read More..