Category Archives: Machine Learning

Machine learning and hydrodynamic proxies for enhanced rapid tsunami vulnerability assessment | Communications … – Nature.com

Synthetic variables for shielding mechanism and debris impact as proxies for water velocity

To comprehensively analyze the individual contributions of the three approaches for accounting for water velocity, we systematically trained different eXtra Trees (XT) models33, each featuring a unique combination of input variables. The reference scenario (ID0) serves as both the initial benchmark and foundational baseline, encompassing the minimum set of variables retained across all subsequent scenarios. This baseline incorporates only basic input variables sourced from the original MLIT database, further enriched with some of the geospatial variables introduced by Di Bacco et al. characterized by the most straightforward computation23. Subsequently, the additional models are generated by iteratively introducing velocity-related (directly or indirectly) features into the model. This stepwise approach allows us to isolate the incremental improvements in predictive accuracy attributed to each individual component under consideration. Table 1 in Methods offers a concise overview of all tested variables, with those included in the reference scenario highlighted in italics.

The core results of the analysis aimed at assessing the predictive performance variability among the various trained models are summarized in Fig.1, which illustrates the global average accuracy (expressed in terms of hit rate (HR) on the test set) achieved by each model across ten training sessions. In the figure, each column represents a specific combination of input features, with x markers indicating excluded variables during each model training. Insights into the importance of individual input features on the models predictive performance are provided by the circles, the size of which corresponds to the mean decrease in accuracy (mda) when each single variable is randomly shuffled.

Circle size reflects the mean decrease in accuracy (mda) when individual variables are shuffled and x markers indicate excluded variables in model training.

The pair plot in Fig.2, illustrating the correlations and distributions among considered velocity-related variables as well as Distance across the seven damage classes in the MLIT dataset, has been generated to support the interpretation of the results and enrich the discussion. This graphical representation employs scatter plots to display the relationships between each pair of variables, while the diagonal axis represents kernel density plots for the individual features.

The pie chart summarizes the distribution of the various damage states within the dataset (shades from light pink to violet). The pair plot displays the relationships between each pair of variables, while the diagonal axis represents kernel density plots for the individual features.

The baseline model (ID0), established as a reference due to its exclusion of any velocity information, attains an average accuracy of 0.836. In ID1, the model exclusively incorporates the direct contribution of vsim, resulting in a modest improvement, with accuracy reaching 0.848. The subsequent model, ID2, closely resembling ID1 but replacing vsim with vc, demonstrates a decline in performance, with an accuracy value of 0.828. This decrease is attributed to the redundancy between vc and inundation depth (h), both in their shared importance as variables and in the decrease of hs importance compared to the previous case. Essentially, when both variables are included, the model might become confused because h, which could have been a relevant variable when introduced alone, may now appear less important due to the addition of vc, which basically provides the same information in a different format.

The analysis proceeds with the introduction of buffer-related proxies to account for possible dynamic water effects on damage. Initially, we isolate the effect of the two considered mechanisms: the shielding (ID3) exerted by structures within the buffers (NShArea and NSW) and the debris impact (NDIArea, ID4). In both instances, we observe an enhancement in accuracy, with values reaching 0.877 and 0.865, respectively. Their combined effect is considered in model ID5, yielding only a marginal overall performance improvement (0.878), due to the noticeable correlation between NShArea and NDIArea, especially for the more severe damage levels (Fig.2), with the two variables sharing their overall importance. Combination ID6, with the addition of vc, does not exhibit an increase in accuracy compared to the previous model (0.871), thus confirming the redundant contribution of a variable directly derived from another.

In the subsequent three input feature combinations, we explore the possible improvements in accuracy through the inclusion of vsim in conjunction with the considered proxies. In the case of ID7, where vsim is combined solely with shielding effect, no enhancement is observed (0.870) compared to the corresponding simple ID3. Similarly, when replacing shielding with the debris proxy (ID8), an overall accuracy of 0.867 is achieved, closely resembling the performance of ID4, lacking direct velocity input. The highest accuracy (0.889) is instead obtained when all three contributions are included simultaneously. Hence, the inclusion of vsim appears to result only in a marginal enhancement of model performance, with also an overall lower importance compared to the considered two proxies. From a physical perspective, albeit without a noticeable correlation between the data points of vsim and NShArea (Fig.2), this result can be explained by recognizing that flow velocity indirectly encapsulates the shielding effect arising from the presence of buildings, which are typically represented in hydrodynamic models as obstructions to wave propagation or through an increase in bottom friction for urban areas8,34,35,36. Since this alteration induced by the presence of buildings directly influences the hydrodynamic characteristics of the tsunami on land, the resulting values of vsim offer limited additional improvement to the models predictive ability compared to what is alreadyprovided by h and NShArea. Moreover, the very weak correlation of the considered proxies with the primary response variable h (Fig.2) reinforces their importance in the framework of a machine learning approach, since they provide distinct input information compared to flow velocity, which, instead, is directly related to h, as discussed for vc. Such observations then support the idea of regarding these proxies as suitable variables for capturing dynamic water effects on buildings.

In all previous combinations, observed field values (hMLIT) served as the primary data source for inundation depth information. However, for a more comprehensive analysis, we also introduced feature combination ID10, similar to ID9 but employing simulated inundation depths (hsim) in place of hMLIT. This model achieves accuracy levels comparable to its counterparts and exhibits a consistent feature importance pattern, albeit with a slight increase in the importance of the Distance variable.

For completeness, normalized confusion matrices, describing hit and misclassification rates among the different damage classes, are reported in Supplementary Fig.S1. These matrices reveal uniform error patterns across all models, with Class 5 consistently exhibiting higher misclassification rates, as a result of its underrepresentation in the dataset, as illustrated in Fig.2. Concerning the potential influence of such dataset imbalance on the results, it is worth noting that, for the primary aim of this study, it does not alter the overall outcomes in terms of relative importance of the various features on damage predictions, as affecting all trained models in the same way.

Delving further into the analysis of the results, the objective shifts toward gaining a thorough understanding of the relationships between the variables influencing the damage mechanisms. Indeed, while we have shown that the inclusion of water velocity components or the adoption of a more comprehensive multi-variable approach enhances tsunami damage predictions, machine learning algorithms have often been criticized for their inherent black-box nature30,31,32.

To address this challenge, we have chosen to embrace the concept of explanation through visualization by illustrating how it remains possible to derive explicit and informative insights from the outcomes derived from a machine learning approach, all while embracing the inherent complexity arising from the multi-variable nature of the problem at hand.

The results of trained models are then translated into the form of traditional fragility functions, expressing the probability of exceeding a certain damage state as a function of inundation depth, for fixed values of the feature under investigation, distinguished for velocity-related (Fig.3), site-dependent (Fig.4) and structural building attributes (Fig.5). In addition to the central value, the derived functions incorporate the 10th90th confidence intervals to provide a comprehensive representation of predictive uncertainty associated with them.

Fragility functions for fixed values of a direct velocity information (vsim), b proxy for shielding effect (NShArea) and c proxy for debris impact (NDIArea). The median fragility function is represented as a solid line, while the shaded area represents the 10th90th confidence interval.

Fragility functions for fixed values of a coastal typology (CoastType) and b distance from the coastline (Distance). The median fragility function is represented as a solid line, while the shaded area represents the 10th90th confidence interval.

Fragility functions for fixed values of a structural type (BS) and b number of floors (NF). The median fragility function is represented as a solid line, while the shaded area represents the 10th90th confidence interval.

Starting with the analysis of the fragility functions obtained for fixed values of velocity-related variables (Fig.3), it is possible to observe the substantial impact of the hydrodynamic effects, especially in more severe inundation scenarios. Notably, differences in the median fragility functions for the more damaging states (DS5) are only evident when velocity reaches high values (around 10m/s), while those for 0.1 and 2m/s are practically overlapping, albeit featuring a wide uncertainty band, demonstrating how the several additional explicative variables included into the model affect the damage process. More pronounced differences in the fragilities become apparent for lower damage states, under shallower water depths (h<2m) and slower flow velocities, although a substantial portion of the predictive power in non-structural damage scenarios predominantly relies on the inundation depth8,11,13. The velocity proxy accounting for the shielding effect (NShArea) mirrors the behavior observed for vsim, but with greater variability for DS7.

For instance, the probability of reaching DS7 with an inundation depth of 4m drops from ~70% for an isolated building (NShArea=0) to roughly 40% for one located in a densely populated area (NShArea=0.5). This substantial variation not only highlights the influence of this variable for describing the damage mechanism, but also explains its profound impact on the models predictive performance shown in Fig.1. Conversely, for less severe DS, the central values of the three considered fragility functions tend to converge onto a single line, indicating that the shielding mechanism primarily influences the process leading to the total destruction of buildings. Distinct patterns emerge for the velocity proxy related to debris impact (NDIArea), particularly for DS5, emphasizing its crucial role in predicting relevant structural damages.

For example, at an inundation depth of 4m, the probability of reaching DS7 is 40% when NDIArea=0 (i.e., no washed-away structures in the buffer area for the considered building), but it rises to ~90% when NDIArea=0.3 (i.e., 30% of the buffer area with washed-away buildings). Moreover, similarly to NshArea, the width of the uncertainty band generally narrows with decreasing damage state, thus suggesting that inundation depth acts as the main predictor for low entity damages. These results represent an advancement beyond the work of Reese et al.26, who first attempted to incorporate information on shielding and debris mechanisms into fragility functions based on a limited number of field observations for the 2009 South Pacific tsunami, and Charvet et al.8, who investigated the possible effect of debris impacts (through the use of a binary variable) on damage levels for the 2011 Great East Japan event.

Concerning morphological variables, Fig.4 well represents the amplification effect induced by ria-type coasts, especially for the higher damage states, consistently with prior literature8,11,13,37,38. However, above 6m, the median fragility curve for the plain coastal areas exceeds that of the ria-type region, in line with findings by Suppasri et al.37,38, who also described a similar trend pattern. Nevertheless, it is worth observing that the variability introduced by other contributing features muddles the differences between the two coastal types, with the magnitude of the uncertainty band almost eclipsing the noticeable distinctions in the central values. This observation highlights the imperative need to move beyond the use of traditional univariate fragility functions, in favor of multi-variable models, intrinsically capable of taking these complex interactions into account. Distance from the coast has emerged as a pivotal factor in predictive accuracy (Fig.1) and this is also evident in the corresponding fragility functions computed for Distance values of 170, 950 and 2600m (Fig.4). Obviously, a clear negative correlation exists between Distance and inundation depth (Fig.2), with structures closer to the coast being more susceptible to damage, especially in case of structural damages. In detail, more pronounced differences in the fragility patterns are observed for DS5 and DS6, where the probability of exceeding these damage states with a 2m depth is almost null for buildings located within a distance of 1km from the coast, while it increases to over 80% for those in close proximity to the coastline. This mirrors the observations resulting for NDIArea (Fig.3), where greater distances result in less damage potential from washed-away buildings.

Figure5 illustrates the fragility functions categorized by structural types (BS) and building characteristics represented in terms of NF. Overall, the observed patterns align with the findings discussed in the preceding figures. When focusing on the median curves, it becomes evident that these features exert minimal influence on the occurrence of non-structural damages, with overlapping curves and relatively narrow uncertainty bands for DS5, owing to the mentioned dominance of inundation depth as main damage predictive variable in such cases.

However, for the more severe damage states, distinctions become more marked. Reinforced-concrete (RC) buildings exhibit lower vulnerability, followed by steel, masonry and wood structures, with the latter two showing only minor differences among them. A similar trend is also evident for NF, with taller buildings being less vulnerable than shorter ones under severe damage scenarios. The most relevant differences emerge when transitioning from single or two-story buildings to multi-story dwellings. However, once again, it is worth noting that, beyond these general patterns, also highlighted in previous studies1,5,8,11,26,34,37, the influence of other factors tends to blur the distinctions among the central values of the different typologies, as visible, for instance, for the confidence interval for steel buildings, which encompasses both median fragility functions for wood and masonry structures.

More here:
Machine learning and hydrodynamic proxies for enhanced rapid tsunami vulnerability assessment | Communications ... - Nature.com

Machine learning-guided realization of full-color high-quantum-yield carbon quantum dots – Nature.com

Workflow of ML-guided synthesis of CQDs

Synthesis parameters have great impacts on the target properties of resulting samples. However, it is intricate to tune various parameters for optimizing multiple desired properties simultaneously. Our ML-integrated MOO strategy tackles this challenge by learning the complex correlations between hydrothermal/solvothermal synthesis parameters and two target properties of CQDs in a unified MOO formulation, thus recommending optimal conditions that enhance both properties simultaneously. The overall workflow for the ML-guided synthesis of CQDs is shown in Fig.1 and Supplementary Fig.1. The workflow primarily consists of four key components: database construction, multi-objective optimization formulation, MOO recommendation, and experimental verification.

It consists of four key components: database construction, multi-objective optimization (MOO) formulation, MOO recommendation, and experimental verification.

Using a representative and comprehensive synthesis descriptor set is of vital importance in achieving the optimization of synthesis conditions36. We carefully selected eight descriptors to comprehensively represent the hydrothermal system, one of the most common methods to prepare CQDs. The descriptor list includes reaction temperature (T), reaction time (t), type of catalyst (C), volume/mass of catalyst (VC), type of solution (S), volume of solution (VS), ramp rate (Rr), and mass of precursor (Mp). To minimize human intervention, the bounds of synthesis parameters are determined primarily by the constraints of the synthesis methods and equipment used, instead of expert intuition. For instance, in employing hydrothermal/solvothermal method to prepare CQDs, as the reactor inner pot is made of polytetrafluoroethylene material, the usage temperature should be 220oC. Moreover, the capacity of the reactor inner pot used in the experiment is 25mL, with general guidance of not exceeding 2/3 of this volume for reactions. Therefore, in this study, the main considerations of experimental design are to ensure experimental safety and accommodate the limitations of equipment. These practical considerations naturally led to a vast parameter space, estimated at 20 million possible combinations, as detailed in Supplementary Table1. Briefly, the 2,7-naphthalenediol molecule along with catalysts such as H2SO4, HAc, ethylenediamine (EDA) and urea, were adopted in constructing the carbon skeleton of CQDs during the hydrothermal or solvothermal reaction process (Supplementary Fig.2). Different reagents (including deionized water, ethanol, N,N-dimethylformamide (DMF), toluene, and formamide) were used to introduce different functional groups into the architectures of CQDs, combined with other synthesis parameters, resulting in tunable PL emission. To establish the initial training dataset, we collected 23 CQDs synthesized under different randomly selected parameters. Each data sample is labelled with experimentally verified PL wavelength and PLQY (see Methods).

To account for the varying importance of multiple desired properties, an effective strategy is needed to quantitatively evaluate candidate synthesis conditions in a unified manner. A MOO strategy has thus been developed that prioritizes full-color PL wavelength over PLQY enhancement, by assigning an additional reward when maximum PLQY of a color surpassing the predefined threshold for the first time. Given (N) explored experimental conditions, {(({x}_{i},,{y}_{i}^{c},,{y}_{i}^{gamma }{|; i}=(1,2,ldots,N))}, ({x}_{i}) indicates the (i)-th synthesis condition defined by 8 synthesis parameters, ({y}_{i}^{c}) and ({y}_{i}^{gamma }) indicate the corresponding color label and yield (i.e., PLQY) given ({x}_{i}); ({y}_{i}^{c}in left{{c}_{1},,,{c}_{2},ldots,{c}_{M}right}) for (M) possible colors, ({y}_{i}^{gamma }in left[0,,1right]). The unified objective function is formulated as the sum of maximum PLQY for each color label, i.e.,

$$mathop{sum}nolimits_{{c}_{j}}{Y}_{{c}_{j}}^{max },$$

(1)

where (jin left{1,,2,,ldots,,Mright}) and ({Y}_{{c}_{j}}^{max }) is 0 if (nexists {y}_{i}^{c}={c}_{j}); otherwise

$${Y}_{{c}_{j}}^{max }={max }_{i}left[Big({y}_{i}^{gamma }+R{{cdot }}{mathbb{1}}left({y}_{i}^{gamma }ge alpha right)Big){{cdot }}{mathbb{1}}left({y}_{i}^{c}={c}_{j}right)right].$$

(2)

({mathbb{1}}({{cdot }})) is an indicator function that output 1 if true, otherwise outputs 0. The term (Rcdot {mathbb{1}}({y}_{i}^{gamma }ge alpha )) enforces a higher priority of full-color synthesis, where PLQY for each color shall be at least (alpha) ((alpha=0.5) in our case) to have an additional reward of (R) ((R=10) in our settings). (R) can be any real value larger than 1 (i.e., maximum possible improvement of PLQY for one synthesis condition), to ensure the higher priority of exploring synthesis conditions for colors in which yield has not achieved (alpha). We set (R) to 10, such that the tens digit of unified objective functions value clearly indicates the number of colors with maximum PLQYs exceeding (alpha), and the units digit reflects the sum of maximum PLQYs (without the additional reward) for all colors. As defined by the ranges of PL wavelength in Supplementary Table2, seven primary colors considered in this work are purple (<420nm), blue (420 and <460nm), cyan (460 and <490nm), green (490 and <520nm), yellow (520 and <550nm), orange (550 and <610nm), and red (610nm), i.e., (M=7). Notably, the proposed MOO formulation unifies the two goals of achieving full color and high PLQY into a single objective function, providing a systematical approach to tune synthesis parameters for the desired properties.

The MOO strategy is premised on the prediction results of ML models. Due to the high-dimensional search space and limited experimental data, it is challenging to build models that generalize well on unseen data, especially considering the nonlinear nature of the condition-property relationship37. To address this issue, we employed a gradient boosting decision tree-based model (XGBoost), which has proven advantageous in handling related material datasets (see Methods and Supplementary Fig.3)30,38. In addition, its capability to guide hydrothermal synthesis has been proven in our previous work (Supplementary Fig.4)21. Two regression models, optimized with the best hyperparameters through grid search, were fitted on the given dataset, one for PL wavelength and the other for PLQY. These models were then deployed to predict all unexplored candidate synthesis conditions. The search space for candidate conditions is defined by the Cartesian product of all possible values of eight synthesis parameters, resulting in ~20 million possible combinations (see Supplementary Table1). The candidate synthesis conditions, i.e., unexplored regions of the search space, are further ranked by MOO evaluation strategy with the prediction results.

Finally, the PL wavelength and PLQY values of the CQDs synthesized under the top two recommended synthesis conditions are verified through experiments and characterization, whose results are then augmented to the training dataset for the next iteration of the MOO design loop. The iterative design loops continue until the objectives are fulfilled, i.e., when the achieved PLQY for all seven colors surpasses 50%. In prior studies on CQDs, its worth noting that only a limited number of CQDs with short-wavelength fluorescence (e.g., blue and green), have reached PLQYs above 50%39,40,41. On the other hand, their long-wavelength counterparts, particularly those with orange and red fluorescence, usually demonstrate PLQYs under 20%42,43,44. Underlining the efficacy of our ML-powered MOO strategy, we have set an ambitious goal for all fluorescent CQDs: the attainment of PLQYs exceeding 50%. The capacity to modulate the PL emission of CQDs holds significant promise for various applications, spanning from bioimaging and sensing to optoelectronics. Our four-stage workflow is crafted to forge an ML-integrated MOO strategy that can iteratively guide hydrothermal synthesis of CQDs for multiple desired properties, while also constantly improving the models prediction performance.

To assess the effectiveness of our ML-driven MOO strategy in the hydrothermal synthesis of CQDs, we employed several metrics, which were specifically chosen to ascertain whether our proposed approach not only meets its dual objectives but also enhances prediction accuracy throughout the iterative process. The unified objective function described above measures how well the two desired objectives have been realized experimentally, and thus can be a quantitative indicator of the effectiveness of our proposed approach in instructing the CQD synthesis. The evaluation output of the unified objective function after a specific ML-guided synthesis loop is termed as objective utility value. The MOO strategy improves the objective utility value by a large margin of 39.27% to 75.44, denoting that the maximum PLQY in all seven colors exceeds the target of 0.5 (Fig.2a). Specifically, at iterations 7 and 19, the number of color labels with maximum PLQY exceeding 50% increases by one, resulting in an additional reward of 10 each time. Even on the seemingly plateau, the two insets illustrate that the maximally achieved PLQY is continuously enhanced. For instance, during iterations 8 to 11, the maximum PLQY for cyan emission escalates from 59% to 94%, and the maximum PLQY for purple emission rises from 52% to 71%. Impressively, our MOO approach successfully fulfilled both objectives within only 20 iterations (i.e., 40 guided experiments).

a MOOs unified objective utility versus design iterations. b Color explored with new synthesized experimental conditions. Value ranges of colors defined by PL wavelength: purple (PL<420nm), blue (420nm PL<460nm), cyan (460nm PL<490nm), green (490nm PL<520nm), yellow (520nm PL<550nm), orange (550nm PL<610nm), and red (610nm PL). It shows that while high PLQY has been achieved for red, orange, and blue in the initial dataset, the MOO strategy purposefully enhances PLQYs for yellow, purple, cyan, green respectively in subsequent synthesized conditions in a group of five. c MSE between the predicted and real target properties. d Covariance matrix for correlation among the 8 synthesis parameters (i.e., reaction temperature T, reaction time t, type of catalyst C, volume/mass of catalyst VC, type of solution S, volume of solution VS, ramp rate Rr, and mass of precursor Mp) and 2 target properties, i.e., PLQY and PL wavelength (PL ). e Two-dimensional t-distributed stochastic neighbor embedding (t-SNE) plot for the whole search space, including unexplored (circular points), training (star-shaped points), and explored (square points) conditions, where the latter two sets are colored by real PL wavelengths.

Figure2b reveals that the MOO strategy systematically explores the synthesis conditions for each color, addressing those that have not yet achieved the designed PLQY threshold, starting with yellow in the first 5 iterations and ending with green in the last 5 iterations. Notably, within each quintet of 5 iterations, a singular color demonstrates an enhancement in its maximum PLQY. Initially, the PLQY for yellow surges to 65%, which is then followed by a significant rise in purples maximum PLQY from 44% to 71% during the next set of 5 iterations. This trend continues with cyan and green, where the maximum PLQY escalates to 94% and 83% respectively. Taking into account both the training set (i.e., the first 23 samples) and the augmented dataset, the peak PLQY for all colors exceeds 60%. Several colors approach 70% (including purple, blue, and red), and some are near 100% (including cyan, green, and orange). This further underscores the effectiveness of our proposed ML technique. A more detailed visualization of the PL wavelength and PLQY along each iteration is provided in Supplementary Fig.5.

The MOO strategy ranks candidate synthesis conditions based on ML prediction; thus, it is vital to evaluate the ML models performance. Mean squared error (MSE) is employed as the evaluation metric, commonly used for regression, which is computed based on the predicted PL wavelength and PLQY from the ML models versus the experimentally determined values45. As shown in Fig.2c, the MSE of PLQY drastically decreases from 0.45 to approximately 0.15 within just four iterations a notable error reduction of 64.5%. The MSE eventually stabilizes around 0.1 as the iterative loops progress. Meanwhile, the MSE of PL wavelength remains consistently low, always under 0.1. MSE of PL wavelength is computed after normalizing all values to the range of zero to one for a fair comparison, thus MSE of 0.1 signifies a favorable deviation within 10% between the ML-predicted values and the experimental verifications. This indicates that the accuracies of our ML models for both PL wavelength and PLQY consistently improve, with predictions closely aligning with actual values after enhanced learning from augmented data. This not only demonstrates the efficacy of our MOO strategy in optimizing multiple desired properties but also in refining ML models.

To unveil the correlation between synthesis parameters and target properties, we further calculated the covariance matrix. As illustrated in Fig.2d, the eight synthesis parameters generally exhibit low correlation among each other, indicating that each parameter contributes unique and complementary information for the optimization of the CQDs synthesis conditions. In terms of the impact of these synthesis parameters on target properties, factors such as reaction time and temperature are found to influence both PL wavelength and PLQY. This underscores the importance for both experimentalists and data-driven methods to adjust them with higher precision. Besides reaction time and temperature, PL wavelength and PLQY are determined by distinct sets of synthesis parameters with varying relations. For instance, the type of solution affects PLQY with a negative correlation, while solution volume has a stronger positive correlation with PLQY. This reiterates that, given the high-dimensional search space, the complex interplay between synthesis parameters and multiple target properties can hardly be unfolded without capable ML-integrated methods.

To visualize how the MOO strategy has navigated in the expansive search space (~20 million) using only 63 data samples, we have compressed the initial training, explored, and unexplored space into two dimensions by projecting them into a new reduced embedding space using t-distributed stochastic neighbor embedding (t-SNE)46. As shown in Fig.2e, discerning distinct clustering patterns by color proves challenging, which emphasizes the intricate task of uncovering the relationship between synthesis conditions and target properties. This complexity further underscores the critical role of a ML-driven approach in deciphering the hidden intricacies within the data. The efficacy of ML models is premised on the quality of training data. Thus, selecting training data that span as large search space as possible is particularly advantageous to models generalizability37. As observed in Fig.2e, our developed ML models benefit from the randomly and sparsely distributed training data, which in turn encourage the models to further generalize to previously unseen areas in the search space, and effectively guide the searching of optimal synthesis conditions within this intricate multi-objective optimization landscape.

With the aid of ML-coupled MOO strategy, we have successfully and rapidly identified the optimal conditions giving rise to full-color CQDs with high PLQY. The ML-recommended synthesis conditions that produced the highest PLQY of each color are detailed in the Methods section. Ten CQDs with the best optical performance were selected for in-depth spectral investigation. The resulting absorption spectra of the CQDs manifest strong excitonic absorption bands, and the normalized PL spectra of the CQDs displayed PL peaks ranging from 410nm of purple CQDs (p-CQDs) to 645nm of red CQDs (r-CQDs), as shown in Fig.3a and Supplementary Fig.6. This encompasses a diverse array of CQD types, including p-CQDs, blue CQDs (b-CQDs, 420nm), cyan CQDs (c-CQDs, 470nm), darkcyan CQDs (dc-CQDs, 485nm), green CQDs (g-CQDs, 490nm), yellow-green CQDs (yg-CQDs, 530nm), yellow CQDs (y-CQDs, 540nm), orange CQDs (o-CQDs, 575nm), orange red CQDs (or-CQDs, 605nm), and r-CQDs. Importantly, PLQY of most of these CQDs were above 60% (Supplementary Table3), exceeding the majority of CQDs reported to date (Supplementary Table4). Corresponding photographs of full-color fluorescence ranging from purple to red light under UV light irradiation are provided in Fig.3b. Excellent excitation-independent behaviors of the CQDs have been further revealed by the three-dimensional fluorescence spectra (Supplementary Fig.7). Furthermore, a comprehensive investigation of the time-resolved PL spectra revealed a notable trend. The monoexponential lifetimes of CQDs progressively decreased from 8.6ns (p-CQDs) to 2.3ns (r-CQDs) (Supplementary Fig.8). This observation signified that the lifetimes of CQDs diminished as their PL wavelength experiences a shift towards the red end of the spectrum47. Moreover, the CQDs also demonstrate long-term photostability (>12hours), rendering them potential candidates for applications in optoelectronic devices that require stable performance over extended periods of time (Supplementary Fig.9). All the results together demonstrate the high quality and great potential of our synthesized CQDs.

a Normalized PL spectra of CQDs. b Photographs of CQDs under 365 nm-UV light irradiation. c Dependence of the HOMO and LUMO energy levels of CQDs.

To gain further insights into the properties of the synthesized CQDs, we calculated their bandgap energies using the experimentally obtained absorption band values (Supplementary Fig.10 and Table5). It is revealed that the calculated bandgap energies gradually decrease from 3.02 to 1.91eV from p-CQDs to r-CQDs. In addition, we measured the highest occupied molecular orbital (HOMO) energy levels of the CQDs using ultraviolet photoelectron spectroscopy. As shown in the energy diagram in Fig.3c, the HOMO values exhibit wave-like variations without any discernible pattern. This result further suggests the robust predictive and optimizing capability of our ML-integrated MOO strategy, which enabled the successful screening of these high-quality CQDs from vast and complex search space using only 40 sets of experiments.

To uncover the underlying mechanism of the tuneable optical effect of the synthesized CQDs, we have carried out a series of characterizations to comprehensively investigate their morphologies and structures (see Methods). X-ray diffraction (XRD) patterns with a single graphite peak at 26.5 indicate a high-degree graphitization in all CQDs (Supplementary Fig.11)15. Raman spectra exhibit a stronger signal intensity for the ordered G band at 1585cm1 compared to the disordered D band at 1397cm1, further confirming the high-degree graphitization (Supplementary Fig.12)48. Fourier-transform infrared (FT-IR) spectroscopy was then performed to detect the functional groups in CQDs, which clearly reveals the NH2 and NC stretching at 3234 and 1457cm1, respectively, indicating the presence of abundant NH2 groups on the surface of CQDs, except for orange CQDs (o-CQDs) and yellow CQDs (y-CQDs) (Supplementary Fig.13)49. The C=C aromatic ring stretching at 1510cm1 confirms the carbon skeleton, while three oxide-related peaks, i.e., OH, C=O, and CO stretching, were observed at 3480, 1580, and 1240cm1, respectively, due to abundant hydroxyl groups of the precursor. The FT-IR spectrum also shows a stretching vibration band SO3 at 1025cm1, confirming the additional functionalization of y-CQDs by SO3H groups.

X-ray photoelectron spectroscopy (XPS) was adopted to further probe the functional groups in CQDs (Supplementary Fig.14 to 23). XPS survey spectra analysis reveals three main elements in CQDs, i.e., C, O, and N, except o-CQDs and y-CQDs. Specifically, o-CQDs and y-CQDs lack the N element and y-CQDs contains S element. The high-resolution C1s spectrum of CQDs can be deconvoluted into three peaks, including a dominant CC/C=C graphitic carbon bond (284.8eV), CO/CN (286eV), and carboxylic C=O (288eV), revealing the structures of CQDs. The N1s peak at 399.7eV indicates the presence of NC bonds, verifying the successful N-doping in the basal plane network structure of CQDs, except o-CQDs and y-CQDs. The separated peaks of O1s at 531.5 and 533eV indicate the two forms of oxyhydrogen functional groups with C=O and CO, respectively, consistent with the FT-IR spectra50. The S2p band of y-CQDs can be decomposed into two peaks at 163.5 and 167.4eV, representing SO3/2P3/2 and SO3/2P1/2, respectively47,51. Combining the results of structure characterization, the excellent fluorescence properties of the CQDs are attributed to the presence of N-doping, which reduces non-radiative sites of CQDs and promotes the formation of C=O bonds. The C=O bonds play a crucial role in radiation recombination and can increase the PLQY of the CQDs.

To gain deeper insights into the morphology and microstructures of the CQDs, we have then conducted transmission electron microscopy (TEM). The TEM images demonstrate uniformly shaped and monodisperse nanodots, with the gradual increase of average lateral sizes ranging from 1.85nm for p-CQDs to 2.3nm for r-CQDs (Fig.4a and Supplementary Fig.24), which agrees with the corresponding PL wavelength, providing further evidence for the quantum size effect of CQDs (Fig.4a)47. High-resolution TEM images further reveal the highly crystalline structures of CQDs with well-resolved lattice fringes (Fig.4b-c). The measured crystal plane spacing of 0.21nm corresponds to the (100) graphite plane, further corroborating the XRD data. Our analysis suggests that the synthesized CQDs possess a graphene-like high-crystallinity characteristic, thereby giving rise to their superior fluorescence performance.

a The lateral size and color of full-color fluorescent CQDs (inset: dependence of the PL wavelength and the lateral size of full-color fluorescent CQDs). Data correspond to meanstandard deviation, n=3. b, c High-resolution TEM images and the fast Fourier transform patterns of p-, b-, c-, g-, y-, o- and r-CQDs, respectively. d Boxplots of PL wavelength (left)/PLQY (right) and 7 synthesis parameters of CQDs. VC is excluded here as its value range is dependent on C, whose relationships with other parameters are not directly interpretable. The labels at the bottom indicate the minimum value (inclusive) for the respective bins, whereas the bins on the left are the same as the discretization of colors in Supplementary Table2, the bins on the right are uniform. Each box spans vertically from the 25th percentile to the 75th percentile, with the horizontal line marking the median and the triangle indicating the mean values. The upper and lower whiskers extend from the ends of the box to the minimum and maximum data values.

Following the effective utilization of ML in thoroughly exploring the entire search space, we proceeded to conduct a systematic examination of 63 samples using box plots, aiming to elucidate the complex interplay between various synthesis parameters and the resultant optical properties of CQDs. As depicted in Fig.4d, the synthesis under conditions of high reaction temperature, prolonged reaction time, and low-polarity solvents, tends to result in CQDs with a larger PL wavelength. These findings are consistent with the general observations in the literature, which suggest that the parameters identified above can enhance precursor molecular fusion and nucleation growth, thereby yielding CQDs with increased particle size and high PL wavelength47,52,53,54,55. Moreover, a comprehensive survey of existing literature implies that precursors and catalysts, typically including electron donation and acceptance, aid in producing long-wavelength CQDs56,57. Interestingly, diverging from traditional findings, we successfully synthesized long-wavelength red CQDs under ML guidance, with 2,7-naphthalenediol containing electron-donating groups as the precursor and EDA is known for its electron-donating functionalities as the catalyst. This significant breakthrough questions existing assumptions and offers new insights into the design of long-wavelength CQDs.

Concerning PLQY, we found that catalysts with stronger electron-donating groups (e.g., EDA) led to enhanced PLQY in CQDs, consistent with earlier observations made by our research team16. Remarkably, we uncovered the significant impact of synthesis parameters on CQDs PLQY. In the high PLQY regime, strong positive correlations were discovered between PLQY and reaction temperature, reaction time, and solvent polarity, previously unreported in the literature58,59,60,61. This insight could be applied to similar systems for PLQY improvement.

Aside from the parameters discussed above, other factors such as ramp rate, the amount of precursor, and solvent volume also influence the properties of CQDs. Overall, the emission color and PLQY of CQDs are governed by complex, non-linear trends resulting from the interaction of numerous factors. Its noteworthy to mention that the traditional methods used to adjust CQDs properties often result in a decrease in PLQY as the PL wavelength redshifts4,47,51,54. However, utilizing AI-assisted synthesis, we have successfully increased the PLQY of the resulting full-color CQDs to over 60%. This significant achievement highlights the unique advantages offered by ML-guided CQDs synthesis and confirms the powerful potential of ML-based methods in effectively navigating the complex relationships among diverse synthesis parameters and multiple target properties within a high-dimensional search space.

Read the rest here:
Machine learning-guided realization of full-color high-quantum-yield carbon quantum dots - Nature.com

Do elephants have names for each other? – Nature.com

Elephants seem to use personalized calls to address members of their group, providing a rare example of naming in animals other than humans.

Theres a lot more sophistication in animal lives than we are typically aware, says Michael Pardo, a behavioural ecologist at Cornell University in Ithaca, New York. Elephants communication may be even more complex than we previously realized.

Other than humans, few animals give each other names. Bottlenose dolphins (Tursiops truncatus) and orange-fronted parakeets (Eupsittula canicularis) are known to identify each other by mimicking the signature calls of those they are addressing. By contrast, humans use names that have no inherent association with the people, or objects, theyre referring to. Pardo had a hunch that elephants might also have a name for each other, because of their extensive vocal communication and rich social relationships.

To find out, Pardo and his colleagues recorded, between 1986 and 2022, the deep rumbles of wild female African savannah elephants (Loxodonta africana) and their offspring in Amboseli National Park in southern Kenya, and in the Samburu and Buffalo Springs National Reserves in the countrys north. The findings were published today in Nature Ecology & Evolution1.

The researchers analysed recordings of 469 rumbles using a machine-learning technique. The model correctly identified which elephant was being addressed 27.5% of the time a much higher success rate than when the model was fed with random audio as a control. This suggests that the rumbles carry information that is intended only for a specific elephant.

Next, Pardo and his colleagues played recordings of these calls to 17 elephants and compared their reactions. The elephants became more vocal and moved more quickly towards the speaker when they heard their name compared with when they heard rumbles directed at other elephants. They could tell if a call was addressed to them just by hearing that call, says Pardo.

The findings are a very promising start, although more evidence is needed to confirm whether elephants do indeed call each other by name, says Hannah Mumby, a behavioural and evolutionary ecologist at the University of Hong Kong. She adds that understanding elephants social relationships and the role of each individual in the group is important for conservation efforts. Conserving elephants goes far beyond population numbers, says Mumby.

The next question for the team involves working out how elephants encode information in their calls. That would open up a whole range of other questions we could ask, says Pardo, such as whether elephants also name places or even talk about each other in the third person.

Link:
Do elephants have names for each other? - Nature.com

Predicting sales and cross-border e-commerce supply chain management using artificial neural networks and the … – Nature.com

This section presents a model for supply chain management in CBEC using artificial intelligence (AI). The approach provides resource provisioning by using a collection of ANNs to forecast future events. Prior to going into depth about this method, the dataset specifications utilized in this study are given.

The performance of seven active sellers in the sphere of international products trade over the course of a month was examined in order to get the data for this study. At the global level, all of these variables are involved in the bulk physical product exchange market. This implies that all goods bought by clients have to be sent by land, air, or sea transportation. In order to trade their items, each seller in this industry utilizes a minimum of four online sales platforms. Each of the 945 documents that make up the datasets that were assembled for each vendor includes data on the number of orders that consumers have made with that particular vendor. Each record's bulk product transactions have minimum and maximum amounts of 3, and 29 units, respectively. Every record is defined using a total of twenty-three distinct attributes. Some of the attributes that are included are order registration time, date, month, method (platform type used), order volume, destination, product type, shipping method, active inventory level, product shipping delay history indicated by active in the previous seven transactions, and product order volume history throughout the previous seven days. For each of these two qualities, a single numerical vector is used.

This section describes a CBEC system that incorporates a tangible product supply chain under the management of numerous retailers and platforms. The primary objective of this study is to enhance the supply chain performance in CBEC through the implementation of machine learning (ML) and Internet of Things (IoT) architectures. This framework comprises four primary components:

Retailers They are responsible for marketing and selling products.

Common sales platform Provides a platform for introducing and selling products by retailers.

Product warehouse It is the place where each retailer stores their products.

Supply center It is responsible for instantly providing the resources needed by retailers. The CBEC system model comprises N autonomous retailers, all of which are authorized to engage in marketing and distribution of one or more products. Each retailer maintains a minimum of one warehouse for product storage. Additionally, retailers may utilize multiple online sales platforms to market and sell their products.

Consumers place orders via these electronic commerce platforms in order to acquire the products they prefer. Through the platform, the registered orders are transmitted to the product's proprietor. The retailer generates and transmits the sales form to the data center situated within the supply center as soon as it receives the order. The supply center is responsible for delivering the essential resources to each retailer in a timely manner. In traditional applications of the CBEC system, the supply center provides resources in a reactive capacity. This approach contributes to an extended order processing time, which ultimately erodes customer confidence and may result in the dissolution of the relationship. Proactive implementation of this procedure is incorporated into the proposed framework. Machine learning methods are applied to predict the number of orders that will be submitted by each agent at future time intervals. Following this, the allocation of resources in the storage facilities of each agent is ascertained by the results of these forecasts. In accordance with the proposed framework, the agent's warehouse inventory is modified in the data center after the sales form is transmitted to the data center. Additionally, a model based on ensemble learning is employed to forecast the quantity of upcoming orders for the product held by the retailer. The supply center subsequently acquires the required resources for the retailer in light of the forecast's outcome. The likelihood of inventory depletion and the time required to process orders are both substantially reduced through the implementation of this procedure.

As mentioned earlier, the efficacy of the supply chain is enhanced by this framework via the integration of IoT architecture. For this purpose, RFID technology is implemented in supply management. Every individual product included in the proposed framework is assigned a unique RFID identification tag. The integration of passive identifiers into the proposed model results in a reduction of the system's ultimate implementation cost. The electronic device serves as an automated data carrier for the RFID-based asset management system in the proposed paradigm. The architecture of this system integrates passive RFID devices that function within the UFH band. In addition, tag reader gateways are installed in the product warehouses of each retailer to facilitate the monitoring of merchandise entering and departing the premises. The proposed model commences the product entry and exit procedure through the utilization of the tag reader to extract the distinct identifier data contained within the RFID tags. The aforementioned identifier is subsequently transmitted to the controller in which the reader node is connected. A query containing the product's unique identifier is transmitted by the controller node to the data center with the purpose of acquiring product information, including entry/exit authorization. Upon authorization of this procedure, the controller node proceeds to transmit a storage command to the data center with the purpose of registering the product transfer information. This registration subsequently modifies the inventory of the retailer's product warehouse. Therefore, the overall performance of the proposed system can be categorized into the subsequent two overarching phases:

Predicting the number of future orders of each retailer in future time intervals using ML techniques.

Assigning resources to the warehouses of specific agents based on the outcomes of predictions and verifying the currency of the data center inventory for each agent's warehouse. The following sub-sections will be dedicated to delivering clarifications for each of the aforementioned phases.

The imminent order volume for each vendor is forecasted within this framework through the utilization of a weighted ensemble model. A direct proportionality exists between the quantity of prediction models and the number of retailers that participate in the CBEC system. In order to predict the future volume of customer orders for the affiliated retailer, each ensemble model compiles the forecasts produced by its internal learning models. The supplier furnishes the requisite supplies to each agent in adherence to these projections. Through proactive measures to alleviate the delay that arises from the reactive supply of requested products, this methodology maximizes the overall duration of the supply chain product delivery process. Utilizing a combination of FSFS and ANOVA, the initial step in forecasting sales volume is to identify which attributes have the greatest bearing on the sales volume of particular merchants. Sales projections are generated through the utilization of a weighted ensemble model that combines sales volume with the most pertinent features. The proposed weighted ensemble model for forecasting the order volume of a specific retailer trained each of the three ANN models comprising the ensemble using the order patterns of the input from that retailer. While ensemble learning can enhance the accuracy of predictions produced by learning systems, there are two additional factors that should be considered in order to optimize its performance even further.

Acceptable performance of each learning model Every learning component in an ensemble system has to perform satisfactorily in order to lower the total prediction error by combining their outputs. This calls for the deployment of well-configured learning models, such that every model continues to operate as intended even while handling a variety of data patterns.

Output weighting In the majority of ensemble system application scenarios, the efficacy of the learning components comprising the system differs. To clarify, while certain learning models exhibit a reduced error rate in forecasting the objective variable, others display a higher error rate. Consequently, in contrast to the methodology employed in traditional ensemble systems, it is not possible to designate an identical value to the output value of every predictive component. In order to address this issue, one may implement a weighting strategy on the outputs of each learning component, thereby generating a weighted ensemble system.

CapSA is utilized in the proposed method to address these two concerns. The operation of the proposed weighted ensemble model for forecasting customer order volumes is illustrated in Fig.1.

Operation of the proposed weighted ensemble model for predicting order volume.

As illustrated in Fig.1, the ensemble model under consideration comprises three predictive components that collaborate to forecast the order volume of a retailer, drawing inspiration from the structure of the ANN. Every individual learning model undergoes training using a distinct subset of sales history data associated with its respective retailer. The proposed method utilizes CapSA to execute the tasks of determining the optimal configuration and modifying the weight vector of each ANN model. It is important to acknowledge that the configuration of every ANN model is distinct from that of the other two models. By employing parallel processing techniques, the configuration and training of each model can be expedited. Every ANN model strives to determine the parameter values in a way that minimizes the mean absolute error criterion during the configuration phase. An optimal configuration set of learning models can be obtained through the utilization of this mechanism, thereby guaranteeing that every component functions at its designated level. After the configuration of each ANN component is complete, the procedure to determine the weight of the output of the predictive component is carried out. In order to accomplish this goal, CapSA is employed. During this phase, CapSA attempts to ascertain the output value of each learning model in relation to its performance.

After employing CapSA to optimize the weight values, the assembled and weighted models can be utilized to predict the volume of orders for novel samples. To achieve this, during the testing phase, input features are provided to each of the predictive components ANN1, ANN2, and ANN3. The final output of the proposed model is computed by averaging the weighted averages of the outputs from these components.

It is possible for the set of characteristics characterizing the sales pattern to contain unrelated characteristics. Hence, the proposed approach employs one-way ANOVA analysis to determine the significance of the input feature set and identify characteristics that are associated with the sales pattern. The F-score values of the features are computed in this manner utilizing the ANOVA test. Generally speaking, characteristics that possess greater F values hold greater significance during the prediction stage and are thus more conspicuous. Following the ranking of the features, the FSFS method is utilized to select the desired features. The primary function of FSFS is to determine the most visible and appropriate subset of ranked features. The algorithm generates the optimal subset of features by iteratively selecting features from the input set in accordance with their ranking. As each new feature is incorporated into the feature subset at each stage, the learning model's prediction error is assessed. The feature addition procedure concludes when the performance of the classification model is negatively impacted by the addition of a new feature. In such cases, the optimal subset is determined as the feature subset with the smallest error. Utilizing the resultant feature set, the ensemble system's components are trained in order to forecast sales volume.

CapSA is tasked with the responsibility of identifying the most appropriate neural network topologies and optimal weight values within the proposed method. As previously stated, the ensemble model under consideration comprises three ANNs, with each one tasked with forecasting the forthcoming sales volume for a specific retailer. Using CapSA, the configuration and training processes for each of these ANN models are conducted independently. This section provides an explanation of the procedure involved in determining the optimal configuration and modifying the weight vector for each ANN model. Hence, the subsequent section outlines the steps required to solve the aforementioned optimization problem using CapSA, after which the structure of the solution vector and the objective function are defined. The suggested method's optimization algorithm makes use of the solution vector to determine the topology, network biases, and weights of neuronal connections. As a result, every solution vector in the optimization process consists of two linked parts. The first part of the solution vector specifies the network topology. Next, in the second part, the weights of the neurons and biases (which match the topology given in the first part of the solution vector) are determined. As a result, the defined topology of the neural network determines the variable length of the solution vectors in CapSA. Because a neural network might have an endless number of topological states, it is necessary to include certain restrictions in the solution vector that relate to the topology of the network. The first part of the solution vector is constrained by the following in order to narrow down the search space:

The precise count of hidden layers in any neural network is one. As such, the first element of the solution vector consists of one element, and the value of that element represents the number of neurons assigned to the hidden layer of the neural network.

The hidden layer of the neural network has a minimum of 4 and a maximum of 15 neurons.

The number of input features and target classes, respectively, determine the dimensions of the input and output layers of the neural network. As a result, the initial segment of the solution vector, known as the topology determination, solely specifies the quantity of neurons to be contained in the hidden layers. Given that the length of the second part of the solution vector is determined by the topology in the first part, the length of the first part determines the number of neurons in the neural network. For a neural network with I input neurons, H hidden neurons, and P output neurons, the length of the second part of the solution vector in CapSA is equal to (Htimes (I+1)+Ptimes (H+1)).

In CapSA, the identification of optimal solutions involves the application of a fitness function to each one. To achieve this goal, following the solution vector-driven configuration of the neural network's weights and topology, the network produces outputs for the training samples. These outputs are then compared to the actual target values. Following this, the mean absolute error criterion is applied to assess the neural network's performance and the generated solution's optimality. CapSAs fitness function is thus characterized as follows:

$$MAE=sum_{i=1}^{N}left|{T}_{i}-{Z}_{i}right|$$

(1)

In this context, N denotes the quantity of training samples, while Ti signifies the desired value to be achieved for the i-th training sample. Furthermore, the output generated by the neural network for the i-th training sample is denoted as Zi. The proposed method utilizes CapSA to ascertain a neural network structure capable of minimizing Eq.(1). In CapSA, both the initial population and the search bounds for the second portion of the solution vector are established at random [1, +1]. Thus, all weight values assigned to the connections between neurons and biases of the neural network fall within this specified range. CapSA determines the optimal solution through the following procedures:

Step 1 The initial population of Capuchin agents is randomly valued.

Step 2 The fitness of each solution vector (Capuchin) is calculated based on Eq.(1).

Step 3 The initial speed of each Capuchin agent is set.

Step 4 Half of the Capuchin population is randomly selected as leaders and the rest are designated as follower Capuchins.

Step 5 If the number of algorithm iterations has reached the maximum G, go to step 13, otherwise, repeat the following steps:

Step 6 The CapSA lifespan parameter is calculated as follows27:

$$tau ={beta }_{0}{e}^{{left(-frac{{beta }_{1}g}{G}right)}^{{beta }_{2}}}$$

(2)

where g represents the current number of iterations, and the parameters ({beta }_{0}), ({beta }_{1}), and ({beta }_{2}) have values of 2, 21, and 2, respectively.

Step 7 Repeat the following steps for each Capuchin agent (leader and follower) like i:

Step 8 If i is a Capuchin leader; update its speed based on Eq.(3)@@27:

$${v}_{j}^{i}=rho {v}_{j}^{i}+tau {a}_{1}left({x}_{bes{t}_{j}}^{i}-{x}_{j}^{i}right){r}_{1}+tau {a}_{2}left(F-{x}_{j}^{i}right){r}_{2}$$

(3)

where the index j represents the dimensions of the problem and ({v}_{j}^{i}) represents the speed of Capuchin i in dimension j. ({x}_{j}^{i}) indicates the position of Capuchin i for the j-th variable and ({x}_{bes{t}_{j}}^{i}) also describes the best position of Capuchin i for the j-th variable so far. Also, ({r}_{1}) and ({r}_{2}) are two random numbers in the range [0,1]. Finally, (rho) is the parameter affecting the previous speed, which is set to 0.7.

Step 9 Update the new position of the leader Capuchins based on their speed and movement pattern.

Step 10 Update the new position of the follower Capuchins based on their speed and the leaders position.

Step 11 Calculate the fitness of the population members based on Eq.(1).

Step 12 If the entire populations position has been updated, go to Step 5, otherwise, repeat the algorithm from Step 7.

Step 13 Return the solution with the least fitness as the optimal configuration of the ANN model.

Once each predictive component has been configured and trained, CapSA is utilized once more to assign the most advantageous weights to each of these components. Determining the significance coefficient of the output produced by each of the predictive components ANN1, ANN2, and ANN3 with respect to the final output of the proposed ensemble system is the objective of optimal weight allocation. Therefore, the optimization variables for the three estimation components comprising the proposed ensemble model correspond to the set of optimal coefficients in this specific implementation of CapSA. Therefore, the length of each Capuchin in CapSA is fixed at three in order to determine the ensemble model output, and the weight coefficients are assigned to the outputs of ANN1, ANN2, and ANN3, correspondingly. Each optimization variable's search range is a real number between 0 and 1. After providing an overview of the computational methods employed in CapSA in the preceding section, the sole remaining point in this section is an explanation of the incorporated fitness function. The following describes the fitness function utilized by CapSA to assign weights to the learning components according to the mean absolute error criterion:

$$fitness=frac{1}{n} sum_{i=1}^{n}{T}_{i}-frac{sum_{j=1}^{3}{w}_{j}times {y}_{j}^{i}}{sum_{j=1}^{3}{w}_{j}}$$

(4)

where ({T}_{i}) represents the actual value of the target variable for the i-th sample. Also, ({y}_{j}^{i}) represents the output estimated by the ANNj model for the i-th training sample, and wj indicates the weight value assigned to the ANNj model via the solution vector. At last, n describes the number of training samples.

A weight coefficient is allocated to each algorithm within the interval [0,1], delineating the manner in which that algorithm contributes to the final output of the ensemble model. It is crucial to note that the weighting phase of the learning components is executed only once, after the training and configuration processes have been completed. Once the optimal weight values for each learning component have been determined by CapSA, the predicted volume of forthcoming orders is executed using the trained models and the specified weight values. Once the predictive output of all three implemented ANN models has been obtained, the number of forthcoming orders is computed as follows by the proposed weighted ensemble model:

$$output=frac{sum_{i=1}^{3}{w}_{i}times {y}_{i}}{sum_{i=1}^{3}{w}_{i}}$$

(5)

Within this framework, the weight value (wi) and predicted value (yi) denote the ANNi model's assigned weight and predicted value, respectively, for the provided input sample. Ultimately, the retailer satisfies its future obligations in accordance with the value prediction produced by this ensemble model.

By predicting the sales volume of the product for specific retailers, it becomes possible to procure the requisite resources for each retailer in alignment with the projected sales volume. By ensuring that the supplier's limited resources are distributed equitably, this mechanism attempts to maximize the effectiveness of the sales system. In the following analysis, the sales volume predicted by the model for each retailer designated as i is represented by pi, whereas the agent's current inventory is denoted by vi. Furthermore, the total distribution capacity of the supplier is represented as L. In such a case, the supplier shall allocate the requisite resources to the retailer as follows:

Sales volume prediction Applying the model described in the previous part, the upcoming sales volume for each agent in the future time interval (pi) is predicted.

Receiving warehouse inventory The current inventory of every agent (vi) is received through supply chain management systems.

Calculating the required resources The amount of resources required for the warehouse of each retailer is calculated as follows:

$$S_{i} = max left( {0,p_{i} - v_{i} } right)$$

(6)

Calculating each agents share of allocatable resources The share of each retailer from the allocatable resources is calculated by Eq.(7), (N represents the number of retailers):

$${R}_{i}=frac{{S}_{i}}{sum_{j=1}^{N}{S}_{j}}$$

(7)

Resource allocation The supply center sends the needed resources for each agent according to the allocated share (Ri) to that agents warehouse.

Inventory update The inventory of every agent is updated with the receipt of new resources.

See the original post here:
Predicting sales and cross-border e-commerce supply chain management using artificial neural networks and the ... - Nature.com

5 Key Ways AI and ML Can Transform Retail Business Operations – InformationWeek

Odds are youve heard more about artificial intelligence and machine learning in the last two years than you had in the previous 20. Thats because advances in the technology have been exponential, and many of the worlds largest brands, from Walmart and Amazon to eBay and Alibaba, are leveraging AI to generate content, power recommendation engines, and much more.

Investment in this technology is substantial, with exponential growth projected -- the AI in retail market was valued at $7.14 billion in 2023, with the potential to reach $85 billion by 2032.

Brands of all sizes are eyeing this technology to see how it fits into their retail strategies. Lets take a look at some of the impactful ways AI and ML can be leveraged to drive business growth.

One of the major hurdles for retailers -- particularly those with large numbers of SKUs -- is creating compelling, accurate product descriptions for every new product added to their assortment. When you factor in the ever-increasing number of platforms on which a product can be sold, from third-party vendors like Amazon to social selling sites to a brands own website, populating that amount of content can be unsustainable.

One of the areas in which generative AI excels is creating compelling product copy at scale. Natural language generation (NLG) algorithms can analyze vast amounts of product data and create compelling, tailored descriptions automatically. This copy can also be adapted to each channel, fitting specific parameters and messaging towards focused audiences. For example, generative AI engines understand the word count restrictions for a particular social channel. They can focus copy to those specifications, tailored to the demographic data of the person who will encounter that message. This level of personalization at scale is astonishing.

Related:Is an AI Bubble Inevitable?

This use of AI has the potential to help brands achieve business objectives through product discoverability and conversion by creating compelling content optimized for search.

Another area in which AI and ML excel is in the cataloging and organizing of data. Again, when brands deal with product catalogs with hundreds of thousands of SKUs spread across many channels, it is increasingly difficult to maintain consistency and clarity of information. Product, inventory, and eCommerce managers spend countless hours attempting to keep all product information straight and up-to-date, and they still make mistakes.

Related:Is Innovation Outpacing Responsible AI?

Brands can leverage AI to automate tasks such as product categorization, attribute extraction, and metadata tagging, ensuring accuracy and scalability in data management across all channels. This use of AI takes the guesswork and labor out of meticulous tasks and can have wide-ranging business implications. More accurate product information means a reduction in returns and improved product searchability and discoverability through intuitive data architecture.

As online shopping has evolved over the past decade, consumer expectations have shifted. Customers rarely go to company websites and browse endless product pages to discover the product theyre looking for. Rather, customers expect a curated and personalized experience, regardless of the channel through which theyre encountering the brand. A report from McKinsey showed that 71% of customers expect personalization from a brand, and 76% get frustrated when they dont encounter it.

Brands have been offering personalized experiences for decades, but AI and ML unlock entirely new avenues for personalization. Once again, AI enables an unprecedented level of scale and nuance in personalized customer interactions. By analyzing vast amounts of customer data, AI algorithms can connect the dots between customer order history, preferences, location and other identifying user data and create tailored product recommendations, marketing messages, shopping experiences, and more.

Related:Overcoming AIs 5 Biggest Roadblocks

This focus on personalization is key for business strategy and hitting benchmarks. Personalization efforts lead to increases in conversion, higher customer engagement and satisfaction, and better brand experiences, which can lead to long-term loyalty and customer advocacy.

Search functionalities are in a constant state of evolution, and the integration of AI and ML is that next leap. AI-powered search algorithms are better able to process natural language, enabling a brand to understand user intent and context, which improves search accuracy and relevance.

Whats more, AI-driven search can provide valuable insights into customer behavior and preferences, enabling brands to optimize product offerings and marketing strategies. By analyzing search patterns and user interactions, brands can identify emerging trends, optimize product placement, and tailor promotions to specific customer segments. Ultimately, this enhanced search experience improves customer engagement while driving sales growth and fostering long-term customer relationships.

At its core, the main benefit of AI and ML tools is that theyre always working and never burn out. This fact is felt strongest when applied to customer support. Tools like chatbots and virtual assistants enable brands to provide instant, personalized assistance around the clock and around the world. This automation reduces wait times, improves response efficiency, and frees staff to focus on higher-level tasks.

Much like personalization engines used in sales, AI-powered customer support tools can process vast amounts of customer data to tailor responses based on a customers order history and preferences. Also, like personalization, these tools can be deployed to radically reduce the amount of time customer support teams spend on low-level inquiries like checking order status or processing returns. Leveraging AI in support allows a brand to allocate resources in more impactful ways without sacrificing customer satisfaction.

Brands are just scratching the surface of the capabilities of AI and ML. Still, early indicators show that this technology can have a profound impact on driving business growth. Embracing AI can put brands in a position to transform operational efficiency while maintaining customer satisfaction.

Continue reading here:
5 Key Ways AI and ML Can Transform Retail Business Operations - InformationWeek

WorldView Launches Referral AI to Boost Home Health and Hospice Revenue – AiThority

WorldView, a leading provider of integrated healthcare technology to the top home health and hospice EHR/EMR platforms, announced the upcoming launch of Referral AI, an enhancement to automate intake referrals using a custom AI/ML model built specific for the healthcare industry.

Referral AI uses AI/ML (Artificial Intelligence/Machine Learning) to scan and analyze dense referral document packets in seconds, detecting false positives and negatives, using custom rules to send confirmed referrals to the EHR/EMR system.

AiThority.com Latest News: Alation Has Announced an Enhanced Integration With Snowflake Horizon

In a recent survey by WorldView, confidence in a referral being acted upon quickly was a top-ranking factor for 85 percent of referring partners. WorldViews Referral AI was designed to help agencies win more business and eliminate manual workflows related to the overload of documents in their inbox.

Home health and hospice agencies receive many forms of electronic documents in their inbox, including referrals for new patient service. Referrals must be acted on quickly, but with documents being dozens of pages, they often sit unread or, worse, are missed or overlooked. Over time, the referral can become invalid, resulting in lost revenue for the agency and posing a risk of delayed service for patients.

Referral AI is a custom AI/ML model built specifically for the home-based care industry and trained on 22+ years of data, outperforming off-the-shelf AI/ML models for similar tasks in speed and accuracy.

Referral AI benefits home health and hospice agencies through cutting-edge features:

Read:Impel adds WhatsApp messaging to AI-Powered Customer Lifecycle Management Platform

Why Referral AI matters:

When we started developing our Referral AI technology, we saw first-hand how other solutions released features that inevitably created more downstream issues, saidJared Robey, SVP at WorldView. We leveraged our extensive dataset to build and train our AI/ML model, ensuring that referrals are identified accurately and routed to an intake team for prioritization. This investment allows WorldView to continue pushing automation limits to enhance user experience and increase financial success.

WorldViews Referral AI prioritizes rapid patient care and reduces the burden on back-office staff. By drastically cutting down the time needed for the intake process, Referral AI enables care coordination to begin almost immediately. The solution provides an organized and insightful overview of the referral packet, ensuring clinicians have quick access to the patients clinical history, reasons for care, and critical findings. This clarity allows admitting clinicians to focus on delivering high-quality care without sifting through extensive documentation.

Read More: L2L Introduces Powerful AI Functionality to Empower Frontline Manufacturing Teams

[To share your insights with us as part of editorial or sponsored content, please write topsen@martechseries.com]

Continued here:
WorldView Launches Referral AI to Boost Home Health and Hospice Revenue - AiThority

What is the Future of AI-Driven Employee Monitoring? – InformationWeek

How much work are you getting done, and how are you performing it? Artificial intelligence is poised to answer those questions, if it isnt already. Employers such as Walmart, Starbucks, and Delta, among others, are using AI company Aware to monitor employee messages, CNBC reports.

Employers have been monitoring workers long before the explosion of AI, but this technologys use in keeping tabs on employees has sparked debate. On one side, AI as an employee monitoring tool joins the ranks of other AI use cases touted as the future of work. On the other side, critics raise questions about the potential missteps and impact on employees.

How can AI be used in employee monitoring, and are there use cases that benefit both employers and employees?

Productivity tracking is at the forefront of the AI and employee monitoring conversation. Are employees working when they are on the clock? Answering this question is particularly top-of-mind for employers with people on remote or hybrid schedules.

A lot of workers are doing something called productivity theater to show that they're working when they might not be, says Sue Cantrell, vice president of products and workforce strategy at consulting firm Deloitte Consulting.

AI can be used to sift through data to identify work patterns and measure employee performance against productivity metrics. Fundamentally, the sector is about analytics and being able to process more data and understand patterns more quickly and make intelligent recommendations, Elizabeth Harz tells InformationWeek. She is CEO of Veriato, an insider risk management and employee monitoring company.

Related:Why Technology and Employee Privacy Clash

Veriatos customers most often use its AI-powered platform for insider risk management and user activity monitoring, according to Harz.

Insider risk is a significant cybersecurity concern. The Cost of Insider Risks Global Report 2023, conducted by the Ponemon Institute, found that 75% of incidents are caused by non-malicious insiders. We believe using AI to help teams get more predictive instead of reactive in cyber is critical, Harz explains.

Using AI to monitor workers can be about their own safety, as well as that of the company. People have different dynamics than they did when they went to the office Monday through Friday. It doesn't mean sexual harassment has gone away. It doesn't mean hostile work environments have gone away. It doesn't mean that things that happened previously have just stopped, but we need new tools to evaluate those things, says Harz.

Related:Data Privacy in the Age of AI Means Moving Beyond Buzzwords

AI also offers employers the opportunity to engage employees on performance quality and improvement. If you're able to align the information you're getting on how a particular employee is executing the work relative to what you consider to be best practices, you can use that to create personalized coaching tools that employees ultimately do find beneficial or helpful, Stephanie Bell, chief programs and insights officer at the nonprofit coalition Partnership on AI, tells InformationWeek.

AI-driven employee monitoring has plenty of tantalizing benefits for employers. It can tap into the massive quantities of data employers are gathering on their workforce and identity patterns in productivity, performance, and safety. GenAI really allows for language and sentiment analysis in a way that just really wasn't possible prior to LLMs, says Harz.

Measuring productivity seems like a rock-solid employer use case for AI, but productivity isnt always black and white. Yes, it's easy to collect data on whether or not workers are online or not, Cantrell points out. But is that really measuring outcomes or collecting data that can really help improve organizational value or benefits? That's open to question.

Related:Privacy, Surveillance & Tech: What FISAs Renewal Means

A more nuanced approach to measuring performance could be beneficial to both employer and employee. And enterprises are acknowledging the opportunities in moving away from traditional productivity metrics, like hours worked. Research from Deloitte Insights found that 74% of respondents recognize the importance of finding better ways to measure employee performance and value compared to traditional productivity metrics.

AI monitoring potentially has more benefits when it is used in a coaching capacity. Where we see the real value is around using AI as a coach. When [it] monitors workers, for example, on their work calls and then [provides] coaching in the background or [uses] AI to infer skills from their daily work to suggest improvements for growth or development, or you're using AI to monitor people's movements on a factory floor to [make] suggestions for well-being, says Cantrell.

This kind of coaching tool is less about if an employee is moving their mouse or keeping their webcam on and more about how they are performing their work and ways they could improve.

AI monitoring tools also can be used to make workplaces safer for people. If integrated into video monitoring, for example, it can be used to identify unsafe workplace behaviors. Employers can follow up to make the necessary changes to protect the people working for them.

But like its many other applications, AI-driven employee monitoring requires careful consideration to actually realize its potential benefits. What data is being gathered? How is it being used? Does the use of AI have a business case? You should have a very clear business rationale for collecting data. Don't just do it because you can, Cantrell cautions.

Realizing the positive outcomes of any technology requires an understanding of its potential pitfalls. Employee monitoring, for one, can have a negative impact on employees. Nearly half of employees (45%) who are monitored using technology report that their workplaces negatively affect their mental health, according to an American Psychological Association (APA) survey.

The perception of being watched at work can decrease peoples trust in their employer. They feel like their employer is spying on them, and it can have punitive consequences, says Cantrell.

Employee monitoring can also have a physical impact on workers. In warehouse settings, workers can be expected to hit high productivity targets in fast-paced, repetitive positions. Amazon, for example, is frequently scrutinized for its worker injury rate. In 2021 employees at Amazons facilities experienced 34,000 serious injuries, an injury rate more than double that of non-Amazon warehouses, according to a study from the Strategic Organizing Center, a coalition of labor unions.

Amazon has faced fines for its worker injuries from agencies like the Occupational Safety and Health Administration (OSHA) and Washingtons Department of Labor and Industries. The musculoskeletal injuries in these citations have been linked to the surveillance-fueled pace of work in Amazon warehouses by reports from the National Employment Law Project and the Strategic Organizing Center, Gabrielle Rejouis, a distinguished fellow with the Georgetown Law Center on Privacy & Technology and a senior fellow with the Workers' Rights Institute at Georgetown Law, tells InformationWeek in an email interview.

While AI may fuel workplace surveillance systems, it does not bear the sole responsibility for outcomes like this. It's not like the AI is arbitrarily setting standards, says Bell. These are managerial decisions that are being made by company leaders, by managers to push employees to this rate. They're using the technology to enable that decision-making.

People are an important part of the equation when looking at how AI employee monitoring is used, particularly if that technology is making suggestions that impact peoples jobs.

AI tools could analyze conversations at a call center, monitoring things like emotional tone. Will AI recognize subtleties that a human easily could? Bell offers the hypothetical of a call center employee adopting a comforting tone and spending a little extra time on the phone with a customer who is closing down an account after the death of a spouse. The call is longer, and the emotional tone is not upbeat.

That's the case where you want that person to take the extra time, and you want that person to match the emotional tone of the person on the other end of the line not to maintain across the board standards, she says.

An AI monitoring system could flag that employee for failing to have an upbeat tone. Is there a person in the loop to recognize that the employee made the right choice, or will the employee be penalized?

Employee monitoring bolstered by AI capabilities also has the potential to impact the way employees interact with one another. When you have this generalized surveillance, it really chills employee activity in speech, and in the research that I've done that turns up in making it harder for employees to build trusting relationships with each other, Bell shares.

Employers could potentially use AI monitoring tools to quell workers ability to exercise their rights. One of the most concerning ways that electronic worker surveillance and automated management benefit employers is that it can obscure management union busting, says Rejouis. If surveillance can find any and every mistake a worker makes, employers can use this to provide a non-union justification for firing an organizer.

The regulatory landscape for AIs use in the workplace is still forming, but that doesnt mean employers are completely free of legal concerns when implementing AI employment monitoring tools.

Employee privacy is a paramount concern. We need to make sure that we're complying with a variety of privacy laws, Domenique Camacho Moran, a partner and leader of the employment law practice at New York law firm Farrell Fritz, tells InformationWeek.

Workers generate the data used by monitoring tools. How is their data, much of it personal, being collected? Does that collection happen only in the workplace on work devices? Does it happen on personal devices? How is that data protected?

The Federal Trade Commission is paying attention to how AI management tools are impacting privacy. As worker surveillance and AI management tools continue to permeate the workplace, the commission has made clear that it will protect Americans from potential harms stemming from these technologies, Benjamin Wiseman, associated director, division of privacy at identity protection at the FTC, said in remarks at a Harvard Law School event in February.

With the possibility of legal and regulatory scrutiny, what kind of policies should enterprises be considering?

Be clear with workers about how the data is being used. Who's going to see it? says Cantrell. Involve workers in co-creating data privacy policies to elevate trust.

Bias in AI systems is an ongoing concern, and one that could have legal ramifications for enterprises using this technology in employee monitoring tools. The use of AI in hiring practices, and the potential for bias, is already the focus of legislation. New York, for example, passed a law regarding the use of AI and automated tools in the hiring process in attempt to combat bias. Thus far, compliance with the law has been low, according to the Society for Human Resource Management (SHRM). But that does not erase the fact that bias in AI systems exists.

How do we make sure that AI monitoring is non-discriminatory? We know that was the issue with respect to AI being used to filter and sort resumes in an application process. I worry that the same issues are present in the workplace, says Camacho Moran.

Any link between the use of AI and discrimination opens enterprises to legal risk.

Employee monitoring facespushback on a number of fronts already. The International Brotherhood of Teamsters, the union representing employees of UPS, fought for a ban on driver-facing cameras in the UPS contract, the Washington Post reports.

The federal government is also investigating the use of employee monitoring. In 2022, the National Labor Relations Board (NLRB) released a memo on surveillance and automated management practices.

[NLRB] General Counsel Jennifer Abruzzo announced her intention to protect employees, to the greatest extent possible, from intrusive or abusive electronic monitoring and automated management practices through vigorously enforcing current law and by urging the board to apply settled labor-law principles in a new framework, according to a NLRB press release.

While conversation about new regulatory and legal frameworks is percolating, it could be quite some time before they come to fruition. I don't think we understand enough about what it will be used for for us to have a clear path towards regulation, says Camacho Moran.

Whether it is union pushback, legal action by individual employees, or regulation, challenges to the use of AI in employee are probable. It's hard to figure out who's going to step in to say enough is enough, or if anybody will, says Camacho Moran.

That means enterprises looking to mitigate risk will need to focus on existing laws and regulations for the time being. Start with the law. We know you can't use AI to discriminate. You can't use AI to harass. It's not appropriate to use AI to write your stories. And, so we start with the law that we know, says Camacho Moran.

Employers can tackle this issue by developing internal taskforces to understand the potential business cases for the use of AI in employee monitoring and to create organization-wide policies that align with current regulatory and legal frameworks.

This is going to be a part of every workplace in the next several years. And so, for most employers, the biggest challenge is if you're not going ahead and looking at this issue, the people in your organization are, says Camacho Moran. Delay is likely to result in inconsistent usage among your team. And that's where I think the legal risk is.

What exactly is the future of work? You can argue that the proliferation of AI is that future, but the technology is evolving so quickly and so many of its uses cases are still nascent, it is hard to say what exactly that future will look like years or decades down the road.

If AI-driven employee monitoring is going to be a part of every workplace, what does responsible use look like?

The answer lies in creating a dialogue between employers and employees, according to Bell. Are employers looking for use cases that employees themselves would seek out? she asks.

For example, a 2023 Gartner survey found that 34% of digital workers would be open to monitoring if meant getting training classes and/or career development paths, and 33% were open to monitoring if it would help them access information to do their jobs.

A big part of this is just recognizing the subject matter expertise of the folks who are doing the work themselves andwhere they could use support, Bell continues. Because ultimately that's going to be a contribution back to business outcomes for the employer.

Transparency and privacy are also important facets of responsible use. Do employees know when and how their data is being collected and analyzed by AI?

Consent is an important, if tricky, element of employee monitoring. Can they workers opt out of this type of monitoring? If opting out is an option, can employees do so without the threat of losing their jobs?

Most workplaces are at-will workplaces, which means an employer does not need justification for firing an employee, Rejouis points out. This makes it harder for employees to meaningfully refuse certain changes to the workplace out of fear of losing their jobs.

When allowing employees to opt out isnt possible, say for video monitoring safety on a manufacturing floor, there are still ways to protect workers privacy. Data can be anonymized and aggregated so that we're protecting people's privacy, says Cantrell.

While AI can be implemented as a powerful monitoring tool, the technology itself needs regular monitoring. Is it measuring what is supposed to be measuring? Has any bias been introduced into the system? Is it actually providing benefits for employers and employees? We always need some human judgment involved and awareness of what the potential downsides and bias that the AI is bringing us could be, says Cantrell.

Most enterprises are not building their own AI systems for worker monitoring. They are working with third-party vendors that offer employee monitoring tools powered by AI. Part of responsible use is understanding how those vendors are managing the potential risk and downsides of their AI systems.

The way we're approaching it at Veriato is, just as you can imagine, being extremely thoughtful about what features we release in the wild and what we keep with beta customers and customer advisory panels that just really test and run things for a longer period of time than we would with some other releases to make sure that we have positive experiences for our partners, Harz shares.

Any innovation boom, AI or otherwise, comes with a period of trial and error. Enterprise leadership teams are going to find out what does and does not work.

Bell emphasizes the importance of keeping employees involved in the process. While many organizations are rushing to implement the buzziest tools, they could benefit from slowing down and identifying use cases first. Start with the problem statement rather than the tool, she says. Starting with the problem statement, I think, is almost always going to be the fastest way to identify where something is going to deliver anyone value and be embraced by the employees who would be using it.

Cantrell considers the use of AI in employee monitoring a goldmine or a landmine. It can bring dramatic benefits for both workers and organizations if done right. But if not done right and not done responsibly, workforce trust can really diminish and it can be a what I call a landmine, she says.

Continue reading here:
What is the Future of AI-Driven Employee Monitoring? - InformationWeek

IT Pros Love, Fear, and Revere AI: The 2024 State of AI Report – InformationWeek

Is AI a boon or a bane to job security? A security tool or a vulnerability? Mature enterprise technology or immature toy? Essential enterprise technology or threat to humanity?

According to survey respondents from InformationWeek's latest State of AI Report, its all of the above.

More than a year after generative artificial intelligence became widely available to the public, we polled 292 people directly involved with AI at their organizations.

Unsurprisingly, results reveal that adoption of AI is widespread, and businesses are using the technology for a wide range of different taskswith 85% of respondents describe their organizations approach to AI as pioneering or curious but cautious.

But expectations about this novel technology are also quite different from reality. So far, AI hasnt significantly affected headcount, and respondents overwhelmingly feel their own jobs are safe from its reach.

On the other hand, concerns around data security, hallucinations, and the reliability of outcomes are weighing on respondents' minds. 53% say that, if unchecked, artificial intelligence poses a threat to humanity.

Download this free report to learn how IT departments are investing in AI now and whats guiding their plans for the future.

Continued here:
IT Pros Love, Fear, and Revere AI: The 2024 State of AI Report - InformationWeek

The war for AI talent is heating up – The Economist

Pity OpenAIs HR department. Since the start of the year the maker of ChatGPT, the hit artificial-intelligence (AI) chatbot, has lost about a dozen top researchers. The biggest name was Ilya Sutskever, a co-founder responsible for many of the startups big breakthroughs, who announced his resignation on May 14th. He did not give a reason, though many suspect that it is linked to his attempt to oust Sam Altman (pictured), the firms boss, last December. Whatever the motivation, the exodus is not unusual at OpenAI. According to one estimate, of the 100-odd AI experts the firm has hired since 2016, about half have left.

That reflects not Mr Altmans leadership but a broader trend in the technology industry, one that OpenAI itself precipitated. Since the launch of ChatGPT in November 2022, the market for AI labour has been transformed. Zeki Research, a market-intelligence firm, reckons that around 20,000 companies in the West are hiring AI experts. Rapid advances in machine learning and the potential for a platform shifttech-speak for the creation of an all-new layer of technologyhas changed the types of skills employers are demanding and the places where those who possess them are going. The result is a market where AI talent, previously hoarded at tech giants, is becoming more distributed.

View post:
The war for AI talent is heating up - The Economist

AI Stethoscope Demonstrates ‘The Power as Well as the Risk’ of Emerging Technology – The Good Men Project

By Michael Leedom

The modest stethoscope has joined the Artificial Intelligence (AI) revolution, tapping into the power of machine learning to help health-care providers screen for diseases of the heart and lung.

This year, NCH Healthcare in Naples, Fla., became the first health-care system in the U.S. to incorporate AI into its primary care clinics to screen for heart disease. The health technology company Eko Health supplied primary care physicians with digital stethoscopes linked to a deep-learning algorithm. Following a 90-day pilot program involving more than 1,000 patients with no known heart problems, the physicians discovered 136 had murmurs suggestive of structural heart disease.

Leveraging this technology to uncover heart valve disease that might otherwise have gone undetected is exciting, says Bryan Murphey, President of the NCH Medical Group, which signed an annual agreement in January with Eko to use stethoscopes with the AI platform. The numbers made sense to help our patients in a non-invasive way in the primary care setting, says Murphey.

Ekos AI tool the SENSORA Cardiac Disease Detection Platform enables stethoscopes to identify atrial fibrillation and heart murmurs. The platform added another algorithm,clearedby the U.S. Food and Drug Administration (FDA) in April, for the detection of heart failure using the Eko stethoscopes built-in electrocardiogram (ECG) feature.

AI-enhanced stethoscopes showed more than a twofold improvement over humans in identifying audible valvular heart disease, according to astudypublished inCirculationin November 2023. The AI showed a 94.1 per cent sensitivity for the detection of valve disease, outperforming the primary care physicians 41.2 per cent. The findings were confirmed with an echocardiogram of each patient.

Stethoscopes join the growing number of AI health-care applications that promise increased efficiency and improved diagnostic performance with machine learning. In recent years, the FDA has cleared hundreds of AI algorithms for use in medical practice. But as the health-care field employs AI for more services, skeptics point to risks posed by over-reliance on this black box, including the potential biases built into AI datasets and the gradual loss of clinician skills.

Since its adoption more than 200 years ago, the stethoscope has served as both a routine exam tool and a visible reminder of the doctors training. It is recognizable worldwide and, for most clinicians, has remained an analog instrument. The first electronic stethoscopes were created more than 20 years ago and feature enhancements to amplify sound and allow for digital recording.

Analog and digital stethoscopes both rely on the ability of the health-care provider to hear and interpret the sounds, which may be the first indication a patient may have a new disease. However, this is not a skill every health-care practitioner masters. The faint, low-pitched whooshing of an incompetent heart valve or the subtle crackling of interstitial lung disease may go unnoticed even by the ears of experienced physicians.

Enter AI, which can mimic the human brain using neural networks consisting of algorithms that, in the case of stethoscopes, are trained with thousands of heart or lung recordings. Instead of relying on explicit program instructions, an AI system uses machine learning to train itself through advanced pattern recognition.

The effectiveness of artificial neural networks to diagnose cardiovascular disease has been demonstrated in controlled clinical trials.

AI improved the diagnosis of heart failure by analyzing ECGs performed on more than 20,000 adult patients in a randomized controlled trial published inNature Medicine. The intervention group was more likely to be sent for a confirmatory echocardiogram, resulting in 148 new diagnoses of left ventricular systolic dysfunction.

A neural network algorithm correctly predicted 355 more patients who developed cardiovascular disease compared to traditional clinical prediction based on American College of Cardiology guidelines, according to a cohortstudyof nearly 25,000 incidents of cardiovascular disease.

These machines are very good at finding patterns that are even beyond human perception. But theres both the power as well as the risk, says Paul Yi, Director of the University of Maryland Medical Intelligent Imaging Center.

The risks include limitations in performance if AI models are not properly trained. The accuracy of the AI algorithm depends on the collection of sufficient data that is representative of the population at large.

These AI models require a large amount of data, and these data are not easy to come by.

The generalizability is a big issue, says Gaurav Choudhary, Director of Cardiovascular Research at Brown University. These AI models require a large amount of data, and these data are not easy to come by. Choudhary notes that once an algorithm is approved by the FDA, it cannot be simply revised as new recordings become available. Changes to a particular AI algorithm require a new submission to the FDA before use.

In January 2024, the World Health Organization published newguidelinesfor health-care policies and practices for AI applications. Its authors warned of several risks inherent in the use of AI tools, including the existence of bias in datasets, the transparency of the algorithms employed and the erosion of medical provider skills.

AI algorithms that interpret heart and lung recordings may not have been trained on the full spectrum of possible sounds if the data does not include a wide range of patients and ambient noises.

This technology has to be validated across a variety of murmurs in a variety of clinical environments and situations, says Andrew Choi, Professor of Medicine and Radiology at George Washington University. Many of our patients are not the ideal patients, he adds, noting that initial validation typically involves patients with clear heart sounds. In real world practice, there will be older patients, obese patients and noisy emergency departments that may compromise the precision of the AI model.

Another complication is the inscrutable nature of the algorithm. Without a clear understanding of how these systems make decisions, it may be difficult for health-care providers to discuss a management plan with patients, particularly if the AI output appears incompatible with other clinical information during the examination.

Explainability is sort of a holy grail, says Paul Friedman, Chair of the Department of Cardiovascular Medicine at Mayo Clinic and one of the developers of the AI tech that Eko Health uses. Over time, he says, more studies may elucidate how these systems process information. AI uncertainty is similar to our incomplete understanding of how certain medications actually work, he suggests. Both are used because they are consistently effective.

Im not dismissive of the importance of trying to crack the black box, but I think thats a subject for research, he says.

The introduction of AI in the exam room could both enhance diagnostic performance while disrupting the relationship between health-care provider and patient. The provider may become complacent and gradually dependent on AI for answers to clinical questions, while the patient may feel that the care is becoming depersonalized and lose confidence in the doctor.

The subconscious transfer of decision-making to an automated system is called automation bias, one of many cognitive biases the health-care provider must confront. There are many reasons providers may forgo medical training and uncritically accept the heuristics of AI, including inexperience, complex workloads and time constraints, according to a systematicreviewof the phenomenon.

It is still unclear how AI will ultimately influence the physician-patient interaction, says Yi. I think thats kind of the last mile of AI in medicine. Its this human-computer interaction piece where we know that this AI works well in the lab, but how does it work when it interacts with humans? Does it make them second guess what theyre doing? Or does it give them false confidence?

The number of AI-enhanced devices submitted to the FDA hassoaredsince 2015, with almost 700 AI medical algorithmsclearedfor market. Most applications are for radiology. AI is already being integrated into academic medical centres across North America for a variety of tasks, including diagnosing disease, projecting length of hospitalization, monitoring wearable devices and performing robotic surgery.

At Unity Health in Toronto, more than 50 AI-based innovations have been developed to improve patient care since 2017. One of these is a tool used at St. Michaels Hospital since 2020 called CHARTWatch, which sifts electronic health records, including recent test results and vital signs, to predict which patients are at risk of clinical deterioration. The algorithm proved to be lifesaving during the COVID pandemic, leading to a 26 per cent drop in unanticipated mortality.

I think AI is really going to transform health care, says Omer Awan, Professor of Radiology at the University of Maryland School of Medicine. He is not concerned that AI will take over physician jobs, instead predicting that AI will continue to improve efficiency and help reduce physician burnout.

Research continues on how best to incorporate AI into the primary care setting, including ethical issues such as data privacy, legal liability and informed consent. The adoption of AI may infringe on patient autonomy if medical decisions are made using algorithms without regard for patient preferences, according to a literaturereview.

Murphey says he is eager to see Eko Healths AI-paired stethoscopes improve the screening for early heart disease but remains cautious about too much use of technology.

I want to stay connected to the patient. I take pride in my patient examinations, he says. I think thats one of the important things we provide to patients in the primary care setting, and Im not looking to sever that part of the relationship.

This post was previously published on HEALTHYDEBATE.CA and is republished under a Creative Commons license.

***

All Premium Members get to view The Good Men Project with NO ADS.

A $50 annual membership gives you an all access pass. You can be a part of every call, group, class and community. A $25 annual membership gives you access to one class, one Social Interest group and our online communities. A $12 annual membership gives you access to our Friday calls with the publisher, our online community.

Need more info? A complete list of benefits is here.

Photo credit: iStock.com

Original post:
AI Stethoscope Demonstrates 'The Power as Well as the Risk' of Emerging Technology - The Good Men Project