Category Archives: Uncategorized
Enterprise architects are comfortable using public cloud services to help quickly build applications. The (REST API + stateless microservice + database) model for cloud-native apps is a pattern that has been key to scaling the cloud (any server will do) and increased use of abstractions such as Kubernetes further simplifies the operational deployment and management of microservices.But theres a class of applications for which todays cloud-native stacks are an awkward fit.
Im not referring to lift+shift legacy applications that are expensive to run in the cloud (and will take forever to re-write).Instead, my focus is a class of applications that are increasingly important to enterprise customers thatanalyze streaming data from production, products, supply chain partners, employees and more - to deliver real-time insights to help drive the business.Increasingly these apps are put into the edge computing category.In this piece Ill dissect the requirements for these apps and (hopefully) convince you that, for streaming data apps at least, the edge is not a place.Instead its a new way to compute on streaming data.
There are many uses for advanced computing embedded in next-gen products at the edge, from cars to compressors.The engineers that develop them will use the best CPUs, ML acceleration, hypervisors, Linux and other technology to build vertically integrated solutions: Better cars, compressors and drones.Is this edge computing?Sure, but not in a generic sense these are tightly integrated solutions rather than computing systems that could be used for a broad set of applications - interesting but only narrowly.But what about the data that these and other connected products produce?Smart devices (with a CPU and lots to say) are being connected at a rate of about 2M/hour, and there already are a billion mobile devices in our hands - so increasingly networks are becoming flooded with streaming data.Applications that drink from this firehose are becoming common, and we need tools to help devs create them.
The term edge computing implies a generic capability that is different from cloud computing.While there are often requirements such as data volume reduction, latency or security/compliance concerns that dictate an on-prem component of an application, other than these, does edge computing have unique requirements?It does: Real-time analysis of streaming data demands that we kick the REST + database habit.But there is nothing that is unique to the physical edge.This is great news because it means that edge applications can run on cloud infrastructure, or on prem.Edge computing is definitely a thing, but its aboutprocessing streaming data from the edge, as opposed torunning the application at the physical edge.Edge applications that process streaming data from real world things have to:
These requirements for stream processing software are independent of where the solution runs. If the data sources are fixed, then it makes sense to co-locate some compute nearby particularly to deal with data reduction. But a cloud-hosted stack is needed for any solution that processes data from mobile devices.
OK, whats different?Whereas cloud computing builds on a powerful triumvirate: Stateless APIs and micro-services, and stateful databases,streaming applications needdifferentinfrastructure abstractions:
A key observation is that things in the real-world change state concurrently and independently, and their state changes are context specific based on the state of (other things in) their environment.It is thestate changesin things that are critical for applications, and not raw data.Moreover, whereas databases are good at capturing relationships that are fixed (buseshaveengines) they are poor at capturing dynamically computed relationships (atruckwithbad braking behavioris nearaninspector).Real-world relationships between data sources are fluid, and based on computed relationships such asbad braking behavior, theapplication should respond differently. Finally, effects of changes are immediate, local and contextual (the inspector is notified to stop the truck).The dynamic nature of relationships suggests a graph database and indeed a graph of related things is what isneeded.But in this case, to satisfy the need to process continuously, the graph itself needs to be fluid and computation must occur in the graph.
Each web agent actively processes raw data from a single source and keeps its state in memory (as well as persisting its state in the background to protect against failures). Web agents are thus stateful, concurrent digital twins of real-world data sources that at all times mirror the relevant state of the real-world things they represent.The diagram shows web-agents created as digital twins of sensors in a smart city environment.
Web agentslinkto each other based on computed context based on changes in the data, dynamically building a graph that reflects real-world relationships like proximity, containment, or even correlation.Linkedagents see each others state changes in real-time. Agents can concurrentlycompute on their own state and that of agents they are linked to. They analyze, learn and predict, and continuously stream enriched insights to UIs, brokers, data lakes and enterprise applications.
The diagram shows that the sensors at an intersectionlinkto the intersection digital twin.Intersectionslinkto their neighbors, enabling them to see state changes in real-time.The links are dynamically computed: membership/containment and proximity are used to build the graph using insights computed by the digital twins from their own data.
SwimOS benefits from in-memory, stateful, concurrent computation that yields several orders of magnitude performance improvement over database-centric analysis simply because all state is in-memory and web agents can compute immediately when data arrives or a contextual state change in another linked agent occurs. Streaming implementations of key analytical, learning and prediction algorithms are included in SwimOS, but it is also easy to interface to existing platforms such as Spark.
SwimOS moves analysis, learning and predictionintothe dynamically constructed graph of web agents.An application is a thus a dynamically constructed graph built from data.This dramatically simplifies application creation: The developer simply describes the objects and their inputs, and the calculations that they use to determine how to link.When data arrives, SwimOS creates a web agent for each data source, each of which independently and concurrently computes as data flows over it, and in turn streams its insights such as predictions or analytical insights. Each is responsible for consuming raw data from its real-world sibling, and links dynamically to other agents based on computed relationships.As data flows over the graph, each web agent computes insights using its own state and that of other agents to which it is linked.
Youcan see SwimOS in actionin an application that consumes over 4TB per day of data from the traffic infrastructure in Palo Alto, CA, to predict the future state of each intersection (click on the blue dots to see the predicted phases, and the colors for down-timers of each light).This application runs in many cities in the USA, and delivers predictions, on a per intersection basis, via an Azure hosted API to customers that need to route their vehicles through each city.The source code for the application is part of the SwimOSGitHub repo. Starting with SwimOS is easy the site iscompletewith docs and tutorials.
In SwimOS the graph is a living, dynamically changing structure. In the example earlier: (atruckwith bad braking behavioris nearaninspector) the relationship is dynamically computed and ephemeral; the link between the inspector and the truck is thus also ephemeral, built when the truck enters a geo-fence around the inspector, and broken when it leaves.
An application in SwimOS is a dynamic graph of linked web agents that continuously concurrently compute as data flows over the graph.As each web agent modifies its own state, that state is immediately visible to linked web agents in the graph.This is achieved through streaming a link is effect a streaming API, and takes the form of a URI just like a REST API call.SwimOS uses a protocol called WARP, which runs over web-sockets, to synchronize state and deliver streamed insights. The key difference is this: each web agent transforms its own raw data into state changes, and those state changes are streamed to linked web agents in the graph.They in turn compute based on their own states, and the states of agents to which they are linked.
A SwimOS application is built from the data, given a simple object definition for each source.A single app is easy to build.But there is another benefit in this approach: If an application is re-used in multiple sites, no changes are required.For example, data in a smart city application is used to build an application that predicts future traffic behavior in the city.But note that there is no specific model required for each city.Instead, only the local context for each intersection is used, by the digital twin of the intersection itself, to learn and predict.In each city, the app for the city is built from the data, without needing to change a line of code.
A word about transformations: In traditional analytical stacks for example using Spark the data is transported to the application, which is forced to transform data to state, for each thing, and save it to a database, before the analytical app can operate on the state of the real world sources.SwimOS transforms raw data to the relevant state of the source in each web agent.The data-to-state transformation is done by each digital twin and the graph of web agents is a mirror of the states of the real-world sources.Then, each digital twin can compute locally, at high resolution, using its own state and the states of linked web agents.This proceeds concurrently, at CPU and memory speed, without needing to access a database.The performance increase is huge, and the reductions of infrastructure required are commensurate.We frequently find that a SwimOS application uses less than 5% of the infrastructure of a database-centric implementation of the same application.
Finally, SwimOS digital twins are real-time streaming mirrors of real-world things things in the edge environment.They compute and stream insights on the fly.Visualization tools are vital for such applications.SwimOS includes a set of JS and typescript bindings that enable developers to quickly develop browser-based UIs that update in real-time, simply because they subscribe to the streamed insights computed by web agents.
In summary: Edge Computing is definitely a thing, but the computing need not occur at the edge.Instead what is needed is an ability to compute (anywhere) on streaming data from large numbers of dynamically changing devices, in the edge environment.This in turn demands an architectural pattern for stateful, distributed computing.SwimOS is an example of a stateful, real-time platform for applications that process real-time streaming data.
Simon Crosby is the CTO of SWIM.AI. Previously, Simon was a co-founder and CTO of Bromium, a security technology company. At Bromium, Simon built a highly secure virtualized system to protect applications. Prior to Bromium, Crosby was the co-founder and CTO of XenSource before its acquisition by Citrix, and later served as the CTO of the Virtualization and Management Division at Citrix. Previously, Crosby was a principal engineer at Intel. Crosby was also the founder of CPlane, a network-optimization software vendor. Simon has been a tenured faculty member at the University of Cambridge. Simon Crosby was named as one of InfoWorlds Top 25 CTOs.
Read the rest here:
Is Edge Computing a Thing? - InfoQ.com