The data generated by automation systems can reveal new insights - but only if it is accessed efficiently. Emerson employee Silvia Gonzalez identifies the main reasons why automated systems produce data silos and explains how to simplify access to this valuable information with an edge-to-cloud Industrial Internet of Things (IIOT) solution.
Automated systems generate prodigious amounts of data that can be used to improve business results. However, this valuable data must be accessed, managed and analyzed effectively. Unfortunately, for a variety of technical and business reasons, it often remains inaccessible. New architectures are changing this by combining flexible and efficient edge computing with a cloud computing model. They provide a concrete and practical way to analyze data, obtain new data and make the results available to stakeholders. In this article, we look at the main reasons behind data silos and how to easily access and leverage them with a modern edge-to-cloud IIOT solution.
Existing infrastructures at the root of data silos
Until recently, the majority of production data came from APIs, HMIs, a Scada, and Historian systems active in the operational technology (OT for operations technology). The primary objective of these systems is to provide control and visibility features to maximize operational efficiency and availability. Therefore, accessing and analyzing the associated data beyond the immediate production objectives is a secondary concern.
IoT infrastructure is designed and tailored to meet operational needs and involves choices related to, for example: selecting proprietary communication protocols that meet performance requirements but do not promote flexibility and do not support interoperability across vendors; reducing sensor controls and data collection to maximize system reliability and simplicity; and implementing localized architectures at the site to reduce cyber threats and proprietary lockdown programs to protect intellectual property and promote reliable machine operation, often at the expense of connectivity.
The systems that result from these decisions are incredibly powerful in terms of their business objectives, but they suffer from data blind spots and do not benefit from the analysis of all potentially accessible data.
Within an IoT environment, data sources appear to be open, but are actually quite difficult to access for applications outside of that environment, where data is easier to analyze. In addition, many potentially useful data sources, such as environmental conditions, condition monitoring information, and energy consumption, are not required for production or equipment control, and therefore are not collected by automated systems. Megadata analytics capabilities are becoming more powerful, but limited access to data silos continues to limit their potential.
Types of data silos
The data silos are very diverse and come from the machines, the plant, and other systems that are integrated with the OT or responsible for managing auxiliary equipment. This data can be as specific as a single temperature reading or as extensive as a historical data log showing the number of times an operator has received an alarm. The most common types of siloed data include:
Isolated data This is the simplest case, but not always the easiest to solve. This is the simplest case, but not always the easiest to solve. Let's take the example of a stand-alone temperature transmitter with 4-20 mA connectivity or even Modbus protocol. It needs to connect to an Edge device (PLC, Edge controller, gateway or whatever) to make that data stream accessible. Very often, this data is not essential to control the machines and therefore is not available through traditional existing PLC/SCADA data sources. Obtaining data via the nearest machine API can invalidate OEF guarantees due to unavoidable changes in programming logic.
Ignored data Resources connected to IoT systems and producing data that is not used. Many smart Edge devices provide fundamental, in-depth data. A smart datalogger can provide essential information related to volts, amps, kilowatts, kilowatt-hours and more through wired or industrial communication protocols. However, more in-depth data sets, such as total harmonic distortion (THD), cannot be transmitted due to lack of application requirements, low-bandwidth communications or limited system data storage capabilities. The data is there, but it is never accessed.
Subsampled data resources generating data sampled at an insufficient bit rate. Even when an intelligent device provides data to control systems over a communication bus, the sampling rate may be too low or the latency too high, or the data set may be so large that the results obtained cannot be exploited. In some cases, the data may be summarized before being published, resulting in a loss of fidelity.
Inaccessible data Intelligent devices: resources that generate data (often unprocessed, but nonetheless important for diagnostic purposes, for example) that is provided in a format that is generally inaccessible or unavailable through traditional industrial systems. Some intelligent devices have on-board data, such as error logs, that may not be communicated via standard communication protocols, but would still be very useful for analyzing events that cause downtime.
Non-numerical data Data is generated manually by staff on paper, notepads and whiteboards, and not captured digitally. In many companies, employees fill out test and inspection forms and other similar quality documents on paper, with no plan to integrate this information digitally. A more modern approach is to use digital methods to gather this data and convert to a paperless factory.
Leverage data from the Edge in the cloud
Data silos are of particular interest to companies that want to analyze their operational performance across production facilities or across multiple facilities. They are looking for solutions that can transmit siloed data from the field to the cloud for recording, visualization, mining and in-depth analysis. This connectivity is required, especially for high-level enterprise IT systems on the plant floor and in the cloud, as it enables the historization and analysis of many types of data at the edge for deeper, longer-term analytical results, far beyond the usual performance tied to short-term production goals.
When an end user or OEM takes the siloed data from traditional data sources and feeds it into cloud-hosted applications and services, they benefit from many opportunities, including remote monitoring, predictive diagnostics and root cause analysis. This enables it to plan different machines, plants and facilities, perform long-term data analysis and comparable resource analysis within and across plants, manage the machine pool, analyze cross-domain data (deep learning), gain insights into production bottlenecks, and identify the origin of process-level anomalies, even if they are not detected until later in the production process.

Create an edge solution
The goal of IIOT initiatives is to address the challenges posed by data silos and effectively connect data at the edge with the cloud, where it can be analyzed. IIOT solutions integrate hardware technologies on the field, software at the edge and in the cloud, and communication protocols, all of which are integrated and efficiently designed to transmit data securely and efficiently, especially for analytics.
Edge solutions can be an integral part of automated systems or installed in parallel to track data not required by these systems. Many users are opting for the latter option, which allows them to obtain the necessary data without affecting existing production systems. However, it is critical to note that these new digital capabilities can connect to all forms of previously identified siloed data.
Edge connectivity solutions come in a variety of forms, including compact or large PLCs that are ready to connect to industrial PCs (IPCs) running SCADA or edge software suites, but also Edge-enabled controllers running Scada or edge software suites, and IPCs running Scada or edge software suites. Equipment deployed at the edge may require wired I/O protocol and/or industrial communication capabilities to interact with all edge data sources. The resulting data may need to be reprocessed or at least organized with context. Maintaining context is especially important in manufacturing environments where hundreds or even thousands of discrete sensors monitor and execute the mechanical and physical actions of machines. Modern automation software helps preserve relative links and context.
Finally, the data must be transmitted to higher-level systems using protocols such as MQTT or OPC UA. Today's IoT/IT standards are developing to ensure consistency and future flexibility of data and communications. It is important that solutions are flexible while still adhering to the standards, as opposed to custom installations that will be impossible to maintain over the long term. Once an Edge solution is in place and capable of obtaining data, the next step is to make it available to higher-level IT systems through effective communications with cloud-hosted software.
Connecting the edge and the cloud
Cloud-based hosting software offers a range of benefits, such as lower costs where the user only pays for what they use and avoids investing in the purchase and management of IT infrastructure. The cloud computing is often referred to as an "elastic computing" environment. This is because if additional computing resources or data are required, they can be added in real time.
The cloud allows the user to avoid the problems of configuring hardware and software IT systems, as well as those related to their deployment, management, performance, security and updates. Resources can be focused on achieving their primary business objectives.
Essentially, the cloud enables megadata sets to be processed efficiently by tailoring processor processing power to the needs of analysis. Data can be optimized and accessed anytime and from anywhere using any device capable of hosting a web browser.
Data security can be improved by using different storage servers, as well as backup and failover options. Faster development is also possible with immediately operational platforms; only an Internet connection and access data are required.
A cloud architecture is particularly suited to the needs of organizations implementing IIOT data projects. The cloud is the enabling infrastructure for many IIOT projects, and the combination of these two technologies enables innovative interactions between humans, objects and machines, giving rise to new business models based on intelligent products and services.
Conclusion
Data silos are all too common in manufacturing sites and production facilities around the world. They are the unfortunate result of legacy technologies unable to handle data and traditional design philosophies focused on core functionality at the expense of data connectivity. The importance and usefulness of megadata analytics has only recently become apparent. As a result, end users are struggling to integrate this capability into new systems and include it in existing operations.
Edge-to-cloud data connectivity creates value because it offers a variety of ways to visualize, record, process and analyze data in depth. All IIOT solutions that bridge data between IoT and IT rely on digital capabilities that can connect to traditional automated elements, such as APIs, or directly to data sources alongside any existing system. These edge resources must be able to pre-process data to some degree and add context before passing it to cloud systems for analysis.