Industrial Data: From Generation to Storage in Data Lakes

Posted by

In industrial settings, most of the data is initially unstructured. Converting this data into structured format and storing it in data lakes is crucial for several reasons. Firstly, structured data is easier to analyze and is compatible with various data analysis tools, enabling applications such as process optimization, quality improvement, and predictive maintenance. Secondly, storing data in data lakes centralizes data management, enhancing efficiency and reducing costs associated with complex data pipeline constructions.

Data Generation

Various equipment and systems in industrial environments generate a massive amount of data. This data is typically generated in the following ways

  • Control PCs: Responsible for process control and monitoring.
  • PLC(Programmable Logic Controller): Communicates with field devices like sensors and actuators to collect data and control processes.

Data Collection

The data generated is centralized for efficient management

  • CIM(Computer Integrated Manufacturing) PC: Collects and integrates data from Control PCs and PLCs to enhance process efficiency.

Control Systems, Data Collection, FDC System

Data passes through several stages of systems for analysis and processing

  • Control Systems: Monitors process states and alarms, manages work instructions and scheduling.
  • Data Collection System: Gathers data from various sources.
  • FDC(Fault Detection and Classification) System: Analyzes data to detect and classify anomalies.

ETL Process

Data is finally stored in the data lake through the ETL process

  • Extraction: Data is extracted from the FDC system.
  • Transformation: Data is cleaned, standardized, and transformed if necessary.
  • Loading: The transformed data is stored in the data lake.

Data Lake

A data lake has the following characteristics

  • Stores and manages large volumes of structured and unstructured data.
  • Can store raw data directly.
  • Data stored in the data lake can be transferred to data warehouses or other analysis tools for further processing and analysis.

Conclusion

Systematically collecting and managing the vast amount of data generated in industrial environments is crucial for maximizing manufacturing efficiency and enhancing competitiveness. Proper data storage and management in data lakes enable real-time analysis and decision support, contributing to overall operational efficiency.

Leave a Reply

이메일 주소는 공개되지 않습니다. 필수 필드는 *로 표시됩니다