Hacking Book | Free Online Hacking Learning


china communications data security system construction

Posted by graebner at 2020-02-27

With the rapid development of the Internet, data technology and its application are profoundly and widely changing and affecting human society. However, in recent years, various kinds of data leakage events emerge in an endless stream, including Facebook, Huazhu and other large companies. From gdpr, personal information protection law to network security law, the subject of data security has been promoted to a new height.

As an enterprise in express delivery industry, due to the needs of business process, Zhongtong has personal information of users, including address and mobile phone number. From user order, express delivery, transfer and distribution, user personal information runs through the whole business process, while the background business system needs to provide user personal information processing and query functions.

It has become the most important goal of information security team to ensure the security of sensitive user information.

To do a good job in data security, first of all, we need to understand the data life cycle, and then combine the current situation of the enterprise and business scenarios to select appropriate security protection measures in each stage.

Figure 1 data lifecycle

There are also other ideas worth learning in the industry, such as building data security system from the perspective of risk control, which is divided into: data governance, risk identification, risk control, basic data information.

Figure 2 data risk

In the specific practice, combined with the above two ideas, we divide the priority according to the actual situation, and give priority to the control measures to quickly improve the safety protection ability, mainly including the following points:

In traditional express delivery, the name, telephone number and address of the sender and the receiver are printed on the surface sheet, which lacks effective protection measures.

In order to protect sensitive information on the face list, privacy face list technology is adopted. Desensitize the personal information on the list. Use * to replace some sensitive information. The courier can only scan the code through the special app to view and contact the recipient. At the same time, the user's mobile phone number is replaced by the privacy number. The courier does not need to know the user's mobile phone number to contact the user.

In addition to express face-to-face orders, sensitive information of users also widely exists in internal systems of enterprises, which also need to be protected. The commonly used data protection / prevention measures for application system are mainly desensitization, authority control, watermark, behavior risk control, audit, etc.

Desensitization: there are mainly two methods. The specific selection should be based on business scenarios, work efficiency, etc.

Completely desensitized, employees can only see part of the data, such as the first three and the last four digits of the mobile phone number, 135 * * * 1234

After desensitization, the interface for viewing clear text is reserved, which is mainly used in scenarios where personal information needs to be verified, and the interface for behavioral risk control and audit is connected

Watermark: the watermark is mainly anti screenshot and photo, divided into bright watermark and dark watermark. The watermark can be extracted by special processing.

Behavior risk control: it mainly adopts the idea of zero trust architecture, calculates the trust value in real time according to the context information such as identity, authority, behavior track and environment, and then performs secondary verification or interception according to the trust value to different degrees.

Audit: a traditional but very effective way of retroactivity, the difficulty lies in recording all and details. How to use non-invasive design and traffic analysis technology to achieve automatic real-time collection of effective behavior log is still a big challenge.

User sensitive information and business sensitive data ultimately need to be persistent in the storage system, but because of the complexity and flexibility of the business, sensitive data needs to participate in the interaction of online business, so it is impossible to encrypt all sensitive data statically, but it needs to adopt specific data storage protection scheme according to different business scenarios.

In addition, the intranet is not an absolutely secure network environment, so we need to use a two-way authentication encryption scheme according to the zero trust model for the transmission of sensitive data, which poses a very high challenge to the performance and availability of the security support system.

The risk of data storage in traditional relational database

With the support of the R & D efficiency department, we have launched the database unified management platform IDB, which migrates all data access and operation to the platform, including authority control, DDL, DML, sub database and sub table, etc. Take the platform as the only access to internal data, block the client and other access, uniformly close all operations on the database, conduct control and audit at the same time, and also support data desensitization and other functions, as shown in Figure 3.

Figure 3 data desensitization configuration

The audit function is added to the platform, which records SQL statements, operator time, etc., providing basic information for future audit and traceability, as shown in Figure 4.

Figure 4 database log audit

Data storage risk control of big data platform

We have developed a set of big data security management and control platform jointly with third-party partners, and embedded the security management and control measures into the production process without changing the structure of the existing big data platform. Specific functions mainly include resource management, authority management, desensitization management, audit traceability, high-risk operation, etc., as shown in Figure 5.

Figure 5 big data security control platform

You can customize the security control policy and configure the desensitization policy according to the business requirements, as shown in Figure 6.

Figure 6 desensitization strategy

You can also customize the high-risk operation, identify the high-risk operation, alarm and intercept, as shown in Figure 7.

Figure 7 high risk operation

Unified encryption and decryption service

For the above two scenarios, as well as the safe use of data sources such as cache, message queue and ES index library, encryption must be done in the data transmission layer and storage layer. However, how to encrypt not only does not affect the business but also ensures the security is still very difficult. The following two points need to be considered:

1. Who will do encryption? If encryption is encrypted by each business system itself, it will face the problems of encryption algorithm selection, key management and so on.

2. In offline scenario, if the encryption methods of data association analysis are different, it must be decrypted before Association, which reduces the data liquidity.

Based on the above two points, we choose to build a unified encryption and decryption service. The overall architecture is shown in Figure 8. Each business system encrypts sensitive data and stores it in the database by calling the interface. The platform performs the encryption and decryption of sensitive data to ensure the security of algorithm and key. The business system only needs to call the encryption and decryption interface. As for the bidirectional authentication and transmission encryption of the internal sensitive data interface, the certificate management, revocation and rotation needed depend on the PKI system. We are exploring to build our own PKI for the next sharing.

Figure 8 encryption and decryption service architecture

The specific encryption scheme is tokenization technology. In the whole data interaction process, the data is encrypted when entering the system to ensure that the entire data interaction process can use token.

In the design of token, considering the complexity and variety of business application scenarios, the length of token is consistent with the original data, and the original data of the first and second bits can be retained. For example, the mobile phone number keeps the first three digits and the last three digits after the area code, and the middle five digits use random characters instead of occupation bits, which can quickly identify the type of data without knowing the plaintext.

The construction of unified encryption and decryption service has the following advantages:

The business system only saves token, and the database structure basically does not need to be changed

The business system only needs to call the encryption and decryption interface, and do not develop the encryption and decryption components repeatedly

The business system cannot contact the encryption algorithm, key and ciphertext to reduce the risk of disclosure

Decryption is controllable. Only authorized system can call decryption interface, which is convenient for management and monitoring

After using token encryption, internal systems use token to interact with each other, and only decrypt in the scenarios where plaintext must be used, such as touching the user's scenario, sending email, sending SMS, making phone calls, etc.

In the scenario of big data analysis, the uniqueness of token can be maintained without affecting the join analysis. The dimension information (birthday, operator, etc.) in the data can be calculated in advance to solve the decryption demand of most offline analysis and reduce unnecessary decryption demand. Only a few need clear text, for example, data sending scenarios, user return visits, which can be decrypted and closed-loop by the data distribution platform.

Figure 9 flow of sensitive data

Encryption and sharing

Hash algorithm is used to ensure that data can only be decrypted and used by authorized person by managing key. Both parties agree to use irreversible algorithm to encrypt data. After data encryption, one party submits a request with its own data to the other party for collision, and the requested party returns yes / no, which can match the cross data of both parties.

Data lifecycle closed loop

Data distribution platform: the closed loop at the end of data process. Please refer to the previously pushed article: Zhongtong data security distribution platform practice

All data decryption, upload and download, batch export, etc. shall be uniformly exported to the data distribution platform for centralized control.

Data discovery, classification and classification are the most basic and important step in data security and data asset management, but in reality, they are often the most inadequate part. The main reason is the lack of attention and the lack of corresponding tools and technologies. In practice, it is indeed more difficult. The specific methods are as follows:

1. Scan the database and push back the application

Scan the database, identify the sensitive data stored in the database, and then push the application through the database to achieve the classification and classification of the database field level and application system level.

Advantages: simple implementation, only need to scan the database

Disadvantages: need to maintain the relationship between the database and the application; only do forwarding, not involving the stored application, unable to perceive

2. Database scanning combined with traffic analysis

Scan the database to classify and grade the data in the database fields.

Traffic analysis, through analyzing the data in the interface, classifies and classifies the application system.

Advantages: data is complete, and applications without storage can be identified, including online system and background system

Disadvantages: the network architecture determines whether the traffic is complete, and the traffic is cleaned

According to our actual situation and technical reserves, we will choose the method of database scanning combined with traffic analysis to realize the function of data discovery and classification.

Before the emergence of ueba, people were using various means and measures to detect and monitor known threats. Ueba uses machine learning to analyze user and entity behavior. It can detect known and unknown threats by offline and real-time analysis.

Comparison recommendation is to use business log as input, use AI to analyze running model and rules. Whether the business system log dimension is complete or not will directly affect the accuracy of the results.

Here, we introduce another kind of data, that is, the internal traffic data. Under the network architecture of the second tier, we can easily get the traffic image we want, analyze the traffic, and find exceptions through AI analysis. There are a lot of noise data in the traffic, which should be taken into account in the analysis.

The noise can also not be excluded. Under the same conditions, the differences between different users should be obvious. Finally, the log and traffic anomalies are analyzed and scored to improve the accuracy.

Through the processing of metadata, data asset map is formed. From a global perspective, information is merged and sorted out to show data volume, data change, data storage, data usage and other information, providing reference for data management departments and decision makers.

According to the different roles of data, it can be divided into data user perspective, data developer perspective, data owner perspective and data security perspective. According to the roles, corresponding content and functions are displayed and integrated into our unified security management and operation platform.

Data is boundless, where data flows, there should be data security.

At present, the fields of information security such as data security, business security, application security and network security are overlapped and related to some extent. They are complementary and need to be planned and constructed in a larger perspective.

Using management and technical means matching with business scenarios to better resolve the data security risks under the internal and external data application scenarios of the enterprise, service our security capabilities, and build a more perfect security ecosystem are the direction of our efforts.