This article is a summary of Nie Jun, a guest of the online Threat Intelligence station of shalonghua south station.
Practice of matrix monitoring to improve safety and effectiveness
Since working, I have been engaged in safety construction work in Party A. In the process of work, I found that there are some differences between Party A and Party B in terms of enterprise safety construction. Party B's starting point is how to sell products or services to customers better, but Party A's users pay more attention to how to solve the actual safety problems of products. There are relatively some gaps between the two things. However, with the improvement of Party A's cognition and the increasing number of companies that pay attention to providing effective services to customers like Weibo online, the gap between Party B and Party A will gradually narrow.
In my understanding, the safety concept of safety personnel is the most important. If the safety concept deviates in direction or thinking, the effect of safety protection will be very different. For the safety personnel of Party A, the most important thing is the way to solve the safety problems and the angle and height to treat these safety problems.
In the security protection of banks and securities, my security concept is: information security is the game and war between people, and the two warring parties are fighting for the control of information assets. So, what should we pay attention to in this process?
We always hope to have a kind of magic mirror like tool when we fight. Once installed, we can find all the attacks on us. But in fact, this tool does not exist. There is no God's hand or once and for all solution. From my ten years of experience in the security construction of the financial industry, I have extracted a relatively applicable framework, which is the matrix test to be shared today. Our framework includes four modules: security protection framework, security operation and maintenance framework, security verification framework and security measurement framework.
Security protection framework mainly refers to the need for a set of front-end security sensors in the construction of the whole security system. These security sensors constitute the "Skynet" of information security. This part is the basic work and the main battlefield of traditional security, which needs continuous investment accumulation for many years. The deployment of security sensors follows the concept of defense in depth. In addition to these listed below, there are many sensors that can be monitored in practice, such as the security monitoring points combined with threat intelligence provided by micro step online, such as the network layer. Some firewalls also have their own IPS functions, such as checkpoint's smartdefense, which is a good sensor, as well as ACS information, bastion machine information, and virtual information that record switch and router login and command operation logs Virtual host operation information, security client information of host layer, out of band management system information such as KVM and ILO, work order information of ITIL system, OA of application layer and application log of official document system, etc. A large part of the enterprise's basic security is to build all kinds of security sensors to solve the dot like security problems and needs. It can be said that these sensors are the most important part of our security protection framework. At this level, we can find many interesting things. For example, if the operation and maintenance personnel download the file server information in batch in a short time, then this person may have resignation behavior.
After monitoring a large number of alarm information, how to find the abnormalities in the information? This needs to be achieved through the security operation and maintenance framework. Exceptions such as Trojans and viruses can be realized based on a single detection rule. There are also some exceptions that need association analysis and judgment in the model and threshold. These are some things we can do in the security operation and maintenance framework.
After finding the really useful exception information, we will put it into a tracking and landing process corresponding to the security operation and maintenance framework. In an enterprise, it takes a lot of time and energy for security personnel to track and confirm security alarms one by one. At this time, event management process needs to be introduced and implemented through security verification framework, such as using WAF, IPS and other tools for corresponding security protection. A security client or agent will be deployed on the host, and tracking will be carried out according to the rules of the machine and the operation and maintenance personnel 。 So how to make sure that this system will not fail? Now there are two methods. The first one is white box verification, which simulates the attack event, and then checks whether the event can be handled within the expected time (such as 24 hours); the second one is black box verification, which allows white hat to conduct crowd testing on the enterprise and submit vulnerabilities.
Security measurement framework is to use a series of security effectiveness indicators to measure whether the protection system is effective. Through the flexible use of security measurement framework, we can detect the current security situation and make plans for the next three years.
The construction and improvement of these four frameworks will take a long time, and need to be improved iteratively.
I've talked about my overall thinking of safety. Next, I'll talk about the importance of monitoring and the pit we stepped on. A safe sensor is our pair of eyes. Whether the eyes are effective depends on how the monitoring is scheduled. The importance of monitoring is the same as that of business system and security system. In practice, we find that the defense line is broken down due to the failure of security monitoring.
The first is that the important protective facilities are not deployed or the utilization rate is not high. When more and more equipment are not included in the centralized and unified management, the asset management effect will be worse. Therefore, how to ensure the corresponding monitoring of new equipment deployment is a problem that needs to be paid attention to. For example, for anti-virus devices, the killing rate for known viruses or malicious programs is basically 100%, but unfortunately, the installation rate and normal rate of many enterprises after deployment are at a relatively low level for a long time, which means that they have the detection ability and can not detect it.
The second is the failure of the monitoring function of the safety sensor. First of all, sensor itself is a program, and there are various possible problems in the program. Secondly, if the monitored information can not reach the processing platform of the back end of the monitoring quickly, it will also cause the same effect.
The third is whether the security monitoring rules of the security operation platform are effective, clear and unified. We have encountered a situation that in order to improve the user experience in commercial products, there will be a function of automatically correcting bad rules. When the alarm caused by a certain rule reaches a certain threshold, the product will feel that the rule has problems and will help users to correct. But this can lead to the miscarriage of useful rules, or the omission of effective high-level information.
Fourth, other warning methods include the failure of email and SMS, and the incorrect disabling of account, etc.
So what solution should be taken?
Asset accuracy, clear monitoring standards, matrix monitoring, continuous tracking and rectification.
In terms of asset accuracy, we have established a set of asset management platform by ourselves, systematized asset management, counted IP, domain name, host and other information, changed the management form of previous Excel files into an automatic way, associated IP address and domain name system, automatically acquired them and put them into our asset management system, and also worked with the automatic scanning and asset identification workers Associated. In the past two years, there are more and more companies doing asset management and discovery on the Internet side. You can get to know about it, but the most important thing is to combine with the internal management of enterprises.
In terms of clear monitoring standards, here is a schematic diagram:
In the horizontal direction, we need to monitor the configuration of which types of security devices, and in the vertical direction, we can collect which monitored objects. For example, if our platform is going to collect these logs and our security client is going to deploy on these hosts, a check indicates that this type of device needs to install this monitoring. For example, collecting logs, we think all these monitoring devices need to be collected logs. For the security client, we may only monitor the rules of the host and firewall, such as checkpoint and junifer in the figure. N / a means invalid, not involved. Each project has a corresponding person in charge to formulate the monitoring standards, then carry out matrix monitoring deployment, generate their own reports from the horizontal and vertical monitoring dimensions, and then carry out automatic comparison, so as to find some of the monitoring failures.
The horizontal dimension is these configurations. For example, windows server needs to monitor whether there is a process of log monitoring.
The vertical result is to view the data of the monitored object from a single dimension such as log monitoring. For example, from the SoC platform, you can see whether all windows hosts can collect logs. If you can't collect logs, you can be regarded as not in the monitoring environment.
Then, through horizontal and vertical comparison, the result must be four types:
If both the horizontal and vertical are fault, it indicates that this monitoring is missing and needs to be rectified immediately;
If one of the horizontal and vertical results is fault and the other is OK, it indicates that there is a problem with one of the results, which may be caused by program problems or network failure. The comparison results show that the result is diff, which needs further investigation;
If both horizontal and vertical are OK, the monitoring is configured and works normally;
If both transverse and longitudinal are n / A, this does not apply.
Connect the horizontal results, vertical results and comparison results with the asset orientation, and find the equipment not deployed in the asset list, so as to ensure the full coverage of the monitoring system.
After finding these missing items, we need to continuously track the rectification. Our approach is to generate a matrix monitoring report every day, prioritize the existing problems, and then bring the high priority items into the supervision and rectification. At the initial stage, we were quite confident in ourselves, because for so many years, we still attached great importance to monitoring. As a result, after matrix monitoring, we found a lot of monitoring deficiencies. It took more than n hours to complete the basic work of this monitoring. We also integrated the matrix monitoring daily report with the daily safety watch, and put the non-conforming items on the monitoring interface Graphical display, daily inspection by on duty personnel, detailed registration in on duty records, then the team leader will supervise the rectification of such non-compliance. If there is no timely rectification, there will be continuous bright red on the visual monitoring platform.
After the deployment of matrix monitoring, we have achieved some benefits: first, we have more than 20 errors in the host name and IP of the asset list, which can be found within one working day, and quickly repaired. The coverage of monitoring is close to 100%. At the same time, we can also check whether the new equipment is included in the asset management and security monitoring, or whether the existing equipment is included in the asset management and security monitoring Failure under monitoring, etc. In this way, the results of our monitoring are no longer simply dependent on the sense of responsibility and luck of the personnel, but have some improvement and improvement compared with the previous ones.
In my opinion, the idea of matrix monitoring is universal, which can not only be used to ensure the effectiveness and sustainability of safe operation, but also can be applied to most operation fields. My understanding is that no matter how strong the security technology and products are, some people who do not have such skills need to be able to operate. For example, products like micro online can do this. Most of the time, in the face of commercialized products, I will first consider how much my operating cost is and whether I can operate it. If I can't operate it, then this product or scheme is invalid for me. No matter how effective the monitoring technology is, if there is no safe operation, it is a Maginot line of defense, which is not effective. So in the idea of matrix monitoring, the front-line safety watchmen don't need any skills, even those who just graduated can do it. Its front-end display is very simple and direct, that is, if there is bright red on it, inform him to handle it, basically no other operation is required. This is the embodiment of safety engineering capability, and I think safety engineering is an important indicator to measure the level of enterprise safety construction capability.
This is the end of my sharing. Thank you.
All the shared ppts of Weibo online Threat Intelligence shalonghua south station have been uploaded. Please click "read the original" to download the PPT materials. We will continue to share the content of other speakers on this official account. Please keep your eyes on it.