Introduction
This article is based on the draft of a speech delivered at a securities fund industry exchange on October 28, 2016. I have worked for more than ten years. Apart from receiving some work assignments, I have no experience of giving a speech on stage. This is my first time. I shared my experience and understanding of the way of safe operation of enterprises in the financial industry, including the problems faced, safe operation framework, supporting tools and required resources, the difficulties of safe operation, why safety detection fails, and the maturity of safe operation. Some thoughts on the grey area of enterprise security, white list or black list are given. Limited by my own experience and industry, I may not have universal representation, and my personal ability and vision are limited. I hope you can criticize and correct me. If I can give some reference and help to my colleagues in need, it will be useless.
Problems faced
1. Safety requirements
In the early stage of enterprise information security construction, the main content of security work is to purchase security equipment and deploy security management and control system and carry out daily maintenance. From network layer, virtual layer, system layer to application layer, data layer and user layer, a series of security devices or software are deployed to ensure their stable operation. However, it is found that the security situation has not been effectively improved and security problems occur frequently. The root cause is that there is no effective security operation. How to build an effective safe operation system for an enterprise is discussed in this paper.
The financial industry is a license industry, which is subject to strict supervision. The typical feature is that the sustainable and steady development of the business needs to be guaranteed by safety. There are full-time safety teams and safety personnel. The annual safety budget and investment are guaranteed and increased year by year. The security needs of enterprises are as follows:
2. Problems
Under the demand of enterprise security, there are three main problems:
(1) The whole picture of enterprise safety
What is the whole picture of enterprise security? We hold the microscope to see what the security administrator does every day, including: check whether all kinds of security devices and software are running normally, hardware alarms, applications and processes, performance capacity, table space, storage space, system and application logs; check and respond to security alarms of security devices and systems, including anti-virus logs and alarms, firewalls Log and alarm, intrusion detection log and alarm, Internet monitoring alarm, honeypot system, data leakage prevention system log and alarm, all kinds of audit systems such as database audit, firewall rule audit report and log, external third-party vulnerability report platform information; handle all kinds of security detection requirements and work orders; repair and recheck all kinds of security vulnerabilities, managed by branches In charge of responsibilities, it is also necessary to urge the safety management of branches; fill in all kinds of safety statements and reports, report all kinds of suspicious situations and further trace them; promote all kinds of safety projects, those with management and technology; some need to pay a lot of energy to deal with all kinds of safety inspection and internal and external audit. People who have done basic level security operation and maintenance will be familiar with the above scenarios, which is the epitome of various scenarios of enterprise security, but not the whole picture. If an enterprise has only a small number of people, servers and products, then the above content is the whole of enterprise security work. I define the above work as a security protection framework. If there are tens of thousands of servers, hundreds of programmers and hundreds of systems, in addition to vulnerability detection and repair, security detection and attack and defense, enterprise security should also consider the problem of security operation. In terms of workload, security protection and security operation account for 50% respectively.
(2) Safety service quality is kept in a stable range
In the construction of safety protection framework, a large number of safety protection equipment and measures will be deployed, which will significantly improve the safety detection capability and bring problems: how to effectively respond to all kinds of safety logs and alarms? With the rapid increase of the number of safety equipment, how to solve the problem of the effectiveness of safety equipment? How to ensure the stable output of personnel's work quality while the number of safety equipment and Safety log alarms are increasing rapidly? This refers to the stability of the work quality of personnel. Our goal is to eliminate the impact of individual personnel on the quality of safety services provided by the safety team. For example, just like the difference between a big meal and a fast food, a big meal depends on the performance of a famous chef. If today's famous chef is in a bad mood or changes a new person, the quality of the products he may make will be greatly reduced. As like as two peas, KFC, all operations are standardized and streamlined. Those who have not learned cooking and have not trained after junior high school have been trained for a short period of time and are strictly managed. The standardized process and management of fast food almost completely eliminate the human factor, and ensure that the quality of service provided to the outside world can always be stable without ups and downs. The goal of safe operation or the problem that needs to be solved is to ensure that the quality of safety service is kept in a stable range as far as possible under the condition that the enterprise becomes larger and the business and system become more and more complex, and the resource investment remains unchanged.
(3) Safety engineering capability improvement
One of the problems that need to be solved in safety operation is the improvement of safety engineering capability. For example, many experienced security engineers in the enterprise can trace a server suspected of being hacked, check the server process and various log records, which is their personal ability. If this ability of security engineer can be transformed into an automatic security monitoring ability, and emergency response and processing can be carried out through the security platform, the security personnel who do not have this ability can also become a force against attackers, which is the benefit of improving the security engineering ability and the problem that needs to be solved in the security operation.
Thinking of safe operation
1. architecture
In order to ensure the flexible expansion of the security operation architecture, it is recommended to divide it into four modules according to the functional modules: security protection framework, security operation and maintenance framework, security verification framework and security measurement framework. The main function of the security protection framework is to provide real-time detection capability through the continuous deployment of the security monitoring system, which is called the "sensor" of the security sensor and provides the "eye of heaven" for the security operation and maintenance framework. The current popular situation notification and intrusion awareness are mainly guaranteed by the security protection framework. The main function of the security operation and maintenance framework is to collect the monitoring information of each sensor in the security protection framework, process the monitoring information through the black-and-white gray list processing and association analysis (many manufacturers call big data intelligent analysis, which I understand is only based on rule-based data mining), output the alarm through the unified display platform, enter the event processing platform and process, and manually intervene in the processing 。 Security operation and maintenance framework also includes regular review of security events and reporting to management, which may be more important than single event handling. The main function of the safety verification framework is to ensure the effectiveness of the safety protection framework and the safety operation and maintenance framework through the black box and white box verification measures. The main function of safety measurement framework is to measure and evaluate the quality level of safety operation through a series of safety measurement indicators, and to achieve targeted continuous process improvement, so as to realize the spiral rise of quality.
(1) Safety protection framework
The purpose of security protection framework is to deploy as many and effective security sensors as possible. These security sensors constitute the "Skynet" of information security. This part is the basic work and the main battlefield of traditional security. It needs continuous investment and accumulation for many years. The deployment of security sensor follows the concept of defense in depth, as shown in the following diagram:
In practice, it may be far more than these sensors. For example, the network layer can collect firewall monitoring information, especially deny information. Some firewalls also have IPS functions, such as checkpoint smartdefense, which is a particularly useful security sensor, ACS server information of switches and routers, bastion machine login information, virtual layer virtual host operation information, windows and Linux host logs, and deployed security clients on the host Monitoring information, database audit system monitoring information, ad system information, storage backup system operation information, KVM, ILO and other out of band management system information, ITIL system work order information, application system application information such as OA system application log, SAP system application information, official document transmission system log, FTP data transmission log. A large part of the enterprise's basic security is to build all kinds of security sensors to solve point-to-point security problems and needs. For example, there are many enterprise firewalls. How to manage the effectiveness and compliance of firewall rules may require the deployment of firewall rule audit tools such as algosec and firemon. The information found in the audit can be used as the input of the security operation and maintenance framework. If you want to monitor the malicious addresses accessed by the intranet or servers, you can collect open source malicious address libraries like arcosi.
Two problems need to be considered in the construction of safety protection framework. First, send the original monitoring information or the monitoring alarm information processed by the sensor to the security operation and maintenance framework? If it is a firewall, IPS and other security protection system, try to be the full amount of original information. If it is windows, Linux host log, compliance detection, login and logout and other information, consider filtering the original information, only the information related to security can be used as the input of the security operation and maintenance framework. Second, whether to conduct business safety monitoring. Huawei ayazero believes that enterprise security covers seven areas: ① network security, ② platform and business security, ③ generalized information security, ④ it risk management, it audit & internal control, ⑤ business continuity management, ⑥ security brand marketing, channel maintenance, and ⑦ other needs of CXOs. For traditional industries, it is suggested to do ① ③ ④ ⑤. For Internet companies, it is suggested to do ① ② ⑤. For financial industries, I suggest to do ① ③ ④. For strong security teams, it is suggested to do ① ③ ④ ⑥⑦.
(2) Security operation and maintenance framework
The construction goal of safety operation and maintenance framework is to become the brain, nerve center, ears, eyes and hands of enterprise safety. In the modern combat system of the army, the US Army creatively put forward the C4ISR combat command system, which is command, control, communication, computer and intelligence, surveillance and reconnaissance. A complete information security battle command automation system shall include the following subsystems:
"Brain" - infrastructure platform. The infrastructure platform is the technical foundation of the command automation system, which requires large capacity, fast speed and strong compatibility. "Eyes and ears" - Security Intelligence, security surveillance and reconnaissance system. It mainly collects and processes the safety information of each safety sensor in the safety protection framework, and realizes the real-time safety monitoring of abnormal behaviors. "Nerve center" - data analysis system. By using all kinds of intelligent analysis algorithm and data mining analysis technology, we can realize the automation of security information processing and the scientization of decision-making methods, so as to ensure the efficient management of security control equipment. The main technology is intelligent analysis algorithm and model and its implementation. "Hands and feet" - safety control system. The security detection and control system is a tool used to collect and display security information and implement the security control instructions issued by the battle command system. It mainly includes various security control technologies and equipment, such as anti-virus, host security client, firewall, etc., which mainly realizes the real-time security control of abnormal behaviors.
When the security operation and maintenance framework is actually implemented, the enterprise will deploy Siem or SOC and other similar platforms to achieve unified collection and storage of security detection information. Most Siem or SOC platforms support built-in or customized blacklist detection rules for implementation detection. There are also cases of security big data mining combined with intelligent analysis platform to solve the short board of insufficient intelligent analysis of Siem and SOC platforms. Follow the standardized process of event handling, incorporate it into ITIL event management process, and give safety reminders by issuing work orders, sending alarm emails, SMS, etc. The confirmation and traceability analysis of safety events are mainly carried out through manual analysis and confirmation. For 100% identified abnormal security attacks, they are blocked automatically. Closed loop management of safety events is carried out by means of daily meeting, weekly report and monthly report of safety events.
(3) Security verification framework
The security verification framework solves the problem of security and effectiveness, and undertakes the functional verification of the two frameworks of security protection and security operation and maintenance. The security verification framework is the blue army of enterprise security. In peacetime, the blue army plays the role of adversary, which is conducive to timely detection, evaluation, repair, confirmation and improvement of vulnerable points in the security protection and operation and maintenance framework. It includes two parts: white box test (process verification) and black box test (result verification).
White box detection (process verification) refers to the establishment of an automatic verification platform to achieve 100% comprehensive verification of the control measures of the safety protection framework, and visual integration into the safety operation and maintenance platform. The failure of the control measures can be found within 24 hours. Through the automatic verification platform:
Based on the above objectives, automatic verification requires that all verification events must be generated by automatic simulation of real events, and cannot be generated by inserting records. At the same time, automatic verification events shall provide a unique identification to judge whether they are verification events, and the generation time of verification events shall be arranged uniformly to prevent centralized triggering. The security operation and maintenance platform shall be able to monitor the systems and rules that fail to pass the security verification, generate alarm information, and notify the security operation and maintenance personnel to intervene.
Black box detection (result verification) is to find its own vulnerability and weakness before the opponent through multi-channel security penetration mechanism and red blue confrontation exercise. At present, the common multi-channel security penetration mechanism is the security crowd test. The red blue confrontation exercise requires the security personnel with high attack and defense skills of the enterprise, or it can be completed by external professional organizations to test the effectiveness of the security protection framework and the security operation and maintenance framework.
(4) Security metrics framework
Security measurement framework is mainly used to evaluate the effectiveness of security, which is a very difficult thing to do. I think it can be divided into several levels:
One is the dimension of technology. It includes anti-virus installation rate, normal rate, detection rate and false alarm rate of intrusion detection, response time and processing time of security events, time required for high-risk early-warning vulnerability investigation and complete repair time. The availability of security operation and maintenance platform and event convergence rate can also be considered. In terms of compliance, compliance rate, number of non-compliance items, number and severity of internal and external audit findings can be set;
Second, the effectiveness of safe operation. Including coverage rate, detection rate, attack and defense success rate. How many businesses and systems are under the protection of security, how many grey areas no one cares about, how deep and fast security can be promoted within the enterprise, which requires comprehensive technology and soft skills. The success or failure mainly depends on the safety team leader. Detection rate and success rate of attack and defense are effective indicators to measure the safety and effectiveness. Even if the safety team can't pat on the chest to ensure that nothing happens, it can't live by luck and probability. The direction of efforts is to continuously improve detection rate and success rate of attack and defense;
Three is safety satisfaction and safety value. Security value is reflected in the ability of security to support business, TCO / ROI, how many resources security uses, how many businesses it supports, and the degree of support. Security value is also reflected in the internal influence and influence on the business, whether to do micro security or general security, whether to bring a positive impact on the business or a negative score. Security satisfaction is a comprehensive dimension indicator. I understand that it is the highest requirement for security team and personnel. It should not only meet the interest demands of superior leaders and business departments on security, but also meet the interest demands of other IT teams at the same level, as well as the interest demands of members within the team. It should provide the best security services to make safe users safe customers To satisfy users is really a very challenging thing.
2. tools
Security operation tools include Siem platform, ITIL and automation tools. Siem platform is responsible for unified collection and storage of safety information, anomaly detection and alarm based on detection rules. The ITIL platform is responsible for receiving the security event information sent by the Siem platform and generating ITIL work orders accordingly, which are pushed to the security operation personnel for handling and closing. The security control automation tool is responsible for automatic operation according to the security control instructions issued by the Siem platform. For example, it detects that there is an external attack source, and the firewall or IPS can block the attack source by issuing the automation instructions. It is found that there is a suspicious process in a host, and the process file sample information is collected by the security client for further manual analysis. It is found that there is a suspicious operation on a user's computer in the office intranet, which is not manually operated. The suspected program operates automatically, and the user can be prompted to confirm manually through the security client. At present, security control automation tools are not highly commercialized.
(1) Siem test rules
Siem is a powerful tool to detect security events that other security tools cannot capture if there are appropriate detection rules. Generally, there are three types of detection rules for Siem:
① Single detection condition rule
If a single specific detection condition is met, an alarm will be triggered. For example, the server host login source is not the bastion host address. If this condition is met, the alarm will be given. This type of rule is the simplest. It mainly depends on the monitoring ability and rule filtering ability of the safety sensor. There must be exceptions in attacks. The key is how to summarize and extract the characteristics of exceptions for detection. For example, ayazero mentioned in advanced security of Internet enterprises that the detection of attack rights (the parent process of a high permission (system uid = 0) process (Bash? CMD. Exe?) is a good case to summarize and extract the characteristics of exceptions for detection.
② Cross platform security monitoring information association detection
The most typical rules are Attack Alert based on asset vulnerability, association analysis vulnerability scanning and intrusion detection alert information for association detection. For example, the firewall limit log contains the malicious IP address information defined in arcosi. This type of rule can be used to correlate the monitoring information of cross platform system, and many detection rules of brain hole opening can be derived. For example, detection of security violations, detection of data leakage, and even personnel may leave. The detection effect of this kind of rules depends on two points: one is that the types of security sensors are as many as possible and a single sensor can monitor as wide as possible; the other is that the detection thinking of the rule designer, just like the thinking of the attacker, needs to have a big brain hole and be obscene. ③ Detection rules for long time slow low frequency attacks
③ Detection rules for long time slow low frequency attacks
Most security tools identify potential security events in an isolated way. For example, IDS monitors suspicious traffic from a certain workstation, and then monitors similar traffic from 20 other workstations. On the IDS management panel, each event is treated as a separate event (some IDS manufacturers have advanced functions). Rules can be written in Siem and triggered according to the frequency of the event For different alarms, if 21 similar events are sent from IDS in a few minutes, a rule can be triggered. If the attacker takes a long-time slow low-frequency attack to invade the intranet, a Siem rule can be written to search for a specific event in a long time and alert when the number of times within the event range reaches a certain threshold. Furthermore, this detection rule is also effective for logs that are not in the form of instant security events. For example, to detect DNS tunnel, DNS tunnel is used to encode C & C traffic as DNS request, which is sent from the infected machine, arrives at the C & C server through the DNS server of the infected enterprise, and then returns the response to the DNS server of the enterprise, which forwards it to the infected intranet machine. Normal DNS queries have a certain frequency. DNS tunnel needs to send a lot of DNS packets on the network. Therefore, it can effectively detect DNS tunnel by making rules that a single machine on the intranet queries the same domain name to a certain threshold (such as 1000 queries in 10 minutes). Siem's detection rules can also be configured to give an alarm when the traffic source is different from the old mode, or to give an alarm when the legitimate and previously correct traffic suddenly shows an exponential rise or fall. For example, in the past 90 days, the web server that generates a certain number of logs suddenly starts to generate 10 times the normal number of logs, which may be used by the intruder host to send the logs to other hosts Signs of attack. Through the Siem rules, the security team can make an alarm based on the standard deviation of the traffic, such as when it reaches 10 standard deviation thresholds.
(2) Siem health monitoring
In many attack and defense cases, the failure of the defense is mainly attributed to the failure of safety protection, among which the health problems of Siem platform tools are relatively common, including: the failure of safety sensor safety monitoring information collector, the failure of Siem detection rules, the failure of safety alarm, and the failure of safety alarm processing. The main reasons for the failure of the security detection information collector are the failure of monitoring the physical machine performance of the collector, the normal monitoring of the collected data, the analysis of the collected data log and the exception of the parser. The failure of Siem detection rules includes invalid setting conditions, invalid thresholds, ineffective rules, etc. sometimes the setting of alarm thresholds is unreasonable and frequent alarms, and the Siem platform will automatically disable the rules, resulting in invalid rules. Security alarm failure, including invalid configuration of email and SMS gateway, invalid configuration of user, network failure, abnormal configuration change, wrong phone number setting, etc. The failure of security alarm processing is mainly due to human factors, such as multiple alarm messages, selective neglect, too many false positive alarms submerge the real threat alarm, etc.
It is worth mentioning the security of the security sensor. At 2:00 p.m. on March 20, 2013, South Korea, including three banks including Shinhan, agricultural association and Jizhou, and two television stations including KBS (Korean Broadcasting Corporation), MBC (Korean Culture Broadcasting Corporation), more than 32000 computers and some ATM ATMs, all crashed at the same time, unable to restart. Hackers first intruded the virus definition update server of AhnLab, a Korean anti-virus software manufacturer, and used the virus database definition upgrade mechanism to distribute the malicious software to the user's computer and install and execute the malicious program on the user's computer. The investigation found that another anti-virus software company, ViRobot, was also used by hackers. If the security sensor that you deploy inside the enterprise accepts the update is malware , shivering. The importance of safety of safety sensor should not be emphasized. Several principles: the control instructions are only allowed to be solidified, and it is strictly prohibited to reserve the execution system command interface at the sensor end; the update package must be uploaded to the update server for saving after being audited, and the update only allows to select the later installation package on the update server, and it is better to verify the MD5 of the update package; the control instructions must be issued after being audited and confirmed manually. For usability, it's better to complete the update in batches and regions. Otherwise, the production network will be blocked due to the downloading of a large number of update packages. I can't spare you.
3. Resource requirements
(1) Process and mechanism
Effective and efficient security operation process and mechanism is very important. Generally, the safety operation process has two standardized processes: safety incident handling process and continuous improvement process of safety operation. The security event handling process is to define what level of events should be handled by what kind of people, when and by what standards. An external attack scan and internal discovery of branches over the continuous high authority account guess security level must be different. The former is a common or concerned event at most. The security front-line engineer issues an instruction to automatically block the external IP address on the firewall for a period of time. The latter needs to be defined as a high-risk event. An experienced second-line security engineer or security expert should immediately contact the branch for traceability investigation. It may be a special Trojan horse in China's financial industry, or it may be a sneak attack by the network blue army, or there may be an attacker coming in. No matter what, your security perception ability has been improved, and finally the security is no longer available It's luck and probability. The continuous improvement process of safety operation is the closed-loop management of safety events, and the final result of each safety event must be false alarm and true, with one of two choices. In case of false alarm, Siem safety detection rules or safety sensor monitoring measures must be improved. If it is true, the good side is that the safety detection ability is effective, and the bad side is that the bad people have come in. According to the bad guys have broken through the layer of targeted improvement. The continuous improvement of safety operation requires that the safety incident review be carried out every day, every week and every month. It may be that the important incidents are ignored by the front-line personnel for a while, or for other reasons. The quality of the continuous improvement process of safety operation may determine the quality of the whole safety operation.
(2) Organization and personnel
The organization chart of the large-scale security department we expect should look like this:
In practice, the organization chart of security department is as follows:
Ideal is rich, reality is backbone, there is always a gap between ideal and reality. In terms of team size, Alibaba and Tencent, the Internet companies, have a security team of about 2000 employees, with a total number of more than 30000 employees. The security team accounts for about 7% of the total employees. There is still a large gap between the financial industry and this proportion. The security team size of the head office of domestic joint stock banks is generally about 10-20 people, and the IT personnel of the head office ranges from hundreds to thousands. Generally, the number of security teams of securities companies is between 2-5, and Huatai security team has 7, which is quite large. The realization of security operation can not be separated from the organization and personnel. The security operation personnel of the securities company are recommended to be allocated in the proportion of 1:2:3. The operation and maintenance personnel of a security operation platform, including server and application operation and maintenance, can be handed over to the operation and maintenance team of the IT department for operation and maintenance. Two safety personnel are prepared for each other, one is responsible for the construction of safety sensor, one is responsible for safety detection rules and safety line 2, incident investigation, review and report, and continuous improvement. Three outsourcing safety front lines are responsible for 7 * 12 incident response and preliminary investigation and confirmation. The number of security operators in joint-stock banks is 2-3 times, and the number of outsourcing personnel can also be increased depending on the type and number of events.
Thoughts on safe operation
1. Difficulties in safe operation and construction
What is the reason why the security construction of the Internet industry and companies leads the development of the whole industry? Large investment in human and property resources? Free market competition? I think the most important reason is that facing the pressure and demand to solve the actual security problems, we have adopted the fastest and most effective security solutions. If we adopt the traditional security solutions of traditional industries to solve the security problems and needs of the Internet industry, it will not work. So the key word of Internet security is to solve practical problems effectively. Before 2010, when we communicated with our colleagues in the domestic financial industry, the idea of safety was still in the stage of regulatory compliance + equipment deployment. I think it's reasonable. Security is matched with the demand. The financial industry is the license industry, and regulatory compliance is the primary and most important demand for security. The security team maximizes to meet the regulatory compliance goal at this stage. At the same time, due to the objective factors such as the legal protection of the state to the financial industry, the risk faced by the business system of the financial industry is far lower than that of the Internet industry. After 2010, due to the rapid development of online banking and mobile finance, as well as the further deterioration of the domestic Internet security environment, the security needs of the financial industry began to change profoundly, which needs to effectively solve the actual security problems. Regulatory compliance and equipment deployment have been continuously improved over the years, but there will still be security incidents. Where is the direction? My understanding is that it is a good idea to shift from equipment deployment and operation to safety and effectiveness.
The core of security operation is the security operation and maintenance framework, and it is the Siem platform or SoC platform that carries the security operation and maintenance framework. I often encounter a problem in the financial industry wechat group. Why is SOC easy to fail? What are the difficulties I understand to be equivalent to safe operation? There are three points.
(1) The maturity of the enterprise's own infrastructure is not high
The quality of safety operation is closely related to the maturity of the enterprise's own infrastructure. If an enterprise's own asset management, IP management, domain name management, basic security equipment operation and maintenance management, process management, performance management and other aspects are not perfect, or even in a mess, can the security operation be independent? The installation rate and normal rate of anti-virus client and security client are terrible. When an IP is detected to have problems, the IP and asset cannot be found. The security events detected are not supported by a reasonable event management process tool. When an internal employee fails to follow the specification, there is no restriction on the security vulnerability results. What can the security operation do? It's better to do a good job in the security of the point, and then consider the security operation. For example, first, run the anti-virus client well.
(2) Safety operation and maintenance cannot cure all diseases
Safety operation and maintenance cannot cure all diseases. As the security operation and maintenance framework does not have its own security monitoring capability, and the security monitoring relies on the security protection framework, the SoC platform does not generate its own information. A series of security sensors need to be built through the security protection framework to have a strong security monitoring capability and a pair of security eyes in the enterprise. Therefore, the security operation and maintenance construction cannot replace the security protection construction The deployed security system and equipment still need to be built.
(3) It's hard to hold on
I have been struggling with this difficulty for a long time, but since it's the calendar, I think it's still necessary to write it out. Our simple wish is always to have a pair of hands of God to help us solve all problems. Security problems are often very difficult. Our intuitive reflection always hopes to have a security solution with low cost and less time consumption. But it always backfires. There is no quick success or shortcut for safety. In fact, everything related to operation is not a big thing. It's often related to trivial, tricky, plain, even frustrating. Therefore, it's difficult to insist on safe operation. It is the most difficult thing to keep track of every alarm, daily safety meeting, weekly safety analysis and doing everything well every day.
2. Why does the safety test fail
Single point detection and defense and enterprise scale detection and defense are two concepts. Many single point detection and defense are very effective, but when the enterprise is on scale, there will be the problem of security detection failure, which can not be promoted and deployed seriously, and finally have to be cancelled.
In practice, if a security attack is not detected, it is a very good opportunity to improve the security operation ability of enterprises, which means that a certain link must be weakened so that the security detection fails. The general troubleshooting sequence is: insufficient single point detection depth - > insufficient coverage - > security operation and maintenance platform availability problems - > alarm quality problems - > human problems.
First of all, the lack of single point detection means may be caused by the bad writing of the detected regular expression, or the way the attacker uses is not considered in advance, or the security monitoring of the existing security protection framework cannot be monitored at all. Targeted improvement and upgrading will be fine.
The second is caused by insufficient coverage. There is no security monitoring product deployed in the problem machine or network area. Even if there is monitoring capability, the detection will fail because there is no deployment. For example, the installation rate and normal rate of anti-virus clients are only 80%. Even for known malicious programs, only 60% of the probability can be detected. In fact, this problem is the current situation of many enterprise safety problems, with monitoring equipment and capabilities, but the safety detection fails. What's more, we often don't pay attention to these gray areas, and invest heavily in testing and deploying those security concept products, such as anti apt, threat intelligence, situation awareness, etc. in fact, how can these security monitoring devices be separated from? So the solution to this problem is to increase the deployment rate and the normal rate. As for the gray area of enterprise security, there are several noteworthy aspects: (1) assets that are not concerned, especially Internet assets. There are many security loopholes in the black cloud newspaper, and many replies from manufacturers are as follows: This is our (test / coming offline / no one to use / outsourcing personnel to use...) We have shut down the device. In addition to servers, these assets also have assigned Internet IP and domain names, systems and applications that are not in security monitoring; (2) open management background, high-risk ports and file upload points on the Internet; (3) various third-party applications that have been exposed to loopholes; (4) weak command, various weak passwords, system weak password, application weak password and user weak password. If they are solved properly I think the password problem can solve at least 50% of the security problems of the enterprise.
Third, there is a problem with the availability of the security operation and maintenance platform. In the previous section, the problem of Siem health monitoring is introduced, which is also one of the important reasons for the failure of security detection.
The fourth is the problem of alarm quality. SOC is criticized most for collecting a large number of data, but it is often unable to judge which alarms really need attention. The low effectiveness of alarm results in a large number of manual confirmation and high management cost. Insufficient design of security detection rules leads to too many alarms, which leads to selective neglect of security operators.
The fifth is about people. I understand that mechanism and process are also human problems. If the above reasons are eliminated, there is still the problem of safety detection failure, it should be attributed to the problem of people. For example, when it's almost time to go off work, we rush to close the alarm confirmation, or our safety skills are insufficient, so we can't effectively investigate and judge the actual safety problems.
3. White list or blacklist
At present, the vast majority of safety protection measures and safety detection rules, no matter how high they are blown, are basically based on the blacklist principle. If they meet the blacklist rule, an alarm will be given. The advantages of blacklist are obvious, false-positive is low, cognitive understanding is easy, the disadvantage is that the rate of missing reports is high, and it needs probability and luck to detect the ugly points of security threats. From the perspective of safety and effectiveness, white list may be paid more and more attention. The disadvantages of white list are high false-positive and high operation cost, so it needs self-learning ability of security detection (artificial intelligence for short) to form automatic or semi-automatic convergent security detection rules. This hope can have mature commercial products as soon as possible, solve the pain point of the enterprise.
4. What kind of safety and safe operation is needed
What kind of security and safe operation does the enterprise need? What suits you is the best, or the biggest ratio of investment to income. The safety investment of the enterprise is related to the scale and profitability of the company. The company has a large scale and strong profitability. When it is in the development period, the budget and staffing will increase, and when the business is stagnant, the safety investment will not be increased. Because in Party A, security is not the main business. The information technology department is already the functional department of the middle and back office of the company. The security team is the middle and back office of the information technology department, which is called the back office. So it's best to fit yourself. There is a stage theory of enterprise safety construction. The first stage: if the basic security system is not complete, in the stage of fire fighting or the construction of security system is inadequate, the apt attack can be put aside first, and the work that needs quick hemostasis in the security can be done well, which is the basic security work. This part of work is far from being high, but it is the most basic and useful "life protection" work, which does not need too much extra investment We can avoid 80% of the security problems and let enterprises have a basic security guarantee. The second stage is the system construction stage. Various safety monitoring and protection means, as well as various safety specifications and safety processes are built. Generally, 27001 system + commercial solutions + a small amount of self-study can be used. The third stage is the high-level construction of security. In this stage, the basic commercial products are difficult to meet the security needs of enterprises. It is characterized by self-development and automation intelligence. The core is to solve the security problems encountered by enterprises and solve the actual security problems. There are not many enterprises that can enter this stage, but they basically represent the future development direction of the industry.
There is a maturity problem in personal understanding of safe operation:
(1) Level 1: spontaneous level. Some basic security measures and control have been deployed. More human and financial resources have been invested in single point defense, which is more dependent on the manufacturer, and there is no overall control over the enterprise security. (2) Level 2: basic level. With the concept of safe operation and put it into action, a relatively perfect safety protection system has been established, and through safe operation to ensure safety and effectiveness, individuals or teams with attack and defense capabilities can solve practical safety problems. (3) Level 3: automation level. It has the ability of automatic monitoring, response, processing and even counterattack, has the overall control over the security status and ability of the enterprise itself, has the ability of intrusion perception, and can carry out a certain level of attack and defense confrontation. (4) Level 4: intelligent level. It adopts the security protection principle of white list, which has the real significance of intelligent security detection, and can identify the behavior that deviates from the normal behavior mode. (5) Level 5: Skynet level. Heaven's vengeance keeps all malicious acts in secret. What kind of security is this level? I don't know. I haven't seen it.
In any case, it is the best principle to stick to the principle of being suitable for yourself. If the demand is for a bicycle, and a special plane comes, the result may not be good.
The best things in life are all kinds of experiences and unforgettable experiences. The process is more painful and the results are better. If you, like me, encounter all kinds of rather "painful" physical calendar in enterprise security, you will certainly thank and miss this physical calendar later.
Please refer to the following articles for some opinions and contents. 1. Advanced Guide to Internet enterprise security, compiled by Zhao Yanjiang, Hu Qianwei, 2. Precaution: information security methods and practices guided by intelligence, Allan Liska, 3. Network security monitoring: collection, detection and analysis, by Chris sanders and Jason Smith.
Note appended:
- Nie Jun, an information security practitioner, has more than ten years of experience in information security in the financial industry. Experience includes information security, ITIL, bank internal control, member of the Standing Committee of the International Disaster Recovery Association, China. Good reading, no understanding. Cheerful personality, like football.
Nie Jun, an information security practitioner, has more than ten years of experience in information security in the financial industry. Experience includes information security, ITIL, bank internal control, member of the Standing Committee of the International Disaster Recovery Association, China. Good reading, no understanding. Cheerful personality, like football.
- This official account is different people, different views. They will be biased in different perspectives and positions. They will not agree with each other, but seek truth, goodness and beauty.
This official account is different people, different views. They will be biased in different perspectives and positions. They will not agree with each other, but seek truth, goodness and beauty.
Long press identification QR code to communicate with me