Joint Entity and Event Extraction with Generative Adversarial Imitation Learning
By: tongtao Zhang, Heng Ji & avirup sil
Contributed by: tongtao Zhang
Editor's note: Data intelligence publishes the latest research results of Professor Ji Gu, an internationally renowned young scholar in the field of natural language processing at Rensselaer Institute of technology in the United States: joint entity and event extraction with general advanced imitation learning.
Article summary:
Most traditional methods regard entity and event extraction as a multi classification problem. However, the loss function (such as cross entropy) in multi classification problems only focuses on increasing the probability of the correct category, and only indirectly "suppresses" the probability of the wrong category. This will lead to the model can not carry out comprehensive training for some more difficult instances. In common parlance, the model only knows what is right, but it doesn't know what is wrong and whether it is serious.
In this paper, the author introduces the concept of "reward" in reinforcement learning. For some simple examples, the difference between reward and punishment values is usually small, but for some difficult examples (for example, repeated error examples - usually a polysemous trigger word) or error to get ratio For more outrageous instances (for example, regarding a person's name and entity as the place where an event occurs), the difference will become large, and the system will make the right decision with a stronger punishment.
The core part of this method is how to estimate rewards and penalties. The author uses the counter generated network (GAN), regards the manual annotation as the "expert", or the real data in the Gan, places the system (or Apprentice) in the position of the generator, outputs the generated data, and inputs the generated data and the real data into the discriminator, which is the reward estimator. If the generated data is consistent with the real data, then the corresponding instance is simple, then the reward and punishment difference of discriminator output will be smaller, and if the generated data is often inconsistent with the real data, then the reward and punishment difference of discriminator output will be larger.
In this way, the model is given more information, so that the model can learn useful information from the mistakes, and further improve the quality of the model.
About the author:
Tongtao Zhang is a Ph.D. student in the Department of computer science, Rensselaer Institute of technology in the United States, and a member of the blender Laboratory of Professor Ji. His research focuses on event extraction based on multimodal methods, involving many fields such as natural language processing, computer vision and machine learning. His research direction is to integrate technologies in these fields and build a more comprehensive knowledge base. Zhang tongtao received his bachelor's degree in Applied Physics from Donghua University, his bachelor's degree in German from Shanghai Foreign Studies University and his master's degree in electronic engineering from Columbia University.
Dr. Ji is a professor in the computer science department of Rensselaer Institute of technology in the United States. He was awarded the honorary title of "Edward P. Hamilton" (former director of Rensselaer Institute of Technology), which is the honorary title of the most outstanding professor of Rensselaer Institute of technology. She is an internationally renowned young scholar in the field of natural language processing. She received her bachelor's degree and master's degree in computational linguistics from Tsinghua University, and her master's degree and doctor's degree in computer science from New York University. Her research interests focus on natural language processing and its connection with data mining, social science and visualization. She was elected as a "young scientist" and a member of the future computing Council of the world economic forum in 2016 and 2017. In 2013, she was selected into the "Ai's 10 to watch" list published by IEEE intelligent systems, a famous international journal in the field of artificial intelligence. In 2009, she won NSF career award, Google research awards in 2009 and 2014, Sloan junior in 2012 Faculty Award) and IBM Watson Professor Award. In 2015 and 2016, they won Bosch research awards, paclic2012 best paper runner up, sdm2013 best paper award, and icdm2013 best paper award. She was invited to join the air force data analysis expert group. She is an expert in the ARL information fusion and knowledge network construction project in the United States. Since 2010, she has been the coordinator of the NIST TAC knowledge base competition held by the National Institute of standards and Technology (NIST), served as the chairman of the international conference procedure Committee, such as naacl2018, nlp-nabd2018, nlpcc2015 and csckg2016, CO chairman of acl2017 demo, naacl2012, acl2013, emnlp2013, nlpcc2014, emnlp2015, naacl2016, etc, President of acl2016 and naacl2019 and other international conference information extraction fields, President of acl2019 information extraction fields, vice chairman of IEEE / WIC / ACM wi2013 and ccl2015 and other international conference procedure committees, chairman of www2015 content analysis branch, chairman of international artificial intelligence Joint Conference (IJCAI) 2016 conference finance.
Dr. avirup SIL is a research scientist of IBM research AI information extraction and natural language processing team. He is also chairman of IBM's natural language processing professional community. His research focuses on information extraction, including entity recognition and association, and relationship extraction. His current research focuses on Q & A algorithm. He is a senior member of the main academic conference procedure Committee in the field of computational linguistics, and has served as the field chairman for many times. He also conducted the research of spatiotemporal information extraction in the machine learning group of Microsoft Research Institute. He has 12 U.S. patents in artificial intelligence research and application.