DI-2021 @ KDD 2021: Cha Zhang Announce

NLP

conference

Announcing Cha Zhang’s invited talk: Visual Document Intelligence in the Wild.

Author

synesis

Published

July 29, 2021

DI-2021 invited talk announcement. Image: LinkedIn.

Document Intelligence Workshop @ KDD2021 proudly presents our first of the 6 invited talks, to be given by Cha Zhang, IEEE Fellow and Partner Engineering Manager @ Microsoft Azure AI!

Title: Visual Document Intelligence in the Wild (8/15 8:10-8:50am PDT)

Abstract: Recent progress in AI has brought Optical Character Recognition (OCR) and document understanding to a whole new level. In this talk, we will first provide an overview of Microsoft’s latest OCR engine (aka OneOCR), which applies the latest deep learning techniques to recognize mixed printed and handwritten text in over 100 languages, with text lines along arbitrary orientations (even flipped), and with varying degrees of quality and distortion. OneOCR achieves industry leading accuracy on a wide range of application scenarios such as document, invoice, receipt, business card, slide, menu, book cover, poster, GIF/MEME, street view, product label, handwritten note and whiteboard. We then introduce another breakthrough technology developed at Microsoft for document understanding: LayoutLM. LayoutLM bridges computer vision and language, producing state-of-the art results on a number of tasks, including document segmentation, classification, TextVQA, and others. Combining OneOCR and LayoutLM, we created the Form Recognizer API in Azure AI, which extracts text, key-value pairs, tables, and structures from documents in the wild. I will demonstrate some of the capabilities of Form Recognizer, highlight its core component technologies, and explain the roadmap ahead.

Bio: Cha Zhang is a Partner Engineering Manager at Microsoft Cloud & AI. He received the B.S. and M.S. degrees from Tsinghua University, Beijing, China in 1998 and 2000, respectively, both in Electronic Engineering, and the Ph.D. degree in Electrical and Computer Engineering from Carnegie Mellon University, in 2004. After graduation, he worked at Microsoft Research for 12 years investigating research topics including multimedia signal processing, computer vision and machine learning. He has published more than 150 technical papers and hold more than 50 U.S. patents. He served as Program Co-Chair for VCIP 2012 and MMSP 2018, and General Co-Chair for ICME 2016. He is a Fellow of the IEEE. Since joining Cloud & AI, he has led teams to ship industry-leading technologies in Microsoft Cognitive Services such as emotion recognition, optical character recognition and document understanding.

DI2021 Invited Talk by Cha Zhang: https://document-intelligence.github.io/DI-2021/talks/#talk_cha
DI2021 Program: https://document-intelligence.github.io/DI-2021/program/

Program committee (alphabetical): Doug Burdick, Dave Lewis, Yijuan Lu, Hamid Motahari, Sandeep Tata Chair: Benjamin Han

Originally posted on LinkedIn.