Data Science Day @ Columbia University has ended
Columbia University’s Data Science Institute Presents:

Authors/Collaborators are listed in alphabetical order.

Back To Schedule
Wednesday, April 6 • 9:45am - 10:30am
Mining Images, Speech, Text and Social Ties for Insights and Important Events

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Shih-Fu Chang  | Exploring Multimedia Recognition Tools in Big Data Applications
Advances in computer vision and the growth of digital photos and videos have created new opportunities to integrate content-recognition tools with mobile apps and large-scale systems. If you want more information about a building, product or bottle of wine, it’s now possible to search the Web with an image on your phone. New 3D sensors and search tools allow users to scan real-world objects and find matching models to make new products. Emerging multimedia-recognition tools are making it possible to track and summarize breaking news from streaming video and social media. This technology is also embedded in smart search engines that can mine video footage from sporting events, roads and security cameras to flag key events, from touchdowns to traffic accidents to criminal activity. I will give an overview of the novel technologies we are developing and discuss open issues.

Julia Hirschberg | Applications for Detecting Emotion in Text and Speech
Identifying the emotional content of written and spoken language is increasingly useful in business, medicine and security. Large data sets of text and speech, including social media, interviews and phone conversations, can be used to train systems to detect consumer reactions to products and services (and to flag ‘fake’ reviews), to diagnose medical conditions such as depression, and identify deception in a wide variety of government, business and social service settings. Each application picks up subtle cues that may indicate whether a speaker is angry, happy, disgusted, afraid, sad or surprised. Similar approaches have been used to distinguish among personality traits, and to infer how tired, drunk or bored someone might be.

Kathy McKeown | Tracking Events Through Time: Objective and Personal Views
The chaos following Hurricane Sandy in 2012 brought home the need for a faster, more accurate way to filter the oceans of text streaming over social media and news sites during and after a crisis. We have been working on an automated method for monitoring and summarizing news as events unfold. Our method can flag new information as it becomes available, and generate updates. This can be extremely useful during emergencies as well as for tracking a wide variety of everyday events. In a related project, we’ve come up with a way to automatically identify the most compelling part of a personal narrative, what we call the “most reportable event.” I will discuss the natural language processing techniques that underlie this work, and future research directions.

Tian Zheng | Mapping Subpopulations within Big Networks
Estimating the size of stigmatized groups such as the homeless, people with HIV and commercial sex workers remains difficult, even in the digital age. Those belonging to marginalized subpopulations may be difficult to reach by phone, or in online surveys, or may simply prefer to keep sensitive personal information to themselves. Advances in network science are now allowing researchers to move past these obstacles to learn more about hard-to-reach demographic groups. My colleagues and I have developed a modeling framework to infer the size and other hidden features of subpopulations within a large study sample. Our method produces inferential results that are easy to interpret and relevant for visualizing, monitoring and understanding structures underlying large, complex networks.

avatar for Shih-Fu Chang

Shih-Fu Chang

Senior Executive Vice Dean and Richard Dicker Professor of Telecommunications and Professor of Computer Science, Columbia Engineering
Shih-Fu Chang is Richard Dicker Chair Professor, Director of the Digital Video and Multimedia Lab, and Senior Executive Vice Dean of The Fu Foundation School of Engineering and Applied Science at Columbia University. He is an active researcher leading development of theories, algorithms... Read More →
avatar for Julia Hirschberg

Julia Hirschberg

Percy K. and Vida L. W. Hudson Professor of Computer Science and Department Chair, Columbia Engineering
Julia Hirschberg is Percy K. and Vida LW Hudson Professor of computer science at Columbia University and chair of the Department. She does research in prosody, spoken dialogue systems, and emotional and deceptive speech. She received her PhD in Computer Science from the University of Pennsylvania in 1985.  She worked at Bell Laboratories and AT&T Labo... Read More →
avatar for Kathy McKeown

Kathy McKeown

Director, Data Science Institute
A leading scholar and researcher in the field of natural language processing, McKeown focuses her research on big data; her interests include text summarization, question answering, natural language generation, multimedia explanation, digital libraries, and multilingual applications. Her research group's Columbia Newsblaster, which has been live since 2001, is... Read More →
avatar for Tian Zheng

Tian Zheng

Associate Professor of Statistics, Graduate School of Arts and Sciences
Tian Zheng is associate professor of Statistics at Columbia University. She obtained her PhD from Columbia in 2002. Her research is to develop novel methods and improve existing methods for exploring and analyzing interesting patterns in complex data from different application... Read More →

Wednesday April 6, 2016 9:45am - 10:30am EDT
Roone Arledge Auditorium Lerner Hall, Columbia University 2920 Broadway, New York, NY 10040