![]() |
Mihai Surdeanu, University of Arizona Host: Hadi Amiri, UMass Lowell Time: Feb 21, 2025 | 11:00–12:00pm ET Location: https://uml.zoom.us/j/99101366772 Password: cstalks |
Title: Neuro-symbolic Approaches for Explainable Natural Language Processing
Abstract
Deep learning approaches to natural language processing (NLP) such as GPT* have achieved tremendous successes recently. However, these systems are difficult to understand, augment, or maintain as needs shift. In this talk I will discuss two of our recent efforts that aim to bring explainability back into deep learning methods for NLP. In the first part of the talk, I will introduce an explainable approach for information extraction (IE), an important language understanding task that focuses on finding structured information in text such as who did what to whom when and where. Our approach mitigates the tension between generalization and explainability by jointly training for the two goals. The proposed method uses a multi-task learning architecture, which jointly trains a classifier for information extraction, and a sequence model that labels words in the context that explain the decisions of the previous classifier. We show that, even with minimal guidance for what makes a good explanation, the sequence model learns to provide accurate explanations. Further, we show that the joint training generally improves the performance of the IE classifier. In the second part of the talk, I will discuss a neuro-symbolic architecture for information extraction that preserves the advantages of both directions, i.e., the generalization power of neural methods and the pliability of symbolic approaches. Our modular approach contains two components: a declarative rule-based model and a neural component. The former implements information extraction with a set of explainable rules that rely on syntax; the latter increases the generalizability of rules by semantically matching them over text. I'll show that the proposed approach outperforms all neural models on a challenging IE task. More importantly, I'll show that that the underlying symbolic representation can be locally modified to correct model mistakes without retraining the neural component.
Bio
Dr. Surdeanu works on natural language processing (NLP) systems that process and extract meaning from natural language texts such as question answering (answering natural language questions), information extraction (converting free text into structured relations and events), and textual entailment. He focuses mostly on interpretable models, i.e., approaches where the computer can explain in human understandable terms why it made a decision, and machine reasoning, i.e., methods that approximate the human capacity to understand bigger things from knowing smaller facts. He published more than 150 peer-reviewed articles, including four articles that were among the top three most cited articles at their respective venues that year. Dr. Surdeanu’s work was funded by several United States government organizations (DARPA, NIH, NSF), as well as private foundations (the Allen Institute for Artificial Intelligence, the Bill Melinda Gates Foundation). Dr. Surdeanu also co-founded an NLP startup that applies these technologies to several domains including biomedical and agriculture.
![]() |
Jessy Li, University of Texas at Austin Host: Hadi Amiri, UMass Lowell Time: Apr 18, 2025 | 11:00–12:00pm ET Location: DAN 321 |
Title: Discourse models with language models
Abstract
How are sentences in a document connected, and why do they make the document feel “coherent”? Computational models of discourse aim to solve this myth by recovering the structural organization of texts, through which writers convey intent and meaning. In the first part of this talk, I will discuss our efforts on modeling human curiosity through question generation, and understanding its connection with discourse representations based on the linguistic theory of Questions Under Discussion. We show that LLMs, with design and training, resurface curiosity-driven questions and ground their elicitation and answers in text. Next, I will demo q q qnstrate how such generative discourse models can be used to measure discourse similarities in LLM-generated texts, as well as to derive explainable measures of information salience in LLMs using summarization as a behavioral probe.
Bio
Jessy Li is an Associate Professor in the Linguistics Department at the University of Texas at Austin. She received her Ph.D. (2017) from the Department of Computer and Information Science at the University of Pennsylvania. Her research interests are in computational linguistics and NLP, specifically discourse and document-level processing, natural language generation, and pragmatics. She is a recipient of an NSF CAREER Award, ACL and EMNLP Outstanding Paper Awards, an ACM SIGSOFT Distinguished Paper Award, among other honors. Jessy is on the leadership team of the newly established NSF-Simons CosmicAI Institute. She is also the Secretary of the Nations of the Americas Chapter of the Association for Computational Linguistics (NAACL).
![]() |
Greg Durrett, University of Texas at Austin Host: Hadi Amiri, UMass Lowell Time: Apr 11, 2025 | 11:00–12:00pm ET Location: https://uml.zoom.us/j/99101366772 Password: cstalks |
Title: Specializing LLMs for Reliability
Abstract
Large language models (LLMs) have advanced the frontiers of AI reasoning: they can synthesize information from multiple sources, derive new conclusions, and explain those conclusions to their users. However, LLMs do not do this reliably. They hallucinate facts, convincingly state incorrect deductions, and exhibit logical fallacies like confirmation bias. In this talk, I will describe my lab's work on making LLM systems reliable by introspecting their behavior. First, I will argue that automating fine-grained evaluation of LLM output provides a level of understanding necessary for further progress. I will describe the ingredients of effective automated evaluators and a state-of-the-art factuality evaluation system, MiniCheck, showing that analyzing the nature of hallucinations can help reduce them. Second, I will demonstrate that better understanding of LLMs’ internal reasoning processes helps us train them to be more reliable. Our work shows that model interpretation techniques can advance training methodology and dataset curation for reasoning models. Finally, I will describe how deeper understanding of LLMs will let us tackle their most fundamental limitations, such as their inconsistency when given different inputs. I will propose how these pieces might soon be combined to form reliable AI systems.
Bio
Greg Durrett is an associate professor of Computer Science at UT Austin. His research is broadly in the areas of natural language processing and machine learning. His group develops techniques for reasoning about knowledge in text, verifying factuality of LLM generations, and specializing LLMs to make them more reliable. He is a 2023 Sloan Research Fellow and a recipient of a 2022 NSF CAREER award. His work has been recognized by paper awards at EMNLP 2024 and EMNLP 2013. He was a founding organizer of the Workshop on Natural Language Reasoning and Structured Explanations at ACL 2023 and ACL 2024 and is a current member of the NAACL board. He received his BS in Computer Science and Mathematics from MIT and his PhD in Computer Science from UC Berkeley, where he was advised by Dan Klein.
![]() |
Jiawei Han, University of Illinois Host: Jie Wang, UMass Lowell Time: Apr 4, 2025 | 11:00–12:00pm ET Location: DAN 321 |
Title: A Retrieval and Structuring Approach for LLM-Enhanced, Theme-Focused Science Discovery
Abstract
Large Language Models (LLMs) may bring unprecedent power in scientific discovery. However, current LLMs may still encounter major challenges for effective scientific exploration due to their lack of in-depth, theme-focused data and knowledge. Retrieval augmented generation (RAG) has recently become an interesting approach for augmenting LLMs with grounded, theme-specific datasets. We discuss the challenges of RAG and propose a retrieval and structuring (RAS) approach, which enhances RAG by improving retrieval quality and mining structures (e.g., extracting entities and relations and building knowledge graphs) to ensure its effective integration of theme-specific data with LLM. We show the promise of this approach at augmenting LLMs and discuss its potential power for LLM-enabled science exploration.
Bio
Jiawei Han is Michael Aiken Chair Professor in the Siebel School of Computing and Data Science, University of Illinois Urbana-Champaign. He received ACM SIGKDD Innovation Award (2004), IEEE Computer Society Technical Achievement Award (2005), IEEE Computer Society W. Wallace McDowell Award (2009), Japan's Funai Achievement Award (2018), and being elevated to Fellow of Royal Society of Canada (2022). He is Fellow of ACM and Fellow of IEEE and served as the Director of Information Network Academic Research Center (INARC) (2009-2016) supported by the Network Science-Collaborative Technology Alliance (NS-CTA) program of U.S. Army Research Lab and co-Director of KnowEnG, a Center of Excellence in Big Data Computing (2014-2019), funded by NIH Big Data to Knowledge (BD2K) Initiative. Currently, he is serving on the executive committees of two NSF funded research centers: MMLI (Molecular Make Research Institute)—one of NSF funded national AI centers since 2020 and I-Guide—The National Science Foundation (NSF) Institute for Geospatial Understanding through an Integrative Discovery Environment (I-GUIDE) since 2021.
![]() |
Tom McCoy, Yale University Host: Hadi Amiri, UMass Lowell Time: March 28, 2025 | 11:00–12:00pm ET Location: DAN 321 |
Title: Understanding the abilities of AI systems: Memorization, generalization, and points in between
Abstract
Large language models (LLMs) can perform a wide range of tasks impressively well. To what extent are these abilities driven by shallow heuristics vs. deeper abstractions? I will argue that, to answer this question, we must view LLMs through the lens of generalization. That is, we should consider the data that LLMs were trained on so that we can identify whether and how their abilities go beyond their training data. In the analyses of LLMs that I will discuss, this perspective reveals both impressive strengths and surprising limitations. For instance, LLMs often produce sentence structures that are well-formed but that never appeared in their training data, yet they also struggle on some seemingly simple algorithmic tasks (e.g., decoding simple ciphers) in ways that are well-explained by training data statistics. In sum, to understand what AI systems are, we must understand what we have trained them to be.
Bio
Tom McCoy is an Assistant Professor of Linguistics at Yale University, with a secondary appointment in Computer Science. His research aims to bridge the divide between linguistics and natural language processing: how can we create AI systems that replicate the rapid learning and robust generalization that humans display when processing language? Much of this work involves analyzing the performance and internal processing of neural network language models. He received his PhD from the Department of Cognitive Science at Johns Hopkins, and his PhD thesis received a Glushko Dissertation Prize from the Cognitive Science Society. He then did a postdoc in Computer Science at Princeton before joining the faculty at Yale. Outside of research, he is an organizer and problem writer for NACLO, a contest that introduces high school students to linguistics and natural language processing.
![]() |
Byron Wallace, Northeastern University Host: Hadi Amiri, UMass Lowell Time: Mar 21, 2025 | 11:00–12:00pm ET Location: DAN 321 |
Title: LLMs for healthcare: Risks and interpretability methods to (possibly) mitigate them
Abstract
Large Language Models (LLMs) are poised to transform specialist fields like healthcare. Such models promise to free domain experts, including physicians, from drudgery, enabling better care to be delivered at scale. But the use of LLMs in healthcare—and similar high-stakes, specialized domains—brings real risks. Used naively, such models may worsen existing biases in practice. They might also result in medical errors owing to “hallucinations”. In this talk I will discuss a few recent efforts designing and critically evaluating LLMs for medical language processing tasks, e.g., summarizing clinical notes in patient electronic health records (EHRs). I will highlight current limitations and associated risks of LLMs in the context of these applications, particularly related to robustness and bias. Finally, I will discuss recent work on adopting “mechanistic” interpretability methods in the space of healthcare as a potential means of mitigating these issues.
Bio
Byron Wallace is the Sy and Laurie Sternberg Interdisciplinary Professor in the Khoury College of Computer Sciences at Northeastern University. His research is primarily in natural language processing (NLP) methods with focus on interpretability and healthcare applications.
![]() |
Emily Prud'hommeaux, Boston College Host: Hadi Amiri, UMass Lowell Time: Mar 7, 2025 | 11:00–12:00pm ET Location: DAN 321 |
Title: Overcoming Obstacles in NLP for Endangered Languages
Abstract
A majority of the world's languages lack sufficient resources to train the state-of-the-art NLP models we've come to expect for high-resource languages like English or Mandarin. The situation is particularly dire for endangered languages, which could benefit enormously from these technologies but will never have abundant high-quality training resources. In this talk, I will discuss some approaches for addressing these challenges in automatic speech recognition and machine translation, with a focus on several different endangered and under-resourced languages.
Bio
Emily Prud’hommeaux is an Associate Professor in the Department of Computer Science at Boston College. She received her BA (Harvard) and MA (UCLA) in Linguistics, and her PhD in Computer Science and Engineering (OHSU/OGI). Her research centers on NLP in low-resource settings, with a particular focus on under-resourced languages and the language of individuals with conditions impacting communication and cognition.
![]() |
Wei Xu, Georgia Tech Host: Hadi Amiri, UMass Lowell Time: Feb 28, 2025 | 11:00–12:00pm ET Location: https://uml.zoom.us/j/99101366772 Password: cstalks |
Title: Human-AI Collaboration in Evaluating Large Language Models
Abstract
To support real-world applications more responsibly and further improve large language models (LLMs), it is essential to design reliable and reusable frameworks for their evaluation. In this talk, I will discuss three forms of human-AI collaboration for evaluation that combine the strengths of both: (1) the reliability and user-centric aspect of human evaluation, and (2) the cost efficiency and reproducibility offered by automatic evaluation. The first part focuses on systematically assessing LLMs’ favoritism towards Western culture, using a hybrid approach of manual effort and automated analysis. The second part will showcase an LLM-powered privacy preservation tool, designed to safeguard users against the disclosure of personal information. I will share some interesting findings from an HCI user study that involves real Reddit users utilizing our tool, which in turn informs our ongoing efforts to improve the design of NLP models. Lastly, we will delve into the evaluation of LLM-generated texts, where human judgments can be used to train automatic evaluation metrics to detect errors. We also highlight the opportunity of engaging both laypeople and experts in evaluating LLM-generated simplified medical texts in high-stake healthcare applications.
Bio
Wei Xu is an Associate Professor in the College of Computing and Machine Learning Center at the Georgia Institute of Technology. Her research interests are in natural language processing and machine learning, with a focus on Generative AI, multilingual LLMs, robustness and cultural adaptation, as well as AI for education and privacy. She is a recipient of the NSF CAREER Award, Faculty Research Awards from Google, Sony, and Criteo, CrowdFlower AI for Everyone Award, Best Paper Awards and Honorable Mentions at COLING’18, ACL’23, ACL’24. She also received research funds from NIH, DARPA, and IARPA.
![]() |
Rada Mihalcea, University of Michigan Host: Hadi Amiri, UMass Lowell Time: Feb 14, 2025 | 11:00–12:00pm ET Location: https://uml.zoom.us/j/99101366772 Password: cstalks |
Title: Why AI Is W.E.I.R.D. And Shouldn't Be This Way
Abstract
Recent years have witnessed remarkable advancements in AI, with language and vision models that have enabled progress in numerous applications and opened the door to the integration of AI in areas such as communication, transportation, healthcare, and arts. Yet, many of these models and their corresponding datasets are W.E.I.R.D. (Western, Educated, Industrialized, Rich, Democratic) and they are reflective of a small fraction of the population.(*) In this talk, I will show some of the limitations and lack of representation of current AI models, and highlight the need for cross-cultural language and vision models that can capture the diversity of behaviors, beliefs, and language expressions across different groups. I will also explore ways in which we can address these limitations by developing models that are re-centered around people and their unique characteristics.
(*) W.E.I.R.D. is an acronym widely used in psychology to indicate the limitation of many of the studies carried out in the field.
Bio
Rada Mihalcea is the Janice M. Jenkins Professor of Computer Science and Engineering at the University of Michigan and the Director of the Michigan Artificial Intelligence Lab. Her research interests are in computational linguistics, with a focus on lexical semantics, multilingual natural language processing, and computational social sciences. She serves or has served on the editorial boards of the Journals of Computational Linguistics, Language Resources and Evaluations, Natural Language Engineering, Journal of Artificial Intelligence Research, IEEE Transactions on Affective Computing, and Transactions of the Association for Computational Linguistics. She was a program co-chair for EMNLP 2009 and ACL 2011, and a general chair for NAACL 2015 and *SEM 2019. She is an ACM Fellow, a AAAI Fellow, and served as ACL President (2018-2022 Vice/Past). She is the recipient of a Sarah Goddard Power award (2019) for her contributions to diversity in science, an honorary citizen of her hometown of Cluj-Napoca, Romania (2013), and the recipient of a Presidential Early Career Award for Scientists and Engineers awarded by President Obama (2009).
![]() |
Diyi Yang, Stanford University Host: Hadi Amiri, UMASS Lowell Time: Feb 7, 2025 | 11:00–12:00pm ET Location: https://uml.zoom.us/j/99101366772 Password: cstalks |
Title: Enabling and Evaluating Human-Agent Collaboration
Abstract
Recent advances in large language models (LLMs) have revolutionized human-AI interaction, but their success depends on addressing key challenges like privacy and effective collaboration. In this talk, we first explore PrivacyLens, a general framework to evaluate privacy leakage in LLM agents’ actions, by extending privacy-sensitive seeds into agent trajectories. By evaluating state-of-the-art models, PrivacyLens reveals contextual and long-tail privacy vulnerabilities, even under privacy-enhancing instructions. We then introduce Co-Gym, a novel framework for studying and enhancing human-agent collaboration across various tasks. Our findings reveal that collaborative agents consistently outperform their fully autonomous counterparts in task performance. Via PrivacyLens and Co-Gym, this talk highlights how to develop AI systems that are trustworthy and capable of fostering meaningful collaboration with human users.
Bio
Diyi Yang is an assistant professor in the Computer Science Department at Stanford University. Her research focuses on human-centered natural language processing and computational social science. She is a recipient of Microsoft Research Faculty Fellowship (2021), NSF CAREER Award (2022), an ONR Young Investigator Award (2023), and a Sloan Research Fellowship (2024). Her work has received multiple paper awards or nominations at top NLP and HCI conferences.
![]() |
Sijia Liu, Michigan State University Host: Hadi Amiri, UMASS Lowell Time: Dec 6, 2024 | 11:00–12:00pm ET Location: DAN 321 | https://uml.zoom.us/j/99101366772 Password: cstalks |
Title: Machine Unlearning for Generative AI: A Model-Based Perspective
Abstract
In this talk, I will introduce the concept of Machine Unlearning (MU), a transformative approach to removing undesirable data influence or associated model capabilities from learned discriminative and generative models. To bridge the gap between exact and approximate unlearning, I will present a novel model-based perspective that integrates model sparsity, gradient-based weight saliency, and weight influence attribution. This model-centric approach achieves significant advancements in MU for vision and language models, balancing effectiveness, preserved utility, and enhanced efficiency. Additionally, I will explore the practical implications of MU in addressing critical challenges in AI safety.
Bio
Dr. Sijia Liu is an Assistant Professor in the Department of Computer Science and Engineering at Michigan State University, where he leads the Optimization and Trustworthy ML (OPTML) lab. He also serves as an Affiliated Professor at IBM Research. His research focuses on trustworthy and scalable ML, bridging foundational areas such as optimization and learning theory with applied research in vision and language modeling. Recently, his work has emphasized machine unlearning for generative AI. Dr. Liu has been recognized with numerous prestigious honors, including the NSF CAREER Award in 2024, the Best Paper Runner-Up Award at the Conference on Uncertainty in Artificial Intelligence (UAI) in 2022, and the Best Student Paper Award at the 42nd IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) in 2017. Additionally, he has been the lead organizer of the New Frontiers in Adversarial ML (AdvML-Frontiers) workshop series from 2022 to 2024.
![]() |
Danielle S. Bitterman, Mass General Brigham, Harvard Medical School Host: Hadi Amiri, UMASS Lowell Time: Nov 22, 2024 | 11:00–12:00pm ET Location: https://uml.zoom.us/j/99101366772 Password: cstalks |
Title: Bridging the AI translational gap in oncology
Abstract
There is immense enthusiasm about the potential of artificial intelligence (AI) in healthcare. In particular, cancer care is multi-dimensional and requires integration of multiple data sources, priming it to be transformed by AI. However, despite significant advances in the research domain, few AI technologies tools have made their way into clinic. In this lecture, I will discuss existing and future applications of AI in the cancer clinic, with a focus on my lab’s research developing natural language processing methods to enhance the value of electronic health records. I will discuss our work in pre-clinical and clinical testing of emerging technologies and highlight opportunities to facilitate safe clinical translation.
Bio
Dr. Danielle Bitterman is an Assistant Professor at Harvard Medical School, whose research is dedicated to developing and implementing AI advances for safe, equitable cancer care. She is a physician-scientist in the Artificial Intelligence in Medicine Program at Mass General Brigham, and a radiation oncologist in the Department of Radiation Oncology at Brigham and Women’s Hospital/Dana-Farber Cancer Dr. Bitterman’s lab uses natural language processing to transform the medical records into systems that actively data-driven care of patients with cancer. Her interests include language model medical knowledge grounding and bias assessments, automated information extraction from the electronic health records, and translational studies of AI in the clinic. Dr. Bitterman’s research has been published in high-impact journals and conference proceedings, including Nature Medicine, Lancet Digital Health, the Journal of Clinical Oncology, NeurIPS, and EMNLP. Her research is funded by the National Cancer Institute, the American Cancer Society, and the American Society for Radiation Oncology. Dr. Bitterman received her undergraduate degree at Columbia University, and attended medical school at New York University School of Medicine. She completed an internship in internal medicine at the Brigham and Women's Hospital, and her residency at the Harvard Radiation Oncology Program. She completed her post-doctoral fellowship in natural language processing at the Computational Health Informatics Program at Boston Children’s Hospital.
![]() |
Liang Zhao, Emory University Host: Hong Yu, UMASS Lowell Time: Nov 20, 2024 | 11:00–12:00pm ET Location: https://uml.zoom.us/j/5961664779 Password: – |
Title: Graph Representation Learning for Network Generation, Optimization, and Verbalization
Abstract
Graphs are ubiquitous data structure that denotes entities and their relations, such as social networks, citation graphs, and neural networks. The topology of graphs is discrete data which prevents it from enjoying numerous mathematical and statistical tools that requires structured data. Graph representation learning aims to map graphs to their vector representations without substantial information loss, hence pave a new pathway for solving graph problems without discrete algorithms. In this talk, I will first introduce our recent works on graph representation learning that can preserve graphs’ geometric information and properties. Then, I will exemplify several interesting research areas where their problem-solving benefits from our leveraging of graph representations. The first area is to solve graph optimization problems, such as influence maximization, source localization, etc., using continuous optimization over graph representations. The second area is to capture and predict deep learning models’ dynamics over data distribution drifts, where the graph representation of neural networks is learned to reflect their functional space. The third area is to investigate the correlation and difference of the two views of graphs in mathematical language and natural language, where the graph representation acts as their bridge, with the help of large language models.
Bio
Dr. Liang Zhao is an associate professor at the Department of Compute Science at Emory University. Before that, he was an assistant professor in the Department of Information Science and Technology and the Department of Computer Science at George Mason University. He obtained his Ph.D. degree as Outstanding PhD student in 2016 from Computer Science Department at Virginia Tech in the United States. His research interests include data mining and machine learning, with special interests in spatiotemporal and network data mining, deep learning on graphs, language and multimodal foundation models, distributed optimization, and interpretable machine learning. He has published over a hundred papers in top-tier conferences and journals such as KDD, TKDE, ICDM, ICLR, NeurIPS, Proceedings of the IEEE, TKDD, CSUR, IJCAI, AAAI, and WWW. He won NSF CAREER Award and Middle-Career Award by IEEE Computer Society on Smart Computing. He also obtained many prestigious awards from industry such as Meta Research Award, Amazon Research Award, Cisco Faculty Research Award, and Jeffress Trust Award. He was recognized as one of the "Top 20 Rising Star in Data Mining" by Microsoft Search in 2016. He has won several best paper awards and shortlists, such as the Best Paper Award of ICDM 2022, Best Poster Runner-up Award of ACM SIGSPATIAL 2022, Best Paper Award Shortlist in WWW 2021, Best Paper Candidate in ACM SIGSPATIAL 2022, the Best Paper Award of ICDM 2019, and Best Paper Candidate in ICDM 2021. He is recognized as a "Computing Innovative Fellow Mentor" in 2021 by Computing Research Association. He is a senior member of IEEE.
![]() |
Marco Gaboardi, Boston University Host: Paul Downen, UMASS Lowell Time: Nov 15, 2024 | 11:00–12:00pm ET Location: DAN 321 |
Title: Reasoning about Programs’ Adaptivity, with applications to Adaptive Data Analysis
Abstract
An adaptive program is a program that interacts with other components and whose choice for the next interaction depends on the results of previous interactions. Adaptive programs find applications in many areas of computer science, such as in adaptive data analysis, in the analysis of interactive protocols in security and privacy, in database systems, etc. In many of these applications it is important to quantify the level of adaptivity of a program.
Bio
Marco Gaboardi is an associate professor at Boston University. Prior to joining Boston University, he was an assistant professor at the University at Buffalo, SUNY, and at the University of Dundee, Scotland. Marco received his PhD from the University of Torino, Italy, and the Institute National Polytechnique de Lorraine, France. He has been a visiting scholar at the University of Pennsylvania, at Harvard University’s CRCS center, at the Simons’ institute at UC Berkeley, and at Chalmers University. He is a recipient of the Caspar Bowden Award for Outstanding Research in Privacy Enhancing Technologies, of the NSF CAREER award, of an EU Marie Curie Fellowship, and of a Google Research Award. He served as Conference Chair for LICS 2023 and as General Chair for ICFP 2024. He is also Co-Founder and Chief Scientist for the startup DPella AB. Marco’s research is in programming languages, formal verification, and in differential privacy.
![]() |
Tianyi Zhou, University of Maryland, College Park Host: Hadi Amiri, UMASS Lowell Time: Nov 8, 2024 | 11:00–12:00pm ET Location: https://uml.zoom.us/j/99101366772 Password: cstalks |
Title: Synthetic Data for Self-Evolving AI
Abstract
Data is the new oil for training large AI models. However, the “oil” created by humans may run out someday or grow much slower than the speed of AI consuming them. Moreover, the human-created data are less controllable in terms of quality, opinions, format, style, etc., and may lead to biases or privacy concerns when used for model training. Can we leverage the power of Generative-AI and automatically create synthetic data in a more efficient, controllable, and safe manner, for training or benchmark purposes? How can we avoid model collapse caused by continuously training a model on self-generated synthetic data? In this talk, I will present our recent works that aim to investigate whether and how synthetic data can be created to improve large language models (LLMs) and vision-language models (VLMs), especially when the real data is non-perfect. These works include Mosaic-IT (compositional data augmentation for instruction tuning), DEBATunE (data generation by LLM debate), Diffusion Curriculum (generative curriculum learning of low-quality images), and AutoHallusion (hallucination benchmark generation via automatic image editing). These projects are led by Ming Li, Yijun Liang, Xiyang Wu, and Tianrui Guan.
Bio
Dr. Tianyi Zhou is a tenure-track assistant professor of Computer Science at the University of Maryland, College Park (UMD). He received his Ph.D. from the University of Washington and worked as a research scientist at Google before joining UMD. His research interests are machine learning, natural language processing, and multi-modal generative AI. His team has published >100 papers in ML (NeurIPS, ICML, ICLR), NLP (ACL, EMNLP, NAACL), CV (CVPR, ICCV, ECCV), and journals such as IEEE TPAMITIPTNNLS/TKDE, with >7000 citations. His recent research topics are (1) Human-AI hybrid intelligence (humans help AI, AI helps humans, human-AI teaming); (2) Controllable multi-modal generative AI via post-training and prompting; (3) Synthetic data, self-evolving AI, and auto-benchmarking; (4) Neurosymbolic world models and System-2 embodied agents.
![]() |
Ankush Das, Boston University Host: Paul Downen, UMASS Lowell Time: Nov 1, 2024 | 11:00–12:00pm ET Location: DAN 321 |
Title: Programming Language Principles for Distributed Systems
Abstract
With the proliferation of distributed systems, the design of safe, secure, and efficient software has become an ever more complex task. The heterogeneous nature of these distributed systems have further introduced domain-specific programming requirements such as inferring execution cost, accounting for randomized behavior, and preventing communication errors. To develop programming languages and reasoning tools for such multi-threaded environments, we need two main ingredients: concurrency and domain-specific support. In this talk, I will use session types as a base type system that already comes equipped with reasoning capabilities for message-passing concurrent systems. On top, I will introduce domain-specific support for 3 different domains: digital transactions, randomized systems, and program verification. Programming smart contracts comes with its unique challenges, which include enforcing protocols of interaction, tracking linear assets, and analyzing execution cost. To address these challenges, the talk introduces Nomos that employs linear session types to enforce protocols and prevent assets from being duplicated or discarded. To predict execution cost, Nomos uses resource-aware types and automatic amortized resource analysis, a type-based technique for inferring cost bounds. For randomized systems, Nomos is further enhanced with probabilistic types that track the probability distribution of message exchanges in a distributed system. Finally, to verify concurrent programs, I will introduce dependent refinement session types that can naturally track intrinsic properties such as sizes and values in the type of messages, which can then be used for lightweight verification. The talk concludes with my future plans on exploring how programming languages can aid in the specification, verification, and possibly synthesis of cryptographic protocols.
Bio
Ankush Das is an Assistant Professor at Boston University. Before that, he worked for 2 years as an Applied Scientist at Amazon AWS. He completed his Ph.D. in 2021 at Carnegie Mellon University where he was advised by Prof. Jan Hoffmann and closely worked with Prof. Frank Pfenning. He is broadly interested in programming languages with a specific focus on concurrency, type systems, complexity analysis, and program verification. His research has received numerous awards including a distinguished paper award at POPL 2024 and the best system description paper award at FSCD 2020.
![]() |
Weiyan Shi, Northeastern University Host: Hong Yu, UMASS Lowell Time: Oct 25, 2024 | 11:00–12:00pm ET Location: https://uml.zoom.us/j/5961664779 Password: cstalks |
Title: Persuasion for Social Good: How to Build and Break Persuasive Chatbots
Abstract
AI research has so far focused on modeling common human skills, such as building systems to see, read, or talk. As these systems gradually achieve a human level in standard benchmarks, it is increasingly important to develop next-generation interactive AI systems with more advanced human skills, to function in realistic and critical applications such as providing personalized emotional support. In this talk, I will cover (1) how to build such expert-like AI systems specialized in social influence that can persuade, negotiate, and cooperate with other humans during conversations. (2) I will also discuss how humans perceive such specialized AI systems. This study validates the necessity of Autobot Law and proposes guidance to regulate such systems. (3) As these systems become more powerful, AI safety problems becomes more important. I will also describe how to persuade AI models to jailbreak them and study AI safety problems. Finally, I will conclude with my long-term vision to build a natural interface between human intelligence and machine intelligence via dialogues, from a multi-angel approach that combines Artificial Intelligence, Human-Computer Interaction, and social sciences, to develop expert AI systems for everyone.
Bio
Weiyan Shi is an assistant professor at Northeastern University. Her research interests are persuasive dialogue systems, and AI safety. She is recognized as MIT Technology Review 35 Innovators under 35, Rising Star in Machine Learning and Rising Star in EECS. She has received a Best Social Impact Paper, an Outstanding Paper, and a Best Paper Nomination for her work on persuasive dialogues at ACL 2019 and ACL 2024. She was also a core team member behind a Science publication on the first negotiation AI agent, Cicero, that achieved a human level in the game of Diplomacy. This work has been featured in The New York Times, The Washington Post, MIT Technology Review, Forbes, and other major media outlets.
COLLOQUIUM, Fall 2020 – Spring 2021
![]() |
Vanessa Frias-Martinez, University of Maryland Host: Hadi Amiri, UMASS Lowell Time: Apr 30, 2021 at 3:30–4:30pm ET Location: https://uml.zoom.us/j/91315501257 Password: cstalks |
Title: Data-driven decision making for cities and communities
Abstract
The pervasiveness of cell phones, mobile applications and social media generates vast amounts of digital traces that can reveal a wide range of human behavior. From mobility patterns to social networks, these signals expose insights about human behaviors and social interactions. In this talk, I will discuss approaches that can help local governments and non-profit organizations understand better the spatial dynamics of cities and communities, offering additional insights beyond more traditional sources of information. I will first give a high-level overview of the research that I lead in the Urban Computing Lab; followed by an in-depth description of two projects. The first project will present a novel approach to create cycling safety maps at large scale. These maps are used by departments of transportation to understand barriers to cycling adoption. In the second project I will describe a new method to enhance the fairness of mobility-based crime prediction models by addressing data bias. This work highlights that controlling for under-reporting can improve both the fairness and the accuracy of mobility-based crime prediction models. My ultimate goal is to illuminate the determinants of human dynamics and to understand the role that the context - represented as physical infrastructure and social fabric - plays in people’s mobility experiences, which can in turn assist in the design of more efficient and inclusive cities.
Bio
Vanessa Frias-Martinez is an associate professor in the iSchool and UMIACS, and an affiliate associate professor in the Department of Computer Science at the University of Maryland (UMD) where she also leads the Urban Computing Lab. Frias-Martinez's research areas are data-driven behavioral modeling and spatio-temporal data mining. Her research focuses on the use of large-scale ubiquitous data to model the interplay between human mobility patterns, social networks and the built environment. Specifically, Frias-Martinez develops methodologies to model and predict human behaviors in different contexts as well as tools to aid decision makers in areas such as poverty, natural disasters or urban planning. Before coming to UMD, she spent five years at Telefonica Research developing algorithms to analyze mobile digital traces. Frias-Martinez is the recipient of a National Science Foundation (NSF) CAREER Award and a La Caixa Fellowship. She received her PhD in Computer Science from Columbia University.
![]() |
He He, New York University Host: Hadi Amiri, UMASS Lowell Time: Apr 23, 2021 at 3:30–4:30pm ET Location: https://uml.zoom.us/j/96595540049 Password: cstalks |
Title: Guarding Against Spurious Correlations in Natural Language Understanding
Abstract
While we have made great progress in natural language understanding, transferring the success from benchmark datasets to real applications has not always been smooth. Notably, models sometimes make mistakes that are confusing and unexpected to humans. In this talk, I will discuss shortcuts in NLP tasks and present our recent works on guarding against spurious correlations in natural language understanding tasks (e.g. textual entailment and paraphrase identification) from the perspectives of both robust learning algorithms and better data coverage. Motivated by the observation that our data often contains a small amount of “unbiased” examples that do not exhibit spurious correlations, we present new learning algorithms that better exploit these minority examples. On the other hand, we may want to directly augment such “unbiased” examples. While recent works along this line are promising, we show several pitfalls in the data augmentation approach.
Bio
He He is an assistant professor in the Center for Data Science and Courant Institute at New York University. Before joining NYU, she spent a year at Amazon Web Services and was a postdoc at Stanford University. She received her PhD from University of Maryland, College Park. She is broadly interested in machine learning and natural language processing. Her current research interests include text generation, dialogue systems, and robust language understanding.
![]() |
Jacob Andreas, Massachusetts Institute of Technology Host: Hadi Amiri, UMASS Lowell Time: Apr 16, 2021 at 3:30–4:30pm ET Location: https://uml.zoom.us/j/92245469889 Password: cstalks |
Title: Implicit Symbolic Representation and Reasoning in Deep Neural Networks
Abstract
Standard neural network architectures can *in principle* implement symbol processing operations like logical deduction and simulation of complex automata. But do current neural models, trained on standard tasks like image recognition and language understanding, learn to perform symbol manipulation *in practice*? I'll survey two recent findings about implicit symbolic behavior in deep networks. First, I will describe a procedure for automatically labeling neurons with compositional logical descriptions of their behavior. These descriptions surface interpretable learned abstractions in models for vision and language, reveal implicit logical “definitions” of visual and linguistic categories, and enable the design of simple adversarial attacks that exploit errors in definitions. Second, I'll describe ongoing work showing that neural models for language generation perform implicit simulation of entities and relations described by text. Representations in these language models can be (linearly) translated into logical representations of world state, and can be directly edited to produce predictable changes in generated output. Together, these results suggest that highly structured representations and behaviors can emerge even in relatively unstructured models trained on natural tasks. Symbolic models of computation can play a key role in helping us understand these models.
Bio
Jacob Andreas is the X Consortium Assistant Professor at MIT. His research focuses on building intelligent systems that can communicate effectively using language and learn from human guidance. Jacob earned his Ph.D. from UC Berkeley, his M.Phil. from Cambridge (where he studied as a Churchill scholar) and his B.S. from Columbia. He has been the recipient of an NSF graduate fellowship, a Facebook fellowship, and paper awards at NAACL and ICML.
![]() |
Noah Smith, University of Washington Host: Hadi Amiri, UMASS Lowell Time: Apr 9, 2021 at 3:30–4:30pm ET Location: https://uml.zoom.us/j/97794620415 Password: cstalks |
Title: Language Models: Challenges and Progress
Abstract
Probabilistic language models are once again foundational to many advances in natural language processing research, bringing the exciting opportunity to harness raw text to build language technologies. With the emergence of deep architectures and protocols for finetuning a pretrained language model, many NLP solutions are being cast as simple variations on language modeling. This talk is about challenges in language model-based NLP and some of our work toward solutions. First, we'll consider evaluation of generated language. I'll present some alarming findings about humans and models and make some recommendations. Second, I'll turn to an ubiquitous design limitation in language modeling – the vocabulary – and present a linguistically principled, sample-efficient solution that enables modifying the vocabulary during finetuning and/or deployment. Finally, I'll delve into today's most popular language modeling architecture, the transformer, and show how its attention layers’ quadratic runtime can be made linear without affecting accuracy. Taken together, we hope these advances will broaden the population of people who can effectively use and contribute back to NLP.
Bio
Noah Smith is a Professor in the Paul G. Allen School of Computer Science & Engineering at the University of Washington, as well as a Senior Research Manager at the Allen Institute for Artificial Intelligence. Previously, he was an Associate Professor of Language Technologies and Machine Learning in the School of Computer Science at Carnegie Mellon University. He received his Ph.D. in Computer Science from Johns Hopkins University in 2006 and his B.S. in Computer Science and B.A. in Linguistics from the University of Maryland in 2001. His research interests include statistical natural language processing, machine learning, and applications of natural language processing, especially to the social sciences. His book, Linguistic Structure Prediction, covers many of these topics. He has served on the editorial boards of the journals Computational Linguistics (2009–2011), Journal of Artificial Intelligence Research (2011–present), and Transactions of the Association for Computational Linguistics (2012–present), as the secretary-treasurer of SIGDAT (2012–2015 and 2018–present), and as program co-chair of ACL 2016. Alumni of his research group, Noah's ARK, are international leaders in NLP in academia and industry; in 2017 UW's Sounding Board team won the inaugural Amazon Alexa Prize. He was named an ACL Fellow in 2020, “for significant contributions to linguistic structure prediction, computational social sciences, and improving NLP research methodology.” Smith's work has been recognized with a UW Innovation award (2016–2018), a Finmeccanica career development chair at CMU (2011–2014), an NSF CAREER award (2011–2016), a Hertz Foundation graduate fellowship (2001–2006), numerous best paper nominations and awards, and coverage by NPR, BBC, CBC, New York Times, Washington Post, and Time.
![]() |
Shuchin Aeron, Tufts University Host: Anna Rumshisky, UMASS Lowell Time: Mar 26, 2021 at 3:30–4:30pm ET Location: https://uml.zoom.us/j/95541491953 Password: cstalks |
Title: Excursions into applying ML for Learning Sciences: Initial results and lessons learned
Abstract
An important challenge in Learning Sciences is the coding of qualitative data for evidence of students’ engagement in scientific practices. Such evidence may appear as novel ideas, expressions of puzzlement, or idiosyncratic lines of reasoning. Typically, identifying such evidence requires time-consuming labor by trained analysts. In this work we explore the possibility for statistical machine learning (ML) methods to aid learning sciences researchers in qualitative coding. We start with a human coded set of lab reports from a biology course that were scored using an adapted version of the domain-general Structure of Observed Learning Outcomes (SOLO) taxonomy (Biggs and Collis, 1992). The adapted four-level scheme assigns higher scores to lab reports that exhibit desirable features of scientific writing, specifically: more complex claim structures, use of multiple evidences, and appropriately qualified conclusions that address or acknowledge uncertainty. We will present and evaluate a novel ML workflow that achieves good performance in scoring the lab reports for evidence of scientific thinking in student writing. This finding is subsequently corroborated via a blind re-coding experiment, wherein the reports that were always mis-classified by the ML algorithm in majority of the cross-validation steps, were re-evaluated by human coders. We found that the ML predictions agreed well with the re-coded scores thereby indicating the possibility that computational NLP tools can approach the reliability of human coding and may assist researchers in automatic coding at-scale.
Bio
Shuchin Aeron is an Associate Professor in the Dept. of ECE at Tufts University. He received his B. Tech from IIT Kanpur in 2002, MS from Boston University in 2004, and PhD from Boston University in 2009, all in Electrical Engineering. Prior to joining Tufts University in 2011, he was a post-doctoral fellow at Schlumberger Doll Research from 2009-2011. He received the NSF CAREER award in 2016. He is currently a senior member of the IEEE, an associate editor for IEEE Transactions on Geosciences and Remote Sensing, and is on the technical committee of Machine Learning for Signal Processing, IEEE SPS. He is also the co-director of the Data Science programs (MS and BS) in the School of Engineering at Tufts University. Prof. Aeron’s research interests lie in Information Theory, Tensor Data Analytics, and more recently in Mathematical Statistics and Optimal Transport.
![]() |
Sameer Singh, University of California, Irvine Host: Hadi Amiri, UMASS Lowell Time: Mar 19, 2021 at 3:30–4:30pm ET Location: https://uml.zoom.us/j/93777620265 Password: cstalks |
Title: Evaluating and Testing Natural Language Processing Models
Abstract
Current evaluation of the generalization of natural language processing (NLP) systems, and much of machine learning, primarily consists of measuring the accuracy on held-out instances of the dataset. Since the held-out instances are often gathered using similar annotation process as the training data, they include the same biases that act as shortcuts for machine learning models, allowing them to achieve accurate results without requiring actual natural language understanding. Thus held-out accuracy is often a poor proxy for measuring generalization, and further, aggregate metrics have little to say about where the problem may lie.
In this talk, I will introduce a number of approaches we are investigating to perform a more thorough evaluation of NLP systems. I will first provide an overview of automated techniques for perturbing instances in the dataset that identify loopholes and shortcuts in NLP models, including semantic adversaries and universal triggers. I will then describe recent work in creating comprehensive and thorough tests and evaluation benchmarks for NLP that aim to directly evaluate comprehension and understanding capabilities. The talk will cover a number of NLP tasks, including sentiment analysis, textual entailment, paraphrase detection, and question answering.
Bio
Sameer Singh is an Assistant Professor of Computer Science at the University of California, Irvine (UCI). He is working primarily on robustness and interpretability of machine learning algorithms, along with models that reason with text and structure for natural language processing. Sameer was a postdoctoral researcher at the University of Washington and received his PhD from the University of Massachusetts, Amherst, during which he also worked at Microsoft Research, Google Research, and Yahoo! Labs. He was selected as a DARPA Riser, and has been awarded the grand prize in the Yelp dataset challenge, the Yahoo! Key Scientific Challenges, UCI Mid-Career Excellence in research award, and recently received the Hellman and the Noyce Faculty fellowships. His group has received funding from Allen Institute for AI, Amazon, NSF, DARPA, Adobe Research, Base 11, and FICO. Sameer has published extensively at machine learning and natural language processing conferences and workshops, including paper awards at KDD 2016, ACL 2018, EMNLP 2019, AKBC 2020, and ACL 2020.
![]() |
Timothy Bickmore, Northeastern University Host: Hadi Amiri, UMASS Lowell Time: Mar 12, 2021 at 3:30–4:30pm ET Location: https://uml.zoom.us/j/92439134605 Password: cstalks |
Title: Health Counseling Dialog Systems: Promise and Peril
Abstract
The current pandemic provides a compelling case for automated solutions to health
behavior change, from mask wearing and social distancing, to behaviors
that can prevent or treat psychological distress or substance misuse, to
vaccination intent. Changing these behaviors can play a major role in how the pandemic is managed,
and will help determine when it ends and the trajectory of our societal recovery.
I will present a range of embodied conversational agents that have been used in medicine
and public health to promote compliance with recommended healthcare regimens, and discuss
how they could be used to help control COVID-19 and help us prepare for the next pandemic.
I will discuss how conversational agents have been shown to be particularly
effective at addressing health disparities for underserved populations, and why this is
crucially important in pandemic response. I will also discuss some of the inherent risks in
using natural language interfaces for medical counseling systems and outline some solutions
to prevent users from harming themselves by following incorrect advice.
Bio
Dr. Timothy Bickmore is a Professor and Associate Dean for Research in the Khoury College
of Computer Sciences at Northeastern University in Boston. The focus of his research is
on the development and evaluation of embodied conversational agents, virtual and robotic,
that emulate face-to-face interactions between health providers and patients. These agents
have been used in automated health education and long-term health behavior change interventions,
spanning preventive medicine and wellness promotion, chronic disease management, inpatient care,
substance misuse screening and treatment, mental health treatment, and palliative care.
His systems have been evaluated in multiple clinical trials with results published in medical
journals including JAMA and The Lancet. Prior to Northeastern, Dr. Bickmore served as an
Assistant Professor of Medicine at the Boston University School of Medicine. Dr. Bickmore
received his Ph.D. from MIT, doing his dissertation work in the Media Lab studying
interactions between people and embodied conversational agents in task contexts, such
as healthcare, in which social-emotional behavior can be used to improve outcomes.
![]() |
Matthew Lease, University of Texas at Austin Host: Hadi Amiri, UMASS Lowell Time: Mar 5, 2021 at 2:00–3:00pm ET Location: https://uml.zoom.us/j/97264164573 Password: cstalks |
Title: Adventures in Crowdsourcing: Toward Safer Content Moderation and Better Supporting Complex Annotation Tasks
Abstract
'll begin the talk discussing content moderation. While most user content posted on social media is benign, other content, such as violent or adult imagery, must be detected and blocked. Unfortunately, such detection is difficult to automate, due to high accuracy requirements, costs of errors, and nuanced rules for acceptable content. Consequently, social media platforms today rely on a vast workforce of human moderators. However, mounting evidence suggests that exposure to disturbing content can cause lasting psychological and emotional damage to some moderators. To mitigate such harm, we investigate a set of blur-based moderation interfaces for reducing exposure to disturbing content whilst preserving moderator ability to quickly and accurately flag it. We report experiments with Mechanical Turk workers to measure moderator accuracy, speed, and emotional well-being across six alternative designs. Our key findings show interactive blurring designs can reduce emotional impact without sacrificing moderation accuracy and speed. See our online demo at: http://ir.ischool.utexas.edu/CM/demo/. The second part of my talk will discuss aggregation modeling. Though many models have been proposed for binary or categorical labels, prior methods do not generalize to complex annotations (e.g., open-ended text, multivariate, or structured responses) without devising new models for each specific task. To obviate the need for task-specific modeling, we propose to model distances between labels, rather than the labels themselves. Our models are largely agnostic to the distance function; we leave it to the requesters to specify an appropriate distance function for their given annotation task. We propose three models of annotation quality, including a Bayesian hierarchical extension of multidimensional scaling which can be trained in an unsupervised or semi-supervised manner. Results show the generality and effectiveness of our models across diverse complex annotation tasks: sequence labeling, translation, syntactic parsing, and ranking.
Bio
Matthew Lease is an Associate Professor in the School of Information at the University of Texas at Austin, where he is co-leading Good Systems (http:goodsystems.utexas.edu/), an eight-year Grand Challenge to design responsible AI technologies. In addition, Lease is an Amazon Scholar, working on Amazon Mechanical Turk, SageMaker Ground Truth and Augmented Artificial Intelligence (A2I). He also worked previously at CrowdFlower. Lease received the Best Paper award at the 2016 AAAI Human Computation and Crowdsourcing conference, as well as three early career awards for crowdsourcing (NSF, DARPA, IMLS). From 2011-2013, Lease co-organized the National Institute of Standards and Technology (NIST) Text Retrieval Conference (TREC) crowdsourcing track.
![]() |
Philip Resnik, University of Maryland Host: Hadi Amiri, UMASS Lowell Time: Feb 26, 2021 at 3:30–4:30pm ET Location: https://uml.zoom.us/j/99309189281 Password: cstalks |
Title: Computational Analysis of Language and the Assessment of Suicide Risk
Abstract
This talk, to be given remotely in the middle of a pandemic, will be about a problem that already existed long prior to COVID-19 as a kind of international pandemic in its own right. Suicide has a worldwide death toll approaching 800,000 people per year worldwide, and in the U.S. in 2016 it became the second leading cause of death among those aged 10-34. Now compounding these existing problems is an “echo pandemic” of suicide and mental illness emerging in the wake of COVID-19, as people struggle with isolation, stress, and sustained disruptions of day to day life. I'll talk about computational linguistics research related to the problem of suicide, raising issues connected with computational research on mental health more generally and including not only the technological angle but also questions of data access, ethical considerations, and the role of computational technologies into the mental health ecosystem.
Bio
Philip Resnik is Professor at University of Maryland in the Department of Linguistics and Institute for Advanced Computer Studies. He earned his bachelor's in Computer Science at Harvard and his PhD in Computer and Information Science at the University of Pennsylvania, and does research in computational linguistics. Prior to joining UMD, he was an associate scientist at BBN, a graduate summer intern at IBM T.J. Watson Research Center (subsequently awarded an IBM Graduate Fellowship) while at UPenn, and a research scientist at Sun Microsystems Laboratories. Resnik's most recent research focus has been in computational social science, with an emphasis on connecting the signal available in people's language use with underlying mental state – this has applications in computational political science, particularly in connection with ideology and framing, and in mental health, focusing on the ways that linguistic behavior may help to identify and monitor depression, suicidality, and schizophrenia. Outside his academic research, Resnik has been a technical co-founder of CodeRyte (NLP for electronic health records, acquired by 3M in 2012), and is an advisor to Converseon (social strategy and analytics), FiscalNote (machine learning and analytics for government relations), and SoloSegment (web site search and content optimization). He was named an ACL Fellow in 2020.
|
Heng Ji, University of Illinois at Urbana-Champaign Host: Hadi Amiri, UMASS Lowell Time: Feb 19, 2021 at 3:30–4:30pm ET Location: https://uml.zoom.us/j/94521966294 Password: cstalks |
Title: How to Write a History Book?
Abstract
Understanding events and communicating about events are fundamental human activities. However, it's much more difficult to remember event-related information compared to entity-related information. For example, most people in the United States will be able to answer the question “Which city is Columbia University is located in?”, but very few people can give a complete answer to “Who died from COVID-19?”. Human-written history books are often incomplete and highly biased because “History is written by the victors”. In this talk I will present a new research direction on event-centric knowledge base construction from multimedia multilingual sources, and then perform consistency checking and reasoning. Our minds represent events at various levels of granularity and abstraction, which allows us to quickly access and reason about old and new scenarios. Progress in natural language understanding and computer vision has helped automate some parts of event understanding but the current, first-generation, automated event understanding is overly simplistic since it is local, sequential and flat. Real events are hierarchical and probabilistic. Understanding them requires knowledge in the form of a repository of abstracted event schemas (complex event templates), understanding the progress of time, using background knowledge, and performing global inference. Our approach to second-generation event understanding builds on an incidental supervision approach to inducing an event schema repository that is probabilistic, hierarchically organized and semantically coherent. This facilitates inducing higher-level event representations analysts can interact with, and allow them to guide further reasoning and extract events by constructing a novel structured cross-media cross-lingual common semantic space. When complex events unfold in an emergent and dynamic manner, the multimedia multilingual digital data from traditional news media and social media often convey conflicting information. To understand the many facets of such complex, dynamic situations, we have developed various novel methods to induce hierarchical narrative graph schemas and apply them to enhance end-to-end joint neural Information Extraction, event coreference resolution, and event time prediction.
Bio
Heng Ji is a professor at Computer Science Department, and an affiliated faculty member at Electrical and Computer Engineering Department of University of Illinois at Urbana-Champaign. She is an Amazon Scholar. She received her B.A. and M. A. in Computational Linguistics from Tsinghua University, and her M.S. and Ph.D. in Computer Science from New York University. Her research interests focus on Natural Language Processing, especially on Multimedia Multilingual Information Extraction, Knowledge Base Population and Knowledge-driven Generation. She was selected as “Young Scientist” and a member of the Global Future Council on the Future of Computing by the World Economic Forum in 2016 and 2017. The awards she received include “AI's 10 to Watch” Award by IEEE Intelligent Systems in 2013, NSF CAREER award in 2009, Google Research Award in 2009 and 2014, IBM Watson Faculty Award in 2012 and 2014, Bosch Research Award in 2014-2018, and ACL2020 Best Demo Paper award. She was invited by the Secretary of the U.S. Air Force and AFRL to join Air Force Data Analytics Expert Panel to inform the Air Force Strategy 2030. She is the lead of many multi-institution projects and tasks, including the U.S. ARL projects on information fusion and knowledge networks construction, DARPA DEFT Tinker Bell team and DARPA KAIROS RESIN team. She has coordinated the NIST TAC Knowledge Base Population task since 2010. She has served as the Program Committee Co-Chair of many conferences including NAACL-HLT2018. She is elected as the North American Chapter of the Association for Computational Linguistics (NAACL) secretary 2020-2021. Her research has been widely supported by the U.S. government agencies (DARPA, ARL, IARPA, NSF, AFRL, DHS) and industry (Amazon, Google, Bosch, IBM, Disney).
|
Alexander Rush, Cornell University Host: Hadi Amiri, UMASS Lowell Time: Dec 18, 2020 at 3:30–4:30pm ET Location: https://uml.zoom.us/j/93871406559 Password: cstalks |
Title: Deployable Language Systems
Abstract
Natural language models for translation and classification now work relatively well, and there is demand for widespread use in real systems. Models developed for research however do not naturally translate to deployment scenarios, particularly on resource constrained devices like mobile phones. In this talk I will discuss two axes that make it difficult to deploy NLP models in practice: a) Serial generation in translation models makes them difficult to optimize, and b) Fine-tuned parameter size in classification makes models difficult to deploy to end-users. I propose two approaches that aim to circumvent these issues, and discuss some practical work on deploying large NLP models on edge devices.
Bio
Alexander “Sasha” Rush is an Associate Professor at Cornell Tech in NYC. His group's research is in the intersection of natural language processing, deep learning, and structured prediction with applications in text generation and efficient inference. He contributes to several open-source projects in NLP and works part time on HuggingFace Transformers. He was recently senior Program Chair of ICLR and developed the MiniConf tool used to run ML/NLP virtual conferences. His work has received paper and demo awards at major NLP, visualization, and hardware conferences, an NSF Career Award, and several industrial faculty awards.
![]() |
Dan Roth, University of Pennsylvania Host: Hadi Amiri, UMASS Lowell Time: Dec 10, 2020 at 3:30–4:30pm ET Location: https://uml.zoom.us/j/99389698582 Password: cstalks |
Title: It's Time for Reasoning
Abstract
The fundamental issue underlying natural language understanding is that of semantics – there is a need to move toward understanding natural language at an appropriate level of abstraction in order to support natural language understanding and communication. Machine Learning has become ubiquitous in our attempt to induce semantic representations of natural language and support decisions that depend on it; however, while we have made significant progress over the last few years, it has focused on classification tasks for which we have large amounts of annotated data. Supporting high level decisions that depend on natural language understanding is still beyond our capabilities, partly since most of these tasks are very sparse and generating supervision signals for it does not scale. I will discuss some of the challenges underlying reasoning – making natural language understanding decisions that depend on multiple, interdependent, models, and exemplify it using the domain of Reasoning about Time, as it is expressed in natural language. If time suffice, I will touch upon other inference problems that challenge our ability to understand natural language, addressing issues in Information Pollution.
Bio
Dan Roth is the Eduardo D. Glandt Distinguished Professor at the Department of Computer and Information Science, University of Pennsylvania, and a Fellow of the AAAS, the ACM, AAAI, and the ACL. In 2017 Roth was awarded the John McCarthy Award, the highest award the AI community gives to mid-career AI researchers. Roth was recognized “for major conceptual and theoretical advances in the modeling of natural language understanding, machine learning, and reasoning.” Roth has published broadly in machine learning, natural language processing, knowledge representation and reasoning, and learning theory, and has developed advanced machine learning based tools for natural language applications that are being used widely. Until February 2017 Roth was the Editor-in-Chief of the Journal of Artificial Intelligence Research (JAIR). Roth has been involved in several startups; most recently he was a co-founder and chief scientist of NexLP, a startup that leverages the latest advances in Natural Language Processing (NLP), Cognitive Analytics, and Machine Learning in the legal and compliance domains. NexLP was sold to Reveal in 2020. Prof. Roth received his B.A Summa cum laude in Mathematics from the Technion, Israel, and his Ph.D. in Computer Science from Harvard University in 1995.
![]() |
Marinka Zitnik, Harvard University Host: Hadi Amiri, UMASS Lowell Time: Dec 4, 2020 at 2:00–3:00pm ET Location: https://uml.zoom.us/j/91465138533 Password: cstalks |
Title: Graph Neural Networks for Biomedical Data
Abstract
The success of machine learning depends heavily on the choice of representations used for downstream tasks. Graph neural networks have emerged as a predominant choice for learning representations of networked data. Still, methods require abundant label information and focus either on nodes or entire graphs. In this talk, I describe our efforts to expand the scope and ease the applicability of graph representation learning. First, I outline SubGNN, the first subgraph neural network for learning disentangled subgraph representations. Second, I will describe G-Meta, a novel meta-learning approach for graphs. G-Meta uses subgraphs to generalize to completely new graphs and never-before-seen labels using only a handful of nodes or edges. G-Meta is theoretically justified and scales to orders of magnitude larger datasets than prior work. Finally, I will discuss applications in biology and medicine. The new methods have enabled the repurposing of drugs for new diseases, including COVID-19, where our predictions were experimentally verified in the wet laboratory. Further, the methods enabled discovering dozens of combinations of drugs safe for patients with considerably fewer unwanted side effects than today's treatments. The methods also allow for molecular phenotyping, much better than more complex algorithms. Lastly, I describe our efforts in learning actionable representations that allow users of our models to receive predictions that can be interpreted meaningfully.
Bio
Marinka Zitnik is an Assistant Professor at Harvard University with appointments in the Department of Biomedical Informatics, Blavatnik Institute, Broad Institute of MIT and Harvard, and Harvard Data Science. Dr. Zitnik is a computer scientist studying machine learning, focusing on challenges brought forward by data in science, medicine, and health. She has published extensively on representation learning, knowledge graphs, data fusion, graph ML (NeurIPS, JMLR, IEEE TPAMI, KDD, ICLR), and applications to biomedicine (Nature Methods, Nature Communications, PNAS). Her algorithms are used by major institutions, including Baylor College of Medicine, Karolinska Institute, Stanford Medical School, and Massachusetts General Hospital. Her work received several best paper, poster, and research awards from the International Society for Computational Biology. She has recently been named a Rising Star in Electrical Engineering and Computer Science (EECS) by MIT and also a Next Generation in Biomedicine by the Broad Institute, being the only young scientist who received such recognition in both EECS and Biomedicine.
![]() |
Rongxing Lu, University of New Brunswick Host: Xinwen Fu, UMASS Lowell Time: Nov 20, 2020 at 3:30–4:30pm ET Location: https://uml.zoom.us/j/99726805928 Password: cstalks |
Title: Privacy-Preserving Computation Offloading for Time-Series Activities Classification in eHealthcare
Abstract
The convergence of Internet of Things (IoT) and smart healthcare technologies has opened up various promising applications that can significantly improve the quality of healthcare services. Among those applications, predicting patients’ physical health based on their routine activities data collected from IoT devices is one of the most popular applications, where patients’ data are considered as time-series activities and patients’ physical health can be predicted by a classification model. Though many existing works have been exploited in this application, they either impose the computational costs of the classification on the healthcare center (e.g., hospitals) or delegate the cloud to process the classification without considering the privacy issues. However, since the healthcare center may not be powerful in computing and the cloud is not fully trusted, there is a high demand in offloading the computational cost of the healthcare center to the cloud while preserving the privacy of classification result against the cloud. Aiming at this challenge, in this work, we present a novel privacy-preserving time-series activities classification algorithm by using hidden markov model (HMM). Specifically, we first design a variant of forward algorithm of HMM and further introduce a privacy-preserving variant of forward (PPVF) protocol for the variant of forward algorithm. Then, based on the PPVF protocol, we propose our classification algorithm, which can offload the computational cost of the healthcare center to the cloud and preserve the privacy of classification result. Finally, security analysis and performance show that our proposal is not only privacy-preserving but also efficient in terms of lower computational cost.
Bio
Rongxing Lu (S’99-M’11-SM’15) is an associate professor at the Faculty of Computer Science (FCS), University of New Brunswick (UNB), Canada, since August 2016. Before that, he worked as an assistant professor at the School of Electrical and Electronic Engineering, Nanyang Technological University (NTU), Singapore from April 2013 to August 2016. Rongxing Lu worked as a Postdoctoral Fellow at the University of Waterloo from May 2012 to April 2013. He was awarded the most prestigious “Governor General’s Gold Medal”, when he received his PhD degree from the Department of Electrical & Computer Engineering, University of Waterloo, Canada, in 2012; and won the 8th IEEE Communications Society (ComSoc) Asia Pacific (AP) Outstanding Young Researcher Award, in 2013. He is presently a senior member of IEEE Communications Society. His research interests include applied cryptography, privacy enhancing technologies, and IoT-Big Data security and privacy. He has published extensively in his areas of expertise (with citation 20,700+ and H-index 71 from Google Scholar as of November 2020), and was the recipient of 9 best (student) paper awards from some reputable journals and conferences. Currently, Dr. Lu serves as the Vice-Chair (Conferences) of IEEE ComSoc CIS-TC (Communications and Information Security Technical Committee). Dr. Lu is the Winner of 2016-17 Excellence in Teaching Award, FCS, UNB.
![]() |
Kenneth Mandl, Harvard University Host: Hadi Amiri, UMASS Lowell Time: Nov 13, 2020 at 3:30–4:30pm ET Location: https://uml.zoom.us/j/91745228147 Password: cstalks |
Title: Parsimonious Standards for Extraordinary Outcomes: a Universal, Regulated API for Healthcare
Bio
Mandl directs the Computational Health Informatics Program at Boston Children's Hospital and is the Donald A.B. Lindberg Professor of Pediatrics and Professor of Biomedical Informatics at Harvard Medical School. His work at the intersection of population and individual health has had a unique and sustained influence on the developing field of biomedical informatics. He was a pioneer of the first personally controlled health record systems, the first participatory surveillance system, and real time biosurveillance. Mandl co-developed SMART, a widely-adopted approach to enable a health app written once to access digital data and run anywhere in the healthcare system. The 21st Century Cures Act made SMART a universal property of the healthcare system, enabling innovators to rapidly reach market-scale and patients and doctors to access data and an “app store for health.” He applies open source inventions to lead EHR research networks and is a leader of the Genomics Research and Innovation Network. Mandl was advisor to two Directors of the CDC and chaired the Board of Scientific Counselors of the NIH's National Library of Medicine. He has been elected to multiple honor societies including the American Society for Clinical Investigation, Society for Pediatric Research, American College of Medical Informatics and American Pediatric Society. He received the Presidential Early Career Award for Scientists and Engineers and the Donald A.B. Lindberg Award for Innovation in Informatics.
![]() |
Ted Pedersen, University of Minnesota in Duluth Host: Hadi Amiri, UMASS Lowell Time: Nov 6, 2020 at 4:30–5:30pm ET Location: https://uml.zoom.us/j/94850255401 Password: cstalks |
Title: Automatically Identifying Islamophobia in Social Media
Abstract
Social media continues to grow in its scope, importance, and toxicity.
Hate speech is ever-present in today’s social media, and causes or
contributes to dangerous situations in the real world for those it
targets. Anti-Muslim bias and hatred has escalated in both public life
and social media in recent years. This talk will overview a new and
ongoing project in identifying Islamophobia in social media using
techniques from Natural Language Processing. I will describe our
methods of data collection and annotation,and discuss some of the
challenges we have encountered thus far. In addition I’ll describe
some of the pitfalls that exist for any effort attempting to identify
hate speech (automatically or not).
Bio
Ted Pedersen is a Professor in the Department of Computer Science at
the University of Minnesota, Duluth. His research interests are in
Natural Language Processing and most recently are focused on
computational humor and identifying hate speech. His research has
previously been supported by the National Institutes of Health (NIH)
and a National Science Foundation (NSF) CAREER award. More details are
available at http://www.d.umn.edu/~tpederse.
![]() |
Dina Demner, NIH, National Library of Medicine Host: Hadi Amiri, UMASS Lowell Time: Oct 30, 2020 at 3:30–4:30pm ET Location: https://uml.zoom.us/j/98198637523 Password: cstalks |
Title: Looking for information and answers during a pandemic
Abstract
COVID-19 caused the first ever infodemic – an avalanche of scientific publications, as well as official and unofficial communications related to the disease caused by the novel coronavirus. Most of these publications intend to inform clinicians, researchers, policy makers, and patients about the health, socio-economic, and cultural consequences of the pandemic. Leveraging this stream of information is essential for developing policies, guidelines and strategies during the pandemic, for recovery after the COVID-19 pandemic, and for designing measures to prevent recurrence of similar threats.
In collaboration with the National Institute of Standards (NIST), Ai2 and UTHealth and OHSU researchers, we have developed datasets for retrieval of COVID-19 information and automatic question answering. These datasets allowed us to (1) conduct community-wide evaluations of the information retrieval and question answering systems; (2) develop novel approaches to meeting information needs as they evolve during pandemics; and (3) automatically detect misinformation. I will discuss the resources and some of the lessons learned in the five rounds of the TREC-COVID evaluation, the ongoing Epidemic Question Answering Challenge (EPIC-QA), and our approaches to detecting misinformation about COVID-19 within the TREC 2020 Misinformation track evaluation.
Bio
Dr. Dina Demner-Fushman is an Investigator at the Lister Hill National Center for Biomedical Communications, NLM, NIH. Her group studies approaches to Information Extraction for Clinical Decision Support, Clinical Data Processing, and Image and Text Indexing for Clinical Decision Support and Education. The outgrowths of this research are the evidence-based decision support system in use at the NIH Clinical Center since 2009, an image retrieval engine, Open-i, launched in 2012, and an automatic question answering service CHiQA launched in 2018. Dina Demner-Fushman is a Fellow of the American College of Medical Informatics (ACMI), an Associate Editor of the Journal of the American Medical Informatics Association (JAMIA), and a founding member of the Association for Computational Linguistics Special Interest Group on biomedical natural language processing. As the secretary of this group, she has been an essential organizer of the yearly ACL BioNLP Workshop since 2007. Dr. Demner-Fushman has received sixteen staff recognition and special act NLM awards since 2002. She is a recipient of the 2012 NIH Award of Merit, a 2013 NLM Regents Award for Scholarship or Technical Achievement and a 2014 NIH Office of the Director Honor Award.
![]() |
Jordan Boyd Graber, University of Maryland, College Park Host: Hadi Amiri, UMASS Lowell Time: Oct 23, 2020 at 3:30–4:30pm ET Location: https://uml.zoom.us/j/94630726928 Password: cstalks |
Title: Artificial intelligence isn't a game show (but it should be)
Abstract
Artificial intelligence is viewed as a goal in science (let's build
intelligent machines) and in education (let's train software engineers
to build smart assistants). Despite the serious implications for the
economy and society, the most widely-accepted view of the end goal of
Artificial Intelligence is a parlor game: a trivial “imitation game”
(known today as the Turing Test). Likewise, many of the watersheds in
the public understanding of AI progress have been in frivolous games
like chess or go. Sometimes, they're a literal game show like
Jeopardy! After discussing why existing game show exhibitions have
given an inaccurate impression of how well we're doing with question
answering, I'll discuss how we can use the skills and strategies of
high school trivia competitions to improve the science of AI,
communicate the limitations of AI, and to broaden participation in
computer science and artificial intelligence.
Bio
Jordan Boyd-Graber is an associate professor in the University of
Maryland's Computer Science Department, iSchool, UMIACS, and Language
Science Center. Jordan's research focus is in applying machine
learning and Bayesian probabilistic models to problems that help us
better understand social interaction or the human cognitive
process. He and his students have won “best of” awards at NIPS (2009,
2015), NAACL (2016), and CoNLL (2015), and Jordan won the British
Computing Society's 2015 Karen Spärk Jones Award and a 2017 NSF CAREER
award. His research has been funded by DARPA, IARPA, NSF, NCSES, ARL,
NIH, and Lockheed Martin and has been featured by CNN, Huffington
Post, New York Magazine, and the Wall Street Journal.
![]() |
Antonio Torralba, Massachusetts Institute of Technology (MIT) Host: Hadi Amiri, UMASS Lowell Time: Oct 16, 2020 at 3:30–4:30pm ET Location: https://uml.zoom.us/j/93071258047 Password: cstalks |
Title: Learning from vision, touch and audition
Abstract
Babies learn with very little supervision, and, even when supervision is present, it comes in the form of an unknown spoken language that also needs to be learned. How can kids make sense of the world? In this talk, I will talk about several ways in which one can discover meaningful representations without requiring manually annotated data. I will show that an agent that has access to multimodal data (like vision, audition or touch) can use the correlation between images and sounds to discover objects in the world without supervision. I will show that ambient sounds can be used as a supervisory signal for learning to see and vice versa (the sound of crashing waves, the roar of fast-moving cars – sound conveys important information about the objects in our surroundings). I will describe an approach that learns, by watching videos without annotations, to locate image regions that produce sounds, and to separate the input sounds into a set of components that represents the sound from each pixel. I will also discuss our recent work on capturing tactile information. I will also show how Generative Adversarial Networks (GANs) can learn meaningful internal representations without supervision.
Bio
Antonio Torralba is the Thomas and Gerd Perkins Professor and head of the AI+D faculty at the Department of Electrical Engineering and Computer Science (EECS) at the Massachusetts Institute of Technology (MIT). From 2017 to 2020, he was the MIT director of the MIT-IBM Watson AI Lab, and, from 2018 to 2020, the inaugural director of the MIT Quest for Intelligence, a MIT campus-wide initiative to discover the foundations of intelligence. He is also member of CSAIL and the Center for Brains, Minds and Machines. He received the degree in telecommunications engineering from Telecom BCN, Spain, in 1994 and the Ph.D. degree in signal, image, and speech processing from the Institut National Polytechnique de Grenoble, France, in 2000. From 2000 to 2005, he spent postdoctoral training at the Brain and Cognitive Science Department and the Computer Science and Artificial Intelligence Laboratory, MIT, where he is now a professor. Prof. Torralba is an Associate Editor of the International Journal in Computer Vision, and has served as program chair for the Computer Vision and Pattern Recognition conference in 2015. He received the 2008 National Science Foundation (NSF) Career award, the best student paper award at the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) in 2009, and the 2010 J. K. Aggarwal Prize from the International Association for Pattern Recognition (IAPR). In 2017, he received the Frank Quick Faculty Research Innovation Fellowship and the Louis D. Smullin (’39) Award for Teaching Excellence. And the 2020 PAMI Mark Everingham Prize.
![]() |
Eduard Hovy, Carnegie Mellon University (CMU) Host: Hadi Amiri, UMASS Lowell Time: Oct 9, 2020 at 5:00–6:00pm ET Location: https://uml.zoom.us/j/92887951351 Password: cstalks |
Title: From Simple to Complex QA
Abstract
Recent automated QA system achieve some strong results using a variety of techniques. How do complex/deep/neural QA approaches differ from simple/shallow ones? In early QA, pattern-learning and -matching techniques identified the appropriate factoid answer(s). In deep QA, neural architectures learn and apply more-flexible generalized word/type-sequence 'patterns'. However, many QA tasks require some sort of intermediate reasoning or other inference procedures that go beyond generalized patterns of words and phrases. One approach focuses on learning small access functions to locate the answer in structured resources like tables or databases. But much (or most) online information is not structured, and what to do in this case is unclear. Most current 'deep' QA research takes a one-size-fits-all approach based on the hope that a multi-layer neural architecture will somehow learn to encode inference steps automatically. The main problem facing this approach is the difficulty in determining exactly what reasoning is required, and what knowledge resources are needed in support. How should the QA community address this challenge? In this talk I outline the problem, define four levels of QA, and propose a general direction for future research.
Bio
Eduard Hovy is a research professor at the Language Technologies Institute in the School of Computer Science at Carnegie Mellon University. He also holds adjunct professorships in CMU's Machine Learning Department and at USC (Los Angeles) and BUPT (Beijing). Dr. Hovy completed a Ph.D. in Computer Science (Artificial Intelligence) at Yale University in 1987, and was awarded honorary doctorates from the National Distance Education University (UNED) in Madrid in 2013 and the University of Antwerp in 2015. He is one of the initial 17 Fellows of the Association for Computational Linguistics (ACL) and is also a Fellow of the Association for the Advancement of Artificial Intelligence (AAAI). Dr. Hovy’s research focuses on computational semantics of language, and addresses various areas in Natural Language Processing and Data Analytics, including in-depth machine reading of text, information extraction, automated text summarization, question answering, the semi-automated construction of large lexicons and ontologies, and machine translation. In late 2019 his Google h-index was 80, with over 30,000 citations. Dr. Hovy is the author or co-editor of six books and over 400 technical articles and is a popular invited speaker. From 2003 to 2015 he was co-Director of Research for the Department of Homeland Security’s Center of Excellence for Command, Control, and Interoperability Data Analytics, a distributed cooperation of 17 universities. In 2001 Dr. Hovy served as President of the international Association of Computational Linguistics (ACL), in 2001–03 as President of the International Association of Machine Translation (IAMT), and in 2010–11 as President of the Digital Government Society (DGS). Dr. Hovy regularly co-teaches Ph.D.-level courses and has served on Advisory and Review Boards for both research institutes and funding organizations in Germany, Italy, Netherlands, Ireland, Singapore, and the USA.
|
Wolfgang Gatterbauer, Northeastern University Host: Tingjian Ge, UMASS Lowell Time: Oct 5, 2020 at 2:00–3:00pm ET Location: https://uml.zoom.us/j/95441518174 Password: |
Title: Algebraic Amplification for Semi-Supervised Learning from Sparse Data
Abstract
Node classification is an important problem in graph data management. It
is commonly solved by various label propagation methods that work
iteratively starting from a few labeled seed nodes. For graphs with
arbitrary compatibilities between classes, these methods crucially
depend on knowing the compatibility matrix that must be provided by
either domain experts or heuristics. Can we instead directly estimate
the correct compatibilities from a sparsely labeled graph in a
principled and scalable way? We answer this question affirmatively and
suggest a method called distant compatibility estimation that works even
on extremely sparsely labeled graphs (e.g., 1 in 10,000 nodes is
labeled) in a fraction of the time it later takes to label the remaining
nodes. Our approach first creates multiple factorized graph
representations (with size independent of the graph) and then performs
estimation on these smaller graph sketches. We refer to algebraic
amplification as the more general idea of leveraging algebraic
properties of an algorithm's update equations to amplify sparse signals.
We show that our estimator is by orders of magnitude faster than an
alternative approach and that the end-to-end classification accuracy is
comparable to using gold standard compatibilities. This makes it a cheap
preprocessing step for any existing label propagation method and removes
the current dependence on heuristics.
VLDB 2015: Linearized and single-pass belief propagation
link1,
link2,
link2
SIGMOD 2020: Factorized Graph Representations for Semi-Supervised
Learning from Sparse Data
link1,
link2,
link3
CODE
https://github.com/northeastern-datalab/factorized-graphs/
Bio
Wolfgang Gatterbauer is an Associate Professor in the Khoury College of
Computer Sciences at Northeastern University. Prior to joining
Northeastern, he was a postdoctoral fellow in the database group at the
University of Washington and an Assistant Professor in the Tepper School
of Business at Carnegie Mellon University. One major focus of his
research is to extend the capabilities of modern data management systems
in generic ways and to allow them to support novel functionalities that
seem hard at first. Examples of such functionalities are managing trust,
provenance, explanations, and uncertain & inconsistent data. He is a
recipient of the NSF Career award and “best-of-conference” mentions from
VLDB 2015, SIGMOD 2017, and WALCOLM 2017. In earlier times, he won a
Bronze medal at the International Physics Olympiad, worked in the steam
turbine development department of ABB Alstom Power, and in the German
office of McKinsey & Company.
https:db.khoury.northeastern.edu/
![]() |
Brendan T. O'Connor, UMASS, Amherst Host: Hadi Amiri, UMASS Lowell Time: Oct 2, 2020 at 3:30–4:30pm ET Location: https://uml.zoom.us/j/98102693544 Password: cstalks |
Title: Social Factors in Natural Language Processing
Abstract
What can text analysis tell us about society? News, social media, and historical documents record events, beliefs, and culture. Natural language processing has the promise to quickly discover patterns and themes in large text collections.
At the same time, findings from the social sciences can better inform the design of artificial intelligence. We conduct a case study of dialectal language in online conversational text by investigating African-American English (AAE) on Twitter from geolocated public messages, through a demographically supervised model to identify AAE-like language associated with geo-located messages. We verify that this language follows well-known AAE linguistic phenomena – and furthermore, existing tools like language identification, part-of-speech tagging, and dependency parsing fail on this AAE-like language more often than text associated with white speakers. We leverage our model to fix racial bias in some of these tools, and discuss future implications for fairness and artificial intelligence.
Bio
Brendan O'Connor http://brenocon.com is an associate professor in the College of Information and Computer Sciences at the University of Massachusetts Amherst, who works in the intersection of computational social science and natural language processing – studying how social factors influence language technologies, and how to better understand social trends with text analysis. For example, he has investigated racial bias in NLP technologies, political events reported in news, language in Twitter, and crowdsourcing foundations of NLP. He is a recipient of the NSF CAREER and Google Faculty Research awards, has received a best paper award, and his research has been cited thousands of times and been featured in the media. At UMass Amherst, he is affiliated with the Computational Social Science Institute and Center for Data Science. His PhD was completed in 2014 from Carnegie Mellon University's Machine Learning Department, and he has previously been a Visiting Fellow at the Harvard Institute for Quantitative Social Science, and worked in the Facebook Data Science group and at the company Crowdflower; he started studying the intersection of AI and social science in Symbolic Systems (BS/MS) at Stanford University.
Hadi Amiri: hadi_amiri@uml.edu | hadi@cs.uml.edu | @amirieb