Selected Publications
You can find all my articles at my Google Scholar profile.
Recent Preprints!
- EAGLE: A Domain Generalization Framework for AI-generated Text Detection. Amrita Bhattacharjee, Raha Moraffah, Joshua Garland, Huan Liu. paper link
Selected Accepted Publications
Towards Inference-time Category-wise Safety Steering for Large Language Models. Amrita Bhattacharjee*, Shaona Ghosh*, Traian Rebedea, Christopher Parisien. paper link [NeurIPS 2024 - Safe Generative AI] [Work done at NVIDIA!]
Defending Against Social Engineering Attacks in the Age of LLMs. Lin Ai*, Tharindu Kumarage*, Amrita Bhattacharjee*, Zizhou Liu, Zheng Hui, Michael Davinroy, James Cook, Laura Cassani, Kirill Trapeznikov, Matthias Kirchner, Arslan Basharat, Anthony Hoogs, Joshua Garland, Huan Liu, Julia Hirschberg. paper link [EMNLP 2024]
Large Language Models for Data Annotation: A Survey. Zhen Tan*, Dawei Li*, Song Wang*, Alimohammad Beigi, Bohan Jiang, Amrita Bhattacharjee, Mansooreh Karami, Jundong Li, Lu Cheng, Huan Liu. paper link [EMNLP 2024]
Zero-shot LLM-guided Counterfactual Generation: A Case Study on NLP Model Evaluation. Amrita Bhattacharjee, Raha Moraffah, Joshua Garland, Huan Liu. preprint link [IEEE BigData 2024]
Towards LLM-guided Causal Explainability for Black-box Text Classifiers. Amrita Bhattacharjee, Raha Moraffah, Joshua Garland, Huan Liu. paper link [AAAI ReLM 2024]
Adversarial Text Purification: A Large Language Model Approach for Defense. Raha Moraffah, Shubh Khandelwal, Amrita Bhattacharjee, Huan Liu. paper link [PAKDD 2024]
š[Outstanding Paper Award]š ConDA: Contrastive Domain Adaptation for AI-generated Text Detection. Amrita Bhattacharjee, Tharindu Kumarage, Raha Moraffah, Huan Liu paper link [IJCNLP-AACL 2023]
Towards Interpretable Hate Speech Detection using Large Language Model-extracted Rationales. Ayushi Nirmal*, Amrita Bhattacharjee*, Paras Sheth, Huan Liu. (* Equal contribution) paper link [NAACL WOAH 2024]
J-Guard: Journalism Guided Adversarially Robust Detection of AI-generated News. Tharindu Kumarage, Amrita Bhattacharjee, Djordje Padejski, Kristy Roschke, Dan Gillmor, Scott Ruston, Huan Liu, Joshua Garland. paper link [IJCNLP-AACL 2023]
Fighting Fire with Fire: Can ChatGPT Detect AI-generated Text? Amrita Bhattacharjee, Huan Liu. paper link [ACM SIGKDD Explorations,Vol. 25, No. 2]
Harnessing Artificial Intelligence to Combat Online Hate: Exploring the Challenges and Opportunities of Large Language Models in Hate Speech Detection. Tharindu Kumarage*, Amrita Bhattacharjee*, Joshua Garland. [Book Chapter]
Text Transformations in Contrastive Self-Supervised Learning: A Review. Amrita Bhattacharjee*, Mansooreh Karami*, Huan Liu. paper link [IJCAI 2022 (Survey)]