Project

CS 695

Natural Language Processing (Special Topics)

Instructor

Ziyu Yao (ziyuyao [at] gmu [dot] edu)
Please email the instructor if you have any questions regarding the project.

Requirements

Students are free to complete the final project either individually or in a group (but with no more than 3 students). If working individually, the project can be less ambitious but should not be less complete; If working in a group, all students should contribute equally, and all students will obtain the same grade from the project. Students are allowed to combine this project with their research or other course projects. However, the project must still involve NLP concepts from this course. If the students are not sure whether their other projects could be combined with this course project, they should email the instructor for confirmation. Note that any external resources used in this project must be clearly cited in the reports.

Project Proposal

The project proposal should include at least the following items:

What problem you want to address, and the motivation (especially if it is a new or less-studied problem);
What dataset(s) you plan to use; if you plan to collect a new dataset, describe the procedure and the source data;
A literature survey on existing work that studies the same/a similar problem and/or investigates the same/a similar method; note that students are expected to compare their proposed methods with existing work, as a way of justifying the novelty of their work
What you plan to do to pursue this project.

A PDF proposal must be turned in through Blackboard by the due date of Assignment 2. There is no other format requirement on the proposal writing. Students are highly recommended to discuss the project idea with the instructor before the deadline. Feedback will be provided, based on which students may revise their proposal and reflect the changes in the proposal presentation. After the submission, the instructor may provide a second-round feedback.

Final Project Report

The final project report is typically expanded from the project proposal (except that all "plans" should be replaced by what have been actually done). Specifically, students should clearly describe:

What problem you have addressed and the motivation;
What datasets you have used or collected (with procedure) for the project;
Existing work that studies the same/a similar problem and/or investigates the same/a similar method;
The technical details of the proposed methods;
Experimental results comparing the proposed methods with the baselines, plus analysis and discussion of the results;
Possible future work extending from this project.

Essentially, students should treat the report as a conference submission and include all necessary details.

Format requirement: the report should follow an *ACL style (Latex/Word template here, under “Paper Submission and Templates”). This also means that your report (or paper) should begin with an abstract and an introduction, just like a typical paper. If you work individually, the paper should target at 4 pages (i.e., an *ACL short paper); if work in a group, the paper should be in 8 pages (i.e., an *ACL long paper); both excluding references.

Source code submission: while submitting the final project report (through Blackboard), you should also submit your source code implementation (with the two in a .zip file). A README document is required in the source code submission, such that the TA/instructor can reproduce your results.

Grading: Note that the final project will not be graded solely based on how well your model works, as long as you can justify the novelty of your work and convincingly show that your model is doing something (e.g., with careful experimental designs, comprehensive quantitative/qualitative analysis on the results, thoughtful discussions, etc.). Good writing, e.g., being able to clearly describe what you have done and how they are compared with existing work, is also important.

Topic Selection

Students are encouraged to explore any topics they are interested in, as long as the topics involve NLP concepts we discussed in the course. A comprehensive set of NLP topics can be found in the Call for Paper of any *ACL conferences (e.g., CFP of ACL 2022). For students who plan to do research with the instructor in a longer term (e.g., one year) or will work on NLP for their PhD study, they are strongly recommended to discuss possible research ideas with the instructor.

The final project work will be expected to be a novel research contribution that either (1) introduces new techniques for one of the existing tasks in the assignment utilizing one of the more advanced techniques introduced in the class, or (2) tackles a new NLP task (potentially with a neural network model that is motivated by the unique problems posed by the application domain), or (3) presents a novel, meaningful analysis of existing methods and their potential failures.

The following are a few useful data resources, but it is not a comprehensive list. Students are not forced to use any of them in their projects:

Semantic Parsing

Text to SQL:

Yu, Tao, et al. "Spider: A Large-Scale Human-Labeled Dataset for Complex and Cross-Domain Semantic Parsing and Text-to-SQL Task." Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018.
Suhr, Alane, et al. "Exploring unexplored generalization challenges for cross-database semantic parsing." Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020.
Elgohary, Ahmed, Saghar Hosseini, and Ahmed Hassan Awadallah. "Speak to your Parser: Interactive Text-to-SQL with Natural Language Feedback." Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020.

Conversational semantic parsing:

Yu, Tao, et al. "SParC: Cross-Domain Semantic Parsing in Context." Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019.
Semantic Machines, "Task-oriented dialogue as dataflow synthesis." Transactions of the Association for Computational Linguistics 8 (2020): 556-571.
Cheng, Jianpeng, et al. "Conversational Semantic Parsing for Dialog State Tracking." Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2020.

Exploring compositional generalization:

Lake, Brenden, and Marco Baroni. "Generalization without systematicity: On the compositional skills of sequence-to-sequence recurrent networks." International conference on machine learning. PMLR, 2018.
Keysers, Daniel, et al. "Measuring Compositional Generalization: A Comprehensive Method on Realistic Data." International Conference on Learning Representations. 2020.
Kim, Najoung, and Tal Linzen. "COGS: A Compositional Generalization Challenge Based on Semantic Interpretation." Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2020.

Question Answering

QA over structured/semi-structured data:

Pasupat, Panupong, and Percy Liang. "Compositional Semantic Parsing on Semi-Structured Tables." Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 2015.
Chen, Wenhu, et al. "TabFact: A Large-scale Dataset for Table-based Fact Verification." International Conference on Learning Representations. 2019.
Chen, Wenhu, et al. "HybridQA: A Dataset of Multi-Hop Question Answering over Tabular and Textual Data." Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings. 2020.
Gu, Yu, et al. "Beyond IID: three levels of generalization for question answering on knowledge bases." Proceedings of the Web Conference 2021. 2021.

QA over textual data:

Yang, Zhilin, et al. "HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering." Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018.
Kwiatkowski, Tom, et al. "Natural questions: a benchmark for question answering research." Transactions of the Association for Computational Linguistics 7 (2019): 453-466.
Fisch, Adam, et al. "MRQA 2019 Shared Task: Evaluating Generalization in Reading Comprehension." Proceedings of the 2nd Workshop on Machine Reading for Question Answering. 2019.
Dua, Dheeru, et al. "DROP: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs." Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 2019.
Saeidi, Marzieh, et al. "Interpretation of Natural Language Rules in Conversational Machine Reading." Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018.

Language Generation

Data to text:

Parikh, Ankur, et al. "ToTTo: A Controlled Table-To-Text Generation Dataset." Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2020.
Nan, Linyong, et al. "DART: Open-Domain Structured Data Record to Text Generation." Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2021.

Dialogue Generation:

Zhang, Saizheng, et al. "Personalizing Dialogue Agents: I have a dog, do you have pets too?." Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2018.
Zhou, Kangyan, Shrimai Prabhumoye, and Alan W. Black. "A Dataset for Document Grounded Conversations." Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018.
Dinan, Emily, et al. "Wizard of wikipedia: Knowledge-powered conversational agents." arXiv preprint arXiv:1811.01241 (2018).
Rashkin, Hannah, et al. "Towards Empathetic Open-domain Conversation Models: A New Benchmark and Dataset." Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019.
Zang, Xiaoxue, et al. "MultiWOZ 2.2: A Dialogue Dataset with Additional Annotation Corrections and State Tracking Baselines." Proceedings of the 2nd Workshop on Natural Language Processing for Conversational AI. 2020.

Language and Code

Husain, Hamel, et al. "CodeSearchNet challenge: Evaluating the state of semantic code search." arXiv preprint arXiv:1909.09436 (2019).
Puri, Ruchir, et al. "Project CodeNet: A Large-Scale AI for Code Dataset for Learning a Diversity of Coding Tasks." arXiv preprint arXiv:2105.12655 (2021).
Yin, Pengcheng, et al. "Learning to mine aligned code and natural language pairs from stack overflow." 2018 IEEE/ACM 15th international conference on mining software repositories (MSR). IEEE, 2018.
Hendrycks, Dan, et al. "Measuring Coding Challenge Competence With APPS." arXiv preprint arXiv:2105.09938 (2021).
Agashe, Rajas, Srinivasan Iyer, and Luke Zettlemoyer. "JuICe: A Large Scale Distantly Supervised Dataset for Open Domain Context-based Code Generation." Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 2019.
Austin, Jacob, et al. "Program Synthesis with Large Language Models." arXiv preprint arXiv:2108.07732 (2021).

NLP + CV/Robotics

Robot/Agent Instruction:

Suhr, Alane, et al. "Executing Instructions in Situated Collaborative Interactions." Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 2019.
Shridhar, Mohit, et al. "ALFRED: A benchmark for interpreting grounded instructions for everyday tasks." Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020.
Banerjee, Shurjo, Jesse Thomason, and Jason J. Corso. "The RobotSlang Benchmark: Dialog-guided Robot Localization and Navigation." arXiv preprint arXiv:2010.12639 (2020).
Küttler, Heinrich, et al. "The nethack learning environment." Proceedings of the Conference on Neural Information Processing Systems (NeurIPS). 2020.

Visual QA:

Johnson, Justin, et al. "CLEVR: A diagnostic dataset for compositional language and elementary visual reasoning." Proceedings of the IEEE conference on computer vision and pattern recognition. 2017.
Yi, Kexin, et al. "CLEVRER: Collision Events for Video Representation and Reasoning." International Conference on Learning Representations. 2020.
Grunde-McLaughlin, Madeleine, Ranjay Krishna, and Maneesh Agrawala. "AGQA: A Benchmark for Compositional Spatio-Temporal Reasoning." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021.

Last updated on Aug 28, 2021