Merge pull request #17 from NhatHoang2002/main

Add nhat info
WING-NUS · Jul 30, 2024 · d719fd6 · d719fd6
2 parents ffd22af + c49b07f
commit d719fd6
Show file tree

Hide file tree

Showing 14 changed files with 275 additions and 2 deletions.
diff --git a/.gitignore b/.gitignore
@@ -2,6 +2,11 @@
 .idea/
 
 # Hugo
-/resources/
+resources/
 public/
-assets/jsconfig.json
+jsconfig.json
+node_modules/
+go.sum
+.hugo_build.lock
+
+.DS_Store
diff --git a/content/authors/nhat/_index.md b/content/authors/nhat/_index.md
@@ -0,0 +1,74 @@
+---
+# Display name
+title: Nhat M. Hoang
+
+# Full name (for SEO)
+first_name: Nhat
+middle_name: Minh
+last_name: Hoang
+
+# Is this the primary user of the site?
+superuser: true
+
+# Role/position
+role: Research Intern
+
+# Organizations/Affiliations
+organizations:
+  - name: Nanyang Technological University
+    url: https://www.ntu.edu.sg/
+
+# Short bio (displayed in user profile at end of posts)
+bio: My name is Nhat, research assistant at NTU Nail Lab, Singapore. I'm interested in generative AI, multimodal learning, and large language models.
+
+interests:
+  - Generative AI
+  - Large Language Models
+  - Multimodal Learning
+
+education:
+  courses:
+    - course: BE in Computer Science
+      institution: Nanyang Technological University
+      year: 2020 - 2024
+
+# Social/Academic Networking
+# For available icons, see: https://docs.hugoblox.com/getting-started/page-builder/#icons
+#   For an email link, use "fas" icon pack, "envelope" icon, and a link in the
+#   form "mailto:[email protected]" or "#contact" for contact widget.
+social:
+  - icon: house
+    icon_pack: fas
+    link: https://nhathoang2002.github.io/
+  - icon: github
+    icon_pack: fab
+    link: https://github.com/NhatHoang2002
+  - icon: linkedin
+    icon_pack: fab
+    link: https://www.linkedin.com/in/nhathoang2002/
+  - icon: envelope
+    icon_pack: fas
+    link: '/#contact'
+  - icon: google-scholar
+    icon_pack: ai
+    link: https://scholar.google.com.sg/citations?hl=en&user=d6ixOGYAAAAJ&view_op=list_works
+# Link to a PDF of your resume/CV from the About widget.
+# To enable, copy your resume/CV to `static/files/cv.pdf` and uncomment the lines below.
+# - icon: cv
+#   icon_pack: ai
+#   link: files/cv.pdf
+
+# Enter email to display Gravatar (if Gravatar enabled in Config)
+email: 'mnhat.hoang2002 [at] gmail.com'
+
+# Highlight the author in author lists? (true/false)
+highlight_name: false
+
+# Organizational groups that you belong to (for People widget)
+#   Set this to `[]` or comment out if you are not using People widget.
+user_groups:
+  - Visitors / Interns
+#  - Researchers
+---
+
+Nhat M. Hoang is a Research Intern at WING Research Group starting from August 2024. He is currently a Research Assistant at of [NTU Nail Lab](https://ntu-nail.github.io/) and with main interest in Multimodal Learning and Generative AI.
diff --git a/content/authors/nhat/avatar.jpg b/content/authors/nhat/avatar.jpg
diff --git a/content/publication/fang-2023-corner-cutmix/cite.bib b/content/publication/fang-2023-corner-cutmix/cite.bib
@@ -0,0 +1,10 @@
+@INPROCEEDINGS{10222009,
+  author={Fang, Fen and Hoang, Nhat M. and Xu, Qianli and Lim, Joo-Hwee},
+  booktitle={2023 IEEE International Conference on Image Processing (ICIP)}, 
+  title={Data Augmentation Using Corner CutMix and an Auxiliary Self-Supervised Loss}, 
+  year={2023},
+  volume={},
+  number={},
+  pages={830-834},
+  keywords={Training;Computer vision;Image color analysis;Self-supervised learning;Data augmentation;Distortion;Convolutional neural networks;Data augmentation;region cut and mix;auxiliary loss;self-supervised learning},
+  doi={10.1109/ICIP49359.2023.10222009}}
diff --git a/content/publication/fang-2023-corner-cutmix/featured.png b/content/publication/fang-2023-corner-cutmix/featured.png
diff --git a/content/publication/fang-2023-corner-cutmix/index.md b/content/publication/fang-2023-corner-cutmix/index.md
@@ -0,0 +1,30 @@
+---
+title: Data Augmentation Using Corner CutMix and an Auxiliary Self-Supervised Loss
+subtitle: ""
+authors:
+- Fen Fang
+- nhat
+- Qianli Xu
+- Joo-Hwee Lim
+
+author_notes: []
+doi: ""
+
+# Schedule page publish date (NOT publication's date).
+publishDate: '2023-10-08T17:06:11.943Z'
+publication_types: ['paper-conference']
+
+# Publication name and optional abbreviated publication name.
+publication: In *2023 IEEE International Conference on Image Processing*
+publication_short: In *ICIP 2023*
+
+abstract: Deep convolutional neural networks (CNNs) have achieved remarkable success in computer vision tasks, but their training is susceptible to overfitting when the training sample size is insufficient. In this paper, we introduce Corner CutMix, a novel data augmentation technique for CNN training. During training, Corner CutMix randomly selects a region from one of four corner areas in an image and replaces it with a randomly chosen region from a distractor image. Additionally, we design an auxiliary self-supervised loss function to learn the position of the selected corner region, thereby improving the transferability and generalizability of the learned representation. Corner CutMix is easy to implement, adding little computational overhead, and can be combined with other augmentation methods such as random cropping, color distortion, and flipping. Our extensive classification task experiments in self-supervised learning on public datasets (e.g., CIFAR10, CIFAR100, and STL10) demonstrate the effectiveness of Corner CutMix, which consistently outperforms strong baselines such as CutOut and CutMix.
+
+# Display this page in the Featured widget?
+featured: true
+
+url_pdf: 'https://doi.org/10.1109/ICIP49359.2023.10222009'
+image:
+  caption: "Visualization examples of Corner CutMix (CCM). The CCM can be directly applied to original images as well as combined with other augmentation methods including crop- ping, flipping and color distortion."
+  preview_only: false
+---
diff --git a/content/publication/hoang-2024-motionmix/cite.bib b/content/publication/hoang-2024-motionmix/cite.bib
@@ -0,0 +1,8 @@
+@misc{hoang2024motionmix,
+  title={MotionMix: Weakly-Supervised Diffusion for Controllable Motion Generation}, 
+  author={Nhat M. Hoang and Kehong Gong and Chuan Guo and Michael Bi Mi},
+  year={2024},
+  eprint={2401.11115},
+  archivePrefix={arXiv},
+  primaryClass={cs.CV}
+}
diff --git a/content/publication/hoang-2024-motionmix/featured.png b/content/publication/hoang-2024-motionmix/featured.png
diff --git a/content/publication/hoang-2024-motionmix/index.md b/content/publication/hoang-2024-motionmix/index.md
@@ -0,0 +1,41 @@
+---
+title: "MotionMix: Weakly-Supervised Diffusion for Controllable Motion Generation"
+subtitle: ""
+authors:
+- nhat
+- Kehong Gong
+- Chuan Guo
+- Michael Bi Mi
+author_notes: []
+doi: ""
+
+# Schedule page publish date (NOT publication's date).
+publishDate: '2024-01-24T00:00:00Z'
+publication_types: ['paper-conference']
+
+# Publication name and optional abbreviated publication name.
+publication: In *The 38th Annual AAAI Conference on Artificial Intelligence*
+publication_short: In *AAAI 2024*
+
+abstract: "Controllable generation of 3D human motions becomes an important topic as the world embraces digital transformation. Existing works, though making promising progress with the advent of diffusion models, heavily rely on meticulously captured and annotated (e.g., text) high-quality motion corpus, a resource-intensive endeavor in the real world. This motivates our proposed MotionMix, a simple yet effective weakly-supervised diffusion model that leverages both noisy and unannotated motion sequences. Specifically, we separate the denoising objectives of a diffusion model into two stages: obtaining conditional rough motion approximations in the initial T−T∗ steps by learning the noisy annotated motions, followed by the unconditional refinement of these preliminary motions during the last T∗ steps using unannotated motions. Notably, though learning from two sources of imperfect data, our model does not compromise motion generation quality compared to fully supervised approaches that access gold data. Extensive experiments on several benchmarks demonstrate that our MotionMix, as a versatile framework, consistently achieves state-of-the-art performances on text-to-motion, action-to-motion, and music-to-dance tasks."
+
+# Display this page in the Featured widget?
+featured: true
+
+links:
+- name: Project Page
+  url: https://nhathoang2002.github.io/MotionMix-page/
+
+url_pdf: 'https://arxiv.org/abs/2401.11115'
+url_code: 'https://github.com/NhatHoang2002/MotionMix'
+url_dataset: ''
+url_poster: 'https://nhathoang2002.github.io/MotionMix-page/static/pdfs/MotionMix_poster.pdf'
+url_project: ''
+url_slides: ''
+url_source: ''
+url_video: 'https://nhathoang2002.github.io/MotionMix-page/static/videos/demo_vid.mp4'
+
+image:
+  caption: "Diffusion model trained with two sources of *imperfect* data while still achieving state-of-the-art performance on different motion generation tasks."
+  preview_only: false
+---
diff --git a/content/publication/hoang-etal-2024-toxcl/cite.bib b/content/publication/hoang-etal-2024-toxcl/cite.bib
@@ -0,0 +1,20 @@
+@inproceedings{hoang-etal-2024-toxcl,
+    title = "{T}o{XCL}: A Unified Framework for Toxic Speech Detection and Explanation",
+    author = "Hoang, Nhat  and
+      Do, Xuan Long  and
+      Do, Duc Anh  and
+      Vu, Duc Anh  and
+      Luu, Anh Tuan",
+    editor = "Duh, Kevin  and
+      Gomez, Helena  and
+      Bethard, Steven",
+    booktitle = "Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)",
+    month = jun,
+    year = "2024",
+    address = "Mexico City, Mexico",
+    publisher = "Association for Computational Linguistics",
+    url = "https://aclanthology.org/2024.naacl-long.359",
+    doi = "10.18653/v1/2024.naacl-long.359",
+    pages = "6460--6472",
+    abstract = "The proliferation of online toxic speech is a pertinent problem posing threats to demographic groups. While explicit toxic speech contains offensive lexical signals, implicit one consists of coded or indirect language. Therefore, it is crucial for models not only to detect implicit toxic speech but also to explain its toxicity. This draws a unique need for unified frameworks that can effectively detect and explain implicit toxic speech. Prior works mainly formulated the task of toxic speech detection and explanation as a text generation problem. Nonetheless, models trained using this strategy can be prone to suffer from the consequent error propagation problem. Moreover, our experiments reveal that the detection results of such models are much lower than those that focus only on the detection task. To bridge these gaps, we introduce ToXCL, a unified framework for the detection and explanation of implicit toxic speech. Our model consists of three modules: a (i) Target Group Generator to generate the targeted demographic group(s) of a given post; an (ii) Encoder-Decoder Model in which the encoder focuses on detecting implicit toxic speech and is boosted by a (iii) Teacher Classifier via knowledge distillation, and the decoder generates the necessary explanation. ToXCL achieves new state-of-the-art effectiveness, and outperforms baselines significantly.",
+}
diff --git a/content/publication/hoang-etal-2024-toxcl/featured.png b/content/publication/hoang-etal-2024-toxcl/featured.png
diff --git a/content/publication/hoang-etal-2024-toxcl/index.md b/content/publication/hoang-etal-2024-toxcl/index.md
@@ -0,0 +1,38 @@
+---
+title: "ToXCL: A Unified Framework for Toxic Speech Detection and Explanation"
+subtitle: ""
+authors:
+- nhat
+- Xuan Long Do
+- Duc Anh Do
+- Duc Anh Vu
+- Luu Anh Tuan
+author_notes: [Equal Contribution, Equal Contribution]
+doi: ""
+
+# Schedule page publish date (NOT publication's date).
+publishDate: '2024-03-26T00:00:00Z'
+publication_types: ['paper-conference']
+
+# Publication name and optional abbreviated publication name.
+publication: In *2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics*
+publication_short: In *NAACL 2024*
+
+abstract: "The proliferation of online toxic speech is a pertinent problem posing threats to demographic groups. While explicit toxic speech contains offensive lexical signals, implicit one consists of coded or indirect language. Therefore, it is crucial for models not only to detect implicit toxic speech but also to explain its toxicity. This draws a unique need for unified frameworks that can effectively detect and explain implicit toxic speech. Prior works mainly formulated the task of toxic speech detection and explanation as a text generation problem. Nonetheless, models trained using this strategy can be prone to suffer from the consequent error propagation problem. Moreover, our experiments reveal that the detection results of such models are much lower than those that focus only on the detection task. To bridge these gaps, we introduce ToXCL, a unified framework for the detection and explanation of implicit toxic speech. Our model consists of three modules: a (i) Target Group Generator to generate the targeted demographic group(s) of a given post; an (ii) Encoder-Decoder Model in which the encoder focuses on detecting implicit toxic speech and is boosted by a (iii) Teacher Classifier via knowledge distillation, and the decoder generates the necessary explanation. ToXCL achieves new state-of-the-art effectiveness, and outperforms baselines significantly."
+
+# Display this page in the Featured widget?
+featured: true
+
+url_pdf: 'https://arxiv.org/abs/2403.16685'
+url_code: 'https://github.com/NhatHoang2002/ToXCL'
+url_dataset: ''
+url_poster: ''
+url_project: ''
+url_slides: ''
+url_source: ''
+url_video: ''
+
+image:
+  caption: "A sample input post and its ground truth explanation from the Implicit Hate Corpus test set were fed into two models. The baseline RoBERTa model failed to detect the implicit toxic speech, while our proposed ToXCL model successfully detected it and generated a toxic explanation closely matching the ground truth."
+  preview_only: false
+---
diff --git a/content/publication/vanlong-2024-chatgpt/cite.bib b/content/publication/vanlong-2024-chatgpt/cite.bib
@@ -0,0 +1,9 @@
+@misc{vanlong2024chatgptmathquestionerevaluating,
+      title={ChatGPT as a Math Questioner? Evaluating ChatGPT on Generating Pre-university Math Questions}, 
+      author={Phuoc Pham Van Long and Duc Anh Vu and Nhat M. Hoang and Xuan Long Do and Anh Tuan Luu},
+      year={2024},
+      eprint={2312.01661},
+      archivePrefix={arXiv},
+      primaryClass={cs.CL},
+      url={https://arxiv.org/abs/2312.01661}, 
+}
diff --git a/content/publication/vanlong-2024-chatgpt/index.md b/content/publication/vanlong-2024-chatgpt/index.md
@@ -0,0 +1,38 @@
+---
+title: "ChatGPT as a Math Questioner? Evaluating ChatGPT on Generating Pre-university Math Question"
+subtitle: ""
+authors:
+- Phuoc Pham Van Long
+- Duc Anh Vu
+- nhat
+- Xuan Long Do
+- Anh Tuan Luu
+
+author_notes: [Equal Contribution, Equal Contribution, Equal Contribution, Equal Contribution]
+doi: ""
+
+# Schedule page publish date (NOT publication's date).
+publishDate: '2024-02-28T00:00:00Z'
+publication_types: ['paper-conference']
+
+# Publication name and optional abbreviated publication name.
+publication: In *The 39th ACM/SIGAPP Symposium On Applied Computing 2024*
+publication_short: In *ACM/SIGAPP SAC 2024*
+
+abstract: "Mathematical questioning is crucial for assessing students problem-solving skills. Since manually creating such questions requires substantial effort, automatic methods have been explored. Existing state-of-the-art models rely on fine-tuning strategies and struggle to generate questions that heavily involve multiple steps of logical and arithmetic reasoning. Meanwhile, large language models(LLMs) such as ChatGPT have excelled in many NLP tasks involving logical and arithmetic reasoning. Nonetheless, their applications in generating educational questions are underutilized, especially in the field of mathematics. To bridge this gap, we take the first step to conduct an in-depth analysis of ChatGPT in generating pre-university math questions. Our analysis is categorized into two main settings: context-aware and context-unaware. In the context-aware setting, we evaluate ChatGPT on existing math question-answering benchmarks covering elementary, secondary, and ternary classes. In the context-unaware setting, we evaluate ChatGPT in generating math questions for each lesson from pre-university math curriculums that we crawl. Our crawling results in TopicMath, a comprehensive and novel collection of pre-university math curriculums collected from 121 math topics and 428 lessons from elementary, secondary, and tertiary classes. Through this analysis, we aim to provide insight into the potential of ChatGPT as a math questioner."
+
+# Display this page in the Featured widget?
+featured: true
+
+url_pdf: 'https://arxiv.org/abs/2312.01661'
+url_code: 'https://github.com/dxlong2000/ChatGPT-as-a-Math-Questioner'
+url_dataset: ''
+url_poster: ''
+url_project: ''
+url_slides: ''
+url_source: ''
+url_video: ''
+
+image:
+  preview_only: false
+---