- I am currently a third-year Ph.D. candidate at Shanghai Jiao Tong University and Shanghai AI Laboratory.
- My research interests include computer vision and music generation, especially for vision language models.
- You can contact me via wangzhaokai [at] sjtu [dot] edu [dot] cn.
- Homepage
一些仓库介绍
-
发表论文
- CNMT:Confidence-aware Non-repetitive Multimodal Transformers for TextCaps (AAAI 2021)
- CMT:Video Background Music Generation with Controllable Music Transformer (ACM MM 2021 Best Paper Award)
- SymMV:Video Background Music Generation: Dataset, Method and Evaluation (ICCV 2023)
- PIIP:Parameter-Inverted Image Pyramid Networks (NeurIPS 2024 Spotlight)
- ITINERA:Integrating Spatial Optimization with Large Language Models for Open-domain Urban Itinerary Planning (EMNLP 2024 Industry Track & KDD UrbComp 2024 Best Paper Award)
- VMB:Multimodal Music Generation with Explicit Bridges and Retrieval Augmentation
-
研究笔记
- COCO-leaderboard:COCO目标检测leaderboard笔记
-
有趣的游戏和工具
- Sanguosha:文字版三国杀
- GPT-turtlesoup:ChatGPT实现AI海龟汤,GPT出题、当玩家、当裁判
- Scraper:小红书、微信公众号、马蜂窝爬虫
- Pokemon-Types-PageRank:宝可梦属性排名,使用PageRank算法
- wordle-solver:wordle游戏求解器
- HRM-architecture:基于人力资源机器游戏的CPU、编译器等架构设计
- wzk-Game-Collection:python小游戏全集,飞行棋、扫雷、德州扑克、2048、五子棋等
- Arxiv-Assistant: 自动获取每日的arxiv新论文列表、使用GPT筛选、发邮件提醒
- luna:简单的版本管理系统
- hahaha:自动生成表情包
- wzk-pypi-package:自己的python包,小游戏、爬虫等娱乐性质代码合集
-
大学课程相关
- BUAA-CS-course-notes:北航计算机专业课代码及期末复习笔记,包含很多课的代码
- BUAA-getscore:北航查分小工具
- PhysicsExperiment:基物实验数据计算程序
- pku-nsd-double-major:北大国发院经双课程复习资料