Projects
Research & Technical Projects
Showcasing my work in Vision and Language.
🔬 Research Projects
Ongoing Projects(WIP)
Duration: 2025
Large-Scale Indoor Object Search Engine
Conference: RSJ 2024, JSAI 2024
- JSAI 2024 Excellence Award (Top 3% of 900+ presentations)
RSJ24 Presentation:
JSAI24 Presentation:
Referring Expression Segmentation with Diffusion Models
Conference: RSJ 2023
- RSJ 2023 Excellence Award (Top 1% of 800+ presentations)
RSJ23 Presentation:
🏆 Competition Projects
DialFRED Challenge Winner @ CVPR 2023
- 1st Place Winner at international competition at top-tier venue
💼 Industry Projects
Sarashina2-Vision: Japanese-Specialized Vision-Language Models
Company: SB Intuitions Corp.
Duration: Aug 2024 - Present
Role: Research Intern
Contributing development of Japanese-specialized large-scale vision-language models for cultural and linguistic contexts.
- Key Contributions:
- Performance evaluation of 8B and 14B parameter models
- Developing internal evaluation platform
- Technical blog writing and community engagement
- Publications:
Production ML Systems
Companies: pluszero Inc., Elith Inc.
Duration: 2023 - 2024
Role: ML Engineer Intern
Developed and deployed machine learning solutions for real-world applications across multiple startups.
📚 Conference Presentations & Research Resources
Journal Club Presentations
Hyperbolic Image-Text Representations
Improving Cross-Modal Retrieval with Diverse Embeddings
Cost Aggregation with 4D Convolutional Swin Transformer
Technical Writing
- Zenn.dev: PyTorchで書いたモデルの中間層と友達になろう
- Book Review: 『ゼロから作るDeep Learning⑤ -生成モデル編』公開レビュー (2024)
🎓 Educational Projects
Advanced Machine Learning Course Instruction
Institution: Keio AI and Advanced Programming Consortium
Duration: Apr 2023 - Dec 2023
Focus: Diffusion Model Theory and Applications
- Curriculum Development: Created comprehensive course materials on diffusion models
- Teaching: Led practical sessions and theoretical discussions for 4th-year students
High School AI Education
Institution: Yokohama Science Frontier High School
Duration: 2023
Course: Science Literacy I
- Outreach: Delivered specialized lectures on AI and machine learning
- Engagement: Inspired next generation of researchers through hands-on demonstrations
🎭 Creative & Technical Projects
Theater Production Technology
Organizations: 劇団二進数, 創像工房in front of.
Duration: 2020 - 2024
Roles: Performer, Audio Engineer, Technical Director
Combining technical skills with creative expression in live theater productions.
- Technical Skills: Sound design, audio engineering, live mixing
- Creative Skills: Performance, script development, production management
- Recognition: Multiple awards including 観客賞 and 審査員奨励賞
Notable Productions:
- 『死して尚、生きてナオ』 - Audio Staff (佐藤佐吉演劇祭2024)
- 『有象無象³』 - Audio Staff
- 『脇役人生の転機』 - Audio Staff (competition winner)
- 『大海を知るかもめ』 - Chief Audio Engineer
🛠️ Technical Expertise by Domain
Multimodal AI & Vision-Language Models
- Cross-modal retrieval and ranking systems
- Vision-language model fine-tuning and evaluation
- Japanese-specialized multimodal models
Computer Vision & Deep Learning
- Object detection and segmentation
- Diffusion models for generative tasks
- Referring expression understanding
Natural Language Processing
- Instruction following and dialog systems
- Dense text processing for retrieval
Robotics & Applications
- Indoor navigation and object search
- Embodied AI and instruction following
💡 Interested in collaboration? I’m always excited to work on innovative projects at the intersection of AI, robotics, and real-world applications. Feel free to reach out!
This portfolio showcases projects from 2020-2025. For the most recent updates, check my GitHub or research presentations.