About

Table of Contents

I am passionate about solving complex data problems. In my experience of working as data/software engineer in traditional Japanese company as well as with international companies in Japan, I learned about cultural and language barriers. This experience has equipped me with excellent technical and communication skills and the ability to collaborate with multiple stakeholders. I have worked on various data projects ranging from data migration, anomaly detection, data visualization, Data Warehouse design to Developing ML Models, MLOps and ETL.

Work Experience 💼
#

Kraken Technologies
#

### Kraken Technologies

Nov 2024 ~ Current

Senior Data Engineer

Led and delivered customer-facing features for APAC clients: gathered requirements, shared data platform best practices, and coordinated implementation with the UK core team.
Designed S3 replication architecture for customer data ingestion and led the discovery phase for data sharing capabilities with external clients.
Optimized dbt pipeline performance by migrating to an incremental ingestion model, reducing processing time and compute costs.
Established async collaboration practices as the first APAC hire: championed writing culture, improved task documentation, and implemented work memos for cross-timezone team coordination.
Identified a critical data integrity bug during client onboarding, preventing cascading production failures across multiple systems.
Led data engineering projects empowering clients to control and monitor demand response on distributed energy resources.
Improved product reporting features enabling customers to generate and submit regulatory audit reports by streamlining version control and documentation to prevent stakeholder miscommunication.

Tech Stack

### Slalom Build

July 2023 ~ Nov 2024

Senior Data Engineer

Migration of Forecast Model to Snowflake (2024)
- Successfully transitioned the forecasting model from AWS to Snowflake using Python and Snowpark, improving integration and performance.
- Reduced ML pipeline execution time from 2 hours to 30 minutes by optimizing pipeline design and adhering to Snowflake best practices, resulting in a significant efficiency boost.
- Enhanced model accuracy through extensive experiments, including sampling and feature engineering, leading to more reliable forecasting results. Ensured data quality at each step of the pipeline for more reliable results.
- Designed the project architecture and introduced software engineering best practices to the Data Science team, enabling more flexible experimentation, a streamlined development workflow, and increased team productivity.
- Served as Tech Lead for the project, responsible for setting weekly milestones, steering the technical direction, and leading customer interactions for requirement gathering and alignment.
SQLServer to Snowflake migration (2023)
- Proficiently migrated 13TB of customer data to Snowflake, utilizing Airflow for streamlined job orchestration and ensuring a seamless data transition
- Employed efficient data export tools like BCP, while optimizing Snowflake stages, to facilitate swift and efficient data ingestion and storage
- Integrated AWS S3 to facilitate secure and scalable data transfers, ensuring data integrity and accessibility during the migration process
- Implemented a dynamic data pipeline for ongoing synchronization, enabling continuous updates of SQL Server changes into Snowflake through an incremental copy approach. This real-time data availability enhances analytical capabilities and reporting accuracy
Data Visualization and anomaly detection (2023)
- Deployed infrastructure setup of the entire project using terraform cloud and Google Cloud Platform. Deployed serverless data pipeline using CloudFunction and CloudScheduler.
- Designed end to end solution for anomaly detection and email alert system using GCP’s application integration.
- Implemented best practices of CI/CD and infrastructure deployment.

Tech Stack

NTT Communications & Docomo Business
#

### NTT Communications & Docomo Business

April 2017 ~ Jun 2023

Software Engineer (Data)

Data Engineering
- Led a team of two engineers, focusing on their career development through regular 1:1s, meaningful work allocation, and training. Strengthened leadership skills by managing cross-cultural teams and driving key internal projects.
- Contributed to developing and maintaining the data analysis platform, which collected and analyzed data from various internal departments. Designed the system architecture of core components such as Trino, HDFS, authentication workflow, and Hadoop ecosystem.
- Collaborated with the data analysis team to improve the performance of BI dashboards by tuning queries and optimizing underlying physical schema.
- Developed data pipeline to ingest network traffic flow data from Apache Kafka to HDFS using PySpark. This pipeline processed over a million records per second and was used for real-time anomaly detection to mitigate DDoS attacks. Published at Japan Society of Artificial Intelligence 2020.
- Integrated Trino with internal data platform components such as BI tools, authentication systems, and data warehouses. Developed custom connectors for Trino using Java and contributed to the open-source community.
- Developed YouTube playback statistics collector using Java, Docker, and JavaScript. Analyzed QoE over QUIC vs HTTP using data from 1000+ nodes. Visualization with Python/Plotly. Presented at ACM CoNEXT 2018, Greece.
Software Engineering
- Developed data catalog metadata platform using ReactJS to increase the productivity of the team by providing features such as advanced search for Japanese text, table search, schema information, and metadata from over 10 different internal systems and tools.
- Developed JP-EN and EN-JP translation feature and Furigana (reading in Hiragana for Japanese Kanji) in the open source markdown documentation tool called CodiMD using DeepL API, JavaScript, and NodeJS. This feature enhanced the communication and collaboration between non-Japanese speaking engineers and Japanese engineers.
- Presented NTT Communication’s Trino usage and architecture design at Trino Japan Virtual Meetup 2021.
- Trained new team members and helped them to understand the overall architecture of the data platform. Created useful tutorials and guides which improved team productivity.

Tech Stack

Persistent Systems
#

### Persistent Systems

June 2016 ~ January 2017

Software Developer (UI)

Developed SQL queries and dashboard widgets for IBM’s Tivoli Netcool Performance Manager product.
Designed database relations for a full stack application for monitoring live water quality using sensors and internet gateway.

Tech Stack

About

Work Experience 💼
#

Kraken Technologies
#

Senior Data Engineer

Senior Data Engineer

NTT Communications & Docomo Business
#

Software Engineer (Data)

Persistent Systems
#

Software Developer (UI)

Certifications 🏆
#

Education 🎓
#

University of Texas at Austin

Master of Artificial Intelligence

Vishwakarma Institute of Technology

Bachelor of Technology Computer Engineering

Work Experience 💼#

Kraken Technologies#

Senior Data Engineer

Senior Data Engineer

NTT Communications & Docomo Business#

Software Engineer (Data)

Persistent Systems#

Software Developer (UI)

Certifications 🏆#

Education 🎓#

University of Texas at Austin

Master of Artificial Intelligence

Vishwakarma Institute of Technology

Bachelor of Technology Computer Engineering

Work Experience 💼
#

Kraken Technologies
#

NTT Communications & Docomo Business
#

Persistent Systems
#

Certifications 🏆
#

Education 🎓
#