Trend Micro is a global leader company in cybersecurity solutions. We’re one of the cloud backend teams that are responsible for creating a data lake platform to enable data analysis among variable data sources in Consumer CBU. The vision of the group is to provide a robust platform for data-driven decision making and machine learning/AI model delivering. To achieve the goal, we leverage data streaming techniques to build data pipelines that are scalable, repeatable, and secured. Also, we have a machine learning platform to develop more advanced applications. As a developer, you will design and develop the data pipelines with high-availability and high data quality. We consider lots of automation to increase productivity and system reliability. Most of the systems are on AWS, but it is not necessary to be, so you need to be familiar with AWS services and have the concept of parallel computing to ensure the performance of your design.
- Communicate with stakeholders of the data source to know the data type, value range, and update frequency. Besides, learn more details of the data source about system status transition and triggered events.
- Design highly reliable, scalable, extremely high-volume, fault-tolerant data pipelines that go into the data lake.
- Familiar with machine-learning techniques to design and operate modeling.
- Proactively solve problems and define process improvements.
- Master degree in Computer Science or related discipline.
- 3~5+ years of hands-on development.
- Python programming.
- Familiar with AWS data streaming, data computing and machine learning services.
- Familiar with Hadoop or Apache Spark development.
- TensorFlow, Keras development.
- Open source that is related to distributed computing/data streaming technique.
- Experienced or interested in data requirement analysis.
- Excellent experience with data structures and algorithms.