Role: Full-Stack Data Engineer
Job Description
We are seeking a highly skilled Full-Stack Data Engineer to lead the design, development, and management of a data-driven solution that integrates diverse car-related data sources and ensure that all integrated data is validated, structured, and accurate, serving as a single source of truth for all car-related data services. This role combines expertise in back-end engineering, data science and knowledge in front-end development to build a scalable and efficient, and maintainable platform.
You will be responsible for building APIs, implementing data aggregation pipelines, creating a rule engine for intelligent data unification and routing, and extending a monitoring/admin panel for solution management. The ideal candidate has a strong understanding of AWS services, data classification and data modeling techniques as well as fundamental web development knowledge.
Key Responsibilities
1.Data Integration and Processing
•Design, implement and maintain pipelines to gather data from various sources, including Web sites, APIs, DBs, FTPs, etc.
•Classify and categorize data into a well-structured taxonomy for use in analytics and decision-making systems.
•Implement a data aggregation pipeline using the ELK Stack for analytics.
•Develop and configure a rule engine to unify incoming data and make intelligent routing decisions based on predefined rules.
2.Full-Stack Development
•Extend and maintain an intuitive and responsive Admin Panel using Node.js, NestJS/TypeORM and TypeScript for solution monitoring, managing data pipelines, rules, and configurations.
•Design and maintain a robust database schema in PostgreSQL (hosted on AWS RDS) to store and retrieve data efficiently.
•Design and optimize complex queries in PostgreSQL, ensuring efficient performance through proper use of indexing
•Build scalable RESTful APIs using AWS API Gateway and AWS Lambda.
•
Front-end development using Handlebars template engine.
3.Cloud Infrastructure and Deployment
•Utilize AWS services to deploy and manage APIs, ensuring scalability and security.
•Implement AWS S3 + CloudFront for efficient storage and delivery of image assets.
•Extend and implement data security and access control measures using tools like IAM, API Keys, and AWS Cognito.
•Develop and maintain systems with proficiency in Linux at the command line level.
•Utilize basic knowledge of Docker and Kubernetes for containerization and orchestration.
•Manage code repositories effectively using Git version control
4.Collaboration
•Work closely with data scientists, product managers, and other developers to ensure the platform meets business requirements.
•Document processes, workflows, and designs for team collaboration and knowledge sharing.
Required Skills & Qualifications
Technical Skills
•Experience with PostgreSQL, including schema design, query optimization, and AWS RDS deployment.
•Experience with classification and categorization of complex datasets.
•Expertise in data aggregation tools like the ELK Stack (Elasticsearch, Logstash, Kibana).
•Familiarity with AWS services, including API Gateway, Lambda, S3, RDS, and CloudFront.
•Knowledge of building and using rule engines for data unification and routing.
•Deep knowledge of TypeScript
•Excellent knowledge of Node.js, NestJS/TypeORM, for custom modules development, decorators, etc.
•Proficient in SQL with the ability to write complex queries and optimize them. Knowledge of how to properly use indexes (PostgreSQL).
•Proficiency in Linux at the command line level, necessary for development.
•Basic knowledge of Docker / Kubernetes.
•Knowledge of Git version control system.
•Experience in developing Web APIs.
•Experience with front-end development and the Handlebars template engine.
•Knowledge in Python (for data gathering and pipeline development) is a plus
Soft Skills
•Strong analytical and problem-solving skills.
•Ability to work independently and collaboratively in a team environment.
•Excellent communication and documentation skills.
Nice-to-have
•Familiarity with VIN decoding and car industry-specific datasets.
•Experience with some of the modern data engineering tools like Apache Airflow, AWS Glue, or Step Functions.