[About the job]
Our team is looking for experienced engineers to work on exciting technical problems for marketing AI. On the 8ndpoint team, you’ll have the opportunity to own and drive projects that help us scale the platform. As a Data Engineer, you’ll be responsible for making sure millions of events are backed up, transformed, and delivered to their destinations on time. You’ll have the full support of engineering to ship greenfield solutions, and you’ll also advise our product managers on how we can best deliver the power of AI to our customers.
MoBagel has an extraordinarily open and relaxed work culture. There’s immense freedom to work on what you think is most important, and generous support for personal development. We’re a small and rapidly growing team with great work-life balance, generous remote work policy, open and supportive teammates, and free food. Come meet the team!
Since this role is highly cross-functional, there are ample opportunities (and support) for diving into DevOps, MLOps, machine learning systems, and machine learning modeling projects.
[Scopes]
1. Maintain and iterate on our end-to-end data pipeline lifecycle: work with stakeholders, gather requirements, design new systems, implement data pipelines and validation pipelines.
2. Maintain and iterate on ingestion and orchestration infrastructure while anticipating future scalability risks.
3. Build and maintain complex Airflow DAGs or equivalent to meet changing DS, product, and engineering requirements.
4. Develop CI/CD and data integration testing pipelines in conjunction with Data Science and Engineering.
[Requirements]
1. 2+ years of professional data engineering experience.
2. 2+ years of professional software development experience or strong Computer Science fundamentals, including data structures and algorithms.
3. Expertise in SQL, data warehousing, ETL/ELT processes; significant experience with data orchestration tools such as Airflow or Prefect.
4. Familiarity with cloud services (e.g., GCP, AWS, Azure).
5. Demonstrated experience working cross-functionally across different teams and functions.
[Nice to Have]
1. Interest in building your own ML models.
2. Expertise in designing production data models and analyzing database usage patterns.
3. Experience on a data or machine learning team in a production environment.
4. Experience with event streaming architectures and protocol buffers or other serialization protocols.
5. Familiarity with distributed data systems fundamentals.
[Tech Stack]
We prioritize candidates that can quickly learn new technologies over domain knowledge, but the following is a snapshot of what our day-to-day looks like:
1. Development: Python, Java, Scala
2. Machine Learning: Python (Pandas, sklearn, xgboost, etc.)
3. Orchestration: Airflow, Kubernetes, Docker
4. GCP: BigQuery, GKE, Artifact Registry, Cloud SQL, Pub/Sub
5. Frontend: React, TypeScript
6. DevOps: ArgoCD, Rancher/Longhorn
7. VCS & CI/CD: Gitlab