|Job title||Data Engineer|
|Contact name:||Jacob Evident|
|Published:||November 22, 2021 5:17|
Flexible Remote options (onsite available also)
- Acting as a technical leader for developing the Data Stack and resolving problems, with end-to-end ownership of data quality in our core datasets and data pipelines
- Manage, mentor, coaching, and steering colleagues across technical challenges
- Design, test, install and maintain highly scalable and data-intensive systems
- Orchestrate data projects such as real-time data exchange with third parties, and data migration projects
- Translate software designs and user requirements into specific data models that are efficient, scalable, and easy to work with
- Identify and solve issues with data pipelines regarding consistency, integrity, and completeness
- Review, maintain and extend distributed systems in production. Support other teams for usage and integration with those systems
- Maintain the technical excellence of the Data Engineering team
- Drive the culture across the business for data quality and its best practices, advocating for Data Engineering with both technical and non-technical audiences
- Proven professional experience as a Data Engineer or related position, working with systems and data infrastructure at scale
- Proficient in Python. In addition, Scala proficiency is a plus
- Experience with crafting and building large scale data and ETL pipelines in distributed environments with technologies such as Kafka, ClickHouse, Elastic, Cassandra, Spark, etc.
- Experience optimizing data models, pipelines and procedures for performance, cost, and usability
- Knowledge of the main architecture models and concepts like replication, sharding, consistency, horizontal and vertical scaling, quorum, idempotency
- Experience in supervising and mentoring team members
- Able to drive and take the lead in projects from a technical perspective
- Understanding of basic analytics and machine learning concepts
- Preferably a university degree in Software Engineering or other relevant field
- Excellent communication (written and spoken) and stakeholder management
Preferred skills / tool experience includes
- Production level experience with multi-regional Kafka platform (brokers, connectors, mirrors)
- Proven experience with SQL, NoSQL and OLAP databases. Preferably PostgreSQL, Cassandra and ClickHouse or another OLAP database such as Druid or Pinot.
- Pipeline orchestration with Apache Airflow or other tools like Luigi.
- Google Cloud environment.
- Docker and Kubernetes.
- Distributed processing frameworks like Apache Spark, Dask or Hadoop.
- Infrastructure provisioning automation tools such as Ansible and Terraform.
- Proficiency with the PyData stack is a plus.
- Production level experience with the Elastic stack (Elasticsearch, Logstash, and Kibana) is a plus.
Any interest in this position please reach out!