This position has been


View Open positions

Data Engineer - Spark - Search Technologies

We’re searching for a talented Data Engineer to join a platform team of one of our key clients - Lucidworks. Lucidworks is the commercial company behind Apache Lucene/Solr, the world’s leading open source search platform. The Platform team at Lucidworks builds the foundation of our cloud-native microservices architecture orchestrated by Kubernetes. The Platform team owns the design and implementation of our API gateway, security, cloud ops, workflows and job scheduling, Apache Spark integration, messaging framework (Apache Pulsar), and ML model ops / serving infrastructure (Seldon Core / Argo). To be successful in this role, you should be passionate about solving data analytics problems at scale using SQL, Java, and Scala. Some exposure to Kubernetes and cloud platforms is preferred.

Job Responsibilities

  • Use the Spark Scala API to build data processing and SQL analytics jobs
  • Build reusable libraries and utilities in Scala to support common tasks
  • Support and refactor an existing codebase containing many diverse Spark jobs
  • Design data intensive workflows that read/write large data sets from/to cloud storage (GCS, S3)
  • Maintain and improve an existing Spark job execution framework on Kubernetes
  • Maintain and improve the spark-solr open source project, including porting to Spark 3
  • Provide example Jupyter notebooks for common analytics tasks

Required Skills & Qualifications:

  • BS in computer science or similar field; Masters degree or higher preferred
  • Deep knowledge of Spark fundamentals required
  • Mastery of Git, Gradle, Jenkins, BASH, SQL, Scala, and Java
  • A minimum 5 years experience with large-scale distributed systems
  • Experience with Kafka streams, Flink, Spark, Storm or similar streaming data platform
  • Minimum of 3 years experience using messaging platforms like RabbitMQ, ActiveMQ, Kafka, or Apache Pulsar
  • Solid understanding of Kubernetes, Helm, and Docker
  • Experience with big data analytics highly preferred
  • Resourcefulness – willing to jump in, work with both opportunity and constraint, and leverage existing resources to accomplish goals
  • Team player - confident collaborating with a diverse community of people and personalities across geographies, backgrounds, and professional abilities
  • Strong interpersonal, written, and communication skills
  • Empathy and care for all stakeholders of Lucidworks, including employees, executives, partners, and guests

Apply now Send your CV to

Send us your resume