Principal ML Pipeline and Data Architect
Company: Lucid Motors
Location: Union City
Posted on: September 16, 2021
Leading the future in luxury electric and
At Lucid, we set out to introduce the most captivating, luxury
electric vehicles that elevate the human experience and transcend
the perceived limitations of space, performance, and intelligence.
Vehicles that are intuitive, liberating, and designed for the
future of mobility.
We plan to lead in this new era of luxury electric by
returning to the fundamentals of great design – where every
decision we make is in service of the individual and environment.
Because when you are longer bound by convention, you are free to
define your own experience.
Come work alongside some of the most accomplished minds in the
industry. Beyond providing competitive salaries, we’re providing a
community for innovators who want to make an immediate and
significant impact. If you are driven to create a better, more
sustainable future, then this is the right place for you.
Notice regarding COVID-19 Vaccination for positions
located in Newark, California
At Lucid (the “Company”), we prioritize the health and
wellbeing of our employees, families, and friends above all else.
In response to the novel Coronavirus, and the increased
transmissibility with recent variants, all Lucid Employees whose
employment is based in Newark, current and future, must be fully
vaccinated and provide proof thereof as a condition of continued or
future employment with the Company.
Accommodations due to medical or religious exemptions will be
Principal ML Pipeline and Data Architect is responsible to
define and lead the Data and Machine learning Architecture,
Performance, and Scalability of lucid ML operations, for both
Vehicle and Operational data, ingesting, processing, and storing
Trillions of rows of data per day. This hands-on role helps solve
real big data problems, which most of the standard tools on the
market are not capable of handling. You will be designing
solutions, writing codes and automation, defining standards, and
establish best practices across the company.
- Lead the design and implementation of ML training and inference
- Design, implement and lead Data and Machine learning
Architecture, Performance, and Scalability of Lucid ML
- Lead the design and deployment of large scale ML pipeline using
open source technologies such as Kubeflow, MLFlow, Airflow
- Design and implement robust, automated, production-level
software using horizontally scalable components
- Work effectively with cross-functional teams of engineers,
product managers, and domain experts
- Design and build the next generation of ML architecture that
will power large-scale data science projects
- Present project metrics and complex ML concepts to both
technical and non-technical audiences
- Deep understanding of data design systems and experience
handling large data sets
- Implement and manage industry best practice tools and processes
such as Data Lake, Delta Lake, S3, Spark ETL, Airflow, Hive
Catalog, Ranger, Redshift, Spline, Kafka, MQTT, Timeseries
Database, Cassandra, Redis, Presto, Kubernetes, Docker, CI/CD,
- Contribute to the overall architecture, implementation and
ongoing maintenance of our codebase
- Optimize the performance and scale our data ingestion and
processing infrastructure to server ever-increasing volume.
- Translate big data and analytics requirements into data models
that will operate at a large scale and high performance and guide
the data analytics engineers on these data models.
- Provide direction and focus in areas of high ambiguity
- Mentoring junior team members
- M.S. or PhD in Computer Science, or equivalent.
- 10+ years of hands-on experience in ML pipeline, ETL, data
- 5+ years of hands-on experience in productionizing and
deploying Big Data platforms and applications, Hands-on experience
working with: Relational/SQL, distributed columnar data
stores/NoSQL databases, time-series databases, Spark streaming,
Kafka, Hive, Parquet, Avro, and more
- ML engineering or data engineering background with more than 10
years of industry experience.
- Expert in Spark, Kafka, Presto, Kubeflow, Airflow, or similar
- Experience with Kubernetes-based ML Architecture.
- Proven hands-on experience building solutions for large-scale
data infrastructures and ML pipelines
- Have hands-on experience with Scala, Spark, Python, and GoLang
to implement large-scale data flows.
- Have production experience with open source technologies like
Hive, Kafka, Airflow, HBase, etc.
- Experience developing in a highly concurrent, multi-processor,
and multi-threaded environment
- Experience with heterogeneous computing and GPGPU
- Strong knowledge and understanding of machine learning
pipelines from standardization, normalization, clustering,
modeling, scoring, validation
- Understanding of ETL engineering and tools so you can interface
with data integration teams
- Experience working with various data infrastructure
technologies such as datastores (SQL/NoSQL dbs) and data
streaming/processing (Spark, Kafka, Airflow, AWS Kinesis) is
At Lucid, we don’t just
welcome diversity - we celebrate it! Lucid Motors is proud to be an
equal opportunity workplace and is an affirmative action employer.
We are committed to equal employment opportunity regardless
of race, color, national or ethnic origin, age, religion,
disability, sexual orientation, gender, gender identity and
expression, marital status, and any other characteristic protected
under applicable State or Federal laws and
To all recruitment
agencies: Lucid Motors does not accept agency
resumes. Please do not forward resumes to our careers alias or
other Lucid Motors employees. Lucid Motors is not responsible for
any fees related to unsolicited resumes.
Keywords: Lucid Motors, Union City , Principal ML Pipeline and Data Architect, Other , Union City, California
Didn't find what you're looking for? Search again!