Data Engineer
Atlanta, GA
Full Time
Mid Level
About GAINSystems:
GAINS is a leading provider of cloud Supply Chain solutions based in the Chicago neighborhood of Wicker Park. As part of the Francisco Partners portfolio of specialized companies, we are rapidly growing and expanding our global teams to drive innovation, deliver customer value, and accelerate market leadership.
Supply chain volatility has made it difficult for businesses to plan and keep their customer promises. GAINS helps companies address these challenges with innovative solutions leveraging proven AI and ML techniques. Our team of industry and technology experts rapidly delivers transformational value resulting in sustainable and measurable ROI-based impact for our global customers. If you are a technology enthusiast who wants to make an impact, then GAINS is for you.
Description:
We are seeking a Data Engineer to join our dynamic team and play a key role in designing, building, and optimizing our data pipelines, ML operations and streaming architectures. The ideal candidate will have strong expertise in Python, PySpark, Databricks, and Kafka, with experience in handling large-scale data processing, real-time data streaming, and cloud-based data solutions.
Key Responsibilities
GAINS is a leading provider of cloud Supply Chain solutions based in the Chicago neighborhood of Wicker Park. As part of the Francisco Partners portfolio of specialized companies, we are rapidly growing and expanding our global teams to drive innovation, deliver customer value, and accelerate market leadership.
Supply chain volatility has made it difficult for businesses to plan and keep their customer promises. GAINS helps companies address these challenges with innovative solutions leveraging proven AI and ML techniques. Our team of industry and technology experts rapidly delivers transformational value resulting in sustainable and measurable ROI-based impact for our global customers. If you are a technology enthusiast who wants to make an impact, then GAINS is for you.
Description:
We are seeking a Data Engineer to join our dynamic team and play a key role in designing, building, and optimizing our data pipelines, ML operations and streaming architectures. The ideal candidate will have strong expertise in Python, PySpark, Databricks, and Kafka, with experience in handling large-scale data processing, real-time data streaming, and cloud-based data solutions.
Key Responsibilities
- Design, develop, and maintain scalable ETL/ELT pipelines using Python, PySpark and Databricks.
- Implement real-time data streaming solutions using Apache Kafka.
- Optimize data ingestion, processing, and transformation workflows for performance and reliability.
- Work closely with data scientists, analysts, and business stakeholders to provide high-quality data solutions.
- Ensure data integrity, security, and compliance with industry best practices.
- Monitor and troubleshoot data pipeline performance and real-time streaming issues.
- Implement DevOps and CI/CD practices for data engineering workflows.
- Collaborate with cross-functional teams to support business intelligence, machine learning, and analytics initiatives.
- Proven experience as a Data Engineer or similar role with 3 or more years of industry experience.
- Strong proficiency in Python for data processing and automation.
- Hands-on experience with Databricks for big data processing and analytics.
- Expertise in Apache Kafka for building scalable real-time data streaming solutions.
- Experience with SQL and NoSQL databases for structured and unstructured data storage.
- Familiarity with cloud platforms (AWS, Azure, or GCP) and their data services.
- Knowledge of CI/CD pipelines, containerization (Docker, Kubernetes), and orchestration tools like Airflow.
- Strong problem-solving and debugging skills in distributed computing environments.
- Strong understanding of Data Ops principles.
- Experience with Microsoft Azure, Delta Lake, and Lakehouse architecture.
- Experience with ML and specifically MlFlow
Apply for this position
Required*