Big Data Engineer

Dexian
3.6 out of 5 stars
Whitefield, Bengaluru, Karnataka

Job details

Pay

  • Up to ₹28,00,000 a year

Job type

  • Full-time

Location

Whitefield, Bengaluru, Karnataka

Benefits

Pulled from the full job description

  • Health insurance
  • Provident Fund

Full job description

Your Responsibilities:

  • Design, develop, and maintain scalable data processing pipelines using Hadoop and Spark.
  • Implement data integration and ETL processes to ingest and transform large datasets.
  • Collaborate with data scientists, analysts, business partners and other stakeholders to understand data requirements and deliver solutions.
  • Optimize and tune Hadoop and Spark jobs for performance and efficiency.
  • Manage and maintain data storage solutions, ensuring data integrity and security.
  • Utilize GitHub for version control and collaboration on code development.
  • Work with CDP (Cloudera Data Platform) to manage and deploy data applications.
  • Integrate and manage data solutions on Cloud Azure, and Snowflake ensuring seamless data flow and accessibility.
  • Monitor and troubleshoot data processing workflows, resolving issues promptly.
  • Stay updated with the latest industry trends and technologies in big data and cloud computing.
  • Explore and develop skills in newer technologies according to big data technology roadmap.
  • Integrate and manage data solutions on Cloud Azure and on-premises infrastructure, ensuring seamless data flow and accessibility.

Qualifications:

  • Strong experience with Apache Spark, Hadoop ecosystem, Hive, Iceberg, HBase, Apache Kafka, Spark Streaming or Flink, and Oozie (Airflow/NiFi good to have).
  • Proficient in Scala, Java, and/or Python with solid coding and debugging skills.
  • Applied experience with Solr for indexing and search workloads.
  • Familiarity with Git/GitHub, GitHub Actions, Azure DevOps; good to have cloud experience (Azure/AWS).
  • Knowledge of SQL, data modeling (nice to have), and experience working in UNIX or UNIX‑like environments.
  • Experience with SCRUM or similar Agile frameworks; strong problem‑solving skills; able to produce clear technical documentation.
  • Self‑starter, effective individual contributor, and good to have Oil & Gas domain knowledge.

Most Required Skillset for the job:

  • Worked extensively on the Hadoop ecosystem including Apache Hadoop, Apache Spark, Apache Hive, Apache Kafka, HDFS, YARN, and Oozie for building scalable Big Data processing and ETL pipelines.
  • Hands-on experience working with Cloudera Data Platform (CDP) for managing enterprise-scale Big Data workloads, distributed data processing, cloud-integrated analytics, data governance, and lakehouse architectures.
  • Strong experience with Cloudera Distribution Hadoop (CDH) environment for deploying and managing Hadoop ecosystem components including Spark, Hive, HDFS, Impala, Kafka, and distributed ETL/data ingestion workflows.
  • Experience designing and maintaining scalable data pipelines on CDP/CDH clusters with expertise in distributed computing, performance tuning, workflow orchestration, and large-scale data processing.

Pay: Up to ₹2,800,000.00 per year

Benefits:

  • Health insurance
  • Provident Fund

Experience:

  • Hadoop Ecosystem: 6 years (Required)
  • Cloudera Data Platform: 1 year (Required)
  • Cloudera Distribution Hadoop : 1 year (Required)

Work Location: In person