Job Description

Job Title: Software Engineer IV - Big Data, Cloud
Job Location: Pennsylvania

Responsibilities:

  • Develop custom batch-oriented and real-time streaming data pipelines working within the MapReduce ecosystem, migrating flows from ELT to ETL
  • Ensure proper data governance policies are followed by implementing or validating data lineage, quality checks, classifications, etc.
  • Act in a technical leadership capacity
  • Mentor junior engineers and new team members and apply technical expertise to challenging programming and design problems
  • Resolve defects/bugs during QA testing, pre-production, production, and post-release patches
  • Have a quality mindset, squash bugs with a passion, and work hard to prevent them in the first place through unit testing, test-driven development, version control, continuous integration and deployment
  • Ability to lead change, be bold, and have the ability to innovate and challenge status quo
  • Be passionate about solving customer problems and develop solutions that result in a passionate customer/community following
  • Conduct design and code reviews
  • Analyze and improve efficiency, scalability, and stability of various system resources
  • Contribute to the design and architecture of the project
  • Operate within Agile Development environment and apply the methodologies

Required Qualifications:

  • Bachelor's degree required or equivalent experience
  • Four+ years of experience in software engineering
  • Proficient understanding of distributed computing principles
  • Experience developing ETL processing flows using MapReduce technologies like Spark and Hadoop
  • Experience developing with ingestion and clustering frameworks, such as Kafka, Zookeeper, and YARN
  • Experience with building stream-processing systems, using solutions, such as Storm or Spark-Streaming
  • Experience with various messaging systems, such as Kafka or RabbitMQ
  • Good knowledge of Big Data querying tools, such as Pig or Hive
  • Good understanding of Lambda Architecture, along with its advantages and drawbacks
  • Proficiency with MapReduce and HDFS
  • Experience with integration of data from multiple data sources
  • Ability to solve any ongoing issues with operating the cluster

Preferred Qualifications:

  • Master's Degree
  • Eight+ years of experience in software engineering
  • Demonstrable advanced knowledge of data architectures, data pipelines, real time processing, streaming, networking, and security
  • One+ year of experience with DataBricks and Spark
  • Management of Spark or Hadoop clusters with all included services
  • One+ year experience with NoSQL databases, such as HBase, Cassandra, MongoDB
  • One+ year of experience with Big Data ML toolkits, such as Mahout, SparkML, or H2O
  • Demonstrable understanding of Service Oriented Architecture
  • One+ year experience in Technical writing, system documentation, design document-management skills
  • Two+ years experience with Scala or Java Language as it relates to product development

QBH#: 2088

Application Instructions

Please click on the link below to apply for this position. A new window will open and direct you to apply at our corporate careers page. We look forward to hearing from you!

Apply Online