Skip navigation EPAM

Big Data | Data Software Engineer Remote

Big Data | Data Software Engineer Description

Job #: 82859
EPAM is a leading global provider of digital platform engineering and development services. We are committed to having a positive impact on our customers, our employees, and our communities. We embrace a dynamic and inclusive culture. Here you will collaborate with multi-national teams, contribute to a myriad of innovative projects that deliver the most creative and cutting-edge solutions, and have an opportunity to continuously learn and grow. No matter where you are located, you will join a dedicated, creative, and diverse community that will help you discover your fullest potential.

DESCRIPTION



You are curious, persistent, logical and clever – a true techie at heart. You enjoy living by the code of your craft and developing elegant solutions for complex problems. If this sounds like you, this could be the perfect opportunity to join EPAM as a "Big Data" Now, Data Software Engineer! . Scroll down to learn more about the position’s responsibilities and requirements.

Requirements

  • 3+ years of experience in software development with one (preferably 2) of the following programming languages: Python/Java/Scala
  • Experience with Linux OS: configure services and write basic shell scripts, understanding of network fundamentals
  • Good knowledge of SQL (DML, DDL, Advance functions) and NoSQL (Mongo DB, Cassandra, Elastic search)
  • Object oriented design or functional programming design
  • Advanced experience in software development e.g., configuration management, monitoring, debugging, performance tuning, documentation, Agile experience
  • Experience with testing (Unit, Integration, End to end)
  • Engineering experience and practice in Data Management, Data Storage, Data Visualization, Integration, Operation, Security
  • Experience building data ingestion pipelines, Data pipeline orchestration
  • Experience with data modeling; hands-on development experience with modern Big Data components
  • Our tech stack includes Hadoop, Apache Hive, Spark, Kafka, Apache Beam, Storm, Jenkins, Streaming, Databricks, and Airflow
  • Cloud experience in at least one of the following clouds: Azure, GCP, AWS
  • Good understanding of CI/CD principles and best practices (Jenkins, Travis CI, Spinnaker
  • Container experience (Docker, Kubernetes)
  • Analytical approach to problem; excellent interpersonal, mentoring and communication skills
  • English proficiency
  • Advanced understanding of distributed computing principles

Technologies

  • Programming Languages: Python/Java/Scala/Kotlin and SQL
  • Cloud-Native stack: Databricks, Azure DataFactory, AWS Glue, AWS EMR, Athena, GCP DataProc, GCP DataFlow
  • Big Data stack: Spark Core, Spark SQL, Spark ML, Kafka, Kafka Connect, Airflow, Nifi, Streamset
  • NoSQL: CosmosDB, DynamoDB, Cassandra, HBase; MongoDB
  • Queues and Stream processing: Kafka Streams; Flink; Spark Streaming

We offer

  • Health Insurance
  • Life Insurance (SVO)
  • Occupational Risk Insurance (ART)
  • Paid Time Off – Vacations. 14 calendar days a year, the number of days will increase by seniority based on local law rules
  • Sick leave
  • Exceptional Leave. Take paid time off for your major life changes (childbirth, marriage, etc.)
  • Compensation of costs for internet, electricity, and personal laptop usage (if applicable)
  • Stable full-time workload
  • Thousands of projects for top brands
  • Stable income
  • Referral Program
  • Certification opportunities
  • Unlimited access to LinkedIn learning solutions
  • Language courses
  • Relocation Assistance Package

Conditions

  • By applying to our role, you are agreeing that your personal data may be used as in set out in EPAM´s Privacy Notice (https://www.epam.com/applicant-privacy-notice) and Policy (https://www.epam.com/privacy-policy)

About the Project

"Big Data" term is becoming obsolete. Most of Big Data technologies are de-facto standards for Data Management. So, it was decided to rebrand "Big Data" Primary Skill into "Data Software Engineering"
Data Software Engineer enables data-driven decision making by collecting, transforming, and publishing data. A Data Software Engineer should be able to design, build, operationalize, secure, and monitor data processing systems with a particular emphasis on security and compliance; scalability and efficiency; reliability and fidelity; and flexibility and portability. A Data Software Engineer should also be able to leverage, deploy, and continuously train pre-existing machine learning models.

Data Software Engineer must have solid knowledge of data processing languages, such as JVM-based (Scala, Java,...) or Python, SQL, Linux, and shell scripting. Also, they need to understand parallel data processing and data architecture patterns.

HELLO! HOW CAN WE HELP YOU?

OUR OFFICES