Closed or Expired Job Posting This job posting is closed or has expired and is no longer open for applications.
Upload
Job Description
- Develop and maintain data pipelines performing ETL processes.
- Work closely with data scientists and software developers to implement pipelines performing advanced
- analytics including machine learning models.
- Selecting features, building and optimizing pipelines using machine learning supervised and
- unsupervised techniques
- Extending company’s / customers’ data with third party sources of information when needed.
- Enhancing data collection procedures to include information that is relevant for building analytic
- systems
- Processing, cleansing, and validating the integrity of data used for analysis.
- Building high quality data services.
- Preparing unit test cases and writing unit test code.
- Writing automated build, execution, and deployment scripts for software artifacts.
- Fixing bugs in open-source software supported by the company and software products developed by the company.
Personal Skills
- Self-starter with a proactive attitude
- Excellent written and verbal communication skills for coordinating across teams.
- Individual contributor who is able to work effectively across different teams
- Excellent organizational, time management, and presentation skills.
- Ability to communicate effectively with peers, management, & business groups
Technical Skills
· Experience with Big Data platforms like Hadoop Eco-System and tools like Spark, Kafka, and NiFi.
· Experience with building ETL pipelines using different types of data sources and types.
· Experience with NoSQL databases, such as MongoDB, Cassandra, HBase, ArangoDB, Neo4J, and Elastic
Search.
· Good Coding Skills in Python are required and other coding knowledge and experience with
Java and Scala are highly desirable.
· Writing scripts to preprocess training data and sending it as input to a model automatically.
· Previous knowledge of container management (I.e., Docker) is preferred.
· Experience with Linux administration.
· Deep understanding with applied experience of OOP concepts and writing modularized code.
· Experience with developing applications using Python frameworks like Django is preferred.
· Excellence at using common python data science packages like Scikit-Learn, NumPy, SciPy, Matplotlib,
and Plotly.
· Proficiency in using query languages such as SQL.
· Proven hands-on experience in dealing with a variety of data models including structured, semi-
structured and unstructured data.
· Knowledge of a variety of machine learning techniques (clustering, decision tree learning, artificial
neural networks, etc.) and their real-world advantages/drawbacks is preferred.
· Python
· ETL
· Django
· Apache Spark
· Apache NiFi
· Graph Data Model
· SQL / NoSQL
· Machine Learning
· Streaming Analytics
· OOP
Education
Bachelor’s degree or equivalent experience. Preferred Computer science or engineering.
Job Details
Preferred Candidate
About This Company
Giza Systems, a leading systems integrator in the MEA region, designs and deploys industry-specific technology solutions for asset-intensive industries such as the Telecoms, Utilities, Oil & Gas, Transportation and other market sectors. We help our clients streamline their operations and businesses through our portfolio of solutions, managed services, and consultancy practice. Our team of 1000 professionals are spread throughout the region with anchor offices in Cairo, Riyadh, Dubai, Doha, Nairobi, Dar-es-Salaam, Abuja, Kampala and New Jersey, allowing us to service an ever-increasing client base in over 40 countries.